Skip to content
Snippets Groups Projects
Commit 46421841 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!879 Upload PSPNet

Merge pull request !879 from tanlin/master
parents 48aed5f3 f4d2d824
No related branches found
No related tags found
No related merge requests found
Showing
with 2001 additions and 0 deletions
# Contents
- [PSPNet Description](#PSPNet-description)
- [Model Architecture](#PSPNet-Architeture)
- [Dataset](#PSPNet-Dataset)
- [Environmental Requirements](#Environmental)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [Training Process](#training-process)
- [Pre-training](#pre-training)
- [Training](#training)
- [Training Results](#training-results)
- [Evaluation Process](#evaluation-process)
- [Evaluation](#evaluation)
- [Evaluation Result](#evaluation-result)
- [Model Description](#model-description)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
# [PSPNet Description](#Contents)
PSPNet(Pyramid Scene Parsing Network) has great capability of global context information by different-region based context aggregation through the pyramid pooling module together.
[paper](https://arxiv.org/abs/1612.01105) from CVPR2017
# [Model Architecture](#Contents)
The pyramid pooling module fuses features under four different pyramid scales.For maintaining a reasonable gap in representation,the module is a four-level one with bin sizes of 1×1, 2×2, 3×3 and 6×6 respectively.
# [Dataset](#Content)
- [PASCAL VOC 2012 and SBD Dataset Website](http://home.bharathh.info/pubs/codes/SBD/download.html)
- It contains 11,357 finely annotated images split into training and testing sets with 8,498 and 2,857 images respectively.
- [ADE20K Dataset Website](http://groups.csail.mit.edu/vision/datasets/ADE20K/)
- It contains 22,210 finely annotated images split into training and testing sets with 20,210 and 2,000 images respectively.
# [Environmental requirements](#Contents)
- Hardware :(Ascend)
- Prepare ascend processor to build hardware environment
- frame:
- [Mindspore](https://www.mindspore.cn/install)
- For details, please refer to the following resources:
- [MindSpore course](https://www.mindspore.cn/tutorials/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
# [Scription Description](#Content)
## Script and Sample Code
```python
.
└─PSPNet
├── eval.py # Evaluation python file for ADE20K/VOC2012
├── export.py # export mindir
├── README.md # descriptions about PSPnet
├── src # PSPNet
├── config # the training config file
├── ade20k_pspnet50.yaml
└── voc2012_pspnet50.yaml
├── dataset # data processing
├── dataset.py
└── transform.py
├── model # models for training and test
├── PSPNet.py
├── resnet.py
└── cell.py # loss function
└── utils
├── functions_args.py # test helper
├── lr.py # learning rate
├── metric_and_evalcallback.py # evalcallback
├── aux_loss.py # loss function helper
└── p_util.py # some functions
├── scripts
├── run_distribute_train_ascend.sh # multi cards distributed training in ascend
├── run_train1p_ascend.sh # multi cards distributed training in ascend
└── run_eval.sh # validation script
└── train.py # The training python file for ADE20K/VOC2012
```
## Script Parameters
Set script parameters in src/config/ade20k_pspnet50.yaml and src/config/voc2012_pspnet50.yaml
### Model
```bash
name: "PSPNet"
backbone: "resnet50_v2"
base_size: 512 # based size for scaling
crop_size: 473
```
### Optimizer
```bash
init_lr: 0.005
momentum: 0.9
weight_decay: 0.0001
```
### Training
```bash
batch_size: 8 # batch size for training
batch_size_val: 8 # batch size for validation during training
ade_root: "./data/ADE/" # set dataset path
voc_root: "./data/voc/voc"
epochs: 100/50 # ade/voc2012
pretrained_model_path: "./data/resnet_deepbase.ckpt"
save_checkpoint_epochs: 10
keep_checkpoint_max: 10
```
## Training Process
### Training
- Train on a single card
```shell
bash scripts/run_train1p_ascend.sh [YAML_PATH] [DEVICE_ID]
```
- Run distributed train in ascend processor environment
```shell
bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH]
```
### Training Result
The training results will be saved in the PSPNet path, you can view the log in the ./LOG/log.txt
```bash
# training result(1p)-voc2012
epoch: 1 step: 1063, loss is 0.62588865
epoch time: 493974.632 ms, per step time: 464.699 ms
epoch: 2 step: 1063, loss is 0.68774235
epoch time: 428786.495 ms, per step time: 403.374 ms
epoch: 3 step: 1063, loss is 0.4055968
epoch time: 428773.945 ms, per step time: 403.362 ms
epoch: 4 step: 1063, loss is 0.7540638
epoch time: 428783.473 ms, per step time: 403.371 ms
epoch: 5 step: 1063, loss is 0.49349666
epoch time: 428776.845 ms, per step time: 403.365 ms
```
## Evaluation Process
### Evaluation
Check the checkpoint path in config/ade20k_pspnet50.yaml and config/voc2012_pspnet50.yaml used for evaluation before running the following command.
```shell
bash run_eval.sh [YAML_PATH] [DEVICE_ID]
```
### Evaluation Result
The results at eval/log were as follows:
```bash
ADE20K:mIoU/mAcc/allAcc 0.4164/0.5319/0.7996.
VOC2012:mIoU/mAcc/allAcc 0.7380/0.8229/0.9293.
````
# [Model Description](#Content)
## Performance
### Distributed Training Performance
|Parameter | PSPNet |
| ------------------- | --------------------------------------------------------- |
|resources | Ascend 910;CPU 2.60GHz, 192core;memory:755G |
|Upload date |2021.11.13 |
|mindspore version |mindspore1.3.0 |
|training parameter |epoch=100,batch_size=8 |
|optimizer |SGD optimizer,momentum=0.9,weight_decay=0.0001 |
|loss function |SoftmaxCrossEntropyLoss |
|training speed |epoch time: 493974.632 ms, per step time: 464.699 ms(1p for voc2012)|
|total time |6h10m34s(1pcs) |
|Script URL |https://gitee.com/mindspore/models/tree/master/research/cv/PSPNet|
|Random number seed |set_seed = 1234 |
# [Description of Random Situation](#Content)
The random seed in `train.py`.
# [ModelZoo Homepage](#Content)
Please visit the official website [homepage](https://gitee.com/mindspore/models).
DATA:
data_root: /home/HEU_535/PSPNet/data/ADE/
art_data_root: /cache/data/ADE
train_list: /home/HEU_535/PSPNet/data/ADE/training_list.txt
art_train_list: /cache/data/ADE/training_list.txt
val_list: /home/HEU_535/PSPNet/data/ADE/val_list.txt
art_val_list: /cache/data/ADE/val_list.txt
classes: 150
prefix: ADE
save_dir: /home/HEU_535/PSPNet/checkpoints/
backbone: resnet50
pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt
art_pretrain_path: /cache/data/ADE/resnet_deepbase.ckpt
ckpt: /home/HEU_535/PSPNet/checkpoints/8P/ADE-100_316.ckpt
obs_save: obs://harbin-engineering-uni/PSPnet/save_checkpoint/ADE/
TRAIN:
arch: psp
feature_size: 60
train_h: 473
train_w: 473
scale_min: 0.5 # minimum random scale
scale_max: 2.0 # maximum random scale
rotate_min: -10 # minimum random rotate
rotate_max: 10 # maximum random rotate
zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8]
ignore_label: 255
aux_weight: 0.4
data_name: ade
batch_size: 8 # batch size for training
art_batch_size: 4
batch_size_val: 8 # batch size for validation during training
base_lr: 0.005
art_base_lr: 0.04
epochs: 100
start_epoch: 0
power: 0.9
momentum: 0.9
weight_decay: 0.0001
TEST:
test_list: /home/HEU_535/PSPNet/data/ADE/list/validation.txt
split: val # split in [train, val and test]
base_size: 512 # based size for scaling
test_h: 473
test_w: 473
scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
index_start: 0 # evaluation start index in list
index_step: 0 # evaluation step index in list, 0 means to end
result_path: /home/HEU_535/PSPNet/result/ade/
color_txt: /home/HEU_535/PSPNet/config/ade20k/ade20k_colors.txt
name_txt: /home/HEU_535/PSPNet/config/ade20k/ade20k_names.txt
DATA:
data_root: /home/HEU_535/PSPNet/data/voc/voc/
art_data_root: /cache/data
train_list: /home/HEU_535/PSPNet/data/voc/voc/train_list.txt
art_train_list: /cache/data/train_list.txt
val_list: /home/HEU_535/PSPNet/data/voc/voc/val_list.txt
art_val_list: /cache/data/val_list.txt
classes: 21
prefix: voc
save_dir: /home/HEU_535/PSPNet/checkpoints/
backbone: resnet50
pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt
art_pretrain_path: /cache/data/resnet_deepbase.ckpt
ckpt: /home/HEU_535/PSPNet/checkpoints/8P/voc-50_133.ckpt
obs_save: obs://harbin-engineering-uni/PSPnet/save_checkpoint/voc/
TRAIN:
arch: psp
feature_size: 60
train_h: 473
train_w: 473
scale_min: 0.5 # minimum random scale
scale_max: 2.0 # maximum random scale
rotate_min: -10 # minimum random rotate
rotate_max: 10 # maximum random rotate
zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8]
ignore_label: 255
aux_weight: 0.4
data_name:
batch_size: 8 # batch size for training
art_batch_size: 4
batch_size_val: 8 # batch size for validation during training, memory and speed tradeoff
base_lr: 0.005
art_base_lr: 0.02
epochs: 50
start_epoch: 0
power: 0.9
momentum: 0.9
weight_decay: 0.0001
TEST:
test_list: /home/HEU_535/PSPNet/dataset/voc2012/list/val.txt
split: val # split in [train, val and test]
base_size: 512 # based size for scaling
test_h: 473
test_w: 473
scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
index_start: 0 # evaluation start index in list
index_step: 0 # evaluation step index in list, 0 means to end
result_path: /home/HEU_535/PSPNet/result/voc/
color_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_colors.txt
name_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_names.txt
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" VOC2012 DATASET EVALUATE """
import os
import time
import logging
import argparse
import cv2
import numpy
from src.dataset import pt_dataset, pt_transform
import src.utils.functions_args as fa
from src.utils.p_util import AverageMeter, intersectionAndUnion, check_makedirs, colorize
import mindspore.numpy as np
from mindspore import Tensor
import mindspore.dataset as ds
from mindspore import context
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.train.serialization import load_param_into_net, load_checkpoint
cv2.ocl.setUseOpenCL(False)
device_id = int(os.getenv('DEVICE_ID'))
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend",
device_id=device_id, save_graphs=False)
def get_parser():
"""
Read parameter file
-> for ADE20k: ./src/config/voc2012_pspnet50.yaml
-> for voc2012: ./src/config/voc2012_pspnet50.yaml
"""
parser = argparse.ArgumentParser(description='MindSpore Semantic Segmentation')
parser.add_argument('--config', type=str, required=True, default='./src/config/voc2012_pspnet50.yaml',
help='config file')
parser.add_argument('opts', help='see ./src/config/voc2012_pspnet50.yaml for all options', default=None,
nargs=argparse.REMAINDER)
args_ = parser.parse_args()
assert args_.config is not None
cfg = fa.load_cfg_from_cfg_file(args_.config)
if args_.opts is not None:
cfg = fa.merge_cfg_from_list(cfg, args_.opts)
return cfg
def get_logger():
""" logger """
logger_name = "main-logger"
logger_ = logging.getLogger(logger_name)
logger_.setLevel(logging.INFO)
handler = logging.StreamHandler()
fmt = "[%(asctime)s %(levelname)s %(filename)s line %(lineno)d %(process)d] %(message)s"
handler.setFormatter(logging.Formatter(fmt))
logger_.addHandler(handler)
return logger_
def check(local_args):
""" check args """
assert local_args.classes > 1
assert local_args.zoom_factor in [1, 2, 4, 8]
assert local_args.split in ['train', 'val', 'test']
if local_args.arch == 'psp':
assert (local_args.train_h - 1) % 8 == 0 and (local_args.train_w - 1) % 8 == 0
else:
raise Exception('architecture not supported {} yet'.format(local_args.arch))
def main():
""" The main function of the evaluate process """
check(args)
logger.info("=> creating model ...")
logger.info("Classes: %s", args.classes)
value_scale = 255
mean = [0.485, 0.456, 0.406]
mean = [item * value_scale for item in mean]
std = [0.229, 0.224, 0.225]
std = [item * value_scale for item in std]
gray_folder = os.path.join(args.result_path, 'gray')
color_folder = os.path.join(args.result_path, 'color')
test_transform = pt_transform.Compose([pt_transform.Normalize(mean=mean, std=std, is_train=False)])
test_data = pt_dataset.SemData(
split='val', data_root=args.data_root,
data_list=args.val_list,
transform=test_transform)
test_loader = ds.GeneratorDataset(test_data, column_names=["data", "label"],
shuffle=False)
test_loader.batch(1)
colors = numpy.loadtxt(args.color_txt).astype('uint8')
names = [line.rstrip('\n') for line in open(args.name_txt)]
from src.model import pspnet
PSPNet = pspnet.PSPNet(
feature_size=args.feature_size,
num_classes=args.classes,
backbone=args.backbone,
pretrained=False,
pretrained_path="",
aux_branch=False,
deep_base=True
)
ms_checkpoint = load_checkpoint(args.ckpt)
load_param_into_net(PSPNet, ms_checkpoint, strict_load=True)
PSPNet.set_train(False)
test(test_loader, test_data.data_list, PSPNet, args.classes, mean, std, args.base_size, args.test_h,
args.test_w, args.scales, gray_folder, color_folder, colors)
if args.split != 'test':
cal_acc(test_data.data_list, gray_folder, args.classes, names)
def net_process(model, image, mean, std=None, flip=True):
""" Give the input to the model"""
transpose = ops.Transpose()
input_ = transpose(image, (2, 0, 1)) # (473, 473, 3) -> (3, 473, 473)
mean = np.array(mean)
std = np.array(std)
if std is None:
input_ = input_ - mean[:, None, None]
else:
input_ = (input_ - mean[:, None, None]) / std[:, None, None]
expand_dim = ops.ExpandDims()
input_ = expand_dim(input_, 0)
if flip:
flip_ = ops.ReverseV2(axis=[3])
flip_input = flip_(input_)
concat = ops.Concat(axis=0)
input_ = concat((input_, flip_input))
model.set_train(False)
output = model(input_)
_, _, h_i, w_i = input_.shape
_, _, h_o, w_o = output.shape
if (h_o != h_i) or (w_o != w_i):
bi_linear = nn.ResizeBilinear()
output = bi_linear(output, size=(h_i, w_i), align_corners=True)
softmax = nn.Softmax(axis=1)
output = softmax(output)
if flip:
flip_ = ops.ReverseV2(axis=[2])
output = (output[0] + flip_(output[1])) / 2
else:
output = output[0]
output = transpose(output, (1, 2, 0)) # Tensor
output = output.asnumpy()
return output
def scale_process(model, image, classes, crop_h, crop_w, h, w, mean, std=None, stride_rate=2 / 3):
""" Process input size """
ori_h, ori_w, _ = image.shape
pad_h = max(crop_h - ori_h, 0)
pad_w = max(crop_w - ori_w, 0)
pad_h_half = int(pad_h / 2)
pad_w_half = int(pad_w / 2)
if pad_h > 0 or pad_w > 0:
image = cv2.copyMakeBorder(image, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half,
cv2.BORDER_CONSTANT, value=mean)
new_h, new_w, _ = image.shape
image = Tensor.from_numpy(image)
stride_h = int(numpy.ceil(crop_h * stride_rate))
stride_w = int(numpy.ceil(crop_w * stride_rate))
grid_h = int(numpy.ceil(float(new_h - crop_h) / stride_h) + 1)
grid_w = int(numpy.ceil(float(new_w - crop_w) / stride_w) + 1)
prediction_crop = numpy.zeros((new_h, new_w, classes), dtype=float)
count_crop = numpy.zeros((new_h, new_w), dtype=float)
for index_h in range(0, grid_h):
for index_w in range(0, grid_w):
s_h = index_h * stride_h
e_h = min(s_h + crop_h, new_h)
s_h = e_h - crop_h
s_w = index_w * stride_w
e_w = min(s_w + crop_w, new_w)
s_w = e_w - crop_w
image_crop = image[s_h:e_h, s_w:e_w].copy()
count_crop[s_h:e_h, s_w:e_w] += 1
prediction_crop[s_h:e_h, s_w:e_w, :] += net_process(model, image_crop, mean, std)
prediction_crop /= numpy.expand_dims(count_crop, 2)
prediction_crop = prediction_crop[pad_h_half:pad_h_half + ori_h, pad_w_half:pad_w_half + ori_w]
prediction = cv2.resize(prediction_crop, (w, h), interpolation=cv2.INTER_LINEAR)
return prediction
def test(test_loader, data_list, model, classes, mean, std, base_size, crop_h, crop_w, scales, gray_folder,
color_folder, colors):
""" Generate evaluate image """
logger.info('>>>>>>>>>>>>>>>> Start Evaluation >>>>>>>>>>>>>>>>')
data_time = AverageMeter()
batch_time = AverageMeter()
model.set_train(False)
end = time.time()
for i, (input_, _) in enumerate(test_loader):
data_time.update(time.time() - end)
input_ = input_.asnumpy()
image = numpy.transpose(input_, (1, 2, 0))
h, w, _ = image.shape
prediction = numpy.zeros((h, w, classes), dtype=float)
for scale in scales:
long_size = round(scale * base_size)
new_h = long_size
new_w = long_size
if h > w:
new_w = round(long_size / float(h) * w)
else:
new_h = round(long_size / float(w) * h)
image_scale = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
prediction += scale_process(model, image_scale, classes, crop_h, crop_w, h, w, mean, std)
prediction /= len(scales)
prediction = numpy.argmax(prediction, axis=2)
batch_time.update(time.time() - end)
end = time.time()
if ((i + 1) % 10 == 0) or (i + 1 == len(data_list)):
logger.info('Test: [{}/{}] '
'Data {data_time.val:.3f} ({data_time.avg:.3f}) '
'Batch {batch_time.val:.3f} ({batch_time.avg:.3f}).'.format(i + 1, len(data_list),
data_time=data_time,
batch_time=batch_time))
check_makedirs(gray_folder)
check_makedirs(color_folder)
gray = numpy.uint8(prediction)
color = colorize(gray, colors)
image_path, _ = data_list[i]
image_name = image_path.split('/')[-1].split('.')[0]
gray_path = os.path.join(gray_folder, image_name + '.png')
color_path = os.path.join(color_folder, image_name + '.png')
cv2.imwrite(gray_path, gray)
color.save(color_path)
logger.info('<<<<<<<<<<<<<<<<< End Evaluation <<<<<<<<<<<<<<<<<')
def cal_acc(data_list, pred_folder, classes, names):
""" Calculation evaluating indicator """
intersection_meter = AverageMeter()
union_meter = AverageMeter()
target_meter = AverageMeter()
for i, (image_path, target_path) in enumerate(data_list):
image_name = image_path.split('/')[-1].split('.')[0]
pred = cv2.imread(os.path.join(pred_folder, image_name + '.png'), cv2.IMREAD_GRAYSCALE)
target = cv2.imread(target_path, cv2.IMREAD_GRAYSCALE)
if args.prefix == 'ADE':
target -= 1
intersection, union, target = intersectionAndUnion(pred, target, classes)
intersection_meter.update(intersection)
union_meter.update(union)
target_meter.update(target)
accuracy = sum(intersection_meter.val) / (sum(target_meter.val) + 1e-10)
logger.info(
'Evaluating {0}/{1} on image {2}, accuracy {3:.4f}.'.format(i + 1, len(data_list), image_name + '.png',
accuracy))
iou_class = intersection_meter.sum / (union_meter.sum + 1e-10)
accuracy_class = intersection_meter.sum / (target_meter.sum + 1e-10)
mIoU = numpy.mean(iou_class)
mAcc = numpy.mean(accuracy_class)
allAcc = sum(intersection_meter.sum) / (sum(target_meter.sum) + 1e-10)
logger.info('Eval result: mIoU/mAcc/allAcc {:.4f}/{:.4f}/{:.4f}.'.format(mIoU, mAcc, allAcc))
for i in range(classes):
logger.info('Class_{} result: iou/accuracy {:.4f}/{:.4f}, name: {}.'.format(i, iou_class[i], accuracy_class[i],
names[i]))
if __name__ == '__main__':
args = get_parser()
logger = get_logger()
main()
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""export checkpoint file into air, onnx, mindir models"""
import argparse
import numpy as np
import src.utils.functions_args as fa
from src.model import pspnet
import mindspore.common.dtype as dtype
from mindspore import Tensor, context, load_checkpoint, load_param_into_net, export
parser = argparse.ArgumentParser(description='maskrcnn export')
parser.add_argument("--device_id", type=int, default=0, help="Device id")
parser.add_argument("--batch_size", type=int, default=1, help="batch size")
parser.add_argument("--yaml_path", type=str, required=True, default='./src/config/voc2012_pspnet50.yaml',
help='yaml file path')
parser.add_argument("--ckpt_file", type=str, required=True, default='./checkpoints/voc/ADE-50_1063.ckpt',
help="Checkpoint file path.")
parser.add_argument("--file_name", type=str, default="PSPNet", help="output file name.")
parser.add_argument("--file_format", type=str, choices=["AIR", "ONNX", "MINDIR"], default="MINDIR", help="file format")
parser.add_argument('--device_target', type=str, default="Ascend",
choices=['Ascend', 'GPU', 'CPU'], help='device target (default: Ascend)')
parser.add_argument("--project_path", type=str, default='/root/PSPNet/',
help="project_path,default is /root/PSPNet/")
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
if args.device_target == "Ascend":
context.set_context(device_id=args.device_id)
if __name__ == '__main__':
config_path = args.yaml_path
cfg = fa.load_cfg_from_cfg_file(config_path)
net = pspnet.PSPNet(
feature_size=cfg.feature_size,
num_classes=cfg.classes,
backbone=cfg.backbone,
pretrained=False,
pretrained_path="",
aux_branch=False,
deep_base=True
)
param_dict = load_checkpoint(args.ckpt_file)
load_param_into_net(net, param_dict, strict_load=True)
net.set_train(False)
img = Tensor(np.ones([args.batch_size, 3, 473, 473]), dtype.float32)
print("################## Start export ###################")
export(net, img, file_name=args.file_name, file_format=args.file_format)
print("################## Finish export ###################")
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Usage: bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH]"
echo "Please run the script as: "
echo "bash /PSPNet/scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH]"
echo "for example: bash scripts/run_distribute_train_ascend.sh /PSPNet/scripts/config/RANK_TABLE_FILE PSPNet/config/voc2012_pspnet50.yaml"
echo "=============================================================================================================="
exit 1
fi
export RANK_SIZE=8
export RANK_TABLE_FILE=$1
export YAML_PATH=$2
export HCCL_CONNECT_TIMEOUT=6000
for((i=0;i<RANK_SIZE;i++))
do
export DEVICE_ID=$i
rm -rf LOG$i
mkdir ./LOG$i
cp ./*.py ./LOG$i
cp -r ./src ./LOG$i
cd ./LOG$i || exit
export RANK_ID=$i
echo "start training for rank $i, device $DEVICE_ID"
env > env.log
python3 train.py --config="$YAML_PATH"> ./log.txt 2>&1 &
cd ../
done
\ No newline at end of file
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Usage: bash /PSPNet/scripts/run_eval.sh [YAML_PATH] [DEVICE_ID]"
echo "for example: bash PSPNet/scripts/run_eval.sh PSPNet/config/voc2012_pspnet50.yaml 0"
echo "=============================================================================================================="
exit 1
fi
rm -rf LOG
mkdir ./LOG
export YAML_PATH=$1
export RANK_SIZE=1
export RANK_ID=0
export DEVICE_ID=$2
echo "start evaluating for device $DEVICE_ID"
env > env.log
python3 eval.py --config="$YAML_PATH" > ./LOG/eval_log.txt 2>&1 &
\ No newline at end of file
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Usage: bash PSPNet/scripts/run_train1p_ascend.sh [YAML_PATH] [DEVICE_ID]"
echo "for example: bash PSPNet/scripts/run_train1p_ascend.sh PSPNet/config/voc2012_pspnet50.yaml 0"
echo "=============================================================================================================="
exit 1
fi
rm -rf LOG
mkdir ./LOG
export YAML_PATH=$1
export RANK_SIZE=1
export RANK_ID=0
export DEVICE_ID=$2
echo "start training for device $DEVICE_ID"
env > env.log
python3 train.py --config="$YAML_PATH" > ./LOG/log.txt 2>&1 &
DATA:
data_root: ./data/ADE/
train_list: ./data/ADE/training_list.txt
val_list: ./data/ADE/val_list.txt # test_list: dataset/ade20k/list/validation.txt
classes: 150
prefix: ADE
save_dir: ./checkpoints/
backbone: resnet50
pretrain_path: ./data/resnet_deepbase.ckpt
ckpt: ./checkpoints/ade/ADE_1-100_2527.ckpt
TRAIN:
arch: psp
feature_size: 60
train_h: 473
train_w: 473
scale_min: 0.5 # minimum random scale
scale_max: 2.0 # maximum random scale
rotate_min: -10 # minimum random rotate
rotate_max: 10 # maximum random rotate
zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8]
ignore_label: 255
aux_weight: 0.4
data_name: ade
batch_size: 8 # batch size for training
batch_size_val: 8 # batch size for validation during training
base_lr: 0.005
epochs: 100
start_epoch: 0
power: 0.9
momentum: 0.9
weight_decay: 0.0001
TEST:
test_list: ./dataset/ade20k/list/validation.txt
split: val # split in [train, val and test]
base_size: 512 # based size for scaling
test_h: 473
test_w: 473
scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
index_start: 0 # evaluation start index in list
index_step: 0 # evaluation step index in list, 0 means to end
result_path: ./result/ade/
color_txt: ./src/config/ade20k/ade20k_colors.txt
name_txt: ./src/config/ade20k/ade20k_names.txt
DATA:
data_root: /home/HEU_535/PSPNet/data/voc/voc/
train_list: /home/HEU_535/PSPNet/data/voc/voc/train_list.txt
val_list: /home/HEU_535/PSPNet/data/voc/voc/val_list.txt
classes: 21
prefix: voc
save_dir: /home/HEU_535/PSPNet/checkpoints/
backbone: resnet50
pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt
ckpt: /home/HEU_535/PSPNet/checkpoints/8P/voc-50_133.ckpt
TRAIN:
arch: psp
feature_size: 60
train_h: 473
train_w: 473
scale_min: 0.5 # minimum random scale
scale_max: 2.0 # maximum random scale
rotate_min: -10 # minimum random rotate
rotate_max: 10 # maximum random rotate
zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8]
ignore_label: 255
aux_weight: 0.4
data_name:
batch_size: 8 # batch size for training
batch_size_val: 8 # batch size for validation during training, memory and speed tradeoff
base_lr: 0.005
epochs: 50
start_epoch: 0
power: 0.9
momentum: 0.9
weight_decay: 0.0001
TEST:
test_list: /home/HEU_535/PSPNet/dataset/voc2012/list/val.txt
split: val # split in [train, val and test]
base_size: 512 # based size for scaling
test_h: 473
test_w: 473
scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
index_start: 0 # evaluation start index in list
index_step: 0 # evaluation step index in list, 0 means to end
result_path: /home/HEU_535/PSPNet/result/voc/
color_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_colors.txt
name_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_names.txt
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" read the dataset file """
import os
import os.path
import cv2
import numpy as np
IMG_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm']
def is_image_file(filename):
""" check file """
filename_lower = filename.lower()
return any(filename_lower.endswith(extension) for extension in IMG_EXTENSIONS)
def make_dataset(split='train', data_root=None, data_list=None):
""" get data list """
assert split in ['train', 'val', 'test']
if not os.path.isfile(data_list):
raise RuntimeError("Image list file do not exist: " + data_list + "\n")
image_label_list = []
list_read = open(data_list).readlines()
print("Totally {} samples in {} set.".format(len(list_read), split))
print("Starting Checking image&label pair {} list...".format(split))
for line in list_read:
line = line.strip()
line_split = line.split(' ')
if split == 'test':
if len(line_split) != 1:
raise RuntimeError("Image list file read line error : " + line + "\n")
image_name = os.path.join(data_root, line_split[0])
label_name = image_name # just set place holder for label_name, not for use
else:
if len(line_split) != 2:
raise RuntimeError("Image list file read line error : " + line + "\n")
image_name = os.path.join(data_root, line_split[0])
label_name = os.path.join(data_root, line_split[1])
item = (image_name, label_name)
image_label_list.append(item)
print("Checking image&label pair {} list done!".format(split))
return image_label_list
class SemData:
""" dataset class """
def __init__(self, split='train', data_root=None, data_list=None, transform=None, data_name=None):
self.split = split
self.data_list = make_dataset(split, data_root, data_list) # (image_name, label_name)
self.transform = transform
self.data_name = data_name
def __len__(self):
return len(self.data_list)
def __getitem__(self, index):
image_path, label_path = self.data_list[index]
image = cv2.imread(image_path, cv2.IMREAD_COLOR) # BGR 3 channel ndarray wiht shape H * W * 3
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # convert cv2 read image from BGR order to RGB order
image = np.float32(image)
label = cv2.imread(label_path, cv2.IMREAD_GRAYSCALE) # GRAY 1 channel ndarray with shape H * W
if self.data_name is not None:
label -= 1
if image.shape[0] != label.shape[0] or image.shape[1] != label.shape[1]:
raise RuntimeError("Image & label shape mismatch: " + image_path + " " + label_path + "\n")
if self.transform is not None:
image, label = self.transform(image, label)
return image.astype(np.float32), label.astype(np.int32)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Functions for input transform
"""
import random
import math
import numbers
import collections
import numpy as np
import cv2
class Compose:
""" compose the process functions """
def __init__(self, segtransform):
self.segtransform = segtransform
def __call__(self, image, label):
for t in self.segtransform:
image, label = t(image, label)
return image, label
class Normalize:
"""
Normalize tensor with mean and standard deviation along channel:
channel = (channel - mean) / std
"""
def __init__(self, mean, std=None, is_train=True):
if std is None:
assert mean
else:
assert len(mean) == len(std)
self.mean = np.array(mean)
self.std = np.array(std)
self.is_train = is_train
def __call__(self, image, label):
if not isinstance(image, np.ndarray) or not isinstance(label, np.ndarray):
raise RuntimeError("segtransform.ToTensor() only handle np.ndarray"
"[eg: data read by cv2.imread()].\n")
if len(image.shape) > 3 or len(image.shape) < 2:
raise RuntimeError("segtransform.ToTensor() only handle np.ndarray with 3 dims or 2 dims.\n")
if len(image.shape) == 2:
image = np.expand_dims(image, axis=2)
if not len(label.shape) == 2:
raise RuntimeError("segtransform.ToTensor() only handle np.ndarray labellabel with 2 dims.\n")
image = np.transpose(image, (2, 0, 1)) # (473, 473, 3) -> (3, 473, 473)
if self.is_train:
if self.std is None:
image = image - self.mean[:, None, None]
else:
image = (image - self.mean[:, None, None]) / self.std[:, None, None]
return image, label
class Resize:
"""Resize the input to the given size, 'size' is a 2-element tuple or list in the order of (h, w). """
def __init__(self, size):
assert (isinstance(size, collections.Iterable) and len(size) == 2)
self.size = size
def __call__(self, image, label):
image = cv2.resize(image, self.size[::-1], interpolation=cv2.INTER_LINEAR)
label = cv2.resize(label, self.size[::-1], interpolation=cv2.INTER_NEAREST)
return image, label
class RandScale:
""" Randomly resize image & label with scale factor in [scale_min, scale_max] """
def __init__(self, scale, aspect_ratio=None):
assert (isinstance(scale, collections.Iterable) and len(scale) == 2)
if isinstance(scale, collections.Iterable) and len(scale) == 2 \
and isinstance(scale[0], numbers.Number) and isinstance(scale[1], numbers.Number) \
and 0 < scale[0] < scale[1]:
self.scale = scale
else:
raise RuntimeError("segtransform.RandScale() scale param error.\n")
if aspect_ratio is None:
self.aspect_ratio = aspect_ratio
elif isinstance(aspect_ratio, collections.Iterable) and len(aspect_ratio) == 2 \
and isinstance(aspect_ratio[0], numbers.Number) and isinstance(aspect_ratio[1], numbers.Number) \
and 0 < aspect_ratio[0] < aspect_ratio[1]:
self.aspect_ratio = aspect_ratio
else:
raise RuntimeError("segtransform.RandScale() aspect_ratio param error.\n")
def __call__(self, image, label):
temp_scale = self.scale[0] + (self.scale[1] - self.scale[0]) * random.random()
temp_aspect_ratio = 1.0
if self.aspect_ratio is not None:
temp_aspect_ratio = self.aspect_ratio[0] + (self.aspect_ratio[1] - self.aspect_ratio[0]) * random.random()
temp_aspect_ratio = math.sqrt(temp_aspect_ratio)
scale_factor_x = temp_scale * temp_aspect_ratio
scale_factor_y = temp_scale / temp_aspect_ratio
image = cv2.resize(image, None, fx=scale_factor_x, fy=scale_factor_y, interpolation=cv2.INTER_LINEAR)
label = cv2.resize(label, None, fx=scale_factor_x, fy=scale_factor_y, interpolation=cv2.INTER_NEAREST)
return image, label
class Crop:
"""Crops the given ndarray image (H*W*C or H*W).
Args:
size (sequence or int): Desired output size of the crop. If size is an
int instead of sequence like (h, w), a square crop (size, size) is made.
"""
def __init__(self, size, crop_type='center', padding=None, ignore_label=255):
# [473, 473], 'rand', padding=mean, ignore255
if isinstance(size, int):
self.crop_h = size
self.crop_w = size
elif isinstance(size, collections.Iterable) and len(size) == 2 \
and isinstance(size[0], int) and isinstance(size[1], int) \
and size[0] > 0 and size[1] > 0:
self.crop_h = size[0]
self.crop_w = size[1]
else:
raise RuntimeError("crop size error.\n")
if crop_type in ('center', 'rand'):
self.crop_type = crop_type
else:
raise RuntimeError("crop type error: rand | center\n")
if padding is None:
self.padding = padding
elif isinstance(padding, list):
if all(isinstance(i, numbers.Number) for i in padding):
self.padding = padding
else:
raise RuntimeError("padding in Crop() should be a number list\n")
if len(padding) != 3:
raise RuntimeError("padding channel is not equal with 3\n")
else:
raise RuntimeError("padding in Crop() should be a number list\n")
if isinstance(ignore_label, int):
self.ignore_label = ignore_label
else:
raise RuntimeError("ignore_label should be an integer number\n")
def __call__(self, image, label):
h, w = label.shape
pad_h = max(self.crop_h - h, 0)
pad_w = max(self.crop_w - w, 0)
pad_h_half = int(pad_h / 2)
pad_w_half = int(pad_w / 2)
if pad_h > 0 or pad_w > 0:
if self.padding is None:
raise RuntimeError("segtransform.Crop() need padding while padding argument is None\n")
image = cv2.copyMakeBorder(image, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half,
cv2.BORDER_CONSTANT, value=self.padding)
label = cv2.copyMakeBorder(label, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half,
cv2.BORDER_CONSTANT, value=self.ignore_label)
h, w = label.shape
if self.crop_type == 'rand':
h_off = random.randint(0, h - self.crop_h)
w_off = random.randint(0, w - self.crop_w)
else:
h_off = int((h - self.crop_h) / 2)
w_off = int((w - self.crop_w) / 2)
image = image[h_off:h_off + self.crop_h, w_off:w_off + self.crop_w]
label = label[h_off:h_off + self.crop_h, w_off:w_off + self.crop_w]
return image, label
class RandRotate:
"""
Randomly rotate image & label with rotate factor in [rotate_min, rotate_max]
"""
def __init__(self, rotate, padding, ignore_label=255, p=0.5):
assert (isinstance(rotate, collections.Iterable) and len(rotate) == 2)
if isinstance(rotate[0], numbers.Number) and isinstance(rotate[1], numbers.Number) and rotate[0] < rotate[1]:
self.rotate = rotate
else:
raise RuntimeError("segtransform.RandRotate() scale param error.\n")
assert padding is not None
assert isinstance(padding, list) and len(padding) == 3
if all(isinstance(i, numbers.Number) for i in padding):
self.padding = padding
else:
raise RuntimeError("padding in RandRotate() should be a number list\n")
assert isinstance(ignore_label, int)
self.ignore_label = ignore_label
self.p = p
def __call__(self, image, label):
if random.random() < self.p:
angle = self.rotate[0] + (self.rotate[1] - self.rotate[0]) * random.random()
h, w = label.shape
matrix = cv2.getRotationMatrix2D((w / 2, h / 2), angle, 1)
image = cv2.warpAffine(image, matrix, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT,
borderValue=self.padding)
label = cv2.warpAffine(label, matrix, (w, h), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT,
borderValue=self.ignore_label)
return image, label
class RandomHorizontalFlip:
""" Random Horizontal Flip """
def __init__(self, p=0.5):
self.p = p
def __call__(self, image, label):
if random.random() < self.p:
image = cv2.flip(image, 1)
label = cv2.flip(label, 1)
return image, label
class RandomVerticalFlip:
""" Random Vertical Flip """
def __init__(self, p=0.5):
self.p = p
def __call__(self, image, label):
if random.random() < self.p:
image = cv2.flip(image, 0)
label = cv2.flip(label, 0)
return image, label
class RandomGaussianBlur:
"""
RandomGaussianBlur
"""
def __init__(self, radius=5):
self.radius = radius
def __call__(self, image, label):
if random.random() < 0.5:
image = cv2.GaussianBlur(image, (self.radius, self.radius), 0)
return image, label
class RGB2BGR:
"""
Converts image from RGB order to BGR order
"""
def __init__(self):
pass
def __call__(self, image, label):
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
return image, label
class BGR2RGB:
"""
Converts image from BGR order to RGB order
"""
def __init__(self):
pass
def __call__(self, image, label):
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image, label
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" PSPNet loss function """
from mindspore import nn
from src.utils.metrics import SoftmaxCrossEntropyLoss
class Aux_CELoss_Cell(nn.Cell):
""" loss """
def __init__(self, num_classes=21, ignore_label=255):
super(Aux_CELoss_Cell, self).__init__()
self.num_classes = num_classes
self.loss = SoftmaxCrossEntropyLoss(self.num_classes, ignore_label)
def construct(self, net_out, target):
""" the process of calculate loss """
if len(net_out) == 2:
predict_aux, predict = net_out
CE_loss = self.loss(predict, target)
CE_loss_aux = self.loss(predict_aux, target)
loss = CE_loss + (0.4 * CE_loss_aux)
return loss
return self.loss(net_out, target)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" PSPNet """
from src.model.resnet import resnet50
import mindspore
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.train.serialization import load_param_into_net, load_checkpoint
import mindspore.common.initializer as weight_init
class ResNet(nn.Cell):
""" The pretrained ResNet """
def __init__(self, pretrained_path, pretrained=False, deep_base=False, BatchNorm_layer=nn.BatchNorm2d):
super(ResNet, self).__init__()
resnet = resnet50(deep_base=deep_base, BatchNorm_layer=BatchNorm_layer)
if pretrained:
params = load_checkpoint(pretrained_path)
load_param_into_net(resnet, params)
if deep_base:
self.layer1 = nn.SequentialCell(resnet.conv1, resnet.bn1, resnet.relu, resnet.conv2, resnet.bn2,
resnet.relu, resnet.conv3, resnet.bn3, resnet.relu, resnet.maxpool)
else:
self.layer1 = nn.SequentialCell(resnet.conv1, resnet.bn1, resnet.relu, resnet.maxpool)
self.layer2 = resnet.layer1
self.layer3 = resnet.layer2
self.layer4 = resnet.layer3
self.layer5 = resnet.layer4
def construct(self, x):
""" ResNet process """
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x_aux = self.layer4(x)
x = self.layer5(x_aux)
return x_aux, x
class AdaPool1(nn.Cell):
""" 1x1 pooling """
def __init__(self):
super(AdaPool1, self).__init__()
self.reduceMean = ops.ReduceMean(keep_dims=True)
def construct(self, X):
""" 1x1 pooling process """
pooled_1x1 = self.reduceMean(X, (-2, -1))
return pooled_1x1
class AdaPool2(nn.Cell):
""" 2x2 pooling """
def __init__(self):
super(AdaPool2, self).__init__()
self.reduceMean = ops.ReduceMean()
self.reshape = ops.Reshape()
def construct(self, X):
""" 2x2 pooling process """
batch_size, channels, _, _ = X.shape
X = self.reshape(X, (batch_size, channels, 2, 30, 2, 30))
pooled_2x2_out = self.reduceMean(X, (3, 5))
return pooled_2x2_out
class AdaPool3(nn.Cell):
""" 3x3 pooling """
def __init__(self):
super(AdaPool3, self).__init__()
self.reduceMean = ops.ReduceMean()
self.reshape = ops.Reshape()
def construct(self, X):
""" 3x3 pooling process """
batch_size, channels, _, _ = X.shape
X = self.reshape(X, (batch_size, channels, 3, 20, 3, 20))
pooled_3x3_out = self.reduceMean(X, (3, 5))
return pooled_3x3_out
class _PSPModule(nn.Cell):
""" PSP module """
def __init__(self, in_channels, pool_sizes, feature_shape, BatchNorm_layer=nn.BatchNorm2d):
super(_PSPModule, self).__init__()
out_channels = in_channels // len(pool_sizes)
self.BatchNorm_layer = BatchNorm_layer
self.stage1 = nn.SequentialCell(
AdaPool1(),
nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False),
self.BatchNorm_layer(out_channels),
nn.ReLU(),
)
self.stage2 = nn.SequentialCell(
nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False),
self.BatchNorm_layer(out_channels),
nn.ReLU()
)
self.stage3 = nn.SequentialCell(
AdaPool3(),
nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False),
self.BatchNorm_layer(out_channels),
nn.ReLU(),
)
self.stage4 = nn.SequentialCell(
nn.AvgPool2d(kernel_size=10, stride=10),
nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False),
self.BatchNorm_layer(out_channels),
nn.ReLU()
)
self.cat = ops.Concat(axis=1)
self.feature_shape = feature_shape
self.resize_ops = ops.ResizeBilinear(
(self.feature_shape[0], self.feature_shape[1]), True
)
self.cast = ops.Cast()
def construct(self, x):
""" PSP module process """
x = self.cast(x, mindspore.float32)
s1_out = self.resize_ops(self.stage1(x))
s2_out = self.resize_ops(self.stage2(x))
s3_out = self.resize_ops(self.stage3(x))
s4_out = self.resize_ops(self.stage4(x))
out = (x, s1_out, s2_out, s3_out, s4_out)
out = self.cat(out)
return out
class PSPNet(nn.Cell):
""" PSPNet """
def __init__(
self,
pool_sizes=None,
feature_size=60,
num_classes=21,
backbone="resnet50",
pretrained=True,
pretrained_path="",
aux_branch=False,
deep_base=False,
BatchNorm_layer=nn.BatchNorm2d
):
"""
"""
super(PSPNet, self).__init__()
if pool_sizes is None:
pool_sizes = [1, 2, 3, 6]
if backbone == "resnet50":
self.backbone = ResNet(
pretrained=pretrained,
pretrained_path=pretrained_path,
deep_base=deep_base,
BatchNorm_layer=BatchNorm_layer
)
aux_channel = 1024
out_channel = 2048
else:
raise ValueError(
"Unsupported backbone - `{}`, Use resnet50 .".format(backbone)
)
self.BatchNorm_layer = BatchNorm_layer
self.feature_shape = [feature_size, feature_size]
self.pool_sizes = [feature_size // pool_size for pool_size in pool_sizes]
self.ppm = _PSPModule(in_channels=out_channel, pool_sizes=self.pool_sizes, feature_shape=self.feature_shape)
self.cls = nn.SequentialCell(
nn.Conv2d(out_channel * 2, 512, kernel_size=3, padding=1, pad_mode="pad", has_bias=False),
self.BatchNorm_layer(512),
nn.ReLU(),
nn.Dropout(0.9),
nn.Conv2d(512, num_classes, kernel_size=1, has_bias=True)
)
self.aux_branch = aux_branch
if self.aux_branch:
self.auxiliary_branch = nn.SequentialCell(
nn.Conv2d(aux_channel, 256, kernel_size=3, padding=1, pad_mode="pad", has_bias=False),
self.BatchNorm_layer(256),
nn.ReLU(),
nn.Dropout(0.9),
nn.Conv2d(256, num_classes, kernel_size=1, has_bias=True)
)
self.resize = nn.ResizeBilinear()
self.shape = ops.Shape()
self.init_weights(self.cls)
def init_weights(self, *models):
""" init the model parameters """
for model in models:
for _, cell in model.cells_and_names():
if isinstance(cell, nn.Conv2d):
cell.weight.set_data(
weight_init.initializer(
weight_init.HeNormal(), cell.weight.shape, cell.weight.dtype
)
)
if isinstance(cell, nn.Dense):
cell.weight.set_data(
weight_init.initializer(
weight_init.TruncatedNormal(0.01),
cell.weight.shape,
cell.weight.dtype,
)
)
cell.bias.set_data(1e-4, cell.bias.shape, cell.bias.dtype)
def construct(self, x):
""" PSPNet process """
x_shape = self.shape(x)
x_aux, x = self.backbone(x)
x = self.ppm(x)
out = self.cls(x)
out = self.resize(out, size=(x_shape[2:4]), align_corners=True)
if self.aux_branch:
out_aux = self.auxiliary_branch(x_aux)
output_aux = self.resize(out_aux, size=(x_shape[2:4]), align_corners=True)
return output_aux, out
return out
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" THE Pretrained model ResNet """
import mindspore.nn as nn
def conv3x3(in_channels, out_channels, stride=1, dilation=1):
""" 3x3 convolution """
return nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, dilation=dilation, pad_mode="pad",
padding=1, has_bias=False)
class BasicBlock(nn.Cell):
""" basic Block for resnet """
expansion = 1
def __init__(self, inplanes, planes, stride=1, down_sample_layer=None, BatchNorm_layer=nn.BatchNorm2d):
super(BasicBlock, self).__init__()
self.BatchNorm_layer = BatchNorm_layer
self.conv1 = conv3x3(inplanes, planes, stride)
self.bn1 = self.BatchNorm_layer(planes)
self.relu = nn.ReLU()
self.conv2 = conv3x3(planes, planes)
self.bn2 = self.BatchNorm_layer(planes)
self.down_sample_layer = down_sample_layer
self.stride = stride
def construct(self, x):
""" process """
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.down_sample_layer is not None:
residual = self.down_sample_layer(x)
out += residual
out = self.relu(out)
return out
class Bottleneck(nn.Cell):
""" bottleneck for ResNet """
expansion = 4
def __init__(self, inplanes, planes, stride=1, down_sample_layer=None, PSP=0, BatchNorm_layer=nn.BatchNorm2d):
super(Bottleneck, self).__init__()
self.BatchNorm_layer = BatchNorm_layer
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, has_bias=False)
self.bn1 = self.BatchNorm_layer(planes)
if PSP == 1:
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, pad_mode="pad", padding=2, has_bias=False,
dilation=2)
elif PSP == 2:
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, pad_mode="pad", padding=4, has_bias=False,
dilation=4)
else:
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
padding=1, has_bias=False, pad_mode="pad")
self.bn2 = self.BatchNorm_layer(planes)
self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, has_bias=False)
self.bn3 = self.BatchNorm_layer(planes * self.expansion)
self.relu = nn.ReLU()
self.down_sample_layer = down_sample_layer
self.stride = stride
def construct(self, x):
""" process """
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.down_sample_layer is not None:
residual = self.down_sample_layer(x)
out += residual
out = self.relu(out)
return out
class ResNet(nn.Cell):
""" ResNet """
def __init__(self, block, layers, deep_base=False, BatchNorm_layer=nn.BatchNorm2d):
super(ResNet, self).__init__()
self.deep_base = deep_base
self.BatchNorm_layer = BatchNorm_layer
if not self.deep_base:
self.inplanes = 64
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, has_bias=False, pad_mode="pad")
self.bn1 = self.BatchNorm_layer(64)
else:
self.inplanes = 128
self.conv1 = conv3x3(3, 64, stride=2)
self.bn1 = self.BatchNorm_layer(64)
self.conv2 = conv3x3(64, 64)
self.bn2 = self.BatchNorm_layer(64)
self.conv3 = conv3x3(64, 128)
self.bn3 = self.BatchNorm_layer(128)
self.relu = nn.ReLU()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same")
self.layer1 = self._make_layer(block, 64, layers[0], PSP=0)
self.layer2 = self._make_layer(block, 128, layers[1], stride=2, PSP=0)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2, PSP=1)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2, PSP=2)
self.avgpool = nn.AvgPool2d(7, stride=1)
def _make_layer(self, block, planes, blocks, PSP, stride=1):
""" make ResNet layer """
down_sample_layer = None
if stride != 1 or self.inplanes != planes * block.expansion:
if PSP == 0:
down_sample_layer = nn.SequentialCell(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=stride, has_bias=False),
self.BatchNorm_layer(planes * block.expansion),
)
else:
down_sample_layer = nn.SequentialCell(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=1, has_bias=False),
self.BatchNorm_layer(planes * block.expansion),
)
layers = [block(self.inplanes, planes, stride, down_sample_layer, PSP=PSP)]
self.inplanes = planes * block.expansion
for _ in range(1, blocks):
layers.append(block(self.inplanes, planes, PSP=PSP))
return nn.SequentialCell(*layers)
def construct(self, x):
""" ResNet process """
x = self.relu(self.bn1(self.conv1(x)))
if self.deep_base:
x = self.relu(self.bn2(self.conv2(x)))
x = self.relu(self.bn3(self.conv3(x)))
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
return x
def resnet50(**kwargs):
"""Constructs a ResNet-50 model.
Args:
If True, returns a model pre-trained on ImageNet
"""
model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
return model
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
loss function helper
"""
import mindspore.nn as nn
from mindspore import Tensor
from mindspore.ops import operations as P
from mindspore import dtype as mstype
class SoftmaxCrossEntropyLoss(nn.Cell):
""" The true calculation process of the loss function """
def __init__(self, num_cls=21, ignore_label=255):
super(SoftmaxCrossEntropyLoss, self).__init__()
self.one_hot = P.OneHot(axis=-1)
self.on_value = Tensor(1.0, mstype.float32)
self.off_value = Tensor(0.0, mstype.float32)
self.cast = P.Cast()
self.ce = nn.SoftmaxCrossEntropyWithLogits()
self.not_equal = P.NotEqual()
self.num_cls = num_cls
self.ignore_label = ignore_label
self.mul = P.Mul()
self.sum = P.ReduceSum(False)
self.div = P.RealDiv()
self.transpose = P.Transpose()
self.reshape = P.Reshape()
def construct(self, logits, labels):
""" calculation process of the loss """
labels_int = self.cast(labels, mstype.int32)
labels_int = self.reshape(labels_int, (-1,))
logits_ = self.transpose(logits, (0, 2, 3, 1))
logits_ = self.reshape(logits_, (-1, self.num_cls))
weights = self.not_equal(labels_int, self.ignore_label)
weights = self.cast(weights, mstype.float32)
one_hot_labels = self.one_hot(
labels_int, self.num_cls, self.on_value, self.off_value
)
loss = self.ce(logits_, one_hot_labels)
loss = self.mul(weights, loss)
loss = self.div(self.sum(loss), self.sum(weights))
return loss
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Functions for parsing args
"""
import os
from ast import literal_eval
import copy
import yaml
class CfgNode(dict):
"""
CfgNode represents an internal node in the configuration tree. It's a simple
dict-like container that allows for attribute-based access to keys.
"""
def __init__(self, init_dict=None, key_list=None, new_allowed=False):
# Recursively convert nested dictionaries in init_dict into CfgNodes
init_dict = {} if init_dict is None else init_dict
key_list = [] if key_list is None else key_list
for k, v in init_dict.items():
if isinstance(v, dict):
# Convert dict to CfgNode
init_dict[k] = CfgNode(v, key_list=key_list + [k])
super(CfgNode, self).__init__(init_dict)
self.new_allowed = new_allowed
def __getattr__(self, name):
if name in self:
return self[name]
raise AttributeError(name)
def __setattr__(self, name, value):
self[name] = value
def __str__(self):
def _indent(s_, num_spaces):
s__ = s_.split("\n")
if len(s__) == 1:
return s_
first = s__.pop(0)
s__ = [(num_spaces * " ") + line for line in s__]
s__ = "\n".join(s__)
s__ = first + "\n" + s__
return s__
r = ""
s = []
for k, v in sorted(self.items()):
separator = "\n" if isinstance(v, CfgNode) else " "
attr_str = "{}:{}{}".format(str(k), separator, str(v))
attr_str = _indent(attr_str, 2)
s.append(attr_str)
r += "\n".join(s)
return r
def __repr__(self):
return "{}({})".format(self.__class__.__name__, super(CfgNode, self).__repr__())
def load_cfg_from_cfg_file(file_):
""" load file """
cfg = {}
assert os.path.isfile(file_) and file_.endswith('.yaml'), \
'{} is not a yaml file'.format(file_)
with open(file_, 'r') as f:
cfg_from_file = yaml.safe_load(f)
for key in cfg_from_file:
for k, v in cfg_from_file[key].items():
cfg[k] = v
cfg = CfgNode(cfg)
return cfg
def merge_cfg_from_list(cfg, cfg_list):
""" aux function """
new_cfg = copy.deepcopy(cfg)
assert len(cfg_list) % 2 == 0
for full_key, v in zip(cfg_list[0::2], cfg_list[1::2]):
subkey = full_key.split('.')[-1]
assert subkey in cfg, 'Non-existent key: {}'.format(full_key)
value = _decode_cfg_value(v)
value = _check_and_coerce_cfg_value_type(
value, cfg[subkey], subkey, full_key
)
setattr(new_cfg, subkey, value)
return new_cfg
def _decode_cfg_value(v):
"""Decodes a raw config value (e.g., from a yaml config files or command
line argument) into a Python object.
"""
# All remaining processing is only applied to strings
if not isinstance(v, str):
return v
# Try to interpret `v` as a:
# string, number, tuple, list, dict, boolean, or None
try:
v = literal_eval(v)
# The following two excepts allow v to pass through when it represents a
# string.
#
# Longer explanation:
# The type of v is always a string (before calling literal_eval), but
# sometimes it *represents* a string and other times a data structure, like
# a list. In the case that v represents a string, what we got back from the
# yaml parser is 'foo' *without quotes* (so, not '"foo"'). literal_eval is
# ok with '"foo"', but will raise a ValueError if given 'foo'. In other
# cases, like paths (v = 'foo/bar' and not v = '"foo/bar"'), literal_eval
# will raise a SyntaxError.
except ValueError:
pass
except SyntaxError:
pass
return v
def _check_and_coerce_cfg_value_type(replacement, original, key, full_key):
"""Checks that `replacement`, which is intended to replace `original` is of
the right type. The type is correct if it matches exactly or is one of a few
cases in which the type can be easily coerced.
"""
if key:
pass
original_type = type(original)
replacement_type = type(replacement)
# The types must match (with some exceptions)
if replacement_type == original_type:
return replacement
# Cast replacement from from_type to to_type if the replacement and original
# types match from_type and to_type
def conditional_cast(from_type_, to_type_):
""" helper """
if replacement_type == from_type_ and original_type == to_type_:
return True, to_type_(replacement)
return False, None
# Conditionally casts
# list <-> tuple
casts = [(tuple, list), (list, tuple)]
# For py2: allow converting from str (bytes) to a unicode string
try:
casts.append((str, unicode)) # noqa: F821
except RuntimeError:
pass
for (from_type, to_type) in casts:
converted, converted_value = conditional_cast(from_type, to_type)
if converted:
return converted_value
raise ValueError(
"Type mismatch ({} vs. {}) with values ({} vs. {}) for config "
"key: {}".format(
original_type, replacement_type, original, replacement, full_key
)
)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""learning rates"""
import numpy as np
def cosine_lr(base_lr, decay_steps, total_steps):
""" Learning rate strategy """
for i in range(total_steps):
step_ = min(i, decay_steps)
yield base_lr * 0.5 * (1 + np.cos(np.pi * step_ / decay_steps))
def poly_lr(base_lr, decay_steps, total_steps, end_lr=0.0001, power=0.9):
""" Learning rate strategy """
res = []
for i in range(total_steps):
step_ = min(i, decay_steps)
res.append((base_lr - end_lr) * ((1.0 - step_ / decay_steps) ** power) + end_lr)
return res
def exponential_lr(base_lr, decay_steps, decay_rate, total_steps, staircase=False):
""" Learning rate strategy """
for i in range(total_steps):
if staircase:
power_ = i // decay_steps
else:
power_ = float(i) / decay_steps
yield base_lr * (decay_rate ** power_)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
""" eval_callback """
from mindspore.nn.metrics.metric import Metric
from src.utils.metrics import SoftmaxCrossEntropyLoss
class pspnet_metric(Metric):
""" callback class """
def __init__(self, num_classes=150, ignore_label=255):
super(pspnet_metric, self).__init__()
self.loss = SoftmaxCrossEntropyLoss(num_classes, ignore_label)
self.val_loss = 0
self.count = 0
self.clear()
def clear(self):
""" clear the init value """
self.val_loss = 0
self.count = 0
def update(self, *inputs):
""" update the calculate process """
if len(inputs) != 2:
raise ValueError('Expect inputs (y_pred, y), but got {}'.format(len(inputs)))
_, predict = inputs[0]
the_loss = self.loss(predict, inputs[1])
self.val_loss += the_loss
self.count += 1
def eval(self):
""" return the result """
return self.val_loss / float(self.count)
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment