diff --git a/research/cv/PSPNet/README.md b/research/cv/PSPNet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..2d1665ceeeea0031f63d1487d84b7c1ec874d0b1 --- /dev/null +++ b/research/cv/PSPNet/README.md @@ -0,0 +1,195 @@ +# Contents + +- [PSPNet Description](#PSPNet-description) +- [Model Architecture](#PSPNet-Architeture) +- [Dataset](#PSPNet-Dataset) +- [Environmental Requirements](#Environmental) +- [Script Description](#script-description) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Process](#training-process) + - [Pre-training](#pre-training) + - [Training](#training) + - [Training Results](#training-results) + - [Evaluation Process](#evaluation-process) + - [Evaluation](#evaluation) + - [Evaluation Result](#evaluation-result) +- [Model Description](#model-description) +- [Description of Random Situation](#description-of-random-situation) +- [ModelZoo Homepage](#modelzoo-homepage) + +# [PSPNet Description](#Contents) + +PSPNet(Pyramid Scene Parsing Network) has great capability of global context information by different-region based context aggregation through the pyramid pooling module together. + +[paper](https://arxiv.org/abs/1612.01105) from CVPR2017 + +# [Model Architecture](#Contents) + +The pyramid pooling module fuses features under four different pyramid scales.For maintaining a reasonable gap in representation锛宼he module is a four-level one with bin sizes of 1脳1, 2脳2, 3脳3 and 6脳6 respectively. + +# [Dataset](#Content) + +- [PASCAL VOC 2012 and SBD Dataset Website](http://home.bharathh.info/pubs/codes/SBD/download.html) + - It contains 11,357 finely annotated images split into training and testing sets with 8,498 and 2,857 images respectively. +- [ADE20K Dataset Website](http://groups.csail.mit.edu/vision/datasets/ADE20K/) + - It contains 22,210 finely annotated images split into training and testing sets with 20,210 and 2,000 images respectively. + +# [Environmental requirements](#Contents) + +- Hardware :(Ascend) + - Prepare ascend processor to build hardware environment +- frame: + - [Mindspore](https://www.mindspore.cn/install) +- For details, please refer to the following resources: + - [MindSpore course](https://www.mindspore.cn/tutorials/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html) + +# [Scription Description](#Content) + +## Script and Sample Code + +```python +. +鈹斺攢PSPNet +鈹溾攢鈹€ eval.py # Evaluation python file for ADE20K/VOC2012 +鈹溾攢鈹€ export.py # export mindir +鈹溾攢鈹€ README.md # descriptions about PSPnet +鈹溾攢鈹€ src # PSPNet +鈹偮犅� 鈹溾攢鈹€ config # the training config file +鈹偮犅� 鈹偮犅� 鈹溾攢鈹€ ade20k_pspnet50.yaml +鈹偮犅� 鈹偮犅� 鈹斺攢鈹€ voc2012_pspnet50.yaml +鈹偮犅� 鈹溾攢鈹€ dataset # data processing +鈹偮犅� 鈹偮犅� 鈹溾攢鈹€ dataset.py +鈹偮犅� 鈹偮犅� 鈹斺攢鈹€ transform.py +鈹偮犅� 鈹溾攢鈹€ model # models for training and test +鈹偮犅� 鈹偮犅� 鈹溾攢鈹€ PSPNet.py +鈹偮犅� 鈹偮犅� 鈹溾攢鈹€ resnet.py +鈹偮犅� 鈹偮犅� 鈹斺攢鈹€ cell.py # loss function +鈹偮犅� 鈹斺攢鈹€ utils +鈹偮犅� 聽聽 鈹溾攢鈹€ functions_args.py # test helper +鈹偮犅� 聽聽 鈹溾攢鈹€ lr.py # learning rate +鈹偮犅� 聽聽 鈹溾攢鈹€ metric_and_evalcallback.py # evalcallback +鈹偮犅� 聽聽 鈹溾攢鈹€ aux_loss.py # loss function helper +鈹偮犅� 聽聽 鈹斺攢鈹€ p_util.py # some functions +鈹� +鈹溾攢鈹€ scripts +鈹偮犅� 鈹溾攢鈹€ run_distribute_train_ascend.sh # multi cards distributed training in ascend +鈹偮犅� 鈹溾攢鈹€ run_train1p_ascend.sh # multi cards distributed training in ascend +鈹偮犅� 鈹斺攢鈹€ run_eval.sh # validation script +鈹斺攢鈹€ train.py # The training python file for ADE20K/VOC2012 +``` + +## Script Parameters + +Set script parameters in src/config/ade20k_pspnet50.yaml and src/config/voc2012_pspnet50.yaml + +### Model + +```bash +name: "PSPNet" +backbone: "resnet50_v2" +base_size: 512 # based size for scaling +crop_size: 473 +``` + +### Optimizer + +```bash +init_lr: 0.005 +momentum: 0.9 +weight_decay: 0.0001 +``` + +### Training + +```bash +batch_size: 8 # batch size for training +batch_size_val: 8 # batch size for validation during training +ade_root: "./data/ADE/" # set dataset path +voc_root: "./data/voc/voc" +epochs: 100/50 # ade/voc2012 +pretrained_model_path: "./data/resnet_deepbase.ckpt" +save_checkpoint_epochs: 10 +keep_checkpoint_max: 10 +``` + +## Training Process + +### Training + +- Train on a single card + +```shell + bash scripts/run_train1p_ascend.sh [YAML_PATH] [DEVICE_ID] +``` + +- Run distributed train in ascend processor environment + +```shell + bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH] +``` + +### Training Result + +The training results will be saved in the PSPNet path, you can view the log in the ./LOG/log.txt + +```bash +# training result(1p)-voc2012 +epoch: 1 step: 1063, loss is 0.62588865 +epoch time: 493974.632 ms, per step time: 464.699 ms +epoch: 2 step: 1063, loss is 0.68774235 +epoch time: 428786.495 ms, per step time: 403.374 ms +epoch: 3 step: 1063, loss is 0.4055968 +epoch time: 428773.945 ms, per step time: 403.362 ms +epoch: 4 step: 1063, loss is 0.7540638 +epoch time: 428783.473 ms, per step time: 403.371 ms +epoch: 5 step: 1063, loss is 0.49349666 +epoch time: 428776.845 ms, per step time: 403.365 ms +``` + +## Evaluation Process + +### Evaluation + +Check the checkpoint path in config/ade20k_pspnet50.yaml and config/voc2012_pspnet50.yaml used for evaluation before running the following command. + +```shell + bash run_eval.sh [YAML_PATH] [DEVICE_ID] +``` + +### Evaluation Result + +The results at eval/log were as follows: + +```bash +ADE20K:mIoU/mAcc/allAcc 0.4164/0.5319/0.7996. +VOC2012:mIoU/mAcc/allAcc 0.7380/0.8229/0.9293. +```` + +# [Model Description](#Content) + +## Performance + +### Distributed Training Performance + +|Parameter | PSPNet | +| ------------------- | --------------------------------------------------------- | +|resources | Ascend 910锛汣PU 2.60GHz, 192core锛沵emory锛�755G | +|Upload date |2021.11.13 | +|mindspore version |mindspore1.3.0 | +|training parameter |epoch=100,batch_size=8 | +|optimizer |SGD optimizer锛宮omentum=0.9,weight_decay=0.0001 | +|loss function |SoftmaxCrossEntropyLoss | +|training speed |epoch time: 493974.632 ms, per step time: 464.699 ms(1p for voc2012)| +|total time |6h10m34s(1pcs) | +|Script URL |https://gitee.com/mindspore/models/tree/master/research/cv/PSPNet| +|Random number seed |set_seed = 1234 | + +# [Description of Random Situation](#Content) + +The random seed in `train.py`. + +# [ModelZoo Homepage](#Content) + +Please visit the official website [homepage](https://gitee.com/mindspore/models). diff --git a/research/cv/PSPNet/config/ade20k_pspnet50.yaml b/research/cv/PSPNet/config/ade20k_pspnet50.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9e7f82ca4911a452b1587418470451ac349d203d --- /dev/null +++ b/research/cv/PSPNet/config/ade20k_pspnet50.yaml @@ -0,0 +1,53 @@ +DATA: + data_root: /home/HEU_535/PSPNet/data/ADE/ + art_data_root: /cache/data/ADE + train_list: /home/HEU_535/PSPNet/data/ADE/training_list.txt + art_train_list: /cache/data/ADE/training_list.txt + val_list: /home/HEU_535/PSPNet/data/ADE/val_list.txt + art_val_list: /cache/data/ADE/val_list.txt + classes: 150 + prefix: ADE + save_dir: /home/HEU_535/PSPNet/checkpoints/ + backbone: resnet50 + pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt + art_pretrain_path: /cache/data/ADE/resnet_deepbase.ckpt + ckpt: /home/HEU_535/PSPNet/checkpoints/8P/ADE-100_316.ckpt + obs_save: obs://harbin-engineering-uni/PSPnet/save_checkpoint/ADE/ + +TRAIN: + arch: psp + feature_size: 60 + train_h: 473 + train_w: 473 + scale_min: 0.5 # minimum random scale + scale_max: 2.0 # maximum random scale + rotate_min: -10 # minimum random rotate + rotate_max: 10 # maximum random rotate + zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8] + ignore_label: 255 + aux_weight: 0.4 + data_name: ade + batch_size: 8 # batch size for training + art_batch_size: 4 + batch_size_val: 8 # batch size for validation during training + base_lr: 0.005 + art_base_lr: 0.04 + epochs: 100 + start_epoch: 0 + power: 0.9 + momentum: 0.9 + weight_decay: 0.0001 + + +TEST: + test_list: /home/HEU_535/PSPNet/data/ADE/list/validation.txt + split: val # split in [train, val and test] + base_size: 512 # based size for scaling + test_h: 473 + test_w: 473 + scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] + index_start: 0 # evaluation start index in list + index_step: 0 # evaluation step index in list, 0 means to end + result_path: /home/HEU_535/PSPNet/result/ade/ + color_txt: /home/HEU_535/PSPNet/config/ade20k/ade20k_colors.txt + name_txt: /home/HEU_535/PSPNet/config/ade20k/ade20k_names.txt diff --git a/research/cv/PSPNet/config/voc2012_pspnet50.yaml b/research/cv/PSPNet/config/voc2012_pspnet50.yaml new file mode 100644 index 0000000000000000000000000000000000000000..bddf449b4d74897b1a1d6d01f2510c2bed848792 --- /dev/null +++ b/research/cv/PSPNet/config/voc2012_pspnet50.yaml @@ -0,0 +1,53 @@ +DATA: + data_root: /home/HEU_535/PSPNet/data/voc/voc/ + art_data_root: /cache/data + train_list: /home/HEU_535/PSPNet/data/voc/voc/train_list.txt + art_train_list: /cache/data/train_list.txt + val_list: /home/HEU_535/PSPNet/data/voc/voc/val_list.txt + art_val_list: /cache/data/val_list.txt + classes: 21 + prefix: voc + save_dir: /home/HEU_535/PSPNet/checkpoints/ + backbone: resnet50 + pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt + art_pretrain_path: /cache/data/resnet_deepbase.ckpt + ckpt: /home/HEU_535/PSPNet/checkpoints/8P/voc-50_133.ckpt + obs_save: obs://harbin-engineering-uni/PSPnet/save_checkpoint/voc/ + +TRAIN: + arch: psp + feature_size: 60 + train_h: 473 + train_w: 473 + scale_min: 0.5 # minimum random scale + scale_max: 2.0 # maximum random scale + rotate_min: -10 # minimum random rotate + rotate_max: 10 # maximum random rotate + zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8] + ignore_label: 255 + aux_weight: 0.4 + data_name: + batch_size: 8 # batch size for training + art_batch_size: 4 + batch_size_val: 8 # batch size for validation during training, memory and speed tradeoff + base_lr: 0.005 + art_base_lr: 0.02 + epochs: 50 + start_epoch: 0 + power: 0.9 + momentum: 0.9 + weight_decay: 0.0001 + + +TEST: + test_list: /home/HEU_535/PSPNet/dataset/voc2012/list/val.txt + split: val # split in [train, val and test] + base_size: 512 # based size for scaling + test_h: 473 + test_w: 473 + scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] + index_start: 0 # evaluation start index in list + index_step: 0 # evaluation step index in list, 0 means to end + result_path: /home/HEU_535/PSPNet/result/voc/ + color_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_colors.txt + name_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_names.txt diff --git a/research/cv/PSPNet/eval.py b/research/cv/PSPNet/eval.py new file mode 100644 index 0000000000000000000000000000000000000000..f8e845b88cd0762dae8a33ea114860a9ea9a4668 --- /dev/null +++ b/research/cv/PSPNet/eval.py @@ -0,0 +1,285 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" VOC2012 DATASET EVALUATE """ +import os +import time +import logging +import argparse +import cv2 +import numpy +from src.dataset import pt_dataset, pt_transform +import src.utils.functions_args as fa +from src.utils.p_util import AverageMeter, intersectionAndUnion, check_makedirs, colorize +import mindspore.numpy as np +from mindspore import Tensor +import mindspore.dataset as ds +from mindspore import context +import mindspore.nn as nn +import mindspore.ops as ops +from mindspore.train.serialization import load_param_into_net, load_checkpoint + +cv2.ocl.setUseOpenCL(False) +device_id = int(os.getenv('DEVICE_ID')) +context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", + device_id=device_id, save_graphs=False) + + +def get_parser(): + """ + Read parameter file + -> for ADE20k: ./src/config/voc2012_pspnet50.yaml + -> for voc2012: ./src/config/voc2012_pspnet50.yaml + """ + parser = argparse.ArgumentParser(description='MindSpore Semantic Segmentation') + parser.add_argument('--config', type=str, required=True, default='./src/config/voc2012_pspnet50.yaml', + help='config file') + parser.add_argument('opts', help='see ./src/config/voc2012_pspnet50.yaml for all options', default=None, + nargs=argparse.REMAINDER) + args_ = parser.parse_args() + assert args_.config is not None + cfg = fa.load_cfg_from_cfg_file(args_.config) + if args_.opts is not None: + cfg = fa.merge_cfg_from_list(cfg, args_.opts) + return cfg + + +def get_logger(): + """ logger """ + logger_name = "main-logger" + logger_ = logging.getLogger(logger_name) + logger_.setLevel(logging.INFO) + handler = logging.StreamHandler() + fmt = "[%(asctime)s %(levelname)s %(filename)s line %(lineno)d %(process)d] %(message)s" + handler.setFormatter(logging.Formatter(fmt)) + logger_.addHandler(handler) + return logger_ + + +def check(local_args): + """ check args """ + assert local_args.classes > 1 + assert local_args.zoom_factor in [1, 2, 4, 8] + assert local_args.split in ['train', 'val', 'test'] + if local_args.arch == 'psp': + assert (local_args.train_h - 1) % 8 == 0 and (local_args.train_w - 1) % 8 == 0 + else: + raise Exception('architecture not supported {} yet'.format(local_args.arch)) + + +def main(): + """ The main function of the evaluate process """ + check(args) + logger.info("=> creating model ...") + logger.info("Classes: %s", args.classes) + + value_scale = 255 + mean = [0.485, 0.456, 0.406] + mean = [item * value_scale for item in mean] + std = [0.229, 0.224, 0.225] + std = [item * value_scale for item in std] + gray_folder = os.path.join(args.result_path, 'gray') + color_folder = os.path.join(args.result_path, 'color') + + test_transform = pt_transform.Compose([pt_transform.Normalize(mean=mean, std=std, is_train=False)]) + test_data = pt_dataset.SemData( + split='val', data_root=args.data_root, + data_list=args.val_list, + transform=test_transform) + + test_loader = ds.GeneratorDataset(test_data, column_names=["data", "label"], + shuffle=False) + test_loader.batch(1) + colors = numpy.loadtxt(args.color_txt).astype('uint8') + names = [line.rstrip('\n') for line in open(args.name_txt)] + + from src.model import pspnet + PSPNet = pspnet.PSPNet( + feature_size=args.feature_size, + num_classes=args.classes, + backbone=args.backbone, + pretrained=False, + pretrained_path="", + aux_branch=False, + deep_base=True + ) + + ms_checkpoint = load_checkpoint(args.ckpt) + load_param_into_net(PSPNet, ms_checkpoint, strict_load=True) + PSPNet.set_train(False) + test(test_loader, test_data.data_list, PSPNet, args.classes, mean, std, args.base_size, args.test_h, + args.test_w, args.scales, gray_folder, color_folder, colors) + if args.split != 'test': + cal_acc(test_data.data_list, gray_folder, args.classes, names) + + +def net_process(model, image, mean, std=None, flip=True): + """ Give the input to the model""" + transpose = ops.Transpose() + input_ = transpose(image, (2, 0, 1)) # (473, 473, 3) -> (3, 473, 473) + mean = np.array(mean) + std = np.array(std) + if std is None: + input_ = input_ - mean[:, None, None] + else: + input_ = (input_ - mean[:, None, None]) / std[:, None, None] + + expand_dim = ops.ExpandDims() + input_ = expand_dim(input_, 0) + if flip: + flip_ = ops.ReverseV2(axis=[3]) + flip_input = flip_(input_) + concat = ops.Concat(axis=0) + input_ = concat((input_, flip_input)) + + model.set_train(False) + output = model(input_) + _, _, h_i, w_i = input_.shape + _, _, h_o, w_o = output.shape + if (h_o != h_i) or (w_o != w_i): + bi_linear = nn.ResizeBilinear() + output = bi_linear(output, size=(h_i, w_i), align_corners=True) + softmax = nn.Softmax(axis=1) + output = softmax(output) + if flip: + flip_ = ops.ReverseV2(axis=[2]) + output = (output[0] + flip_(output[1])) / 2 + else: + output = output[0] + output = transpose(output, (1, 2, 0)) # Tensor + output = output.asnumpy() + return output + + +def scale_process(model, image, classes, crop_h, crop_w, h, w, mean, std=None, stride_rate=2 / 3): + """ Process input size """ + ori_h, ori_w, _ = image.shape + pad_h = max(crop_h - ori_h, 0) + pad_w = max(crop_w - ori_w, 0) + pad_h_half = int(pad_h / 2) + pad_w_half = int(pad_w / 2) + if pad_h > 0 or pad_w > 0: + image = cv2.copyMakeBorder(image, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half, + cv2.BORDER_CONSTANT, value=mean) + + new_h, new_w, _ = image.shape + image = Tensor.from_numpy(image) + stride_h = int(numpy.ceil(crop_h * stride_rate)) + stride_w = int(numpy.ceil(crop_w * stride_rate)) + grid_h = int(numpy.ceil(float(new_h - crop_h) / stride_h) + 1) + grid_w = int(numpy.ceil(float(new_w - crop_w) / stride_w) + 1) + prediction_crop = numpy.zeros((new_h, new_w, classes), dtype=float) + count_crop = numpy.zeros((new_h, new_w), dtype=float) + for index_h in range(0, grid_h): + for index_w in range(0, grid_w): + s_h = index_h * stride_h + e_h = min(s_h + crop_h, new_h) + s_h = e_h - crop_h + s_w = index_w * stride_w + e_w = min(s_w + crop_w, new_w) + s_w = e_w - crop_w + image_crop = image[s_h:e_h, s_w:e_w].copy() + count_crop[s_h:e_h, s_w:e_w] += 1 + prediction_crop[s_h:e_h, s_w:e_w, :] += net_process(model, image_crop, mean, std) + prediction_crop /= numpy.expand_dims(count_crop, 2) + prediction_crop = prediction_crop[pad_h_half:pad_h_half + ori_h, pad_w_half:pad_w_half + ori_w] + prediction = cv2.resize(prediction_crop, (w, h), interpolation=cv2.INTER_LINEAR) + return prediction + + +def test(test_loader, data_list, model, classes, mean, std, base_size, crop_h, crop_w, scales, gray_folder, + color_folder, colors): + """ Generate evaluate image """ + logger.info('>>>>>>>>>>>>>>>> Start Evaluation >>>>>>>>>>>>>>>>') + data_time = AverageMeter() + batch_time = AverageMeter() + model.set_train(False) + end = time.time() + for i, (input_, _) in enumerate(test_loader): + data_time.update(time.time() - end) + input_ = input_.asnumpy() + image = numpy.transpose(input_, (1, 2, 0)) + h, w, _ = image.shape + prediction = numpy.zeros((h, w, classes), dtype=float) + for scale in scales: + long_size = round(scale * base_size) + new_h = long_size + new_w = long_size + if h > w: + new_w = round(long_size / float(h) * w) + else: + new_h = round(long_size / float(w) * h) + + image_scale = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_LINEAR) + prediction += scale_process(model, image_scale, classes, crop_h, crop_w, h, w, mean, std) + prediction /= len(scales) + prediction = numpy.argmax(prediction, axis=2) + batch_time.update(time.time() - end) + end = time.time() + if ((i + 1) % 10 == 0) or (i + 1 == len(data_list)): + logger.info('Test: [{}/{}] ' + 'Data {data_time.val:.3f} ({data_time.avg:.3f}) ' + 'Batch {batch_time.val:.3f} ({batch_time.avg:.3f}).'.format(i + 1, len(data_list), + data_time=data_time, + batch_time=batch_time)) + check_makedirs(gray_folder) + check_makedirs(color_folder) + gray = numpy.uint8(prediction) + color = colorize(gray, colors) + image_path, _ = data_list[i] + image_name = image_path.split('/')[-1].split('.')[0] + gray_path = os.path.join(gray_folder, image_name + '.png') + color_path = os.path.join(color_folder, image_name + '.png') + cv2.imwrite(gray_path, gray) + color.save(color_path) + logger.info('<<<<<<<<<<<<<<<<< End Evaluation <<<<<<<<<<<<<<<<<') + + +def cal_acc(data_list, pred_folder, classes, names): + """ Calculation evaluating indicator """ + intersection_meter = AverageMeter() + union_meter = AverageMeter() + target_meter = AverageMeter() + + for i, (image_path, target_path) in enumerate(data_list): + image_name = image_path.split('/')[-1].split('.')[0] + pred = cv2.imread(os.path.join(pred_folder, image_name + '.png'), cv2.IMREAD_GRAYSCALE) + target = cv2.imread(target_path, cv2.IMREAD_GRAYSCALE) + if args.prefix == 'ADE': + target -= 1 + intersection, union, target = intersectionAndUnion(pred, target, classes) + intersection_meter.update(intersection) + union_meter.update(union) + target_meter.update(target) + accuracy = sum(intersection_meter.val) / (sum(target_meter.val) + 1e-10) + logger.info( + 'Evaluating {0}/{1} on image {2}, accuracy {3:.4f}.'.format(i + 1, len(data_list), image_name + '.png', + accuracy)) + + iou_class = intersection_meter.sum / (union_meter.sum + 1e-10) + accuracy_class = intersection_meter.sum / (target_meter.sum + 1e-10) + mIoU = numpy.mean(iou_class) + mAcc = numpy.mean(accuracy_class) + allAcc = sum(intersection_meter.sum) / (sum(target_meter.sum) + 1e-10) + + logger.info('Eval result: mIoU/mAcc/allAcc {:.4f}/{:.4f}/{:.4f}.'.format(mIoU, mAcc, allAcc)) + for i in range(classes): + logger.info('Class_{} result: iou/accuracy {:.4f}/{:.4f}, name: {}.'.format(i, iou_class[i], accuracy_class[i], + names[i])) + + +if __name__ == '__main__': + args = get_parser() + logger = get_logger() + main() diff --git a/research/cv/PSPNet/export.py b/research/cv/PSPNet/export.py new file mode 100644 index 0000000000000000000000000000000000000000..68f54b95481fd1e7e1994e0e1102c3a1e046e4da --- /dev/null +++ b/research/cv/PSPNet/export.py @@ -0,0 +1,64 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""export checkpoint file into air, onnx, mindir models""" +import argparse +import numpy as np +import src.utils.functions_args as fa +from src.model import pspnet +import mindspore.common.dtype as dtype +from mindspore import Tensor, context, load_checkpoint, load_param_into_net, export + +parser = argparse.ArgumentParser(description='maskrcnn export') +parser.add_argument("--device_id", type=int, default=0, help="Device id") +parser.add_argument("--batch_size", type=int, default=1, help="batch size") +parser.add_argument("--yaml_path", type=str, required=True, default='./src/config/voc2012_pspnet50.yaml', + help='yaml file path') +parser.add_argument("--ckpt_file", type=str, required=True, default='./checkpoints/voc/ADE-50_1063.ckpt', + help="Checkpoint file path.") +parser.add_argument("--file_name", type=str, default="PSPNet", help="output file name.") +parser.add_argument("--file_format", type=str, choices=["AIR", "ONNX", "MINDIR"], default="MINDIR", help="file format") +parser.add_argument('--device_target', type=str, default="Ascend", + choices=['Ascend', 'GPU', 'CPU'], help='device target (default: Ascend)') +parser.add_argument("--project_path", type=str, default='/root/PSPNet/', + help="project_path,default is /root/PSPNet/") +args = parser.parse_args() + +context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target) + +if args.device_target == "Ascend": + context.set_context(device_id=args.device_id) + +if __name__ == '__main__': + config_path = args.yaml_path + cfg = fa.load_cfg_from_cfg_file(config_path) + + net = pspnet.PSPNet( + feature_size=cfg.feature_size, + num_classes=cfg.classes, + backbone=cfg.backbone, + pretrained=False, + pretrained_path="", + aux_branch=False, + deep_base=True + ) + param_dict = load_checkpoint(args.ckpt_file) + + load_param_into_net(net, param_dict, strict_load=True) + net.set_train(False) + + img = Tensor(np.ones([args.batch_size, 3, 473, 473]), dtype.float32) + print("################## Start export ###################") + export(net, img, file_name=args.file_name, file_format=args.file_format) + print("################## Finish export ###################") diff --git a/research/cv/PSPNet/requirements.txt b/research/cv/PSPNet/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/research/cv/PSPNet/scripts/run_distribute_train_ascend.sh b/research/cv/PSPNet/scripts/run_distribute_train_ascend.sh new file mode 100644 index 0000000000000000000000000000000000000000..7110a25929c938413982515936f67f642d0c39a6 --- /dev/null +++ b/research/cv/PSPNet/scripts/run_distribute_train_ascend.sh @@ -0,0 +1,47 @@ +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 2 ] +then + echo "==============================================================================================================" + echo "Usage: bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH]" + echo "Please run the script as: " + echo "bash /PSPNet/scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [YAML_PATH]" + echo "for example: bash scripts/run_distribute_train_ascend.sh /PSPNet/scripts/config/RANK_TABLE_FILE PSPNet/config/voc2012_pspnet50.yaml" + echo "==============================================================================================================" + exit 1 +fi + +export RANK_SIZE=8 +export RANK_TABLE_FILE=$1 +export YAML_PATH=$2 +export HCCL_CONNECT_TIMEOUT=6000 + +for((i=0;i<RANK_SIZE;i++)) +do + export DEVICE_ID=$i + rm -rf LOG$i + mkdir ./LOG$i + cp ./*.py ./LOG$i + cp -r ./src ./LOG$i + cd ./LOG$i || exit + export RANK_ID=$i + echo "start training for rank $i, device $DEVICE_ID" + env > env.log + python3 train.py --config="$YAML_PATH"> ./log.txt 2>&1 & + + cd ../ +done \ No newline at end of file diff --git a/research/cv/PSPNet/scripts/run_eval.sh b/research/cv/PSPNet/scripts/run_eval.sh new file mode 100644 index 0000000000000000000000000000000000000000..5574e853c89b367cdb40dcb5e65da5bbc6dbed9f --- /dev/null +++ b/research/cv/PSPNet/scripts/run_eval.sh @@ -0,0 +1,35 @@ +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 2 ] +then + echo "==============================================================================================================" + echo "Usage: bash /PSPNet/scripts/run_eval.sh [YAML_PATH] [DEVICE_ID]" + echo "for example: bash PSPNet/scripts/run_eval.sh PSPNet/config/voc2012_pspnet50.yaml 0" + echo "==============================================================================================================" + exit 1 +fi + +rm -rf LOG +mkdir ./LOG +export YAML_PATH=$1 +export RANK_SIZE=1 +export RANK_ID=0 +export DEVICE_ID=$2 +echo "start evaluating for device $DEVICE_ID" +env > env.log + +python3 eval.py --config="$YAML_PATH" > ./LOG/eval_log.txt 2>&1 & \ No newline at end of file diff --git a/research/cv/PSPNet/scripts/run_train1p_ascend.sh b/research/cv/PSPNet/scripts/run_train1p_ascend.sh new file mode 100644 index 0000000000000000000000000000000000000000..c87a815e5b0cb4a767f19ad4305808c8c788a2c4 --- /dev/null +++ b/research/cv/PSPNet/scripts/run_train1p_ascend.sh @@ -0,0 +1,35 @@ +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 2 ] +then + echo "==============================================================================================================" + echo "Usage: bash PSPNet/scripts/run_train1p_ascend.sh [YAML_PATH] [DEVICE_ID]" + echo "for example: bash PSPNet/scripts/run_train1p_ascend.sh PSPNet/config/voc2012_pspnet50.yaml 0" + echo "==============================================================================================================" + exit 1 +fi + +rm -rf LOG +mkdir ./LOG +export YAML_PATH=$1 +export RANK_SIZE=1 +export RANK_ID=0 +export DEVICE_ID=$2 +echo "start training for device $DEVICE_ID" +env > env.log + +python3 train.py --config="$YAML_PATH" > ./LOG/log.txt 2>&1 & diff --git a/research/cv/PSPNet/src/config/ade20k_pspnet50.yaml b/research/cv/PSPNet/src/config/ade20k_pspnet50.yaml new file mode 100644 index 0000000000000000000000000000000000000000..818389f27b8ac30f78dee029260404f24437f18c --- /dev/null +++ b/research/cv/PSPNet/src/config/ade20k_pspnet50.yaml @@ -0,0 +1,47 @@ +DATA: + data_root: ./data/ADE/ + train_list: ./data/ADE/training_list.txt + val_list: ./data/ADE/val_list.txt # test_list: dataset/ade20k/list/validation.txt + classes: 150 + prefix: ADE + save_dir: ./checkpoints/ + backbone: resnet50 + pretrain_path: ./data/resnet_deepbase.ckpt + ckpt: ./checkpoints/ade/ADE_1-100_2527.ckpt + +TRAIN: + arch: psp + feature_size: 60 + train_h: 473 + train_w: 473 + scale_min: 0.5 # minimum random scale + scale_max: 2.0 # maximum random scale + rotate_min: -10 # minimum random rotate + rotate_max: 10 # maximum random rotate + zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8] + ignore_label: 255 + aux_weight: 0.4 + data_name: ade + batch_size: 8 # batch size for training + batch_size_val: 8 # batch size for validation during training + base_lr: 0.005 + epochs: 100 + start_epoch: 0 + power: 0.9 + momentum: 0.9 + weight_decay: 0.0001 + + +TEST: + test_list: ./dataset/ade20k/list/validation.txt + split: val # split in [train, val and test] + base_size: 512 # based size for scaling + test_h: 473 + test_w: 473 + scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] + index_start: 0 # evaluation start index in list + index_step: 0 # evaluation step index in list, 0 means to end + result_path: ./result/ade/ + color_txt: ./src/config/ade20k/ade20k_colors.txt + name_txt: ./src/config/ade20k/ade20k_names.txt + diff --git a/research/cv/PSPNet/src/config/voc2012_pspnet50.yaml b/research/cv/PSPNet/src/config/voc2012_pspnet50.yaml new file mode 100644 index 0000000000000000000000000000000000000000..9ea4df720db1b0d9630be5ccb668cbc1ae5557d6 --- /dev/null +++ b/research/cv/PSPNet/src/config/voc2012_pspnet50.yaml @@ -0,0 +1,46 @@ +DATA: + data_root: /home/HEU_535/PSPNet/data/voc/voc/ + train_list: /home/HEU_535/PSPNet/data/voc/voc/train_list.txt + val_list: /home/HEU_535/PSPNet/data/voc/voc/val_list.txt + classes: 21 + prefix: voc + save_dir: /home/HEU_535/PSPNet/checkpoints/ + backbone: resnet50 + pretrain_path: /home/HEU_535/PSPNet/data/resnet_deepbase.ckpt + ckpt: /home/HEU_535/PSPNet/checkpoints/8P/voc-50_133.ckpt + +TRAIN: + arch: psp + feature_size: 60 + train_h: 473 + train_w: 473 + scale_min: 0.5 # minimum random scale + scale_max: 2.0 # maximum random scale + rotate_min: -10 # minimum random rotate + rotate_max: 10 # maximum random rotate + zoom_factor: 8 # zoom factor for final prediction during training, be in [1, 2, 4, 8] + ignore_label: 255 + aux_weight: 0.4 + data_name: + batch_size: 8 # batch size for training + batch_size_val: 8 # batch size for validation during training, memory and speed tradeoff + base_lr: 0.005 + epochs: 50 + start_epoch: 0 + power: 0.9 + momentum: 0.9 + weight_decay: 0.0001 + + +TEST: + test_list: /home/HEU_535/PSPNet/dataset/voc2012/list/val.txt + split: val # split in [train, val and test] + base_size: 512 # based size for scaling + test_h: 473 + test_w: 473 + scales: [1.0] # evaluation scales, ms as [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] + index_start: 0 # evaluation start index in list + index_step: 0 # evaluation step index in list, 0 means to end + result_path: /home/HEU_535/PSPNet/result/voc/ + color_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_colors.txt + name_txt: /home/HEU_535/PSPNet/config/voc2012/voc2012_names.txt diff --git a/research/cv/PSPNet/src/dataset/pt_dataset.py b/research/cv/PSPNet/src/dataset/pt_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..0231b65272be5fd24ff87a4a806444521edd29f5 --- /dev/null +++ b/research/cv/PSPNet/src/dataset/pt_dataset.py @@ -0,0 +1,81 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" read the dataset file """ +import os +import os.path +import cv2 +import numpy as np + +IMG_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.ppm', '.bmp', '.pgm'] + + +def is_image_file(filename): + """ check file """ + filename_lower = filename.lower() + return any(filename_lower.endswith(extension) for extension in IMG_EXTENSIONS) + + +def make_dataset(split='train', data_root=None, data_list=None): + """ get data list """ + assert split in ['train', 'val', 'test'] + if not os.path.isfile(data_list): + raise RuntimeError("Image list file do not exist: " + data_list + "\n") + image_label_list = [] + list_read = open(data_list).readlines() + print("Totally {} samples in {} set.".format(len(list_read), split)) + print("Starting Checking image&label pair {} list...".format(split)) + for line in list_read: + line = line.strip() + line_split = line.split(' ') + if split == 'test': + if len(line_split) != 1: + raise RuntimeError("Image list file read line error : " + line + "\n") + image_name = os.path.join(data_root, line_split[0]) + label_name = image_name # just set place holder for label_name, not for use + else: + if len(line_split) != 2: + raise RuntimeError("Image list file read line error : " + line + "\n") + image_name = os.path.join(data_root, line_split[0]) + label_name = os.path.join(data_root, line_split[1]) + item = (image_name, label_name) + image_label_list.append(item) + print("Checking image&label pair {} list done!".format(split)) + return image_label_list + + +class SemData: + """ dataset class """ + def __init__(self, split='train', data_root=None, data_list=None, transform=None, data_name=None): + self.split = split + self.data_list = make_dataset(split, data_root, data_list) # (image_name, label_name) + self.transform = transform + self.data_name = data_name + + def __len__(self): + return len(self.data_list) + + def __getitem__(self, index): + image_path, label_path = self.data_list[index] + image = cv2.imread(image_path, cv2.IMREAD_COLOR) # BGR 3 channel ndarray wiht shape H * W * 3 + image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # convert cv2 read image from BGR order to RGB order + image = np.float32(image) + label = cv2.imread(label_path, cv2.IMREAD_GRAYSCALE) # GRAY 1 channel ndarray with shape H * W + if self.data_name is not None: + label -= 1 + if image.shape[0] != label.shape[0] or image.shape[1] != label.shape[1]: + raise RuntimeError("Image & label shape mismatch: " + image_path + " " + label_path + "\n") + if self.transform is not None: + image, label = self.transform(image, label) + return image.astype(np.float32), label.astype(np.int32) diff --git a/research/cv/PSPNet/src/dataset/pt_transform.py b/research/cv/PSPNet/src/dataset/pt_transform.py new file mode 100644 index 0000000000000000000000000000000000000000..f6044dbca06283251cb4ab6cbb79af6eab3bf243 --- /dev/null +++ b/research/cv/PSPNet/src/dataset/pt_transform.py @@ -0,0 +1,282 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" +Functions for input transform +""" +import random +import math +import numbers +import collections +import numpy as np +import cv2 + + +class Compose: + """ compose the process functions """ + + def __init__(self, segtransform): + self.segtransform = segtransform + + def __call__(self, image, label): + for t in self.segtransform: + image, label = t(image, label) + return image, label + + +class Normalize: + """ + Normalize tensor with mean and standard deviation along channel: + + channel = (channel - mean) / std + + """ + + def __init__(self, mean, std=None, is_train=True): + if std is None: + assert mean + else: + assert len(mean) == len(std) + self.mean = np.array(mean) + self.std = np.array(std) + self.is_train = is_train + + def __call__(self, image, label): + if not isinstance(image, np.ndarray) or not isinstance(label, np.ndarray): + raise RuntimeError("segtransform.ToTensor() only handle np.ndarray" + "[eg: data read by cv2.imread()].\n") + if len(image.shape) > 3 or len(image.shape) < 2: + raise RuntimeError("segtransform.ToTensor() only handle np.ndarray with 3 dims or 2 dims.\n") + if len(image.shape) == 2: + image = np.expand_dims(image, axis=2) + if not len(label.shape) == 2: + raise RuntimeError("segtransform.ToTensor() only handle np.ndarray labellabel with 2 dims.\n") + image = np.transpose(image, (2, 0, 1)) # (473, 473, 3) -> (3, 473, 473) + + if self.is_train: + if self.std is None: + image = image - self.mean[:, None, None] + else: + image = (image - self.mean[:, None, None]) / self.std[:, None, None] + return image, label + + +class Resize: + """Resize the input to the given size, 'size' is a 2-element tuple or list in the order of (h, w). """ + + def __init__(self, size): + assert (isinstance(size, collections.Iterable) and len(size) == 2) + self.size = size + + def __call__(self, image, label): + image = cv2.resize(image, self.size[::-1], interpolation=cv2.INTER_LINEAR) + label = cv2.resize(label, self.size[::-1], interpolation=cv2.INTER_NEAREST) + return image, label + + +class RandScale: + """ Randomly resize image & label with scale factor in [scale_min, scale_max] """ + + def __init__(self, scale, aspect_ratio=None): + assert (isinstance(scale, collections.Iterable) and len(scale) == 2) + if isinstance(scale, collections.Iterable) and len(scale) == 2 \ + and isinstance(scale[0], numbers.Number) and isinstance(scale[1], numbers.Number) \ + and 0 < scale[0] < scale[1]: + self.scale = scale + else: + raise RuntimeError("segtransform.RandScale() scale param error.\n") + if aspect_ratio is None: + self.aspect_ratio = aspect_ratio + elif isinstance(aspect_ratio, collections.Iterable) and len(aspect_ratio) == 2 \ + and isinstance(aspect_ratio[0], numbers.Number) and isinstance(aspect_ratio[1], numbers.Number) \ + and 0 < aspect_ratio[0] < aspect_ratio[1]: + self.aspect_ratio = aspect_ratio + else: + raise RuntimeError("segtransform.RandScale() aspect_ratio param error.\n") + + def __call__(self, image, label): + temp_scale = self.scale[0] + (self.scale[1] - self.scale[0]) * random.random() + temp_aspect_ratio = 1.0 + if self.aspect_ratio is not None: + temp_aspect_ratio = self.aspect_ratio[0] + (self.aspect_ratio[1] - self.aspect_ratio[0]) * random.random() + temp_aspect_ratio = math.sqrt(temp_aspect_ratio) + scale_factor_x = temp_scale * temp_aspect_ratio + scale_factor_y = temp_scale / temp_aspect_ratio + image = cv2.resize(image, None, fx=scale_factor_x, fy=scale_factor_y, interpolation=cv2.INTER_LINEAR) + label = cv2.resize(label, None, fx=scale_factor_x, fy=scale_factor_y, interpolation=cv2.INTER_NEAREST) + return image, label + + +class Crop: + """Crops the given ndarray image (H*W*C or H*W). + Args: + size (sequence or int): Desired output size of the crop. If size is an + int instead of sequence like (h, w), a square crop (size, size) is made. + """ + + def __init__(self, size, crop_type='center', padding=None, ignore_label=255): + # [473, 473], 'rand', padding=mean, ignore255 + if isinstance(size, int): + self.crop_h = size + self.crop_w = size + elif isinstance(size, collections.Iterable) and len(size) == 2 \ + and isinstance(size[0], int) and isinstance(size[1], int) \ + and size[0] > 0 and size[1] > 0: + self.crop_h = size[0] + self.crop_w = size[1] + else: + raise RuntimeError("crop size error.\n") + if crop_type in ('center', 'rand'): + self.crop_type = crop_type + else: + raise RuntimeError("crop type error: rand | center\n") + if padding is None: + self.padding = padding + elif isinstance(padding, list): + if all(isinstance(i, numbers.Number) for i in padding): + self.padding = padding + else: + raise RuntimeError("padding in Crop() should be a number list\n") + if len(padding) != 3: + raise RuntimeError("padding channel is not equal with 3\n") + else: + raise RuntimeError("padding in Crop() should be a number list\n") + if isinstance(ignore_label, int): + self.ignore_label = ignore_label + else: + raise RuntimeError("ignore_label should be an integer number\n") + + def __call__(self, image, label): + h, w = label.shape + pad_h = max(self.crop_h - h, 0) + pad_w = max(self.crop_w - w, 0) + pad_h_half = int(pad_h / 2) + pad_w_half = int(pad_w / 2) + if pad_h > 0 or pad_w > 0: + if self.padding is None: + raise RuntimeError("segtransform.Crop() need padding while padding argument is None\n") + image = cv2.copyMakeBorder(image, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half, + cv2.BORDER_CONSTANT, value=self.padding) + label = cv2.copyMakeBorder(label, pad_h_half, pad_h - pad_h_half, pad_w_half, pad_w - pad_w_half, + cv2.BORDER_CONSTANT, value=self.ignore_label) + + h, w = label.shape + if self.crop_type == 'rand': + h_off = random.randint(0, h - self.crop_h) + w_off = random.randint(0, w - self.crop_w) + else: + h_off = int((h - self.crop_h) / 2) + w_off = int((w - self.crop_w) / 2) + image = image[h_off:h_off + self.crop_h, w_off:w_off + self.crop_w] + label = label[h_off:h_off + self.crop_h, w_off:w_off + self.crop_w] + return image, label + + +class RandRotate: + """ + Randomly rotate image & label with rotate factor in [rotate_min, rotate_max] + """ + + def __init__(self, rotate, padding, ignore_label=255, p=0.5): + assert (isinstance(rotate, collections.Iterable) and len(rotate) == 2) + if isinstance(rotate[0], numbers.Number) and isinstance(rotate[1], numbers.Number) and rotate[0] < rotate[1]: + self.rotate = rotate + else: + raise RuntimeError("segtransform.RandRotate() scale param error.\n") + assert padding is not None + assert isinstance(padding, list) and len(padding) == 3 + if all(isinstance(i, numbers.Number) for i in padding): + self.padding = padding + else: + raise RuntimeError("padding in RandRotate() should be a number list\n") + assert isinstance(ignore_label, int) + self.ignore_label = ignore_label + self.p = p + + def __call__(self, image, label): + + if random.random() < self.p: + angle = self.rotate[0] + (self.rotate[1] - self.rotate[0]) * random.random() + h, w = label.shape + matrix = cv2.getRotationMatrix2D((w / 2, h / 2), angle, 1) + image = cv2.warpAffine(image, matrix, (w, h), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, + borderValue=self.padding) + label = cv2.warpAffine(label, matrix, (w, h), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT, + borderValue=self.ignore_label) + return image, label + + +class RandomHorizontalFlip: + """ Random Horizontal Flip """ + + def __init__(self, p=0.5): + self.p = p + + def __call__(self, image, label): + if random.random() < self.p: + image = cv2.flip(image, 1) + label = cv2.flip(label, 1) + return image, label + + +class RandomVerticalFlip: + """ Random Vertical Flip """ + + def __init__(self, p=0.5): + self.p = p + + def __call__(self, image, label): + if random.random() < self.p: + image = cv2.flip(image, 0) + label = cv2.flip(label, 0) + return image, label + + +class RandomGaussianBlur: + """ + RandomGaussianBlur + """ + + def __init__(self, radius=5): + self.radius = radius + + def __call__(self, image, label): + if random.random() < 0.5: + image = cv2.GaussianBlur(image, (self.radius, self.radius), 0) + return image, label + + +class RGB2BGR: + """ + Converts image from RGB order to BGR order + """ + + def __init__(self): + pass + + def __call__(self, image, label): + image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) + return image, label + + +class BGR2RGB: + """ + Converts image from BGR order to RGB order + """ + def __init__(self): + pass + + def __call__(self, image, label): + image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) + return image, label diff --git a/research/cv/PSPNet/src/model/cell.py b/research/cv/PSPNet/src/model/cell.py new file mode 100644 index 0000000000000000000000000000000000000000..22187ac1e32bad9506cdccd2a06f507aea39aa7f --- /dev/null +++ b/research/cv/PSPNet/src/model/cell.py @@ -0,0 +1,35 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" PSPNet loss function """ +from mindspore import nn +from src.utils.metrics import SoftmaxCrossEntropyLoss + + +class Aux_CELoss_Cell(nn.Cell): + """ loss """ + def __init__(self, num_classes=21, ignore_label=255): + super(Aux_CELoss_Cell, self).__init__() + self.num_classes = num_classes + self.loss = SoftmaxCrossEntropyLoss(self.num_classes, ignore_label) + + def construct(self, net_out, target): + """ the process of calculate loss """ + if len(net_out) == 2: + predict_aux, predict = net_out + CE_loss = self.loss(predict, target) + CE_loss_aux = self.loss(predict_aux, target) + loss = CE_loss + (0.4 * CE_loss_aux) + return loss + return self.loss(net_out, target) diff --git a/research/cv/PSPNet/src/model/pspnet.py b/research/cv/PSPNet/src/model/pspnet.py new file mode 100644 index 0000000000000000000000000000000000000000..763a45b14f903511f24a7c1af9711e54739a737c --- /dev/null +++ b/research/cv/PSPNet/src/model/pspnet.py @@ -0,0 +1,237 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" PSPNet """ +from src.model.resnet import resnet50 +import mindspore +import mindspore.nn as nn +import mindspore.ops as ops +from mindspore.train.serialization import load_param_into_net, load_checkpoint +import mindspore.common.initializer as weight_init + + +class ResNet(nn.Cell): + """ The pretrained ResNet """ + + def __init__(self, pretrained_path, pretrained=False, deep_base=False, BatchNorm_layer=nn.BatchNorm2d): + super(ResNet, self).__init__() + resnet = resnet50(deep_base=deep_base, BatchNorm_layer=BatchNorm_layer) + if pretrained: + params = load_checkpoint(pretrained_path) + load_param_into_net(resnet, params) + if deep_base: + self.layer1 = nn.SequentialCell(resnet.conv1, resnet.bn1, resnet.relu, resnet.conv2, resnet.bn2, + resnet.relu, resnet.conv3, resnet.bn3, resnet.relu, resnet.maxpool) + else: + self.layer1 = nn.SequentialCell(resnet.conv1, resnet.bn1, resnet.relu, resnet.maxpool) + self.layer2 = resnet.layer1 + self.layer3 = resnet.layer2 + self.layer4 = resnet.layer3 + self.layer5 = resnet.layer4 + + def construct(self, x): + """ ResNet process """ + x = self.layer1(x) + x = self.layer2(x) + x = self.layer3(x) + x_aux = self.layer4(x) + x = self.layer5(x_aux) + + return x_aux, x + + +class AdaPool1(nn.Cell): + """ 1x1 pooling """ + + def __init__(self): + super(AdaPool1, self).__init__() + self.reduceMean = ops.ReduceMean(keep_dims=True) + + def construct(self, X): + """ 1x1 pooling process """ + pooled_1x1 = self.reduceMean(X, (-2, -1)) + return pooled_1x1 + + +class AdaPool2(nn.Cell): + """ 2x2 pooling """ + + def __init__(self): + super(AdaPool2, self).__init__() + self.reduceMean = ops.ReduceMean() + self.reshape = ops.Reshape() + + def construct(self, X): + """ 2x2 pooling process """ + batch_size, channels, _, _ = X.shape + X = self.reshape(X, (batch_size, channels, 2, 30, 2, 30)) + pooled_2x2_out = self.reduceMean(X, (3, 5)) + return pooled_2x2_out + + +class AdaPool3(nn.Cell): + """ 3x3 pooling """ + + def __init__(self): + super(AdaPool3, self).__init__() + self.reduceMean = ops.ReduceMean() + self.reshape = ops.Reshape() + + def construct(self, X): + """ 3x3 pooling process """ + batch_size, channels, _, _ = X.shape + X = self.reshape(X, (batch_size, channels, 3, 20, 3, 20)) + pooled_3x3_out = self.reduceMean(X, (3, 5)) + return pooled_3x3_out + + +class _PSPModule(nn.Cell): + """ PSP module """ + + def __init__(self, in_channels, pool_sizes, feature_shape, BatchNorm_layer=nn.BatchNorm2d): + super(_PSPModule, self).__init__() + out_channels = in_channels // len(pool_sizes) + self.BatchNorm_layer = BatchNorm_layer + self.stage1 = nn.SequentialCell( + AdaPool1(), + nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False), + self.BatchNorm_layer(out_channels), + nn.ReLU(), + ) + self.stage2 = nn.SequentialCell( + nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False), + self.BatchNorm_layer(out_channels), + nn.ReLU() + ) + self.stage3 = nn.SequentialCell( + AdaPool3(), + nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False), + self.BatchNorm_layer(out_channels), + nn.ReLU(), + ) + self.stage4 = nn.SequentialCell( + nn.AvgPool2d(kernel_size=10, stride=10), + nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False), + self.BatchNorm_layer(out_channels), + nn.ReLU() + ) + self.cat = ops.Concat(axis=1) + self.feature_shape = feature_shape + self.resize_ops = ops.ResizeBilinear( + (self.feature_shape[0], self.feature_shape[1]), True + ) + self.cast = ops.Cast() + + def construct(self, x): + """ PSP module process """ + x = self.cast(x, mindspore.float32) + s1_out = self.resize_ops(self.stage1(x)) + s2_out = self.resize_ops(self.stage2(x)) + s3_out = self.resize_ops(self.stage3(x)) + s4_out = self.resize_ops(self.stage4(x)) + out = (x, s1_out, s2_out, s3_out, s4_out) + out = self.cat(out) + + return out + + +class PSPNet(nn.Cell): + """ PSPNet """ + + def __init__( + self, + pool_sizes=None, + feature_size=60, + num_classes=21, + backbone="resnet50", + pretrained=True, + pretrained_path="", + aux_branch=False, + deep_base=False, + BatchNorm_layer=nn.BatchNorm2d + ): + """ + """ + super(PSPNet, self).__init__() + if pool_sizes is None: + pool_sizes = [1, 2, 3, 6] + if backbone == "resnet50": + self.backbone = ResNet( + pretrained=pretrained, + pretrained_path=pretrained_path, + deep_base=deep_base, + BatchNorm_layer=BatchNorm_layer + ) + aux_channel = 1024 + out_channel = 2048 + else: + raise ValueError( + "Unsupported backbone - `{}`, Use resnet50 .".format(backbone) + ) + self.BatchNorm_layer = BatchNorm_layer + self.feature_shape = [feature_size, feature_size] + self.pool_sizes = [feature_size // pool_size for pool_size in pool_sizes] + self.ppm = _PSPModule(in_channels=out_channel, pool_sizes=self.pool_sizes, feature_shape=self.feature_shape) + self.cls = nn.SequentialCell( + nn.Conv2d(out_channel * 2, 512, kernel_size=3, padding=1, pad_mode="pad", has_bias=False), + self.BatchNorm_layer(512), + nn.ReLU(), + nn.Dropout(0.9), + nn.Conv2d(512, num_classes, kernel_size=1, has_bias=True) + ) + self.aux_branch = aux_branch + if self.aux_branch: + self.auxiliary_branch = nn.SequentialCell( + nn.Conv2d(aux_channel, 256, kernel_size=3, padding=1, pad_mode="pad", has_bias=False), + self.BatchNorm_layer(256), + nn.ReLU(), + nn.Dropout(0.9), + nn.Conv2d(256, num_classes, kernel_size=1, has_bias=True) + ) + self.resize = nn.ResizeBilinear() + self.shape = ops.Shape() + self.init_weights(self.cls) + + def init_weights(self, *models): + """ init the model parameters """ + for model in models: + for _, cell in model.cells_and_names(): + if isinstance(cell, nn.Conv2d): + cell.weight.set_data( + weight_init.initializer( + weight_init.HeNormal(), cell.weight.shape, cell.weight.dtype + ) + ) + if isinstance(cell, nn.Dense): + cell.weight.set_data( + weight_init.initializer( + weight_init.TruncatedNormal(0.01), + cell.weight.shape, + cell.weight.dtype, + ) + ) + cell.bias.set_data(1e-4, cell.bias.shape, cell.bias.dtype) + + def construct(self, x): + """ PSPNet process """ + x_shape = self.shape(x) + x_aux, x = self.backbone(x) + x = self.ppm(x) + out = self.cls(x) + out = self.resize(out, size=(x_shape[2:4]), align_corners=True) + if self.aux_branch: + out_aux = self.auxiliary_branch(x_aux) + output_aux = self.resize(out_aux, size=(x_shape[2:4]), align_corners=True) + return output_aux, out + return out diff --git a/research/cv/PSPNet/src/model/resnet.py b/research/cv/PSPNet/src/model/resnet.py new file mode 100644 index 0000000000000000000000000000000000000000..0f808dce297083fe05a31d284e22331ccd27b07e --- /dev/null +++ b/research/cv/PSPNet/src/model/resnet.py @@ -0,0 +1,185 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" THE Pretrained model ResNet """ +import mindspore.nn as nn + + +def conv3x3(in_channels, out_channels, stride=1, dilation=1): + """ 3x3 convolution """ + return nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, dilation=dilation, pad_mode="pad", + padding=1, has_bias=False) + + +class BasicBlock(nn.Cell): + """ basic Block for resnet """ + expansion = 1 + + def __init__(self, inplanes, planes, stride=1, down_sample_layer=None, BatchNorm_layer=nn.BatchNorm2d): + super(BasicBlock, self).__init__() + self.BatchNorm_layer = BatchNorm_layer + self.conv1 = conv3x3(inplanes, planes, stride) + self.bn1 = self.BatchNorm_layer(planes) + self.relu = nn.ReLU() + self.conv2 = conv3x3(planes, planes) + self.bn2 = self.BatchNorm_layer(planes) + self.down_sample_layer = down_sample_layer + self.stride = stride + + def construct(self, x): + """ process """ + residual = x + + out = self.conv1(x) + out = self.bn1(out) + out = self.relu(out) + + out = self.conv2(out) + out = self.bn2(out) + + if self.down_sample_layer is not None: + residual = self.down_sample_layer(x) + + out += residual + out = self.relu(out) + + return out + + +class Bottleneck(nn.Cell): + """ bottleneck for ResNet """ + expansion = 4 + + def __init__(self, inplanes, planes, stride=1, down_sample_layer=None, PSP=0, BatchNorm_layer=nn.BatchNorm2d): + super(Bottleneck, self).__init__() + self.BatchNorm_layer = BatchNorm_layer + self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, has_bias=False) + self.bn1 = self.BatchNorm_layer(planes) + if PSP == 1: + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, pad_mode="pad", padding=2, has_bias=False, + dilation=2) + elif PSP == 2: + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, pad_mode="pad", padding=4, has_bias=False, + dilation=4) + + else: + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, + padding=1, has_bias=False, pad_mode="pad") + + self.bn2 = self.BatchNorm_layer(planes) + self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, has_bias=False) + self.bn3 = self.BatchNorm_layer(planes * self.expansion) + self.relu = nn.ReLU() + self.down_sample_layer = down_sample_layer + self.stride = stride + + def construct(self, x): + """ process """ + residual = x + + out = self.conv1(x) + out = self.bn1(out) + out = self.relu(out) + + out = self.conv2(out) + out = self.bn2(out) + out = self.relu(out) + + out = self.conv3(out) + out = self.bn3(out) + + if self.down_sample_layer is not None: + residual = self.down_sample_layer(x) + + out += residual + out = self.relu(out) + + return out + + +class ResNet(nn.Cell): + """ ResNet """ + + def __init__(self, block, layers, deep_base=False, BatchNorm_layer=nn.BatchNorm2d): + super(ResNet, self).__init__() + self.deep_base = deep_base + self.BatchNorm_layer = BatchNorm_layer + if not self.deep_base: + self.inplanes = 64 + self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, has_bias=False, pad_mode="pad") + self.bn1 = self.BatchNorm_layer(64) + else: + self.inplanes = 128 + self.conv1 = conv3x3(3, 64, stride=2) + self.bn1 = self.BatchNorm_layer(64) + self.conv2 = conv3x3(64, 64) + self.bn2 = self.BatchNorm_layer(64) + self.conv3 = conv3x3(64, 128) + self.bn3 = self.BatchNorm_layer(128) + self.relu = nn.ReLU() + self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same") + self.layer1 = self._make_layer(block, 64, layers[0], PSP=0) + self.layer2 = self._make_layer(block, 128, layers[1], stride=2, PSP=0) + self.layer3 = self._make_layer(block, 256, layers[2], stride=2, PSP=1) + self.layer4 = self._make_layer(block, 512, layers[3], stride=2, PSP=2) + self.avgpool = nn.AvgPool2d(7, stride=1) + + def _make_layer(self, block, planes, blocks, PSP, stride=1): + """ make ResNet layer """ + down_sample_layer = None + if stride != 1 or self.inplanes != planes * block.expansion: + if PSP == 0: + down_sample_layer = nn.SequentialCell( + nn.Conv2d(self.inplanes, planes * block.expansion, + kernel_size=1, stride=stride, has_bias=False), + self.BatchNorm_layer(planes * block.expansion), + ) + else: + down_sample_layer = nn.SequentialCell( + nn.Conv2d(self.inplanes, planes * block.expansion, + kernel_size=1, stride=1, has_bias=False), + self.BatchNorm_layer(planes * block.expansion), + ) + + layers = [block(self.inplanes, planes, stride, down_sample_layer, PSP=PSP)] + self.inplanes = planes * block.expansion + for _ in range(1, blocks): + layers.append(block(self.inplanes, planes, PSP=PSP)) + + return nn.SequentialCell(*layers) + + def construct(self, x): + """ ResNet process """ + x = self.relu(self.bn1(self.conv1(x))) + if self.deep_base: + x = self.relu(self.bn2(self.conv2(x))) + x = self.relu(self.bn3(self.conv3(x))) + x = self.maxpool(x) + + x = self.layer1(x) + x = self.layer2(x) + x = self.layer3(x) + x = self.layer4(x) + + return x + + +def resnet50(**kwargs): + """Constructs a ResNet-50 model. + + Args: + If True, returns a model pre-trained on ImageNet + """ + model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs) + return model diff --git a/research/cv/PSPNet/src/utils/aux_loss.py b/research/cv/PSPNet/src/utils/aux_loss.py new file mode 100644 index 0000000000000000000000000000000000000000..4ad782d1f01a8c8510e0337fbdd290bd5e5c04e9 --- /dev/null +++ b/research/cv/PSPNet/src/utils/aux_loss.py @@ -0,0 +1,56 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" +loss function helper +""" +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.ops import operations as P +from mindspore import dtype as mstype + + +class SoftmaxCrossEntropyLoss(nn.Cell): + """ The true calculation process of the loss function """ + def __init__(self, num_cls=21, ignore_label=255): + super(SoftmaxCrossEntropyLoss, self).__init__() + self.one_hot = P.OneHot(axis=-1) + self.on_value = Tensor(1.0, mstype.float32) + self.off_value = Tensor(0.0, mstype.float32) + self.cast = P.Cast() + self.ce = nn.SoftmaxCrossEntropyWithLogits() + self.not_equal = P.NotEqual() + self.num_cls = num_cls + self.ignore_label = ignore_label + self.mul = P.Mul() + self.sum = P.ReduceSum(False) + self.div = P.RealDiv() + self.transpose = P.Transpose() + self.reshape = P.Reshape() + + def construct(self, logits, labels): + """ calculation process of the loss """ + labels_int = self.cast(labels, mstype.int32) + labels_int = self.reshape(labels_int, (-1,)) + logits_ = self.transpose(logits, (0, 2, 3, 1)) + logits_ = self.reshape(logits_, (-1, self.num_cls)) + weights = self.not_equal(labels_int, self.ignore_label) + weights = self.cast(weights, mstype.float32) + one_hot_labels = self.one_hot( + labels_int, self.num_cls, self.on_value, self.off_value + ) + loss = self.ce(logits_, one_hot_labels) + loss = self.mul(weights, loss) + loss = self.div(self.sum(loss), self.sum(weights)) + return loss diff --git a/research/cv/PSPNet/src/utils/functions_args.py b/research/cv/PSPNet/src/utils/functions_args.py new file mode 100644 index 0000000000000000000000000000000000000000..a82460ae18aee32d0e05855803e6d31ef81d623c --- /dev/null +++ b/research/cv/PSPNet/src/utils/functions_args.py @@ -0,0 +1,178 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" +Functions for parsing args +""" +import os +from ast import literal_eval +import copy +import yaml + + +class CfgNode(dict): + """ + CfgNode represents an internal node in the configuration tree. It's a simple + dict-like container that allows for attribute-based access to keys. + """ + + def __init__(self, init_dict=None, key_list=None, new_allowed=False): + # Recursively convert nested dictionaries in init_dict into CfgNodes + init_dict = {} if init_dict is None else init_dict + key_list = [] if key_list is None else key_list + for k, v in init_dict.items(): + if isinstance(v, dict): + # Convert dict to CfgNode + init_dict[k] = CfgNode(v, key_list=key_list + [k]) + super(CfgNode, self).__init__(init_dict) + self.new_allowed = new_allowed + + def __getattr__(self, name): + if name in self: + return self[name] + raise AttributeError(name) + + def __setattr__(self, name, value): + self[name] = value + + def __str__(self): + def _indent(s_, num_spaces): + s__ = s_.split("\n") + if len(s__) == 1: + return s_ + first = s__.pop(0) + s__ = [(num_spaces * " ") + line for line in s__] + s__ = "\n".join(s__) + s__ = first + "\n" + s__ + return s__ + + r = "" + s = [] + for k, v in sorted(self.items()): + separator = "\n" if isinstance(v, CfgNode) else " " + attr_str = "{}:{}{}".format(str(k), separator, str(v)) + attr_str = _indent(attr_str, 2) + s.append(attr_str) + r += "\n".join(s) + return r + + def __repr__(self): + return "{}({})".format(self.__class__.__name__, super(CfgNode, self).__repr__()) + + +def load_cfg_from_cfg_file(file_): + """ load file """ + cfg = {} + assert os.path.isfile(file_) and file_.endswith('.yaml'), \ + '{} is not a yaml file'.format(file_) + + with open(file_, 'r') as f: + cfg_from_file = yaml.safe_load(f) + + for key in cfg_from_file: + for k, v in cfg_from_file[key].items(): + cfg[k] = v + + cfg = CfgNode(cfg) + return cfg + + +def merge_cfg_from_list(cfg, cfg_list): + """ aux function """ + new_cfg = copy.deepcopy(cfg) + assert len(cfg_list) % 2 == 0 + for full_key, v in zip(cfg_list[0::2], cfg_list[1::2]): + subkey = full_key.split('.')[-1] + assert subkey in cfg, 'Non-existent key: {}'.format(full_key) + value = _decode_cfg_value(v) + value = _check_and_coerce_cfg_value_type( + value, cfg[subkey], subkey, full_key + ) + setattr(new_cfg, subkey, value) + + return new_cfg + + +def _decode_cfg_value(v): + """Decodes a raw config value (e.g., from a yaml config files or command + line argument) into a Python object. + """ + # All remaining processing is only applied to strings + if not isinstance(v, str): + return v + # Try to interpret `v` as a: + # string, number, tuple, list, dict, boolean, or None + try: + v = literal_eval(v) + # The following two excepts allow v to pass through when it represents a + # string. + # + # Longer explanation: + # The type of v is always a string (before calling literal_eval), but + # sometimes it *represents* a string and other times a data structure, like + # a list. In the case that v represents a string, what we got back from the + # yaml parser is 'foo' *without quotes* (so, not '"foo"'). literal_eval is + # ok with '"foo"', but will raise a ValueError if given 'foo'. In other + # cases, like paths (v = 'foo/bar' and not v = '"foo/bar"'), literal_eval + # will raise a SyntaxError. + except ValueError: + pass + except SyntaxError: + pass + return v + + +def _check_and_coerce_cfg_value_type(replacement, original, key, full_key): + """Checks that `replacement`, which is intended to replace `original` is of + the right type. The type is correct if it matches exactly or is one of a few + cases in which the type can be easily coerced. + """ + if key: + pass + + original_type = type(original) + replacement_type = type(replacement) + + # The types must match (with some exceptions) + if replacement_type == original_type: + return replacement + + # Cast replacement from from_type to to_type if the replacement and original + # types match from_type and to_type + def conditional_cast(from_type_, to_type_): + """ helper """ + if replacement_type == from_type_ and original_type == to_type_: + return True, to_type_(replacement) + return False, None + + # Conditionally casts + # list <-> tuple + casts = [(tuple, list), (list, tuple)] + # For py2: allow converting from str (bytes) to a unicode string + try: + casts.append((str, unicode)) # noqa: F821 + except RuntimeError: + pass + + for (from_type, to_type) in casts: + converted, converted_value = conditional_cast(from_type, to_type) + if converted: + return converted_value + + raise ValueError( + "Type mismatch ({} vs. {}) with values ({} vs. {}) for config " + "key: {}".format( + original_type, replacement_type, original, replacement, full_key + ) + ) diff --git a/research/cv/PSPNet/src/utils/lr.py b/research/cv/PSPNet/src/utils/lr.py new file mode 100644 index 0000000000000000000000000000000000000000..fd6b3bf1fbf1e16c4a3f83991a3a52b7327feca6 --- /dev/null +++ b/research/cv/PSPNet/src/utils/lr.py @@ -0,0 +1,42 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""learning rates""" +import numpy as np + + +def cosine_lr(base_lr, decay_steps, total_steps): + """ Learning rate strategy """ + for i in range(total_steps): + step_ = min(i, decay_steps) + yield base_lr * 0.5 * (1 + np.cos(np.pi * step_ / decay_steps)) + + +def poly_lr(base_lr, decay_steps, total_steps, end_lr=0.0001, power=0.9): + """ Learning rate strategy """ + res = [] + for i in range(total_steps): + step_ = min(i, decay_steps) + res.append((base_lr - end_lr) * ((1.0 - step_ / decay_steps) ** power) + end_lr) + return res + + +def exponential_lr(base_lr, decay_steps, decay_rate, total_steps, staircase=False): + """ Learning rate strategy """ + for i in range(total_steps): + if staircase: + power_ = i // decay_steps + else: + power_ = float(i) / decay_steps + yield base_lr * (decay_rate ** power_) diff --git a/research/cv/PSPNet/src/utils/metric_and_evalcallback.py b/research/cv/PSPNet/src/utils/metric_and_evalcallback.py new file mode 100644 index 0000000000000000000000000000000000000000..f8bd3f42dd8bfdd42eee15ba6b2c8705c21c415c --- /dev/null +++ b/research/cv/PSPNet/src/utils/metric_and_evalcallback.py @@ -0,0 +1,45 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" eval_callback """ +from mindspore.nn.metrics.metric import Metric +from src.utils.metrics import SoftmaxCrossEntropyLoss + + +class pspnet_metric(Metric): + """ callback class """ + def __init__(self, num_classes=150, ignore_label=255): + super(pspnet_metric, self).__init__() + self.loss = SoftmaxCrossEntropyLoss(num_classes, ignore_label) + self.val_loss = 0 + self.count = 0 + self.clear() + + def clear(self): + """ clear the init value """ + self.val_loss = 0 + self.count = 0 + + def update(self, *inputs): + """ update the calculate process """ + if len(inputs) != 2: + raise ValueError('Expect inputs (y_pred, y), but got {}'.format(len(inputs))) + _, predict = inputs[0] + the_loss = self.loss(predict, inputs[1]) + self.val_loss += the_loss + self.count += 1 + + def eval(self): + """ return the result """ + return self.val_loss / float(self.count) diff --git a/research/cv/PSPNet/src/utils/p_util.py b/research/cv/PSPNet/src/utils/p_util.py new file mode 100644 index 0000000000000000000000000000000000000000..eb66b5d388f7d0cc4384de5719800008b93f716f --- /dev/null +++ b/research/cv/PSPNet/src/utils/p_util.py @@ -0,0 +1,71 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" utils """ +import os +import numpy as np +from PIL import Image + + +class AverageMeter: + """Computes and stores the average and current value""" + + def __init__(self): + self.count = 0 + self.sum = 0 + self.avg = 0 + self.val = 0 + + def update(self, val, n=1): + """ calculate the result """ + self.val = val + self.sum += val * n + self.count += n + self.avg = self.sum / self.count + + +def intersectionAndUnion(output, target, K, ignore_index=255): + """ + 'K' classes, output and target sizes are N or N * L or N * H * W, each value in range 0 to K - 1. + """ + assert (output.ndim in [1, 2, 3]) + assert output.shape == target.shape + output = output.reshape(output.size).copy() + target = target.reshape(target.size) + output[np.where(target == ignore_index)[0]] = ignore_index + intersection = output[np.where(output == target)[0]] + area_intersection, _ = np.histogram(intersection, bins=np.arange(K + 1)) + area_output, _ = np.histogram(output, bins=np.arange(K + 1)) + area_target, _ = np.histogram(target, bins=np.arange(K + 1)) + area_union = area_output + area_target - area_intersection + return area_intersection, area_union, area_target + + +def check_mkdir(dir_name): + """ check file dir """ + if not os.path.exists(dir_name): + os.mkdir(dir_name) + + +def check_makedirs(dir_name): + """ check file dir """ + if not os.path.exists(dir_name): + os.makedirs(dir_name) + + +def colorize(gray, palette): + """ gray: numpy array of the label and 1*3N size list palette """ + color = Image.fromarray(gray.astype(np.uint8)).convert('P') + color.putpalette(palette) + return color diff --git a/research/cv/PSPNet/train.py b/research/cv/PSPNet/train.py new file mode 100644 index 0000000000000000000000000000000000000000..d23a0f3e64078d7df8c546b381388edb1a91c63a --- /dev/null +++ b/research/cv/PSPNet/train.py @@ -0,0 +1,256 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" train PSPNet and get checkpoint files """ +import os +import ast +import argparse +import src.utils.functions_args as fa +from src.model import pspnet +from src.model.cell import Aux_CELoss_Cell +from src.dataset import pt_dataset +from src.dataset import pt_transform as transform +from src.utils.lr import poly_lr +from src.utils.metric_and_evalcallback import pspnet_metric +import mindspore +from mindspore import nn +from mindspore import context +from mindspore import Tensor +from mindspore.common import set_seed +from mindspore.train.model import Model +from mindspore.communication import init +from mindspore.context import ParallelMode +from mindspore.train.callback import Callback +from mindspore.train.callback import LossMonitor, TimeMonitor +from mindspore.train.callback import ModelCheckpoint, CheckpointConfig +from mindspore.train.loss_scale_manager import FixedLossScaleManager +import mindspore.dataset as ds + +set_seed(1234) +rank_id = int(os.getenv('RANK_ID')) +device_id = int(os.getenv('DEVICE_ID')) +device_num = int(os.getenv('RANK_SIZE')) +context.set_context(mode=context.GRAPH_MODE, device_target='Ascend') +Model_Art = False + + +def get_parser(): + """ + Read parameter file + -> for ADE20k: ./src/config/voc2012_pspnet50.yaml + -> for voc2012: ./src/config/voc2012_pspnet50.yaml + """ + global Model_Art + parser = argparse.ArgumentParser(description='MindSpore Semantic Segmentation') + parser.add_argument('--config', type=str, required=True, + help='config file') + parser.add_argument('--model_art', type=ast.literal_eval, default=False, + help='train on modelArts or not, default: True') + parser.add_argument('--data_url', type=str, default='', + help='Location of data.') + parser.add_argument('--train_url', type=str, default='', + help='Location of training outputs.') + parser.add_argument('--dataset_name', type=str, default='', + help='aux parameter for ModelArt') + parser.add_argument('opts', help='see ./src/config/voc2012_pspnet50.yaml for all options', default=None, + nargs=argparse.REMAINDER) + args_ = parser.parse_args() + if args_.model_art: + import moxing as mox + mox.file.shift('os', 'mox') + Model_Art = True + root = "/cache/" + local_data_path = os.path.join(root, 'data') + if args_.dataset_name == 'ade': + obs_data_path = "obs://harbin-engineering-uni/PSPnet/data" + else: + obs_data_path = "obs://harbin-engineering-uni/PSPnet/voc" + print("########### Downloading data from OBS #############") + mox.file.copy_parallel(src_url=obs_data_path, dst_url=local_data_path) + print('########### data downloading is completed ############') + assert args_.config is not None + cfg = fa.load_cfg_from_cfg_file(args_.config) + if args_.opts is not None: + cfg = fa.merge_cfg_from_list(cfg, args_.opts) + return cfg + + +class EvalCallBack(Callback): + """Precision verification using callback function.""" + + def __init__(self, models, eval_dataset, eval_per_epochs, epochs_per_eval): + super(EvalCallBack, self).__init__() + self.models = models + self.eval_dataset = eval_dataset + self.eval_per_epochs = eval_per_epochs + self.epochs_per_eval = epochs_per_eval + + def epoch_end(self, run_context): + """ evaluate during training """ + cb_param = run_context.original_args() + cur_epoch = cb_param.cur_epoch_num + if cur_epoch % self.eval_per_epochs == 0: + val_loss = self.models.eval(self.eval_dataset, dataset_sink_mode=False) + self.epochs_per_eval["epoch"].append(cur_epoch) + self.epochs_per_eval["val_loss"].append(val_loss) + print(val_loss) + + def get_dict(self): + """ get eval dict""" + return self.epochs_per_eval + + +def create_dataset(purpose, data_root, data_list, batch_size=8): + """ get dataset """ + value_scale = 255 + mean = [0.485, 0.456, 0.406] + mean = [item * value_scale for item in mean] + std = [0.229, 0.224, 0.225] + std = [item * value_scale for item in std] + if purpose == 'train': + cur_transform = transform.Compose([ + transform.RandScale([0.5, 2.0]), + transform.RandRotate([-10, 10], padding=mean, ignore_label=255), + transform.RandomGaussianBlur(), + transform.RandomHorizontalFlip(), + transform.Crop([473, 473], crop_type='rand', padding=mean, ignore_label=255), + transform.Normalize(mean=mean, std=std, is_train=True)]) + data = pt_dataset.SemData( + split=purpose, data_root=data_root, + data_list=data_list, + transform=cur_transform, + data_name=args.data_name + ) + dataset = ds.GeneratorDataset(data, column_names=["data", "label"], + shuffle=True, num_shards=device_num, shard_id=rank_id) + dataset = dataset.batch(batch_size, drop_remainder=False) + else: + cur_transform = transform.Compose([ + transform.Crop([473, 473], crop_type='center', padding=mean, ignore_label=255), + transform.Normalize(mean=mean, std=std, is_train=True)]) + data = pt_dataset.SemData( + split=purpose, data_root=data_root, + data_list=data_list, + transform=cur_transform, + data_name=args.data_name + ) + + dataset = ds.GeneratorDataset(data, column_names=["data", "label"], + shuffle=False, num_shards=device_num, shard_id=rank_id) + dataset = dataset.batch(batch_size, drop_remainder=False) + return dataset + + +def psp_train(): + """ Train process """ + if Model_Art: + pre_path = args.art_pretrain_path + data_path = args.art_data_root + train_list_path = args.art_train_list + val_list_path = args.art_val_list + else: + pre_path = args.pretrain_path + data_path = args.data_root + train_list_path = args.train_list + val_list_path = args.val_list + if device_num > 1: + context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL, + parameter_broadcast=True, gradients_mean=True) + init() + + PSPNet = pspnet.PSPNet( + feature_size=args.feature_size, num_classes=args.classes, backbone=args.backbone, pretrained=True, + pretrained_path=pre_path, aux_branch=True, deep_base=True, + BatchNorm_layer=nn.SyncBatchNorm + ) + train_dataset = create_dataset('train', data_path, train_list_path) + validation_dataset = create_dataset('val', data_path, val_list_path) + else: + PSPNet = pspnet.PSPNet( + feature_size=args.feature_size, num_classes=args.classes, backbone=args.backbone, pretrained=True, + pretrained_path=pre_path, aux_branch=True, deep_base=True + ) + train_dataset = create_dataset('train', data_path, train_list_path) + validation_dataset = create_dataset('val', data_path, val_list_path) + + # loss + train_net_loss = Aux_CELoss_Cell(args.classes, ignore_label=255) + + steps_per_epoch = train_dataset.get_dataset_size() # Return the number of batches in an epoch. + total_train_steps = steps_per_epoch * args.epochs + + if device_num > 1: + lr_iter = poly_lr(args.art_base_lr, total_train_steps, total_train_steps, end_lr=0.0, power=0.9) + lr_iter_ten = poly_lr(args.art_base_lr, total_train_steps, total_train_steps, end_lr=0.0, power=0.9) + else: + lr_iter = poly_lr(args.base_lr, total_train_steps, total_train_steps, end_lr=0.0, power=0.9) + lr_iter_ten = poly_lr(args.base_lr, total_train_steps, total_train_steps, end_lr=0.0, power=0.9) + + pretrain_params = list(filter(lambda x: 'backbone' in x.name, PSPNet.trainable_params())) + cls_params = list(filter(lambda x: 'backbone' not in x.name, PSPNet.trainable_params())) + group_params = [{'params': pretrain_params, 'lr': Tensor(lr_iter, mindspore.float32)}, + {'params': cls_params, 'lr': Tensor(lr_iter_ten, mindspore.float32)}] + opt = nn.SGD( + params=group_params, + momentum=0.9, + weight_decay=0.0001, + loss_scale=1024, + ) + # loss scale + manager_loss_scale = FixedLossScaleManager(1024, False) + + m_metric = {'val_loss': pspnet_metric(args.classes, 255)} + + model = Model( + PSPNet, train_net_loss, optimizer=opt, loss_scale_manager=manager_loss_scale, metrics=m_metric + ) + + time_cb = TimeMonitor(data_size=steps_per_epoch) + loss_cb = LossMonitor() + epoch_per_eval = {"epoch": [], "val_loss": []} + eval_cb = EvalCallBack(model, validation_dataset, 10, epoch_per_eval) + config_ck = CheckpointConfig( + save_checkpoint_steps=10 * steps_per_epoch, + keep_checkpoint_max=12, + ) + + if Model_Art: + os.path.join('/cache/', 'save') + ckpoint_cb = ModelCheckpoint( + prefix=args.prefix, directory='/cache/save/' + str(device_id), config=config_ck + ) + else: + ckpoint_cb = ModelCheckpoint( + prefix=args.prefix, directory=args.save_dir, config=config_ck + ) + model.train( + args.epochs, train_dataset, callbacks=[loss_cb, time_cb, ckpoint_cb, eval_cb], dataset_sink_mode=True, + ) + dict_eval = eval_cb.get_dict() + val_num_list = dict_eval["epoch"] + val_value = dict_eval["val_loss"] + for i in range(len(val_num_list)): + print(val_num_list[i], " : ", val_value[i]) + + if Model_Art: + print("######### upload to OBS #########") + import moxing as mox + mox.file.shift('os', 'mox') + mox.file.copy_parallel(src_url="/cache/save", dst_url=args.obs_save) + + +if __name__ == "__main__": + args = get_parser() + print(args.obs_save) + # psp_train()