diff --git a/research/cv/RefineNet/README.md b/research/cv/RefineNet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4a83b074ec735e286baa20a0b7ef6956c359c44c --- /dev/null +++ b/research/cv/RefineNet/README.md @@ -0,0 +1,378 @@ +# 目录 + +<!-- TOC --> + +- [目录](#目录) +- [RefineNet描述](#RefineNet描述) + - [描述](#描述) +- [模型架构](#模型架构) +- [数据集](#数据集) +- [特性](#特性) + - [混合精度](#混合精度) +- [环境要求](#环境要求) +- [快速入门](#快速入门) +- [脚本说明](#脚本说明) + - [脚本及样例代码](#脚本及样例代码) + - [脚本参数](#脚本参数) + - [训练过程](#训练过程) + - [用法](#用法) + - [Ascend处理器环境运行](#ascend处理器环境运行) + - [结果](#结果) + - [评估过程](#评估过程) + - [用法](#用法-1) + - [Ascend处理器环境运行](#ascend处理器环境运行-1) + - [结果](#结果-1) + - [训练准确率](#训练准确率) +- [模型描述](#模型描述) + - [性能](#性能) + - [评估性能](#评估性能) +- [随机情况说明](#随机情况说明) +- [ModelZoo主页](#modelzoo主页) + +<!-- /TOC --> + +# RefineNet描述 + +## 概述 + +RefineNet是一种通用的多径优化网络,它显式地利用下采样过程中的所有可用信息,利用长程残差连接实现高分辨率预测。通过这种方式,捕获高级语义特征的深层可以使用来自浅层卷积的细粒度特征直接细化。RefineNet的各个组件按照认证映射思想使用残差连接,这允许进行有效的端到端训练。 + +有关网络详细信息,请参阅[论文][1] +`guosheng.lin,anton.milan,et.al.RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation.arXiv:1611.06612v3 [cs.CV] 25 Nov 2016` + +[1]: https://arxiv.org/abs/1611.06612v3 + +# 模型架构 + +以ResNet-101为骨干,利用不同阶段的多种层次的卷积信息,并将他们融合到一起来获取一个高分辨率的预测,具体请见[链接][2]。 + +[2]: https://arxiv.org/pdf/1611.06612v3.pdf + +# 数据集 + +Pascal VOC数据集和语义边界数据集(Semantic Boundaries Dataset,SBD) + +- 下载分段数据集。 + +- 准备训练数据清单文件。清单文件用于保存图片和标注对的相对路径。如下: + + ```text + VOCdevkit/VOC2012/JPEGImages/2007_000032.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000032.png + VOCdevkit/VOC2012/JPEGImages/2007_000039.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000039.png + VOCdevkit/VOC2012/JPEGImages/2007_000063.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000063.png + VOCdevkit/VOC2012/JPEGImages/2007_000068.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000068.png + ...... + ``` + +你也可以通过运行脚本:`python get_dataset_lst.py --data_root=/PATH/TO/DATA` 来自动生成数据清单文件。 + +- 配置并运行get_dataset_MRcd.sh,将数据集转换为MindRecords。scripts/get_dataset_MRcd.sh中的参数: + + ``` + --data_root 训练数据的根路径 + --data_lst 训练数据列表(如上准备) + --dst_path MindRecord所在路径 + --num_shards MindRecord的分片数 + --shuffle 是否混洗 + ``` + +# 特性 + +## 混合精度 + +采用[混合精度](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/enable_mixed_precision.html) +的训练方法使用支持单精度和半精度数据来提高深度学习神经网络的训练速度,同时保持单精度训练所能达到的网络精度。混合精度训练提高计算速度、减少内存使用的同时,支持在特定硬件上训练更大的模型或实现更大批次的训练。 +以FP16算子为例,如果输入数据类型为FP32,MindSpore后台会自动降低精度来处理数据。用户可打开INFO日志,搜索“reduce precision”查看精度降低的算子。 + +# 环境要求 + +- 硬件(Ascend) + - 准备Ascend处理器搭建硬件环境。 +- 框架 + - [MindSpore](https://www.mindspore.cn/install) +- 如需查看详情,请参见如下资源: + - [MindSpore教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/master/index.html) +- 安装requirements.txt中的python包。 +- 生成config json文件用于8卡训练。 + +# 快速入门 + +通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: + +- Ascend处理器环境运行 + +在RefineNet原始论文的基础上,我们对去除与VOC数据集重复部分的边界数据集SBD数据集进行了一次训练实验,再对剩余VOC数据集进行finetune,并对voc_val数据集进行了评估。 + +运行以下训练脚本配置单卡训练参数: + +```bash +run_standalone_train_ascend.sh +``` + +运行以下训练脚本配置8卡训练参数,微调ResNet_101模型: + +```bash +run_distribute_train_ascend_r1.sh +``` + +运行以下训练脚本配置8卡训练参数,微调上一步骤模型: + +```bash +run_distribute_train_ascend_r2.sh +``` + +评估步骤如下: + +1.使用voc val数据集评估。评估脚本如下: + +```bash +run_eval_ascend.sh +``` + +# 脚本说明 + +## 脚本及样例代码 + +```shell +. +└──refinenet + ├── script + ├── get_dataset_mindrecord.sh # 将原始数据转换为MindRecord数据集 + ├── run_standalone_train_r1.sh # 启动Ascend单机预训练(单卡) + ├── run_standalone_train_r2.sh # 启动Ascend单机finetune(单卡) + ├── run_distribute_train_ascend_r1.sh # 启动Ascend分布式预训练(八卡) + ├── run_distribute_train_ascend_r2.sh # 启动Ascend分布式finetune(八卡) + ├── run_eval_ascend.sh # 启动Ascend评估 + ├── src + ├── tools + ├── get_dataset_lst.py # 获取数据清单文件 + ├── build_MRcd.py # 获取MindRecord文件 + ├── dataset.py # 数据预处理 + ├── refinenet.py # RefineNet网络结构 + ├── learning_rates.py # 生成学习率 + ├── loss.py # RefineNet的损失定义 + ├── eval.py # 训练时评估网络 + ├── train.py # 训练网络 + ├── requirements.txt # requirements文件 + └── README.md +``` + +## 脚本参数 + +默认配置 + +```bash +"data_file":"/PATH/TO/MINDRECORD_NAME" # 数据集路径 +"device_target":Ascend # 训练后端类型 +"train_epochs":200 # 总轮次数 +"batch_size":32 # 输入张量的批次大小 +"crop_size":513 # 裁剪大小 +"base_lr":0.0015 # 初始学习率 +"lr_type":cos # 用于生成学习率的衰减模式 +"min_scale":0.5 # 数据增强的最小尺度 +"max_scale":2.0 # 数据增强的最大尺度 +"ignore_label":255 # 忽略标签 +"num_classes":21 # 类别数 +"ckpt_pre_trained":"/PATH/TO/PRETRAIN_MODEL" # 加载预训练检查点的路径 +"is_distributed": # 分布式训练,设置该参数为True +"save_epochs":5 # 用于保存的迭代间隙 +"freeze_bn": # 设置该参数freeze_bn为True +"keep_checkpoint_max":200 # 用于保存的最大检查点 +``` + +## 训练过程 + +### 用法 + +#### Ascend处理器环境运行 + +在RefineNet原始论文的基础上,我们先对COCO+SBD混合数据集进行训练,再采用Pascal Voc中的voc_train数据集进行finetune。最后对voc_val数据集进行了评估。 + +运行以下训练脚本配置单卡训练参数: + +```bash +# run_standalone_train.sh +Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH] +``` + +运行以下训练脚本配置单卡训练参数,微调上一步模型: + +```bash +# run_distribute_train.sh +Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH] +``` + +运行以下训练脚本配置八卡训练参数,微调ResNet_101模型: + +```bash +# run_distribute_train.sh +Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH] +``` + +运行以下训练脚本配置八卡训练参数,微调上一步模型: + +```bash +# run_distribute_train.sh +Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH] +``` + +### 结果 + +#### Ascend处理器环境运行 + +- 在去除VOC2012重复部分的SBD数据集上训练,微调ResNet-101模型: + +```bash +# 分布式训练结果(单卡) +epoch: 1 step: 284, loss is 0.7524967 +epoch time: 546527.635 ms, per step time: 1924.393 ms +epoch: 2 step: 284, loss is 0.7311493 +epoch time: 298406.836 ms, per step time: 1050.728 ms +epoch: 3 step: 284, loss is 0.36002275 +epoch time: 298394.940 ms, per step time: 1050.686 ms +epoch: 4 step: 284, loss is 0.50077325 +epoch time: 298390.876 ms, per step time: 1050.672 ms +epoch: 5 step: 284, loss is 0.62343127 +epoch time: 309631.879 ms, per step time: 1090.253 ms +epoch: 6 step: 284, loss is 0.3367705 +epoch time: 298388.706 ms, per step time: 1050.664 ms +... +``` + +```bash +# 分布式训练结果(8P) +epoch: 1 step: 142, loss is 0.781318 +epoch time: 194373.504 ms, per step time: 1368.827 ms +epoch: 2 step: 142, loss is 0.55504256 +epoch time: 54313.781 ms, per step time: 382.491 ms +epoch: 3 step: 142, loss is 0.2290901 +epoch time: 54346.609 ms, per step time: 382.723 ms +epoch: 4 step: 142, loss is 0.23693062 +epoch time: 54391.451 ms, per step time: 383.038 ms +epoch: 5 step: 142, loss is 0.26892647 +epoch time: 59496.694 ms, per step time: 418.991 ms +epoch: 6 step: 142, loss is 0.34565672 +epoch time: 54295.630 ms, per step time: 382.364 ms +... +``` + +- 在单独的VOC2012数据集上训练,微调上一步模型 + +```bash +# 分布式训练结果(单卡) +epoch: 1 step: 45, loss is 0.27439225 +epoch time: 292909.346 ms, per step time: 6509.097 ms +epoch: 2 step: 45, loss is 0.3075968 +epoch time: 47189.032 ms, per step time: 1048.645 ms +epoch: 3 step: 45, loss is 0.33274153 +epoch time: 47213.959 ms, per step time: 1049.199 ms +epoch: 4 step: 45, loss is 0.15978609 +epoch time: 47171.244 ms, per step time: 1048.250 ms +epoch: 5 step: 45, loss is 0.1546418 +epoch time: 59120.354 ms, per step time: 1313.786 ms +epoch: 6 step: 45, loss is 0.12949142 +epoch time: 47178.499 ms, per step time: 1048.411 ms +... +``` + +```bash +# 分布式训练结果(8P) +epoch: 1 step: 22, loss is 1.2161481 +epoch time: 142361.584 ms, per step time: 6470.981 ms +epoch: 2 step: 22, loss is 0.11737871 +epoch time: 8448.342 ms, per step time: 384.016 ms +epoch: 3 step: 22, loss is 0.09774251 +epoch time: 14003.816 ms, per step time: 636.537 ms +epoch: 4 step: 22, loss is 0.0612365 +epoch time: 8421.547 ms, per step time: 382.798 ms +epoch: 5 step: 22, loss is 0.09208072 +epoch time: 8432.817 ms, per step time: 383.310 ms +epoch: 6 step: 22, loss is 0.1707601 +epoch time: 12969.236 ms, per step time: 589.511 ms +... +``` + +## 评估过程 + +### 用法 + +#### Ascend处理器环境运行 + +使用--ckpt_path配置检查点,运行脚本,在eval_path/log中打印mIOU。 + +```bash +./run_eval_ascend.sh # 测试训练结果 + +per-class IoU [0.92730402 0.89903323 0.42117934 0.82678775 0.69056955 0.72132475 + 0.8930829 0.81315161 0.80125108 0.32330532 0.74447242 0.58100735 + 0.77520672 0.74184709 0.8185944 0.79020087 0.51059369 0.7229567 + 0.36999663 0.79072283 0.74327523] +mean IoU 0.8038030230633278 + +``` + +测试脚本示例如下: + +```bash +if [ $# -ne 3 ] +then + echo "Usage: sh run_eval_ascend.sh [DATASET_PATH] [PRETRAINED_PATH] [DEVICE_ID]" +exit 1 +ulimit -u unlimited +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_ID=0 +export RANK_SIZE=1 +LOCAL_DIR=eval$DEVICE_ID +rm -rf $LOCAL_DIR +mkdir $LOCAL_DIR +cp ../*.py $LOCAL_DIR +cp *.sh $LOCAL_DIR +cp -r ../src $LOCAL_DIR +cd $LOCAL_DIR || exit +echo "start training for device $DEVICE_ID" +env > env.log +python eval_utils.py --data_lst=$DATASET_PATH --ckpt_path=$PRETRAINED_PATH --device_id=$DEVICE_ID --flip &> log & +cd .. +``` + +### 结果 + +运行适用的训练脚本获取结果。要获得相同的结果,请按照快速入门中的步骤操作。 + +#### 训练准确率 + +| **网络** | mIOU |论文中的mIOU | +| :----------: | :-----: | :-------------: | +| refinenet | 80.3 | 80.3 | + +# 模型描述 + +## 性能 + +### 评估性能 + +| 参数 | Ascend 910| +| -------------------------- | -------------------------------------- | +| 模型版本 | RefineNet | +| 资源 | Ascend 910 | +| 上传日期 | 2021-09-17 | +| MindSpore版本 | 1.2 | +| 数据集 | PASCAL VOC2012 + SBD | +| 训练参数 | epoch = 200, batch_size = 32 | +| 优化器 | Momentum | +| 损失函数 | Softmax交叉熵 | +| 输出 | 概率 | +| 损失 | 0.027490407 | +| 性能 | 54294.528ms(八卡) 298406.836ms(单卡)| +| 微调检查点 | 901M(.ckpt文件) | +| 脚本 | [链接](https://gitee.com/mindspore/models/tree/master/research/cv/refinenet) | + +# 随机情况说明 + +dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。 + +# ModelZoo主页 + + 请浏览官网[主页](https://gitee.com/mindspore/models/)。 \ No newline at end of file diff --git a/research/cv/RefineNet/eval.py b/research/cv/RefineNet/eval.py new file mode 100644 index 0000000000000000000000000000000000000000..e58d292c5970c71129082c237695271b061fc3f8 --- /dev/null +++ b/research/cv/RefineNet/eval.py @@ -0,0 +1,198 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" eval Refinenet """ +import os +import argparse +import numpy as np +import cv2 +from mindspore import Tensor +import mindspore.common.dtype as mstype +from mindspore import context +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from src.refinenet import RefineNet, Bottleneck + + +def parse_args(): + """parse_args""" + parser = argparse.ArgumentParser('Refinet Eval') + + # val data + parser.add_argument('--data_lst', type=str, default='', help='list of val data') + parser.add_argument('--batch_size', type=int, default=32, help='batch size') + parser.add_argument('--crop_size', type=int, default=513, help='crop size') + parser.add_argument('--image_mean', type=list, default=[103.53, 116.28, 123.675], help='image mean') + parser.add_argument('--image_std', type=list, default=[57.375, 57.120, 58.395], help='image std') + parser.add_argument('--scales', type=float, action='append', default=[1.0], help='scales of evaluation') + parser.add_argument('--flip', action='store_true', help='perform left-right flip') + parser.add_argument('--ignore_label', type=int, default=255, help='ignore label') + parser.add_argument('--num_classes', type=int, default=21, help='number of classes') + parser.add_argument('--device_id', type=str, default='0', choices=['0', '1', '2', '3', '4', '5', '6', '7'], + help='which device will be implemented') + parser.add_argument('--ckpt_path', type=str, default='', help='model to evaluate') + args, _ = parser.parse_known_args() + return args + + +def cal_hist(a, b, n): + k = (a >= 0) & (a < n) + return np.bincount(n * a[k].astype(np.int32) + b[k], minlength=n ** 2).reshape(n, n) + + +def resize_long(img, long_size=513): + h, w, _ = img.shape + if h > w: + new_h = long_size + new_w = int(1.0 * long_size * w / h) + else: + new_w = long_size + new_h = int(1.0 * long_size * h / w) + imo = cv2.resize(img, (new_w, new_h)) + return imo + + +def image_bgr_rgb(img): + img_data = img[:, :, ::-1] + return img_data + + +def pre_process(args, img_, crop_size=513): + """pre_process""" + # resize + img_ = resize_long(img_, crop_size) + resize_h, resize_w, _ = img_.shape + + # mean, std + image_mean = np.array(args.image_mean) + image_std = np.array(args.image_std) + img_ = (img_ - image_mean) / image_std + img_ = image_bgr_rgb(img_) + # pad to crop_size + pad_h = crop_size - img_.shape[0] + pad_w = crop_size - img_.shape[1] + if pad_h > 0 or pad_w > 0: + img_ = cv2.copyMakeBorder(img_, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=0) + + # hwc to chw + img_ = img_.transpose((2, 0, 1)) + return img_, resize_h, resize_w + + +def eval_batch(args, eval_net, img_lst, crop_size=513, flip=True): + """eval_batch""" + result_lst = [] + batch_size = len(img_lst) + batch_img = np.zeros((args.batch_size, 3, crop_size, crop_size), dtype=np.float32) + resize_hw = [] + for l in range(batch_size): + img_ = img_lst[l] + img_, resize_h, resize_w = pre_process(args, img_, crop_size) + batch_img[l] = img_ + resize_hw.append([resize_h, resize_w]) + batch_img = np.ascontiguousarray(batch_img) + net_out = eval_net(Tensor(batch_img, mstype.float32)) + net_out = net_out.asnumpy() + if flip: + batch_img = batch_img[:, :, :, ::-1] + net_out_flip = eval_net(Tensor(batch_img, mstype.float32)) + net_out += net_out_flip.asnumpy()[:, :, :, ::-1] + for bs in range(batch_size): + probs_ = net_out[bs][:, :resize_hw[bs][0], :resize_hw[bs][1]].transpose((1, 2, 0)) + ori_h, ori_w = img_lst[bs].shape[0], img_lst[bs].shape[1] + probs_ = cv2.resize(probs_, (ori_w, ori_h)) + result_lst.append(probs_) + return result_lst + + +def eval_batch_scales(args, eval_net, img_lst, scales, + base_crop_size=513, flip=True): + """eval_batch_scales""" + sizes_ = [int((base_crop_size - 1) * sc) + 1 for sc in scales] + probs_lst = eval_batch(args, eval_net, img_lst, crop_size=sizes_[0], flip=flip) + print(sizes_) + for crop_size_ in sizes_[1:]: + probs_lst_tmp = eval_batch(args, eval_net, img_lst, crop_size=crop_size_, flip=flip) + for pl, _ in enumerate(probs_lst): + probs_lst[pl] += probs_lst_tmp[pl] + + result_msk = [] + for i in probs_lst: + result_msk.append(i.argmax(axis=2)) + return result_msk + + +def net_eval(): + """net_eval""" + args = parse_args() + + # data list + with open(args.data_lst) as f: + img_lst = f.readlines() + + context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False, + device_id=int(args.device_id)) + + network = RefineNet(Bottleneck, [3, 4, 23, 3], args.num_classes) + + # load model + param_dict = load_checkpoint(args.ckpt_path) + load_param_into_net(network, param_dict) + network.set_train(False) + + # evaluate + hist = np.zeros((args.num_classes, args.num_classes)) + batch_img_lst = [] + batch_msk_lst = [] + seg_path_list = [] + bi = 0 + image_num = 0 + for i, line in enumerate(img_lst): + img_path, msk_path = line.strip().split(' ') + img_ = cv2.imread(img_path) + msk_ = cv2.imread(msk_path, cv2.IMREAD_GRAYSCALE) + batch_img_lst.append(img_) + batch_msk_lst.append(msk_) + seg_path_list.append(msk_path) + bi += 1 + if bi == args.batch_size: + batch_res = eval_batch_scales(args, network, batch_img_lst, scales=args.scales, + base_crop_size=args.crop_size, flip=args.flip) + for mi in range(args.batch_size): + hist += cal_hist(batch_msk_lst[mi].flatten(), batch_res[mi].flatten(), args.num_classes) + seg_name = os.path.split(seg_path_list[mi])[1] + new_seg = batch_res[mi] + new_seg[new_seg > 0] = 255 + cv2.imwrite(seg_name, new_seg) + bi = 0 + batch_img_lst = [] + batch_msk_lst = [] + seg_path_list = [] + print('processed {} images'.format(i + 1)) + image_num = i + + if bi > 0: + batch_res = eval_batch_scales(args, network, batch_img_lst, scales=args.scales, + base_crop_size=args.crop_size, flip=args.flip) + for mi in range(bi): + hist += cal_hist(batch_msk_lst[mi].flatten(), batch_res[mi].flatten(), args.num_classes) + print('processed {} images'.format(image_num + 1)) + print(hist) + iu = np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist)) + print('per-class IoU', iu) + print('mean IoU', np.nanmean(iu)) + + + +if __name__ == '__main__': + net_eval() diff --git a/research/cv/RefineNet/export.py b/research/cv/RefineNet/export.py new file mode 100644 index 0000000000000000000000000000000000000000..b14169ffbf80680785f26d33c88bc249bfc68cf1 --- /dev/null +++ b/research/cv/RefineNet/export.py @@ -0,0 +1,36 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""export AIR file.""" +import argparse +import numpy as np +from mindspore import Tensor, context, load_checkpoint, load_param_into_net, export +from src.refinenet import RefineNet, Bottleneck + +context.set_context(mode=context.GRAPH_MODE, save_graphs=False) + +if __name__ == '__main__': + parser = argparse.ArgumentParser(description='checkpoint export') + parser.add_argument('--checkpoint', type=str, default='', help='checkpoint of refinenet (Default: None)') + parser.add_argument('--num_classes', type=int, default=21, help='the number of classes (Default: 21)') + args = parser.parse_args() + + + network = RefineNet(Bottleneck, [3, 4, 23, 3], args.num_classes) + param_dict = load_checkpoint(args.checkpoint) + + # load the parameter into net + load_param_into_net(network, param_dict) + input_data = np.random.uniform(0.0, 1.0, size=[32, 3, 513, 513]).astype(np.float32) + export(network, Tensor(input_data), file_name=args.model + '-300_11.air', file_format='AIR') diff --git a/research/cv/RefineNet/requiements.txt b/research/cv/RefineNet/requiements.txt new file mode 100644 index 0000000000000000000000000000000000000000..b97d32dfff15e9a670ad171ec3037d8b755b90fb --- /dev/null +++ b/research/cv/RefineNet/requiements.txt @@ -0,0 +1,4 @@ +mindspore +numpy +Pillow +python-opencv \ No newline at end of file diff --git a/research/cv/RefineNet/scripts/run_distribute_train_ascend_r1.sh b/research/cv/RefineNet/scripts/run_distribute_train_ascend_r1.sh new file mode 100644 index 0000000000000000000000000000000000000000..771aa7663d84c957a3e099493efa4c2ef87ebe19 --- /dev/null +++ b/research/cv/RefineNet/scripts/run_distribute_train_ascend_r1.sh @@ -0,0 +1,74 @@ +#! /bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ $# -ne 3 ] +then + echo "Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +RANK_TABLE_PATH=$(get_real_path $1) +echo $RANK_TABLE_PATH + +if [ ! -f $RANK_TABLE_PATH ] +then + echo "error: RANK_TABLE_FILE=$RANK_TABLE_PATH is not a file" +exit 1 +fi + +DATASET_PATH=$2 +if [ ! -f $DATASET_PATH ] +then + echo "error: DATASET_PATH=$DATASET_PATH is not a file" +exit 1 +fi + +PRETRAINED_PATH=$(get_real_path $3) +echo $PRETRAINED_PATH +if [ ! -f $PRETRAINED_PATH ] +then + echo "error: PRETRAINED_PATH=$PRETRAINED_PATH is not a file" +exit 1 +fi + +ulimit -u unlimited +export DEVICE_NUM=8 +export RANK_SIZE=8 +export RANK_TABLE_FILE=$RANK_TABLE_PATH + +for((i=0; i<${DEVICE_NUM}; i++)) +do + export DEVICE_ID=$i + export RANK_ID=$i + rm -rf ./train_parallel$i + mkdir ./train_parallel$i + cp ../*.py ./train_parallel$i + cp *.sh ./train_parallel$i + cp -r ../src ./train_parallel$i + cd ./train_parallel$i || exit + echo "start training for rank $RANK_ID, device $DEVICE_ID" + env > env.log + python train.py --device_id=$i --rank=$i --is_distribute --data_file=$DATASET_PATH --ckpt_pre_trained=$PRETRAINED_PATH --base_lr=0.0032 --batch_size=32 &> log & + cd .. +done + diff --git a/research/cv/RefineNet/scripts/run_distribute_train_ascend_r2.sh b/research/cv/RefineNet/scripts/run_distribute_train_ascend_r2.sh new file mode 100644 index 0000000000000000000000000000000000000000..097859507f6c940965e923e2cd9cac4c6306d366 --- /dev/null +++ b/research/cv/RefineNet/scripts/run_distribute_train_ascend_r2.sh @@ -0,0 +1,74 @@ +#! /bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ $# -ne 3 ] +then + echo "Usage: sh run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATASET_PATH] [PRETRAINED_PATH]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +RANK_TABLE_PATH=$(get_real_path $1) +echo $RANK_TABLE_PATH + +if [ ! -f $RANK_TABLE_PATH ] +then + echo "error: RANK_TABLE_FILE=$RANK_TABLE_PATH is not a file" +exit 1 +fi + +DATASET_PATH=$2 +if [ ! -f $DATASET_PATH ] +then + echo "error: DATASET_PATH=$DATASET_PATH is not a file" +exit 1 +fi + +PRETRAINED_PATH=$(get_real_path $3) +echo $PRETRAINED_PATH +if [ ! -f $PRETRAINED_PATH ] +then + echo "error: PRETRAINED_PATH=$PRETRAINED_PATH is not a file" +exit 1 +fi + +ulimit -u unlimited +export DEVICE_NUM=8 +export RANK_SIZE=8 +export RANK_TABLE_FILE=$RANK_TABLE_PATH + +for((i=0; i<${DEVICE_NUM}; i++)) +do + export DEVICE_ID=$i + export RANK_ID=$i + rm -rf ./train_parallel$i + mkdir ./train_parallel$i + cp ../*.py ./train_parallel$i + cp *.sh ./train_parallel$i + cp -r ../src ./train_parallel$i + cd ./train_parallel$i || exit + echo "start training for rank $RANK_ID, device $DEVICE_ID" + env > env.log + python train.py --device_id=$i --rank=$i --is_distribute --data_file=$DATASET_PATH --ckpt_pre_trained=$PRETRAINED_PATH --base_lr=0.00032 --batch_size=32 &> log & + cd .. +done + diff --git a/research/cv/RefineNet/scripts/run_eval_ascend.sh b/research/cv/RefineNet/scripts/run_eval_ascend.sh new file mode 100644 index 0000000000000000000000000000000000000000..a30cfe06859739734583ef23a987d91f19008cca --- /dev/null +++ b/research/cv/RefineNet/scripts/run_eval_ascend.sh @@ -0,0 +1,63 @@ +#! /bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ $# -ne 3 ] +then + echo "Usage: sh run_eval_ascend.sh [DATASET_PATH] [PRETRAINED_PATH] [DEVICE_ID]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +DATASET_PATH=$(get_real_path $1) +PRETRAINED_PATH=$(get_real_path $2) +echo $DATASET_PATH +echo $PRETRAINED_PATH + +if [ ! -f $DATASET_PATH ] +then + echo "error: DATASET_PATH=$DATASET_PATH is not a file" +exit 1 +fi + +if [ ! -f $PRETRAINED_PATH ] +then + echo "error: PRETRAINED_PATH=$PRETRAINED_PATH is not a file" +exit 1 +fi + +ulimit -u unlimited +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_ID=0 +export RANK_SIZE=1 +LOCAL_DIR=eval$DEVICE_ID +rm -rf $LOCAL_DIR +mkdir $LOCAL_DIR +cp ../*.py $LOCAL_DIR +cp *.sh $LOCAL_DIR +cp -r ../src $LOCAL_DIR +cd $LOCAL_DIR || exit +echo "start training for device $DEVICE_ID" +env > env.log +python eval.py --data_lst=$DATASET_PATH --ckpt_path=$PRETRAINED_PATH --device_id=$DEVICE_ID --flip &> log & +cd .. + diff --git a/research/cv/RefineNet/scripts/run_standalone_train_ascend_r1.sh b/research/cv/RefineNet/scripts/run_standalone_train_ascend_r1.sh new file mode 100644 index 0000000000000000000000000000000000000000..b5cd8b5716a8fe5a95863d856498d7a5b3dde3cb --- /dev/null +++ b/research/cv/RefineNet/scripts/run_standalone_train_ascend_r1.sh @@ -0,0 +1,63 @@ +#! /bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ $# -ne 3 ] +then + echo "Usage: sh run_standalone_train_ascend.sh [DATASET_PATH] [PRETRAINED_PATH] [DEVICE_ID]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +DATASET_PATH=$(get_real_path $1) +PRETRAINED_PATH=$(get_real_path $2) +echo $DATASET_PATH +echo $PRETRAINED_PATH + +if [ ! -f $DATASET_PATH ] +then + echo "error: DATASET_PATH=$DATASET_PATH is not a file" +exit 1 +fi + +if [ ! -f $PRETRAINED_PATH ] +then + echo "error: PRETRAINED_PATH=$PRETRAINED_PATH is not a file" +exit 1 +fi + +ulimit -u unlimited +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_ID=0 +export RANK_SIZE=1 +LOCAL_DIR=train$DEVICE_ID +rm -rf $LOCAL_DIR +mkdir $LOCAL_DIR +cp ../*.py $LOCAL_DIR +cp *.sh $LOCAL_DIR +cp -r ../src $LOCAL_DIR +cd $LOCAL_DIR || exit +echo "start training for device $DEVICE_ID" +env > env.log +python train.py --data_file=$DATASET_PATH --ckpt_pre_trained=$PRETRAINED_PATH --device_id=$DEVICE_ID --base_lr=0.0015 --batch_size=32 &> log & +cd .. + diff --git a/research/cv/RefineNet/scripts/run_standalone_train_ascend_r2.sh b/research/cv/RefineNet/scripts/run_standalone_train_ascend_r2.sh new file mode 100644 index 0000000000000000000000000000000000000000..70c02995462fb5c578c550f997df06ff2a78325b --- /dev/null +++ b/research/cv/RefineNet/scripts/run_standalone_train_ascend_r2.sh @@ -0,0 +1,63 @@ +#! /bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ $# -ne 3 ] +then + echo "Usage: sh run_standalone_train_ascend.sh [DATASET_PATH] [PRETRAINED_PATH] [DEVICE_ID]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +DATASET_PATH=$(get_real_path $1) +PRETRAINED_PATH=$(get_real_path $2) +echo $DATASET_PATH +echo $PRETRAINED_PATH + +if [ ! -f $DATASET_PATH ] +then + echo "error: DATASET_PATH=$DATASET_PATH is not a file" +exit 1 +fi + +if [ ! -f $PRETRAINED_PATH ] +then + echo "error: PRETRAINED_PATH=$PRETRAINED_PATH is not a file" +exit 1 +fi + +ulimit -u unlimited +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_ID=0 +export RANK_SIZE=1 +LOCAL_DIR=train$DEVICE_ID +rm -rf $LOCAL_DIR +mkdir $LOCAL_DIR +cp ../*.py $LOCAL_DIR +cp *.sh $LOCAL_DIR +cp -r ../src $LOCAL_DIR +cd $LOCAL_DIR || exit +echo "start training for device $DEVICE_ID" +env > env.log +python train.py --data_file=$DATASET_PATH --ckpt_pre_trained=$PRETRAINED_PATH --device_id=$DEVICE_ID --base_lr=0.00015 --batch_size=32 &> log & +cd .. + diff --git a/research/cv/RefineNet/src/dataset.py b/research/cv/RefineNet/src/dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..7bc771e597bf5d15d5b6168a6e04aacb2777dd24 --- /dev/null +++ b/research/cv/RefineNet/src/dataset.py @@ -0,0 +1,131 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" dataset """ +import numpy as np +import cv2 +import mindspore.dataset.vision.c_transforms as C +import mindspore.dataset as ds +from mindspore.common import set_seed +cv2.setNumThreads(0) +set_seed(1) + + +class SegDataset: + """init dataset""" + def __init__(self, + image_mean, + image_std, + data_file='', + batch_size=32, + crop_size=512, + max_scale=2.0, + min_scale=0.5, + ignore_label=255, + num_classes=21, + num_readers=2, + num_parallel_calls=4, + shard_id=None, + shard_num=None): + self.data_file = data_file + self.batch_size = batch_size + self.crop_size = crop_size + self.image_mean = np.array(image_mean, dtype=np.float32) + self.image_std = np.array(image_std, dtype=np.float32) + self.max_scale = max_scale + self.min_scale = min_scale + self.ignore_label = ignore_label + self.num_classes = num_classes + self.num_readers = num_readers + self.num_parallel_calls = num_parallel_calls + self.shard_id = shard_id + self.shard_num = shard_num + self.enable_flip = True + assert max_scale > min_scale + + def expand(self, image, label): + """expand image""" + if np.random.uniform(0.0, 1.0) > 0.5: + return image, label + h, w, c = image.shape + ratio = np.random.uniform(1.0, 4.0) + mean = (0, 0, 0) + expand_img = np.full((int(h * ratio), int(w * ratio), c), mean).astype(image.dtype) + left = int(np.random.uniform(0, w * ratio - w)) + top = int(np.random.uniform(0, h * ratio - h)) + expand_img[top:top + h, left:left + w] = image + image = expand_img + expand_label = np.full((int(h * ratio), int(w * ratio)), self.ignore_label).astype(label.dtype) + expand_label[top:top + h, left:left + w] = label + label = expand_label + return image, label + + def resize_long(self, img): + """resize""" + long_size = self.crop_size + h, w, _ = img.shape + if h > w: + new_h = long_size + new_w = int(1.0 * long_size * w / h) + else: + new_w = long_size + new_h = int(1.0 * long_size * h / w) + return new_h, new_w + + def RandomScaleAndCrop(self, image, label): + """random scale and crop""" + sc = np.random.uniform(self.min_scale, self.max_scale) + new_h, new_w = int(sc * image.shape[0]), int(sc * image.shape[1]) + image_out = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_CUBIC) + label_out = cv2.resize(label, (new_w, new_h), interpolation=cv2.INTER_NEAREST) + image_out = (image_out - self.image_mean) / self.image_std + h_, w_ = max(new_h, self.crop_size), max(new_w, self.crop_size) + pad_h, pad_w = h_ - new_h, w_ - new_w + if pad_h > 0 or pad_w > 0: + image_out = cv2.copyMakeBorder(image_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=0) + label_out = cv2.copyMakeBorder(label_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=self.ignore_label) + offset_h = np.random.randint(0, h_ - self.crop_size + 1) + offset_w = np.random.randint(0, w_ - self.crop_size + 1) + image_out = image_out[offset_h: offset_h + self.crop_size, offset_w: offset_w + self.crop_size, :] + label_out = label_out[offset_h: offset_h + self.crop_size, offset_w: offset_w+self.crop_size] + return image_out, label_out + + def preprocess_(self, image, label): + """bgr image""" + image_out = image + label_out = cv2.imdecode(np.frombuffer(label, dtype=np.uint8), cv2.IMREAD_GRAYSCALE) + image_out, label_out = self.RandomScaleAndCrop(image_out, label_out) + if np.random.uniform(0.0, 1.0) > 0.5: + image_out = image_out[:, ::-1, :] + label_out = label_out[:, ::-1] + image_out = image_out.transpose((2, 0, 1)) + image_out = image_out.copy() + label_out = label_out.copy() + return image_out, label_out + + def get_dataset1(self): + """get dataset""" + ds.config.set_seed(1000) + data_set = ds.MindDataset(dataset_file=self.data_file, columns_list=["data", "label"], + shuffle=True, num_parallel_workers=self.num_readers, + num_shards=self.shard_num, shard_id=self.shard_id) + decode_op = C.Decode() + trans = [decode_op] + data_set = data_set.map(operations=trans, input_columns=["data"]) + transforms_list = self.preprocess_ + data_set = data_set.map(operations=transforms_list, input_columns=["data", "label"], + output_columns=["data", "label"], + num_parallel_workers=self.num_parallel_calls) + data_set = data_set.batch(self.batch_size, drop_remainder=True) + return data_set diff --git a/research/cv/RefineNet/src/learning_rates.py b/research/cv/RefineNet/src/learning_rates.py new file mode 100644 index 0000000000000000000000000000000000000000..b74501fdce0b6b214db4585c653ebaf4ba0155a5 --- /dev/null +++ b/research/cv/RefineNet/src/learning_rates.py @@ -0,0 +1,226 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""learning rates""" +import math +import numpy as np + + +def cosine_lr(base_lr, decay_steps, total_steps): + for i in range(total_steps): + step_ = min(i, decay_steps) + yield base_lr * 0.5 * (1 + np.cos(np.pi * step_ / decay_steps)) + + +def poly_lr(base_lr, decay_steps, total_steps, end_lr=0.0001, power=0.9): + for i in range(total_steps): + step_ = min(i, decay_steps) + yield (base_lr - end_lr) * ((1.0 - step_ / decay_steps) ** power) + end_lr + + +def exponential_lr(base_lr, decay_steps, decay_rate, total_steps, staircase=False): + for i in range(total_steps): + if staircase: + power_ = i // decay_steps + else: + power_ = float(i) / decay_steps + yield base_lr * (decay_rate ** power_) + + +def _generate_steps_lr(lr_init, lr_max, total_steps, warmup_steps): + """ + Applies three steps decay to generate learning rate array. + + Args: + lr_init(float): init learning rate. + lr_max(float): max learning rate. + total_steps(int): all steps in training. + warmup_steps(int): all steps in warmup epochs. + + Returns: + np.array, learning rate array. + """ + decay_epoch_index = [0.3 * total_steps, 0.6 * total_steps, 0.8 * total_steps] + lr_each_step = [] + for i in range(total_steps): + if i < warmup_steps: + lr = lr_init + (lr_max - lr_init) * i / warmup_steps + else: + if i < decay_epoch_index[0]: + lr = lr_max + elif i < decay_epoch_index[1]: + lr = lr_max * 0.1 + elif i < decay_epoch_index[2]: + lr = lr_max * 0.01 + else: + lr = lr_max * 0.001 + lr_each_step.append(lr) + return lr_each_step + + +def _generate_poly_lr(lr_init, lr_max, total_steps, warmup_steps): + """ + Applies polynomial decay to generate learning rate array. + + Args: + lr_init(float): init learning rate. + lr_max(float): max learning rate. + total_steps(int): all steps in training. + warmup_steps(int): all steps in warmup epochs. + + Returns: + np.array, learning rate array. + """ + lr_each_step = [] + if warmup_steps != 0: + inc_each_step = (float(lr_max) - float(lr_init)) / float(warmup_steps) + else: + inc_each_step = 0 + for i in range(total_steps): + if i < warmup_steps: + lr = float(lr_init) + inc_each_step * float(i) + else: + base = (1.0 - (float(i) - float(warmup_steps)) / (float(total_steps) - float(warmup_steps))) + lr = float(lr_max) * base * base + if lr < 0.0: + lr = 0.0 + lr_each_step.append(lr) + return lr_each_step + + +def _generate_cosine_lr(lr_init, lr_max, total_steps, warmup_steps): + """ + Applies cosine decay to generate learning rate array. + + Args: + lr_init(float): init learning rate. + lr_max(float): max learning rate. + total_steps(int): all steps in training. + warmup_steps(int): all steps in warmup epochs. + + Returns: + np.array, learning rate array. + """ + decay_steps = total_steps - warmup_steps + lr_each_step = [] + for i in range(total_steps): + if i < warmup_steps: + lr_inc = (float(lr_max) - float(lr_init)) / float(warmup_steps) + lr = float(lr_init) + lr_inc * (i + 1) + else: + linear_decay = (total_steps - i) / decay_steps + cosine_decay = 0.5 * (1 + math.cos(math.pi * 2 * 0.47 * i / decay_steps)) + decayed = linear_decay * cosine_decay + 0.00001 + lr = lr_max * decayed + lr_each_step.append(lr) + return lr_each_step + + +def _generate_liner_lr(lr_init, lr_end, lr_max, total_steps, warmup_steps): + """ + Applies liner decay to generate learning rate array. + + Args: + lr_init(float): init learning rate. + lr_end(float): end learning rate + lr_max(float): max learning rate. + total_steps(int): all steps in training. + warmup_steps(int): all steps in warmup epochs. + + Returns: + np.array, learning rate array. + """ + lr_each_step = [] + for i in range(total_steps): + if i < warmup_steps: + lr = lr_init + (lr_max - lr_init) * i / warmup_steps + else: + lr = lr_max - (lr_max - lr_end) * (i - warmup_steps) / (total_steps - warmup_steps) + lr_each_step.append(lr) + return lr_each_step + + + +def get_lr(lr_init, lr_end, lr_max, warmup_epochs, total_epochs, steps_per_epoch, lr_decay_mode): + """ + generate learning rate array + + Args: + lr_init(float): init learning rate + lr_end(float): end learning rate + lr_max(float): max learning rate + warmup_epochs(int): number of warmup epochs + total_epochs(int): total epoch of training + steps_per_epoch(int): steps of one epoch + lr_decay_mode(string): learning rate decay mode, including steps, poly, cosine or liner(default) + + Returns: + np.array, learning rate array + """ + lr_each_step = [] + total_steps = steps_per_epoch * total_epochs + warmup_steps = steps_per_epoch * warmup_epochs + + if lr_decay_mode == 'steps': + lr_each_step = _generate_steps_lr(lr_init, lr_max, total_steps, warmup_steps) + elif lr_decay_mode == 'poly': + lr_each_step = _generate_poly_lr(lr_init, lr_end, lr_max, total_steps) + elif lr_decay_mode == 'cosine': + lr_each_step = _generate_cosine_lr(lr_init, lr_end, lr_max, total_steps) + else: + lr_each_step = _generate_liner_lr(lr_init, lr_end, lr_max, total_steps, warmup_steps) + + lr_each_step = np.array(lr_each_step).astype(np.float32) + return lr_each_step + + +def linear_warmup_lr(current_step, warmup_steps, base_lr, init_lr): + lr_inc = (float(base_lr) - float(init_lr)) / float(warmup_steps) + lr = float(init_lr) + lr_inc * current_step + return lr + + +def warmup_cosine_annealing_lr(lr, steps_per_epoch, warmup_epochs, max_epoch=120, global_step=0): + """ + generate learning rate array with cosine + + Args: + lr(float): base learning rate + steps_per_epoch(int): steps size of one epoch + warmup_epochs(int): number of warmup epochs + max_epoch(int): total epochs of training + global_step(int): the current start index of lr array + Returns: + np.array, learning rate array + """ + base_lr = lr + warmup_init_lr = 0 + total_steps = int(max_epoch * steps_per_epoch) + warmup_steps = int(warmup_epochs * steps_per_epoch) + decay_steps = total_steps - warmup_steps + + lr_each_step = [] + for i in range(total_steps): + if i < warmup_steps: + lr = linear_warmup_lr(i + 1, warmup_steps, base_lr, warmup_init_lr) + else: + linear_decay = (total_steps - i) / decay_steps + cosine_decay = 0.5 * (1 + math.cos(math.pi * 2 * 0.47 * i / decay_steps)) + decayed = linear_decay * cosine_decay + 0.00001 + lr = base_lr * decayed + lr_each_step.append(lr) + + lr_each_step = np.array(lr_each_step).astype(np.float32) + learning_rate = lr_each_step[global_step:] + return learning_rate diff --git a/research/cv/RefineNet/src/loss.py b/research/cv/RefineNet/src/loss.py new file mode 100644 index 0000000000000000000000000000000000000000..6c9fbee2aeec44d7a7caa2bea3265d1cde083067 --- /dev/null +++ b/research/cv/RefineNet/src/loss.py @@ -0,0 +1,55 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""loss""" +from mindspore import Tensor +import mindspore.common.dtype as mstype +import mindspore.nn as nn +from mindspore.ops import operations as P + + +class SoftmaxCrossEntropyLoss(nn.Cell): + """SoftmaxCrossEntropyLoss""" + def __init__(self, num_cls=21, ignore_label=255): + super(SoftmaxCrossEntropyLoss, self).__init__() + self.one_hot = P.OneHot(axis=-1) + self.on_value = Tensor(1.0, mstype.float32) + self.off_value = Tensor(0.0, mstype.float32) + self.cast = P.Cast() + self.ce = nn.SoftmaxCrossEntropyWithLogits() + self.not_equal = P.NotEqual() + self.num_cls = num_cls + self.ignore_label = ignore_label + self.mul = P.Mul() + self.sum = P.ReduceSum(False) + self.div = P.RealDiv() + self.transpose = P.Transpose() + self.reshape = P.Reshape() + self.shape = P.Shape() + + def construct(self, logits, labels): + """construct""" + labels_int = self.cast(labels, mstype.int32) + labels_int = self.reshape(labels_int, (-1,)) + N, C = self.shape(logits)[0:2] + logits_ = self.reshape(logits, (N, C, -1)) + logits_ = self.transpose(logits_, (0, 2, 1)) + logits_ = self.reshape(logits_, (-1, C)) + weights = self.not_equal(labels_int, self.ignore_label) + weights = self.cast(weights, mstype.float32) + one_hot_labels = self.one_hot(labels_int, self.num_cls, self.on_value, self.off_value) + loss = self.ce(logits_, one_hot_labels) + loss = self.mul(weights, loss) + loss = self.div(self.sum(loss), self.sum(weights)) + return loss diff --git a/research/cv/RefineNet/src/refinenet.py b/research/cv/RefineNet/src/refinenet.py new file mode 100644 index 0000000000000000000000000000000000000000..d6bbee66e752b8f418e0017475de945ef605ea9b --- /dev/null +++ b/research/cv/RefineNet/src/refinenet.py @@ -0,0 +1,260 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""refinenet""" +import mindspore.nn as nn +import mindspore.ops as ops +from mindspore.common import set_seed +set_seed(1) + + +def conv3x3(in_planes, out_planes, stride=1, bias=False): + "3x3 convolution with padding" + return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, + pad_mode='pad', padding=1, has_bias=bias) + + +class CRPBlock(nn.Cell): + """chained residual pooling""" + def __init__(self, in_planes, out_planes, n_stages): + super(CRPBlock, self).__init__() + layers = [] + for i in range(n_stages): + if i == 0: + layer = conv3x3(in_planes, out_planes, stride=1, bias=False) + else: + layer = conv3x3(out_planes, out_planes, stride=1, bias=False) + layers.append(layer) + self.layers = nn.CellList(layers) + self.stride = 1 + self.n_stages = n_stages + self.maxpool = nn.MaxPool2d(kernel_size=5, stride=1, pad_mode='same') + + def construct(self, x): + top = x + for i in range(self.n_stages): + top = self.maxpool(top) + top = self.layers[i](top) + x = top + x + return x + + +class RCUBlock(nn.Cell): + """Residual Conv Unit""" + def __init__(self, in_planes, out_planes, n_blocks, n_stages): + super(RCUBlock, self).__init__() + layers = [] + for i in range(n_blocks): + seq = nn.SequentialCell([]) + for j in range(n_stages): + if j == 0: + relu1 = nn.ReLU() + if i == 0: + con1 = conv3x3(in_planes, out_planes, stride=1, bias=True) + else: + con1 = conv3x3(out_planes, out_planes, stride=1, bias=True) + seq.append(relu1) + seq.append(con1) + else: + relu2 = nn.ReLU() + con2 = conv3x3(out_planes, out_planes, stride=1, bias=False) + seq.append(relu2) + seq.append(con2) + layers.append(seq) + self.layers = nn.CellList(layers) + self.stride = 1 + self.n_blocks = n_blocks + self.n_stages = n_stages + + def construct(self, x): + for i in range(self.n_blocks): + residual = x + x = self.layers[i](x) + x += residual + return x + + +class Bottleneck(nn.Cell): + """bottleneck""" + expansion = 4 + + def __init__(self, inplanes, planes, stride=1, downsample=None, use_batch_statistics=False, weights_update=True): + super(Bottleneck, self).__init__() + self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, has_bias=False) + self.use_batch_statistics = use_batch_statistics + self.bn1 = nn.BatchNorm2d(planes, use_batch_statistics=self.use_batch_statistics) + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, + pad_mode='pad', padding=1, has_bias=False) + self.bn2 = nn.BatchNorm2d(planes, use_batch_statistics=self.use_batch_statistics) + self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, has_bias=False) + self.bn3 = nn.BatchNorm2d(planes * 4, use_batch_statistics=self.use_batch_statistics) + self.relu = nn.ReLU() + self.downsample = downsample + self.stride = stride + if not weights_update: + self.conv1.weight.requires_grad = False + self.conv2.weight.requires_grad = False + self.conv3.weight.requires_grad = False + self.downsample[0].weight.requires_grad = False + + def construct(self, x): + """construct""" + residual = x + + out = self.conv1(x) + out = self.bn1(out) + out = self.relu(out) + out = self.conv2(out) + out = self.bn2(out) + out = self.relu(out) + out = self.conv3(out) + out = self.bn3(out) + if self.downsample is not None: + residual = self.downsample(x) + out += residual + out = self.relu(out) + return out + + +class RefineNet(nn.Cell): + """network""" + def __init__(self, block, layers, num_classes=21, use_batch_statistics=False): + self.inplanes = 64 + super(RefineNet, self).__init__() + self.do4 = nn.Dropout(keep_prob=1.0) + self.do3 = nn.Dropout(keep_prob=1.0) + self.do = nn.Dropout(keep_prob=1.0) + self.use_batch_statistics = use_batch_statistics + self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, pad_mode='pad', padding=3, + has_bias=False) + self.bn1 = nn.BatchNorm2d(64, use_batch_statistics=self.use_batch_statistics) + self.relu = nn.ReLU() + self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same') + self.layer1 = self._make_layer(block, 64, layers[0], weights_update=False) + self.layer2 = self._make_layer(block, 128, layers[1], stride=2, weights_update=False) + self.layer3 = self._make_layer(block, 256, layers[2], stride=2) + self.layer4 = self._make_layer(block, 512, layers[3], stride=2) + self.p_ims1d2_outl1_dimred = conv3x3(2048, 512, bias=False) + self.adapt_stage1_b = self._make_rcu(512, 512, 2, 2) + self.mflow_conv_g1_pool = self._make_crp(512, 512, 4) + self.mflow_conv_g1_b = self._make_rcu(512, 512, 3, 2) + self.mflow_conv_g1_b3_joint_varout_dimred = conv3x3(512, 256, bias=False) + self.p_ims1d2_outl2_dimred = conv3x3(1024, 256, bias=False) + self.adapt_stage2_b = self._make_rcu(256, 256, 2, 2) + self.adapt_stage2_b2_joint_varout_dimred = conv3x3(256, 256, bias=False) + self.mflow_conv_g2_pool = self._make_crp(256, 256, 4) + self.mflow_conv_g2_b = self._make_rcu(256, 256, 3, 2) + self.mflow_conv_g2_b3_joint_varout_dimred = conv3x3(256, 256, bias=False) + + self.p_ims1d2_outl3_dimred = conv3x3(512, 256, bias=False) + self.adapt_stage3_b = self._make_rcu(256, 256, 2, 2) + self.adapt_stage3_b2_joint_varout_dimred = conv3x3(256, 256, bias=False) + self.mflow_conv_g3_pool = self._make_crp(256, 256, 4) + self.mflow_conv_g3_b = self._make_rcu(256, 256, 3, 2) + self.mflow_conv_g3_b3_joint_varout_dimred = conv3x3(256, 256, bias=False) + + self.p_ims1d2_outl4_dimred = conv3x3(256, 256, bias=False) + self.adapt_stage4_b = self._make_rcu(256, 256, 2, 2) + self.adapt_stage4_b2_joint_varout_dimred = conv3x3(256, 256, bias=False) + self.mflow_conv_g4_pool = self._make_crp(256, 256, 4) + self.mflow_conv_g4_b = self._make_rcu(256, 256, 3, 2) + + self.clf_conv = nn.Conv2d(256, num_classes, kernel_size=3, stride=1, + pad_mode='pad', padding=1, has_bias=True) + self.resize = nn.ResizeBilinear() + + def _make_crp(self, in_planes, out_planes, stages): + """make_crp""" + layers = [CRPBlock(in_planes, out_planes, stages)] + return nn.SequentialCell(layers) + + def _make_rcu(self, in_planes, out_planes, blocks, stages): + """make_rcu""" + layers = [RCUBlock(in_planes, out_planes, blocks, stages)] + return nn.SequentialCell(layers) + + def _make_layer(self, block, planes, blocks, stride=1, weights_update=True): + """make different layer""" + downsample = None + if stride != 1 or self.inplanes != planes * block.expansion: + downsample = nn.SequentialCell([nn.Conv2d(self.inplanes, planes * block.expansion, kernel_size=1, + stride=stride, has_bias=False), + nn.BatchNorm2d(planes * block.expansion, + use_batch_statistics=self.use_batch_statistics)]) + layers = [] + layers.append(block(self.inplanes, planes, stride, downsample, weights_update=weights_update)) + self.inplanes = planes * block.expansion + for _ in range(1, blocks): + layers.append(block(self.inplanes, planes)) + return nn.SequentialCell(layers) + + def construct(self, x): + """construct""" + resize_shape = ops.Shape()(x)[2:] + + x = self.conv1(x) + x = self.bn1(x) + x = self.relu(x) + x = self.maxpool(x) + + l1 = self.layer1(x) + l2 = self.layer2(l1) + l3 = self.layer3(l2) + l4 = self.layer4(l3) + + l4 = self.do4(l4) + l3 = self.do3(l3) + + x4 = self.p_ims1d2_outl1_dimred(l4) + x4 = self.adapt_stage1_b(x4) + x4 = self.relu(x4) + x4 = self.mflow_conv_g1_pool(x4) + x4 = self.mflow_conv_g1_b(x4) + x4 = self.mflow_conv_g1_b3_joint_varout_dimred(x4) + resize_shape3 = ops.Shape()(l3)[2:] + x4 = self.resize(x4, resize_shape3, align_corners=True) + + x3 = self.p_ims1d2_outl2_dimred(l3) + x3 = self.adapt_stage2_b(x3) + x3 = self.adapt_stage2_b2_joint_varout_dimred(x3) + x3 = x3 + x4 + x3 = self.relu(x3) + x3 = self.mflow_conv_g2_pool(x3) + x3 = self.mflow_conv_g2_b(x3) + x3 = self.mflow_conv_g2_b3_joint_varout_dimred(x3) + resize_shape2 = ops.Shape()(l2)[2:] + x3 = self.resize(x3, resize_shape2, align_corners=True) + + x2 = self.p_ims1d2_outl3_dimred(l2) + x2 = self.adapt_stage3_b(x2) + x2 = self.adapt_stage3_b2_joint_varout_dimred(x2) + x2 = x2 + x3 + x2 = self.relu(x2) + x2 = self.mflow_conv_g3_pool(x2) + x2 = self.mflow_conv_g3_b(x2) + x2 = self.mflow_conv_g3_b3_joint_varout_dimred(x2) + resize_shape1 = ops.Shape()(l1)[2:] + x2 = self.resize(x2, size=resize_shape1, align_corners=True) + + x1 = self.p_ims1d2_outl4_dimred(l1) + x1 = self.adapt_stage4_b(x1) + x1 = self.adapt_stage4_b2_joint_varout_dimred(x1) + x1 = x1 + x2 + x1 = self.relu(x1) + x1 = self.mflow_conv_g4_pool(x1) + x1 = self.mflow_conv_g4_b(x1) + + logits = self.clf_conv(x1) + logits = self.resize(logits, resize_shape, align_corners=False) + return logits diff --git a/research/cv/RefineNet/src/tool/build_MRcd.py b/research/cv/RefineNet/src/tool/build_MRcd.py new file mode 100644 index 0000000000000000000000000000000000000000..63e790bab3d55572089aac46288ea70bdfe956b8 --- /dev/null +++ b/research/cv/RefineNet/src/tool/build_MRcd.py @@ -0,0 +1,68 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""build mindrecord""" +import os +import argparse +import numpy as np +from mindspore.mindrecord import FileWriter +from mindspore.common import set_seed + +seg_schema = {"file_name": {"type": "string"}, "label": {"type": "bytes"}, "data": {"type": "bytes"}} + + +def parse_args(): + parser = argparse.ArgumentParser('mindrecord') + parser.add_argument('--data_root', type=str, default='', help='root path of data') + parser.add_argument('--data_lst', type=str, default='', help='list of data') + parser.add_argument('--dst_path', type=str, default='', help='save path of mindrecords') + parser.add_argument('--num_shards', type=int, default=8, help='number of shards') + parser_args, _ = parser.parse_known_args() + return parser_args + + +if __name__ == '__main__': + args = parse_args() + + data = [] + with open(args.data_lst) as f: + lines = f.readlines() + set_seed(1000) + np.random.shuffle(lines) + dst_dir = '/'.join(args.dst_path.split('/')[:-1]) + if not os.path.exists(dst_dir): + os.makedirs(dst_dir) + + print('number of samples:', len(lines)) + writer = FileWriter(file_name=args.dst_path, shard_num=args.num_shards) + writer.add_schema(seg_schema, "seg_schema") + cnt = 0 + for l in lines: + img_path, label_path = l.strip().split(' ') + sample_ = {"file_name": img_path.split('/')[-1]} + with open(os.path.join(args.data_root, img_path), 'rb') as f: + sample_['data'] = f.read() + with open(os.path.join(args.data_root, label_path), 'rb') as f: + sample_['label'] = f.read() + data.append(sample_) + cnt += 1 + if cnt % 1000 == 0: + writer.write_raw_data(data) + print('number of samples written:', cnt) + data = [] + + if data: + writer.write_raw_data(data) + writer.commit() + print('number of samples written:', cnt) diff --git a/research/cv/RefineNet/src/tool/get_dataset_lst.py b/research/cv/RefineNet/src/tool/get_dataset_lst.py new file mode 100644 index 0000000000000000000000000000000000000000..9844b2754242b42b35a8d08ae675fe5542fd679d --- /dev/null +++ b/research/cv/RefineNet/src/tool/get_dataset_lst.py @@ -0,0 +1,157 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""get data list""" +import argparse +import os +import numpy as np +import scipy.io +from PIL import Image + +parser = argparse.ArgumentParser('dataset list generator') +parser.add_argument("--data_dir", type=str, default='D:/datasets/', help='where dataset stored.') + +args, _ = parser.parse_known_args() + +data_dir = args.data_dir +print("Data dir is:", data_dir) + +# +VOC_IMG_DIR = os.path.join(data_dir, 'VOCdevkit/VOC2012/JPEGImages') +VOC_ANNO_DIR = os.path.join(data_dir, 'VOCdevkit/VOC2012/SegmentationClass') +VOC_ANNO_GRAY_DIR = os.path.join(data_dir, 'VOCdevkit/VOC2012/SegmentationClassGray') +VOC_TRAIN_TXT = os.path.join(data_dir, 'VOCdevkit/VOC2012/ImageSets/Segmentation/train.txt') +VOC_VAL_TXT = os.path.join(data_dir, 'VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt') + +SBD_ANNO_DIR = os.path.join(data_dir, 'benchmark_RELEASE/dataset/cls') +SBD_IMG_DIR = os.path.join(data_dir, 'benchmark_RELEASE/dataset/img') +SBD_ANNO_PNG_DIR = os.path.join(data_dir, 'benchmark_RELEASE/dataset/cls_png') +SBD_ANNO_GRAY_DIR = os.path.join(data_dir, 'benchmark_RELEASE/dataset/cls_png_gray') +SBD_TRAIN_TXT = os.path.join(data_dir, 'benchmark_RELEASE/dataset/train.txt') +SBD_VAL_TXT = os.path.join(data_dir, 'benchmark_RELEASE/dataset/val.txt') + +VOC_TRAIN_LST_TXT = os.path.join(data_dir, 'voc_train_lst.txt') +VOC_VAL_LST_TXT = os.path.join(data_dir, 'voc_val_lst.txt') +VOC_AUG_TRAIN_LST_TXT = os.path.join(data_dir, 'sbd_train_lst.txt') + + +def __get_data_list(data_list_file): + """get_data_list""" + with open(data_list_file, mode='r') as f: + return f.readlines() + + +def conv_voc_colorpng_to_graypng(): + """conv_voc_colorpng_to_graypng""" + if not os.path.exists(VOC_ANNO_GRAY_DIR): + os.makedirs(VOC_ANNO_GRAY_DIR) + + for ann in os.listdir(VOC_ANNO_DIR): + ann_im = Image.open(os.path.join(VOC_ANNO_DIR, ann)) + ann_im = Image.fromarray(np.array(ann_im)) + ann_im.save(os.path.join(VOC_ANNO_GRAY_DIR, ann)) + + +def __gen_palette(cls_nums=256): + """palette""" + palette = np.zeros((cls_nums, 3), dtype=np.uint8) + for i in range(cls_nums): + lbl = i + j = 0 + while lbl: + palette[i, 0] |= (((lbl >> 0) & 1) << (7 - j)) + palette[i, 1] |= (((lbl >> 1) & 1) << (7 - j)) + palette[i, 2] |= (((lbl >> 2) & 1) << (7 - j)) + lbl >>= 3 + j += 1 + return palette.flatten() + + +def conv_sbd_mat_to_png(): + """convert sbd.mat to png""" + if not os.path.exists(SBD_ANNO_PNG_DIR): + os.makedirs(SBD_ANNO_PNG_DIR) + if not os.path.exists(SBD_ANNO_GRAY_DIR): + os.makedirs(SBD_ANNO_GRAY_DIR) + + palette = __gen_palette() + for an in os.listdir(SBD_ANNO_DIR): + img_id = an[:-4] + mat = scipy.io.loadmat(os.path.join(SBD_ANNO_DIR, an)) + anno = mat['GTcls'][0]['Segmentation'][0].astype(np.uint8) + anno_png = Image.fromarray(anno) + # save to gray png + anno_png.save(os.path.join(SBD_ANNO_GRAY_DIR, img_id + '.png')) + # save to color png use palette + anno_png.putpalette(palette) + anno_png.save(os.path.join(SBD_ANNO_PNG_DIR, img_id + '.png')) + + +def create_voc_train_lst_txt(): + """ create voc train list txt """ + voc_train_data_lst = __get_data_list(VOC_TRAIN_TXT) + with open(VOC_TRAIN_LST_TXT, mode='w') as f: + for id_ in voc_train_data_lst: + id_ = id_.strip() + img_ = os.path.join(VOC_IMG_DIR, id_ + '.jpg') + anno_ = os.path.join(VOC_ANNO_GRAY_DIR, id_ + '.png') + f.write(img_ + ' ' + anno_ + '\n') + + +def create_voc_val_lst_txt(): + """ create voc eval list txt""" + voc_val_data_lst = __get_data_list(VOC_VAL_TXT) + with open(VOC_VAL_LST_TXT, mode='w') as f: + for id_ in voc_val_data_lst: + id_ = id_.strip() + img_ = os.path.join(VOC_IMG_DIR, id_ + '.jpg') + anno_ = os.path.join(VOC_ANNO_GRAY_DIR, id_ + '.png') + f.write(img_ + ' ' + anno_ + '\n') + + +def create_sbd_train_aug_lst_txt(): + """create sbd train list txt""" + voc_train_data_lst = __get_data_list(VOC_TRAIN_TXT) + voc_val_data_lst = __get_data_list(VOC_VAL_TXT) + + sbd_train_data_lst = __get_data_list(SBD_TRAIN_TXT) + sbd_val_data_lst = __get_data_list(SBD_VAL_TXT) + + with open(VOC_AUG_TRAIN_LST_TXT, mode='w') as f: + for id_ in sbd_train_data_lst + sbd_val_data_lst: + if id_ in voc_train_data_lst + voc_val_data_lst: + continue + id_ = id_.strip() + img_ = os.path.join(SBD_IMG_DIR, id_ + '.jpg') + anno_ = os.path.join(SBD_ANNO_GRAY_DIR, id_ + '.png') + f.write(img_ + ' ' + anno_ + '\n') + + +if __name__ == '__main__': + print('converting voc color png to gray png ...') + conv_voc_colorpng_to_graypng() + print('converting done.') + + create_voc_train_lst_txt() + print('generating voc train list success.') + + create_voc_val_lst_txt() + print('generating voc val list success.') + + print('converting sbd annotations to png ...') + conv_sbd_mat_to_png() + print('converting done') + + create_sbd_train_aug_lst_txt() + print('generating sbd list success.') diff --git a/research/cv/RefineNet/train.py b/research/cv/RefineNet/train.py new file mode 100644 index 0000000000000000000000000000000000000000..754487eb1395fdd9eaca23ec1d8fa0f13027259f --- /dev/null +++ b/research/cv/RefineNet/train.py @@ -0,0 +1,160 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +""" train Refinenet """ +import argparse +import math +from mindspore import Parameter, context +from mindspore.train.model import Model +import mindspore.nn as nn +from mindspore.train.callback import ModelCheckpoint, CheckpointConfig +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from mindspore.train.callback import LossMonitor, TimeMonitor +from mindspore.train.loss_scale_manager import FixedLossScaleManager +from mindspore.common import set_seed +from mindspore.communication.management import init, get_rank, get_group_size +from mindspore.common.initializer import initializer, HeUniform +from mindspore.context import ParallelMode +from src import dataset as data_generator +from src import loss, learning_rates +from src.refinenet import RefineNet +from src.refinenet import Bottleneck + +set_seed(1) + + +def parse_args(): + """parse_args""" + parser = argparse.ArgumentParser('MindSpore Refinet training') + # dataset + parser.add_argument('--data_file', type=str, default='', help='path and Name of one MindRecord file') + parser.add_argument('--batch_size', type=int, default=32, help='batch size') + parser.add_argument('--crop_size', type=int, default=513, help='crop size') + parser.add_argument('--image_mean', type=list, default=[123.675, 116.28, 103.53], help='image mean')# rgb + parser.add_argument('--image_std', type=list, default=[58.395, 57.120, 57.375], help='image std') #rgb + parser.add_argument('--min_scale', type=float, default=0.5, help='minimum scale of data argumentation') + parser.add_argument('--max_scale', type=float, default=3.0, help='maximum scale of data argumentation') + parser.add_argument('--ignore_label', type=int, default=255, help='ignore label') + parser.add_argument('--num_classes', type=int, default=21, help='number of classes') + + # optimizer + parser.add_argument('--train_epochs', type=int, default=200, help='epoch') + parser.add_argument('--warmup_epochs', type=int, default=10, help='warmup_epoch') + parser.add_argument('--lr_type', type=str, default='cos', help='type of learning rate') + parser.add_argument('--base_lr', type=float, default=0.0015, help='base learning rate') + parser.add_argument('--lr_decay_step', type=int, default=450, help='learning rate decay step') + parser.add_argument('--lr_decay_rate', type=float, default=0.8, help='learning rate decay rate') + parser.add_argument('--loss_scale', type=float, default=1024.0, help='loss scale') + + # model + parser.add_argument('--model', type=str, default='refinenet', help='select model') + parser.add_argument('--freeze_bn', action='store_true', help='freeze bn') + parser.add_argument('--ckpt_pre_trained', type=str, default='', help='PreTrained model') + + # train + parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'CPU'], + help='device where the code will be implemented. (Default: Ascend)') + parser.add_argument('--device_id', type=str, default='0', choices=['0', '1', '2', '3', '4', '5', '6', '7'], + help='which device will be implemented') + parser.add_argument('--is_distributed', action='store_true', help='distributed training') + parser.add_argument('--rank', type=int, default=0, help='local rank of distributed') + parser.add_argument('--group_size', type=int, default=1, help='world size of distributed') + parser.add_argument('--save_epochs', type=int, default=5, help='steps interval for saving') + parser.add_argument('--keep_checkpoint_max', type=int, default=200, help='max checkpoint for saving') + parser.add_argument('--data_lst', type=str, default='', help='list of val data') + args, _ = parser.parse_known_args() + return args + + +def weights_init(net): + for _, cell in net.cells_and_names(): + if isinstance(cell, nn.Conv2d): + cell.weight = Parameter(initializer(HeUniform(negative_slope=math.sqrt(5)), cell.weight.shape, + cell.weight.dtype), name=cell.weight.name) + + +def train(): + """train""" + args = parse_args() + context.set_context(mode=context.GRAPH_MODE, enable_auto_mixed_precision=True, save_graphs=False, + device_target="Ascend", device_id=int(args.device_id)) + + if args.is_distributed: + init() + args.rank = get_rank() + args.group_size = get_group_size() + parallel_mode = ParallelMode.DATA_PARALLEL + context.set_auto_parallel_context(parallel_mode=parallel_mode, gradients_mean=True, device_num=args.group_size) + # dataset + dataset = data_generator.SegDataset(image_mean=args.image_mean, + image_std=args.image_std, + data_file=args.data_file, + batch_size=args.batch_size, + crop_size=args.crop_size, + max_scale=args.max_scale, + min_scale=args.min_scale, + ignore_label=args.ignore_label, + num_classes=args.num_classes, + num_readers=2, + num_parallel_calls=4, + shard_id=args.rank, + shard_num=args.group_size, + ) + dataset = dataset.get_dataset1() + network = RefineNet(Bottleneck, [3, 4, 23, 3], args.num_classes) + + # loss + loss_ = loss.SoftmaxCrossEntropyLoss(args.num_classes, args.ignore_label) + weights_init(network) + if args.ckpt_pre_trained: + param_dict = load_checkpoint(args.ckpt_pre_trained) + load_param_into_net(network, param_dict) + + # optimizer + iters_per_epoch = dataset.get_dataset_size() + total_train_steps = iters_per_epoch * args.train_epochs + if args.lr_type == 'cos': + lr_iter = learning_rates.cosine_lr(args.base_lr, total_train_steps, total_train_steps) + elif args.lr_type == 'poly': + lr_iter = learning_rates.poly_lr(args.base_lr, total_train_steps, total_train_steps, end_lr=0.0, power=0.9) + elif args.lr_type == 'exp': + lr_iter = learning_rates.exponential_lr(args.base_lr, args.lr_decay_step, args.lr_decay_rate, + total_train_steps, staircase=True) + elif args.lr_type == 'cos_warmup': + lr_iter = learning_rates.warmup_cosine_annealing_lr(args.base_lr, iters_per_epoch, + args.warmup_epochs, args.train_epochs) + else: + raise ValueError('unknown learning rate type') + opt = nn.Momentum(params=network.trainable_params(), learning_rate=lr_iter, momentum=0.9, weight_decay=0.0005, + loss_scale=args.loss_scale) + + # loss scale + manager_loss_scale = FixedLossScaleManager(args.loss_scale, drop_overflow_update=False) + amp_level = "O0" if args.device_target == "CPU" else "O3" + model = Model(network, loss_, optimizer=opt, amp_level=amp_level, loss_scale_manager=manager_loss_scale) + + # callback for saving ckpts + time_cb = TimeMonitor(data_size=iters_per_epoch) + loss_cb = LossMonitor() + cbs = [time_cb, loss_cb] + if args.rank == 0: + config_ck = CheckpointConfig(save_checkpoint_steps=args.save_epochs*iters_per_epoch, + keep_checkpoint_max=args.keep_checkpoint_max) + ckpoint_cb = ModelCheckpoint(prefix=args.model, directory="./ckpt_"+str(args.rank), config=config_ck) + cbs.append(ckpoint_cb) + model.train(args.train_epochs, dataset, callbacks=cbs, dataset_sink_mode=(args.device_target != "CPU")) + + +if __name__ == '__main__': + train()