Skip to content
Snippets Groups Projects
Commit b0eb47dd authored by zhaoting's avatar zhaoting
Browse files

delete deeplabv3+ in official

parent f817d065
No related branches found
No related tags found
No related merge requests found
Showing
with 0 additions and 1472 deletions
# 目录
<!-- TOC -->
- [目录](#目录)
- [DeepLabV3+描述](#deeplabv3+描述)
- [描述](#描述)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [特性](#特性)
- [混合精度](#混合精度)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [用法](#用法)
- [Ascend处理器环境运行](#ascend处理器环境运行)
- [结果](#结果)
- [评估过程](#评估过程)
- [用法](#用法-1)
- [Ascend处理器环境运行](#ascend处理器环境运行-1)
- [结果](#结果-1)
- [训练准确率](#训练准确率)
- [模型描述](#模型描述)
- [性能](#性能)
- [评估性能](#评估性能)
- [随机情况说明](#随机情况说明)
- [ModelZoo主页](#modelzoo主页)
<!-- /TOC -->
# DeepLabV3+描述
## 描述
DeepLab是一系列图像语义分割模型,DeepLabv3+通过encoder-decoder进行多尺度信息的融合,同时保留了原来的空洞卷积和ASSP层,
其骨干网络使用了Resnet模型,提高了语义分割的健壮性和运行速率。
有关网络详细信息,请参阅[论文][1]
`Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." Proceedings of the European conference on computer vision (ECCV). 2018.`
[1]: https://arxiv.org/abs/1802.02611
# 模型架构
以ResNet-101为骨干,通过encoder-decoder进行多尺度信息的融合,使用空洞卷积进行密集特征提取。
# 数据集
Pascal VOC数据集和语义边界数据集(Semantic Boundaries Dataset,SBD)
- 下载分段数据集。
- 准备Backbone模型
准备resnet101模型,点此下载(https://download.mindspore.cn/model_zoo/r1.2/resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78/resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78.ckpt).
- 准备训练数据清单文件。清单文件用于保存图片和标注对的相对路径。如下:
```text
VOCdevkit/VOC2012/JPEGImages/2007_000032.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000032.png
VOCdevkit/VOC2012/JPEGImages/2007_000039.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000039.png
VOCdevkit/VOC2012/JPEGImages/2007_000063.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000063.png
VOCdevkit/VOC2012/JPEGImages/2007_000068.jpg VOCdevkit/VOC2012/SegmentationClassGray/2007_000068.png
......
```
你也可以通过运行脚本:`python get_dataset_list.py --data_root=/PATH/TO/DATA` 来自动生成数据清单文件。
- 配置并运行get_dataset_mindrecord.sh,将数据集转换为MindRecords。scripts/get_dataset_mindrecord.sh中的参数:
```
--data_root 训练数据的根路径
--data_lst 训练数据列表(如上准备)
--dst_path MindRecord所在路径
--num_shards MindRecord的分片数
--shuffle 是否混洗
```
# 特性
## 混合精度
采用[混合精度](https://www.mindspore.cn/tutorials/experts/zh-CN/master/others/mixed_precision.html)的训练方法使用支持单精度和半精度数据来提高深度学习神经网络的训练速度,同时保持单精度训练所能达到的网络精度。混合精度训练提高计算速度、减少内存使用的同时,支持在特定硬件上训练更大的模型或实现更大批次的训练。
以FP16算子为例,如果输入数据类型为FP32,MindSpore后台会自动降低精度来处理数据。用户可打开INFO日志,搜索“reduce precision”查看精度降低的算子。
# 环境要求
- 硬件(Ascend)
- 准备Ascend处理器搭建硬件环境。
- 框架
- [MindSpore](https://www.mindspore.cn/install)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
- 安装requirements.txt中的python包。
- 生成config json文件用于8卡训练。
```
# 从项目根目录进入
cd src/tools/
python3 get_multicards_json.py 10.111.*.*
# 10.111.*.*为计算机IP地址
```
# 快速入门
通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估:
- Ascend处理器环境运行
在DeepLabV3+原始论文的基础上,我们对VOCaug(也称为trainaug)数据集进行了两次训练实验,并对voc val数据集进行了评估。
运行以下训练脚本配置单卡训练参数:
```bash
进入shell脚本修改 data_file和ckpt_pre_trained参数
# example:
data_file=/home/DataSet/VOC2012/vocaug_mindrecords/vocaug.mindrecord0
ckpt_pre_trained=/home/model/deeplabv3/predtrained/resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78.ckpt
bash run_alone_train.sh
```
按照以下训练步骤进行8卡训练:
1.使用VOCaug数据集训练s16,微调ResNet-101预训练模型。脚本如下:
```bash
进入shell脚本修改 data_file和ckpt_pre_trained参数
# example:
data_file=/home/DataSet/VOC2012/vocaug_mindrecords/vocaug.mindrecord0
ckpt_pre_trained=/home/model/deeplabv3/predtrained/resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78.ckpt
bash run_distribute_train_s16_r1.sh ~/hccl_8p.json
```
2.使用VOCaug数据集训练s8,微调上一步的模型。脚本如下:
```bash
进入shell脚本修改 data_file和ckpt_pre_trained参数
# example:
data_file=/home/DataSet/VOC2012/vocaug_mindrecords/vocaug.mindrecord0
ckpt_pre_trained=/home/model/deeplabv3/predtrained/resnet101_ascend_v120_imagenet2012_official_cv_bs32_acc78.ckpt
bash run_distribute_train_s8_r1.sh ~/hccl_8p.json
```
3.使用VOCtrain数据集训练s8,微调上一步的模型。脚本如下:
```bash
进入shell脚本修改 data_file和ckpt_pre_trained参数
注意:本次训练预训练权重采用上一步训练的权重文件,数据集也有所变化
# example:
data_file=/home/DataSet/VOC2012/voctrain_mindrecords/votrain.mindrecord0
ckpt_pre_trained=/home/model/deeplabv3/ckpt/deeplabv3-800_330.ckpt
bash run_distribute_train_s8_r2.sh ~/hccl_8p.json
```
评估步骤如下:
1. 进入对应的shell脚本修改参数
```shell
# example:
data_root=/home/DataSet/VOC2012
data_lst=/home/DataSet/VOC2012/voc_val_lst.txt
ckpt_path=/home/model/deeplabv3/ckpt/deeplabv3-800_330.ckpt
```
2.使用voc val数据集评估s16。评估脚本如下:
```bash
bash run_eval_s16.sh
```
3.使用voc val数据集评估多尺度s16。评估脚本如下:
```bash
bash run_eval_s16_multiscale.sh
```
4.使用voc val数据集评估多尺度和翻转s16。评估脚本如下:
```bash
bash run_eval_s16_multiscale_flip.sh
```
5.使用voc val数据集评估s8。评估脚本如下:
```bash
bash run_eval_s8.sh
```
6.使用voc val数据集评估多尺度s8。评估脚本如下:
```bash
bash run_eval_s8_multiscale.sh
```
7.使用voc val数据集评估多尺度和翻转s8。评估脚本如下:
```bash
bash run_eval_s8_multiscale_flip.sh
```
# 脚本说明
## 脚本及样例代码
```shell
.
└──deeplabv3plus
├── script
├── get_dataset_mindrecord.sh # 将原始数据转换为MindRecord数据集
├── run_alone_train.sh # 启动Ascend单机训练(单卡)
├── run_distribute_train_s16_r1.sh # 使用s16结构的VOCaug数据集启动Ascend分布式训练(8卡)
├── run_distribute_train_s8_r1.sh # 使用s8结构的VOCaug数据集启动Ascend分布式训练(8卡)
├── run_distribute_train_s8_r2.sh # 使用s8结构的VOCtrain数据集启动Ascend分布式训练(8卡)
├── run_eval_s16.sh # 使用s16结构启动Ascend评估
├── run_eval_s16_multiscale.sh # 使用多尺度s16结构启动Ascend评估
├── run_eval_s16_multiscale_filp.sh # 使用多尺度和翻转s16结构启动Ascend评估
├── run_eval_s8.sh # 使用s8结构启动Ascend评估
├── run_eval_s8_multiscale.sh # 使用多尺度s8结构启动Ascend评估
├── run_eval_s8_multiscale_filp.sh # 使用多尺度和翻转s8结构启动Ascend评估
├── src
├── tools
├── get_dataset_list.py # 获取数据清单文件
├── get_dataset_mindrecord.py # 获取MindRecord文件
├── get_multicards_json.py # 获取rank table文件
├── get_pretrained_model.py # 获取resnet预训练模型
├── dataset.py # 数据预处理
├── deeplab_v3plus.py # DeepLabV3+网络结构
├── learning_rates.py # 生成学习率
├── loss.py # DeepLabV3+的损失定义
├── eval.py # 评估网络
├── train.py # 训练网络
├──requirements.txt # requirements文件
└──README.md
```
## 脚本参数
默认配置
```bash
"data_file":"/PATH/TO/MINDRECORD_NAME" # 数据集路径
"device_target":Ascend # 训练后端类型
"train_epochs":300 # 总轮次数
"batch_size":32 # 输入张量的批次大小
"crop_size":513 # 裁剪大小
"base_lr":0.08 # 初始学习率
"lr_type":cos # 用于生成学习率的衰减模式
"min_scale":0.5 # 数据增强的最小尺度
"max_scale":2.0 # 数据增强的最大尺度
"ignore_label":255 # 忽略标签
"num_classes":21 # 类别数
"model":DeepLabV3plus_s16 # 选择模型
"ckpt_pre_trained":"/PATH/TO/PRETRAIN_MODEL" # 加载预训练检查点的路径
"is_distributed": # 分布式训练,设置该参数为True
"save_steps":410 # 用于保存的迭代间隙
"freeze_bn": # 设置该参数freeze_bn为True
"keep_checkpoint_max":200 # 用于保存的最大检查点
```
## 训练过程
### 用法
#### Ascend处理器环境运行
在DeepLabV3+原始论文的基础上,我们对vocaug(也称为trainaug)数据集进行了两次训练实验,并对voc val数据集进行了评估。
运行以下训练脚本配置单卡训练参数:
```bash
# run_alone_train.sh
python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
--train_dir=${train_path}/ckpt \
--train_epochs=200 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.015 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--save_steps=1500 \
--keep_checkpoint_max=200 >log 2>&1 &
```
按照以下训练步骤进行8卡训练:
1.使用VOCaug数据集训练s16,微调ResNet-101预训练模型。脚本如下:
```bash
# run_distribute_train_s16_r1.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=$i
export DEVICE_ID=`expr $i + $RANK_START_ID`
echo 'start rank='$i', device id='$DEVICE_ID'...'
mkdir ${train_path}/device$DEVICE_ID
cd ${train_path}/device$DEVICE_ID
ython ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.08 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=410 \
--keep_checkpoint_max=200 >log 2>&1 &
done
```
2.使用VOCaug数据集训练s8,微调上一步的模型。脚本如下:
```bash
# run_distribute_train_s8_r1.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=$i
export DEVICE_ID=`expr $i + $RANK_START_ID`
echo 'start rank='$i', device id='$DEVICE_ID'...'
mkdir ${train_path}/device$DEVICE_ID
cd ${train_path}/device$DEVICE_ID
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=800 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.02 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=820 \
--keep_checkpoint_max=200 >log 2>&1 &
done
```
3.使用VOCtrain数据集训练s8,微调上一步的模型。脚本如下:
```bash
# run_distribute_train_s8_r2.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=$i
export DEVICE_ID=`expr $i + $RANK_START_ID`
echo 'start rank='$i', device id='$DEVICE_ID'...'
mkdir ${train_path}/device$DEVICE_ID
cd ${train_path}/device$DEVICE_ID
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.008 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=110 \
--keep_checkpoint_max=200 >log 2>&1 &
done
```
#### ModelArts环境运行
按以下样例配置训练参数,启动ModelArts训练:
```shell
python train.py --train_url=/PATH/TO/OUTPUT_DIR \
--data_url=/PATH/TO/MINDRECORD \
--model=DeepLabV3plus_s16 \
--modelArts_mode=True \
--dataset_filename=MINDRECORD_NAME \
--pretrainedmodel_filename=PRETRAIN_MODELNAME \
--train_epochs=300 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.08 \
--lr_type=cos \
--save_steps=410 \
```
### 结果
#### Ascend处理器环境运行
- 使用s16结构训练VOCaug
```bash
# 分布式训练结果(8P)
epoch: 1 step: 41, loss is 0.81338423
epoch time: 202199.339 ms, per step time: 4931.691 ms
epoch: 2 step: 41, loss is 0.34089813
epoch time: 23811.338 ms, per step time: 580.764 ms
epoch: 3 step: 41, loss is 0.32335973
epoch time: 23794.863 ms, per step time: 580.363 ms
epoch: 4 step: 41, loss is 0.18254203
epoch time: 23796.674 ms, per step time: 580.407 ms
epoch: 5 step: 41, loss is 0.27708685
epoch time: 23794.654 ms, per step time: 580.357 ms
epoch: 6 step: 41, loss is 0.37388346
epoch time: 23845.658 ms, per step time: 581.601 ms
...
```
- 使用s8结构训练VOCaug
```bash
# 分布式训练结果(8P)
epoch: 1 step: 82, loss is 0.073864505
epoch time: 226610.999 ms, per step time: 2763.549 ms
epoch: 2 step: 82, loss is 0.06908825
epoch time: 44474.187 ms, per step time: 542.368 ms
epoch: 3 step: 82, loss is 0.059860937
epoch time: 44485.142 ms, per step time: 542.502 ms
epoch: 4 step: 82, loss is 0.084193744
epoch time: 44472.924 ms, per step time: 542.353 ms
epoch: 5 step: 82, loss is 0.072242916
epoch time: 44466.738 ms, per step time: 542.277 ms
epoch: 6 step: 82, loss is 0.04948996
epoch time: 44474.549 ms, per step time: 542.373 ms
...
```
- 使用s8结构训练VOCtrain
```bash
# 分布式训练结果(8P)
epoch: 1 step: 11, loss is 0.0055908263
epoch time: 183966.044 ms, per step time: 16724.186 ms
epoch: 2 step: 11, loss is 0.008914589
epoch time: 5985.108 ms, per step time: 544.101 ms
epoch: 3 step: 11, loss is 0.0073758443
epoch time: 5977.932 ms, per step time: 543.448 ms
epoch: 4 step: 11, loss is 0.00677738
epoch time: 5978.866 ms, per step time: 543.533 ms
epoch: 5 step: 11, loss is 0.0053799236
epoch time: 5987.879 ms, per step time: 544.353 ms
epoch: 6 step: 11, loss is 0.0049248594
epoch time: 5979.642 ms, per step time: 543.604 ms
...
```
#### ModelArts环境运行
- 使用s16结构训练VOCaug
```bash
epoch: 1 step: 41, loss is 0.6122837
epoch: 2 step: 41, loss is 0.4066103
epoch: 3 step: 41, loss is 0.3504579
...
```
## 评估过程
### 用法
#### Ascend处理器环境运行
使用--ckpt_path配置检查点,运行脚本,在eval_path/eval_log中打印mIOU。
```bash
./run_eval_s16.sh # 测试s16
./run_eval_s16_multiscale.sh # 测试s16 + 多尺度
./run_eval_s16_multiscale_flip.sh # 测试s16 + 多尺度 + 翻转
./run_eval_s8.sh # 测试s8
./run_eval_s8_multiscale.sh # 测试s8 + 多尺度
./run_eval_s8_multiscale_flip.sh # 测试s8 + 多尺度 + 翻转
```
测试脚本示例如下:
```bash
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
```
### 结果
运行适用的训练脚本获取结果。要获得相同的结果,请按照快速入门中的步骤操作。
#### 训练准确率
| **网络** | OS=16 | OS=8 | MS |翻转| mIOU |论文中的mIOU |
| :----------: | :-----: | :----: | :----: | :-----: | :-----: | :-------------: |
| deeplab_v3+ | √ | | | | 79.78 | 78.85 |
| deeplab_v3+ | √ | | √ | | 80.59 |80.09 |
| deeplab_v3+ | √ | | √ | √ | 80.76 | 80.22 |
| deeplab_v3+ | | √ | | | 79.56 | 79.35 |
| deeplab_v3+ | | √ | √ | | 80.43 |80.43 |
| deeplab_v3+ | | √ | √ | √ | 80.69 | 80.57 |
注意:OS指输出步长(output stride), MS指多尺度(multiscale)。
# 模型描述
## 性能
### 评估性能
| 参数 | Ascend 910|
| -------------------------- | -------------------------------------- |
| 模型版本 | DeepLabV3+ |
| 资源 | Ascend 910 |
| 上传日期 | 2021-03-16 |
| MindSpore版本 | 1.1.1 |
| 数据集 | PASCAL VOC2012 + SBD |
| 训练参数 | epoch = 300, batch_size = 32 (s16_r1) epoch = 800, batch_size = 16 (s8_r1) epoch = 300, batch_size = 16 (s8_r2) |
| 优化器 | Momentum |
| 损失函数 | Softmax交叉熵 |
| 输出 | 概率 |
| 损失 | 0.0041095633 |
| 性能 | 187736.386 ms(单卡,s16)<br> 44474.187 ms(八卡,s16) |
| 微调检查点 | 453M (.ckpt文件) |
| 脚本 | [链接](https://gitee.com/mindspore/models/tree/master/official/cv/deeplabv3plus) |
# 随机情况说明
dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。
# ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/models)
import os
import argparse
import numpy as np
import cv2
from mindspore import Tensor
import mindspore.common.dtype as mstype
import mindspore.nn as nn
from mindspore import context
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from src.deeplab_v3plus import DeepLabV3Plus
context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", save_graphs=False,
device_id=int(os.getenv('DEVICE_ID')))
def parse_args():
parser = argparse.ArgumentParser('MindSpore DeepLabV3+ eval')
# val data
parser.add_argument('--data_root', type=str, default='', help='root path of val data')
parser.add_argument('--data_lst', type=str, default='', help='list of val data')
parser.add_argument('--batch_size', type=int, default=16, help='batch size')
parser.add_argument('--crop_size', type=int, default=513, help='crop size')
parser.add_argument('--image_mean', type=list, default=[103.53, 116.28, 123.675], help='image mean')
parser.add_argument('--image_std', type=list, default=[57.375, 57.120, 58.395], help='image std')
parser.add_argument('--scales', type=float, action='append', help='scales of evaluation')
parser.add_argument('--flip', action='store_true', help='perform left-right flip')
parser.add_argument('--ignore_label', type=int, default=255, help='ignore label')
parser.add_argument('--num_classes', type=int, default=21, help='number of classes')
# model
parser.add_argument('--model', type=str, default='', help='select model')
parser.add_argument('--freeze_bn', action='store_true', default=False, help='freeze bn')
parser.add_argument('--ckpt_path', type=str, default='', help='model to evaluate')
args, _ = parser.parse_known_args()
return args
def cal_hist(a, b, n):
k = (a >= 0) & (a < n)
return np.bincount(n * a[k].astype(np.int32) + b[k], minlength=n ** 2).reshape(n, n)
def resize_long(img, long_size=513):
h, w, _ = img.shape
if h > w:
new_h = long_size
new_w = int(1.0 * long_size * w / h)
else:
new_w = long_size
new_h = int(1.0 * long_size * h / w)
imo = cv2.resize(img, (new_w, new_h))
return imo
class BuildEvalNetwork(nn.Cell):
def __init__(self, network):
super(BuildEvalNetwork, self).__init__()
self.network = network
self.softmax = nn.Softmax(axis=1)
def construct(self, input_data):
output = self.network(input_data)
output = self.softmax(output)
return output
def pre_process(args, img_, crop_size=513):
# resize
img_ = resize_long(img_, crop_size)
resize_h, resize_w, _ = img_.shape
# mean, std
image_mean = np.array(args.image_mean)
image_std = np.array(args.image_std)
img_ = (img_ - image_mean) / image_std
# pad to crop_size
pad_h = crop_size - img_.shape[0]
pad_w = crop_size - img_.shape[1]
if pad_h > 0 or pad_w > 0:
img_ = cv2.copyMakeBorder(img_, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=0)
# hwc to chw
img_ = img_.transpose((2, 0, 1))
return img_, resize_h, resize_w
def eval_batch(args, eval_net, img_lst, crop_size=513, flip=True):
result_lst = []
batch_size = len(img_lst)
batch_img = np.zeros((args.batch_size, 3, crop_size, crop_size), dtype=np.float32)
resize_hw = []
for l in range(batch_size):
img_ = img_lst[l]
img_, resize_h, resize_w = pre_process(args, img_, crop_size)
batch_img[l] = img_
resize_hw.append([resize_h, resize_w])
batch_img = np.ascontiguousarray(batch_img)
net_out = eval_net(Tensor(batch_img, mstype.float32))
net_out = net_out.asnumpy()
if flip:
batch_img = batch_img[:, :, :, ::-1]
net_out_flip = eval_net(Tensor(batch_img, mstype.float32))
net_out += net_out_flip.asnumpy()[:, :, :, ::-1]
for bs in range(batch_size):
probs_ = net_out[bs][:, :resize_hw[bs][0], :resize_hw[bs][1]].transpose((1, 2, 0))
ori_h, ori_w = img_lst[bs].shape[0], img_lst[bs].shape[1]
probs_ = cv2.resize(probs_, (ori_w, ori_h))
result_lst.append(probs_)
return result_lst
def eval_batch_scales(args, eval_net, img_lst, scales,
base_crop_size=513, flip=True):
sizes_ = [int((base_crop_size - 1) * sc) + 1 for sc in scales]
probs_lst = eval_batch(args, eval_net, img_lst, crop_size=sizes_[0], flip=flip)
print(sizes_)
for crop_size_ in sizes_[1:]:
probs_lst_tmp = eval_batch(args, eval_net, img_lst, crop_size=crop_size_, flip=flip)
for pl, _ in enumerate(probs_lst):
probs_lst[pl] += probs_lst_tmp[pl]
result_msk = []
for i in probs_lst:
result_msk.append(i.argmax(axis=2))
return result_msk
def net_eval():
args = parse_args()
# data list
with open(args.data_lst) as f:
img_lst = f.readlines()
# network
if args.model == 'DeepLabV3plus_s16':
network = DeepLabV3Plus('eval', args.num_classes, 16, args.freeze_bn)
elif args.model == 'DeepLabV3plus_s8':
network = DeepLabV3Plus('eval', args.num_classes, 8, args.freeze_bn)
else:
raise NotImplementedError('model [{:s}] not recognized'.format(args.model))
eval_net = BuildEvalNetwork(network)
# load model
param_dict = load_checkpoint(args.ckpt_path)
load_param_into_net(eval_net, param_dict)
eval_net.set_train(False)
# evaluate
hist = np.zeros((args.num_classes, args.num_classes))
batch_img_lst = []
batch_msk_lst = []
bi = 0
image_num = 0
for i, line in enumerate(img_lst):
img_path, msk_path = line.strip().split(' ')
img_path = os.path.join(args.data_root, img_path)
msk_path = os.path.join(args.data_root, msk_path)
img_ = cv2.imread(img_path)
msk_ = cv2.imread(msk_path, cv2.IMREAD_GRAYSCALE)
batch_img_lst.append(img_)
batch_msk_lst.append(msk_)
bi += 1
if bi == args.batch_size:
batch_res = eval_batch_scales(args, eval_net, batch_img_lst, scales=args.scales,
base_crop_size=args.crop_size, flip=args.flip)
for mi in range(args.batch_size):
hist += cal_hist(batch_msk_lst[mi].flatten(), batch_res[mi].flatten(), args.num_classes)
bi = 0
batch_img_lst = []
batch_msk_lst = []
print('processed {} images'.format(i+1))
image_num = i
if bi > 0:
batch_res = eval_batch_scales(args, eval_net, batch_img_lst, scales=args.scales,
base_crop_size=args.crop_size, flip=args.flip)
for mi in range(bi):
hist += cal_hist(batch_msk_lst[mi].flatten(), batch_res[mi].flatten(), args.num_classes)
print('processed {} images'.format(image_num + 1))
print(hist)
iu = np.diag(hist) / (hist.sum(1) + hist.sum(0) - np.diag(hist))
print('per-class IoU', iu)
print('mean IoU', np.nanmean(iu))
if __name__ == '__main__':
net_eval()
"""export AIR file."""
import argparse
import numpy as np
from mindspore import Tensor, context, load_checkpoint, load_param_into_net, export
from src.deeplab_v3plus import DeepLabV3Plus
context.set_context(mode=context.GRAPH_MODE, save_graphs=False)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='checkpoint export')
parser.add_argument('--checkpoint', type=str, default='', help='checkpoint of deeplabv3 (Default: None)')
parser.add_argument('--model', type=str, default='DeepLabV3plus_s8',
choices=['DeepLabV3plus_s16', 'DeepLabV3plus_s8'],
help='Select model structure (Default: DeepLabV3plus_s8)')
parser.add_argument('--num_classes', type=int, default=21, help='the number of classes (Default: 21)')
args = parser.parse_args()
if args.model == 'DeepLabV3plus_s16':
network = DeepLabV3Plus('eval', args.num_classes, 16, True)
else:
network = DeepLabV3Plus('eval', args.num_classes, 8, True)
param_dict = load_checkpoint(args.checkpoint)
# load the parameter into net
load_param_into_net(network, param_dict)
input_data = np.random.uniform(0.0, 1.0, size=[32, 3, 513, 513]).astype(np.float32)
export(network, Tensor(input_data), file_name=args.model + '-300_11.air', file_format='MINDIR')
"""hub config."""
from src.deeplab_v3plus import DeepLabV3Plus
def create_network(name, *args, **kwargs):
freeze_bn = True
num_classes = kwargs["num_classes"]
if name == 'DeepLabV3plus_s16':
DeepLabV3plus_s16_network = DeepLabV3Plus('eval', num_classes, 16, freeze_bn)
return DeepLabV3plus_s16_network
if name == 'DeepLabV3plus_s8':
DeepLabV3plus_s8_network = DeepLabV3Plus('eval', num_classes, 8, freeze_bn)
return DeepLabV3plus_s8_network
raise NotImplementedError(f"{name} is not implemented in the repo")
mindspore
numpy
Pillow
python-opencv
#!/bin/bash
export DEVICE_ID=7
python /PATH/TO/MODEL_ZOO_CODE/data/get_dataset_mindrecord.py --data_root=/PATH/TO/DATA_ROOT \
--data_lst=/PATH/TO/DATA_lst.txt \
--dst_path=/PATH/TO/MINDRECORED_NAME.mindrecord \
--num_shards=1 \
--shuffle=True
\ No newline at end of file
#!/bin/bash
export DEVICE_ID=5
export SLOG_PRINT_TO_STDOUT=0
train_path=/PATH/TO/EXPERIMENTS_DIR
train_code_path=/PATH/TO/MODEL_ZOO_CODE
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/device${DEVICE_ID}
mkdir ${train_path}/ckpt
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
--train_dir=${train_path}/ckpt \
--train_epochs=200 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.015 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--save_steps=1500 \
--keep_checkpoint_max=200 >log 2>&1 &
\ No newline at end of file
#!/bin/bash
ulimit -c unlimited
train_path=/PATH/TO/EXPERIMENTS_DIR
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
export RANK_TABLE_FILE=${train_code_path}/src/tools/rank_table_8p.json
export RANK_SIZE=8
export RANK_START_ID=0
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.08 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=410 \
--keep_checkpoint_max=200 >log 2>&1 &
done
#!/bin/bash
ulimit -c unlimited
train_path=/PATH/TO/EXPERIMENTS_DIR
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
export RANK_TABLE_FILE=${train_code_path}/src/tools/rank_table_8p.json
export RANK_SIZE=8
export RANK_START_ID=0
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=800 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.02 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=820 \
--keep_checkpoint_max=200 >log 2>&1 &
done
#!/bin/bash
ulimit -c unlimited
train_path=/PATH/TO/EXPERIMENTS_DIR
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
export RANK_TABLE_FILE=${train_code_path}/src/tools/rank_table_8p.json
export RANK_SIZE=8
export RANK_START_ID=0
if [ -d ${train_path} ]; then
rm -rf ${train_path}
fi
mkdir -p ${train_path}
mkdir ${train_path}/ckpt
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.008 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=110 \
--keep_checkpoint_max=200 >log 2>&1 &
done
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=32 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
\ No newline at end of file
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s16 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=1.0 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
#!/bin/bash
export DEVICE_ID=3
export SLOG_PRINT_TO_STDOUT=0
train_code_path=/PATH/TO/MODEL_ZOO_CODE
eval_path=/PATH/TO/EVAL
if [ -d ${eval_path} ]; then
rm -rf ${eval_path}
fi
mkdir -p ${eval_path}
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=DeepLabV3plus_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
import numpy as np
import cv2
import mindspore.dataset as de
cv2.setNumThreads(0)
class SegDataset:
def __init__(self,
image_mean,
image_std,
data_file='',
batch_size=32,
crop_size=512,
max_scale=2.0,
min_scale=0.5,
ignore_label=255,
num_classes=21,
num_readers=2,
num_parallel_calls=4,
shard_id=None,
shard_num=None):
self.data_file = data_file
self.batch_size = batch_size
self.crop_size = crop_size
self.image_mean = np.array(image_mean, dtype=np.float32)
self.image_std = np.array(image_std, dtype=np.float32)
self.max_scale = max_scale
self.min_scale = min_scale
self.ignore_label = ignore_label
self.num_classes = num_classes
self.num_readers = num_readers
self.num_parallel_calls = num_parallel_calls
self.shard_id = shard_id
self.shard_num = shard_num
assert max_scale > min_scale
def preprocess_(self, image, label):
# bgr image
image_out = cv2.imdecode(np.frombuffer(image, dtype=np.uint8), cv2.IMREAD_COLOR)
label_out = cv2.imdecode(np.frombuffer(label, dtype=np.uint8), cv2.IMREAD_GRAYSCALE)
sc = np.random.uniform(self.min_scale, self.max_scale)
new_h, new_w = int(sc * image_out.shape[0]), int(sc * image_out.shape[1])
image_out = cv2.resize(image_out, (new_w, new_h), interpolation=cv2.INTER_CUBIC)
label_out = cv2.resize(label_out, (new_w, new_h), interpolation=cv2.INTER_NEAREST)
image_out = (image_out - self.image_mean) / self.image_std
h_, w_ = max(new_h, self.crop_size), max(new_w, self.crop_size)
pad_h, pad_w = h_ - new_h, w_ - new_w
if pad_h > 0 or pad_w > 0:
image_out = cv2.copyMakeBorder(image_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=0)
label_out = cv2.copyMakeBorder(label_out, 0, pad_h, 0, pad_w, cv2.BORDER_CONSTANT, value=self.ignore_label)
offset_h = np.random.randint(0, h_ - self.crop_size + 1)
offset_w = np.random.randint(0, w_ - self.crop_size + 1)
image_out = image_out[offset_h: offset_h + self.crop_size, offset_w: offset_w + self.crop_size, :]
label_out = label_out[offset_h: offset_h + self.crop_size, offset_w: offset_w + self.crop_size]
if np.random.uniform(0.0, 1.0) > 0.5:
image_out = image_out[:, ::-1, :]
label_out = label_out[:, ::-1]
image_out = image_out.transpose((2, 0, 1))
image_out = image_out.copy()
label_out = label_out.copy()
return image_out, label_out
def get_dataset(self, repeat=1):
data_set = de.MindDataset(self.data_file, columns_list=["data", "label"],
shuffle=True, num_parallel_workers=self.num_readers,
num_shards=self.shard_num, shard_id=self.shard_id)
transforms_list = self.preprocess_
data_set = data_set.map(operations=transforms_list, input_columns=["data", "label"],
output_columns=["data", "label"],
num_parallel_workers=self.num_parallel_calls)
data_set = data_set.shuffle(buffer_size=self.batch_size * 10)
data_set = data_set.batch(self.batch_size, drop_remainder=True)
data_set = data_set.repeat(repeat)
return data_set
import mindspore.nn as nn
from mindspore.ops import operations as P
def conv1x1(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, weight_init='xavier_uniform')
def conv3x3(in_planes, out_planes, stride=1, dilation=1, padding=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, pad_mode='pad', padding=padding,
dilation=dilation, weight_init='xavier_uniform')
class Resnet(nn.Cell):
def __init__(self, block, block_num, output_stride, use_batch_statistics=True):
super(Resnet, self).__init__()
self.inplanes = 64
self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, pad_mode='pad', padding=3,
weight_init='xavier_uniform')
self.bn1 = nn.BatchNorm2d(self.inplanes, use_batch_statistics=use_batch_statistics)
self.relu = nn.ReLU()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same')
self.layer1 = self._make_layer(block, 64, block_num[0], use_batch_statistics=use_batch_statistics)
self.layer2 = self._make_layer(block, 128, block_num[1], stride=2, use_batch_statistics=use_batch_statistics)
if output_stride == 16:
self.layer3 = self._make_layer(block, 256, block_num[2], stride=2,
use_batch_statistics=use_batch_statistics)
self.layer4 = self._make_layer(block, 512, block_num[3], stride=1, base_dilation=2, grids=[1, 2, 4],
use_batch_statistics=use_batch_statistics)
elif output_stride == 8:
self.layer3 = self._make_layer(block, 256, block_num[2], stride=1, base_dilation=2,
use_batch_statistics=use_batch_statistics)
self.layer4 = self._make_layer(block, 512, block_num[3], stride=1, base_dilation=4, grids=[1, 2, 4],
use_batch_statistics=use_batch_statistics)
def _make_layer(self, block, planes, blocks, stride=1, base_dilation=1, grids=None, use_batch_statistics=True):
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.SequentialCell([
conv1x1(self.inplanes, planes * block.expansion, stride),
nn.BatchNorm2d(planes * block.expansion, use_batch_statistics=use_batch_statistics)
])
if grids is None:
grids = [1] * blocks
layers = [
block(self.inplanes, planes, stride, downsample, dilation=base_dilation * grids[0],
use_batch_statistics=use_batch_statistics)
]
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(
block(self.inplanes, planes, dilation=base_dilation * grids[i],
use_batch_statistics=use_batch_statistics))
return nn.SequentialCell(layers)
def construct(self, x):
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.maxpool(out)
out = self.layer1(out)
low_level_feat = out
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
return out, low_level_feat
class Bottleneck(nn.Cell):
expansion = 4
def __init__(self, inplanes, planes, stride=1, downsample=None, dilation=1, use_batch_statistics=True):
super(Bottleneck, self).__init__()
self.conv1 = conv1x1(inplanes, planes)
self.bn1 = nn.BatchNorm2d(planes, use_batch_statistics=use_batch_statistics)
self.conv2 = conv3x3(planes, planes, stride, dilation, dilation)
self.bn2 = nn.BatchNorm2d(planes, use_batch_statistics=use_batch_statistics)
self.conv3 = conv1x1(planes, planes * self.expansion)
self.bn3 = nn.BatchNorm2d(planes * self.expansion, use_batch_statistics=use_batch_statistics)
self.relu = nn.ReLU()
self.downsample = downsample
def construct(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
identity = self.downsample(x)
out = out + identity
out = self.relu(out)
return out
class ASPPConv(nn.Cell):
def __init__(self, in_channels, out_channels, atrous_rate=1, use_batch_statistics=True):
super(ASPPConv, self).__init__()
if atrous_rate == 1:
conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, has_bias=False, weight_init='xavier_uniform')
else:
conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, pad_mode='pad', padding=atrous_rate,
dilation=atrous_rate, weight_init='xavier_uniform')
bn = nn.BatchNorm2d(out_channels, use_batch_statistics=use_batch_statistics)
relu = nn.ReLU()
self.aspp_conv = nn.SequentialCell([conv, bn, relu])
def construct(self, x):
out = self.aspp_conv(x)
return out
class ASPPPooling(nn.Cell):
def __init__(self, in_channels, out_channels, use_batch_statistics=True):
super(ASPPPooling, self).__init__()
self.conv = nn.SequentialCell([
nn.Conv2d(in_channels, out_channels, kernel_size=1, weight_init='xavier_uniform'),
nn.BatchNorm2d(out_channels, use_batch_statistics=use_batch_statistics),
nn.ReLU()
])
self.shape = P.Shape()
def construct(self, x):
size = self.shape(x)
out = nn.AvgPool2d(size[2])(x)
out = self.conv(out)
out = P.ResizeNearestNeighbor((size[2], size[3]), True)(out)
return out
class ASPP(nn.Cell):
def __init__(self, atrous_rates, phase='train', in_channels=2048, num_classes=21,
use_batch_statistics=True):
super(ASPP, self).__init__()
self.phase = phase
out_channels = 256
self.aspp1 = ASPPConv(in_channels, out_channels, atrous_rates[0], use_batch_statistics=use_batch_statistics)
self.aspp2 = ASPPConv(in_channels, out_channels, atrous_rates[1], use_batch_statistics=use_batch_statistics)
self.aspp3 = ASPPConv(in_channels, out_channels, atrous_rates[2], use_batch_statistics=use_batch_statistics)
self.aspp4 = ASPPConv(in_channels, out_channels, atrous_rates[3], use_batch_statistics=use_batch_statistics)
self.aspp_pooling = ASPPPooling(in_channels, out_channels)
self.conv1 = nn.Conv2d(out_channels * (len(atrous_rates) + 1), out_channels, kernel_size=1,
weight_init='xavier_uniform')
self.bn1 = nn.BatchNorm2d(out_channels, use_batch_statistics=use_batch_statistics)
self.relu = nn.ReLU()
self.concat = P.Concat(axis=1)
self.drop = nn.Dropout(0.3)
def construct(self, x):
x1 = self.aspp1(x)
x2 = self.aspp2(x)
x3 = self.aspp3(x)
x4 = self.aspp4(x)
x5 = self.aspp_pooling(x)
x = self.concat((x1, x2))
x = self.concat((x, x3))
x = self.concat((x, x4))
x = self.concat((x, x5))
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
if self.phase == 'train':
x = self.drop(x)
return x
class DeepLabV3Plus(nn.Cell):
def __init__(self, phase='train', num_classes=21, output_stride=16, freeze_bn=False):
super(DeepLabV3Plus, self).__init__()
use_batch_statistics = not freeze_bn
self.resnet = Resnet(Bottleneck, [3, 4, 23, 3], output_stride=output_stride,
use_batch_statistics=use_batch_statistics)
self.aspp = ASPP([1, 6, 12, 18], phase, 2048, num_classes,
use_batch_statistics=use_batch_statistics)
self.shape = P.Shape()
self.conv2 = nn.Conv2d(256, 48, kernel_size=1, weight_init='xavier_uniform')
self.bn2 = nn.BatchNorm2d(48, use_batch_statistics=use_batch_statistics)
self.relu = nn.ReLU()
self.concat = P.Concat(axis=1)
self.last_conv = nn.SequentialCell([
conv3x3(304, 256, stride=1, dilation=1, padding=1),
nn.BatchNorm2d(256, use_batch_statistics=use_batch_statistics),
nn.ReLU(),
conv3x3(256, 256, stride=1, dilation=1, padding=1),
nn.BatchNorm2d(256, use_batch_statistics=use_batch_statistics),
nn.ReLU(),
conv1x1(256, num_classes, stride=1)
])
def construct(self, x):
size = self.shape(x)
out, low_level_features = self.resnet(x)
size2 = self.shape(low_level_features)
out = self.aspp(out)
out = P.ResizeNearestNeighbor((size2[2], size2[3]), True)(out)
low_level_features = self.conv2(low_level_features)
low_level_features = self.bn2(low_level_features)
low_level_features = self.relu(low_level_features)
out = self.concat((out, low_level_features))
out = self.last_conv(out)
out = P.ResizeBilinear((size[2], size[3]), True)(out)
return out
import numpy as np
def cosine_lr(base_lr, decay_steps, total_steps):
for i in range(total_steps):
step_ = min(i, decay_steps)
yield base_lr * 0.5 * (1 + np.cos(np.pi * step_ / decay_steps))
def poly_lr(base_lr, decay_steps, total_steps, end_lr=0.0001, power=0.9):
for i in range(total_steps):
step_ = min(i, decay_steps)
yield (base_lr - end_lr) * ((1.0 - step_ / decay_steps) ** power) + end_lr
def exponential_lr(base_lr, decay_steps, decay_rate, total_steps, staircase=False):
for i in range(total_steps):
if staircase:
power_ = i // decay_steps
else:
power_ = float(i) / decay_steps
yield base_lr * (decay_rate ** power_)
from mindspore import Tensor
import mindspore.common.dtype as mstype
import mindspore.nn as nn
from mindspore.ops import operations as P
class SoftmaxCrossEntropyLoss(nn.Cell):
def __init__(self, num_cls=21, ignore_label=255):
super(SoftmaxCrossEntropyLoss, self).__init__()
self.one_hot = P.OneHot(axis=-1)
self.on_value = Tensor(1.0, mstype.float32)
self.off_value = Tensor(0.0, mstype.float32)
self.cast = P.Cast()
self.ce = nn.SoftmaxCrossEntropyWithLogits()
self.not_equal = P.NotEqual()
self.num_cls = num_cls
self.ignore_label = ignore_label
self.mul = P.Mul()
self.sum = P.ReduceSum(False)
self.div = P.RealDiv()
self.transpose = P.Transpose()
self.reshape = P.Reshape()
def construct(self, logits, labels):
labels_int = self.cast(labels, mstype.int32)
labels_int = self.reshape(labels_int, (-1,))
logits_ = self.transpose(logits, (0, 2, 3, 1))
logits_ = self.reshape(logits_, (-1, self.num_cls))
weights = self.not_equal(labels_int, self.ignore_label)
weights = self.cast(weights, mstype.float32)
one_hot_labels = self.one_hot(labels_int, self.num_cls, self.on_value, self.off_value)
loss = self.ce(logits_, one_hot_labels)
loss = self.mul(weights, loss)
loss = self.div(self.sum(loss), self.sum(weights))
return loss
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment