Skip to content
Snippets Groups Projects
Unverified Commit 0a12b587 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!2346 [武汉理工大学][高校贡献][Mindspore][EGnet][GPU]

Merge pull request !2346 from yihaizhong/master
parents 15b2586e c551e91a
No related branches found
No related tags found
No related merge requests found
Showing
with 1193 additions and 181 deletions
# 目录
- [目录](#目录)
- [EGNet描述](#EGNet描述)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [训练](#训练)
- [分布式训练](#分布式训练)
- [评估过程](#评估过程)
- [评估](#评估)
- [导出过程](#导出过程)
- [导出](#导出)
- [模型描述](#模型描述)
- [性能](#性能)
- [评估性能](#评估性能)
- [DUTS-TR上的EGNet](#DUTS-TR上的EGNet)
- [推理性能](#推理性能)
- [显著性检测数据集上的EGNet](#显著性检测数据集上的EGNet)
- [ModelZoo主页](#modelzoo主页)
# EGNet描述
- [EGNet描述](#egnet描述)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [数据集预处理](#数据集预处理)
- [预训练模型](#预训练模型)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [训练](#训练)
- [分布式训练](#分布式训练)
- [评估过程](#评估过程)
- [评估](#评估)
- [导出过程](#导出过程)
- [导出](#导出)
- [模型描述](#模型描述)
- [性能](#性能)
- [评估性能](#评估性能)
- [DUTS-TR上的EGNet(Ascend)](#duts-tr上的egnetascend)
- [DUTS-TR上的EGNet(GPU)](#duts-tr上的egnetgpu)
- [推理性能](#推理性能)
- [显著性检测数据集上的EGNet(Ascend)](#显著性检测数据集上的egnetascend)
- [显著性检测数据集上的EGNet(GPU)](#显著性检测数据集上的egnetgpu)
- [ModelZoo主页](#modelzoo主页)
## EGNet描述
EGNet是用来解决静态目标检测问题,它由边缘特征提取部分、显著性目标特征提取部分以及一对一的导向模块三部分构成,利用边缘特征帮助显著性目标特征定位目标,使目标的边界更加准确。在6个不同的数据集中与15种目前最好的方法进行对比,实验结果表明EGNet性能最优。
[EGNet的pytorch源码](https://github.com/JXingZhao/EGNet),由论文作者提供。具体包含运行文件、模型文件和数据处理文件,此外还带有数据集、初始化模型和预训练模型的获取途径,可用于直接训练以及测试。
[论文](https://arxiv.org/abs/1908.08297): Zhao J X, Liu J J, Fan D P, et al. EGNet: Edge guidance network for salient object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 8779-8788.
# 模型架构
## 模型架构
EGNet网络由三个部分组成,NLSEM(边缘提取模块)、PSFEM(目标特征提取模块)、O2OGM(一对一指导模块),原始图片通过两次卷积输出图片边缘信息,与此同时,对原始图像进行更深层次的卷积操作提取salient object,然后将边缘信息与不同深度提取出来的显著目标在一对一指导模块中分别FF(融合),再分别经过卷积操作得到不同程度的显著性图像,最终输出了一张融合后的显著性检测图像。
# 数据集
## 数据集
数据集统一放在一个目录,下面的文件夹以此为基础创建。
- 训练集:[DUTS-TR数据集](http://saliencydetection.net/duts/download/DUTS-TR.zip),210MB,共10533张最大边长为400像素的彩色图像,均从ImageNet DET训练/验证集中收集。
创建名为“DUTS-TR”的文件夹,根据以上链接下载数据集放入文件夹,并解压到当前路径。
```bash
├──DUTS-TR
├──DUTS-TR-Image
├──DUTS-TR-Mask
```
- 测试集:[DUTS-TE数据集](http://saliencydetection.net/duts/download/DUTS-TE.zip),32.3MB,共5019张最大边长为400像素的彩色图像,均从ImageNet DET测试集和SUN数据集中收集。
创建名为“DUTS-TE”的文件夹,根据以上链接下载数据集放入文件夹,并解压到当前路径。
```bash
├──DUTS-TE
├──DUTS-TE-Image
├──DUTS-TE-Mask
```
- 测试集:[SOD数据集](https://www.elderlab.yorku.ca/?smd_process_download=1&download_id=8285),21.2MB,共300张最大边长为400像素的彩色图像,此数据集是基于Berkeley Segmentation Dataset(BSD)的显著对象边界的集合。
创建名为“SOD”的文件夹,根据以上链接下载数据集放入文件夹,并解压到当前路径。
```bash
├──SOD
├──Imgs
```
- 测试集:[ECSSD数据集](http://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/data/ECSSD/images.zip,http://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/data/ECSSD/ground_truth_mask.zip),64.6MB,共1000张最大边长为400像素的彩色图像。
创建名为“ECSSD”的文件夹,根据以上链接下载数据集的原图及groundtruth放入文件夹,并解压到当前路径。
```bash
├──ECSSD
├──ground_truth_mask
├──images
```
- 测试集:[PASCAL-S数据集](https://academictorrents.com/download/6c49defd6f0e417c039637475cde638d1363037e.torrent),175 MB,共10个类、850张32*32彩色图像。该数据集与其他显著物体检测数据集区别较大, 没有非常明显的显著物体, 并主要根据人类的眼动进行标注数据集, 因此该数据集难度较大。
根据以上链接下载数据集并解压到当前路径。在数据集根目录创建名为“PASCAL-S”以及Imgs的文件夹,将datasets/imgs/pascal和datasets/masks/pascal放入到Imgs文件夹中。
```bash
├──PASCAL-S
├──Imgs
```
- 测试集:[DUTS-OMRON数据集](http://saliencydetection.net/dut-omron/download/DUT-OMRON-image.zip,http://saliencydetection.net/dut-omron/download/DUT-OMRON-gt-pixelwise.zip.zip),107 MB,共5168张最大边长为400像素的彩色图像。数据集中具有一个或多个显著对象和相对复杂的背景,具有眼睛固定、边界框和像素方面的大规模真实标注的数据集.
创建名为“DUTS-OMRON-image”的文件夹,根据以上链接下载数据集放入文件夹,并解压到当前路径。
使用的数据集:[显著性检测数据集](<https://blog.csdn.net/studyeboy/article/details/102383922?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522163031601316780274127035%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=163031601316780274127035&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~hot_rank-5-102383922.first_rank_v2_pc_rank_v29&utm_term=DUTS-TE%E6%95%B0%E6%8D%AE%E9%9B%86%E4%B8%8B%E8%BD%BD&spm=1018.2226.3001.4187>)
```bash
├──DUTS-OMRON-image
├──DUTS-OMRON-image
├──pixelwiseGT-new-PNG
```
- 测试集:[HKU-IS数据集](https://i.cs.hku.hk/~gbli/deep_saliency.html),893MB,共4447张最大边长为400像素的彩色图像。数据集中每张图像至少满足以下的3个标准之一:1)含有多个分散的显著物体; 2)至少有1个显著物体在图像边界; 3)显著物体与背景表观相似。
- 数据集大小:
- 训练集:DUTS-TR数据集,210MB,共10533张最大边长为400像素的彩色图像,均从ImageNet DET训练/验证集中收集。
- 测试集:SOD数据集,21.2MB,共300张最大边长为400像素的彩色图像,此数据集是基于Berkeley Segmentation Dataset(BSD)的显著对象边界的集合。
- 测试集:ECSSD数据集,64.6MB,共1000张最大边长为400像素的彩色图像。
- 测试集:PASCAL-S数据集,175 MB,共10个类、850张32*32彩色图像。该数据集与其他显著物体检测数据集区别较大, 没有非常明显的显著物体, 并主要根据人类的眼动进行标注数据集, 因此该数据集难度较大。
- 测试集:DUTS-OMRON数据集,107 MB,共5168张最大边长为400像素的彩色图像。数据集中具有一个或多个显著对象和相对复杂的背景,具有眼睛固定、边界框和像素方面的大规模真实标注的数据集。
- 测试集:HKU-IS数据集,893MB,共4447张最大边长为400像素的彩色图像。数据集中每张图像至少满足以下的3个标准之一:1)含有多个分散的显著物体; 2)至少有1个显著物体在图像边界; 3)显著物体与背景表观相似。
- 数据格式:二进制文件
- 注:数据将在src/dataset.py中处理。
创建名为“HKU-IS”的文件夹,根据以上链接下载数据集放入文件夹,并解压到当前路径。
# 环境要求
```bash
├──HKU-IS
├──imgs
├──gt
```
### 数据集预处理
运行dataset_preprocess.sh脚本,对数据集进行了格式统一,裁剪以及生成对应的lst文件。其中测试集生成test.lst,训练集生成test.lst和train_pair_edge.lst。
```shell
# DATA_ROOT 所有数据集存放的根目录
# OUTPUT_ROOT 结果目录
bash dataset_preprocess.sh [DATA_ROOT] [OUTPUT_ROOT]
```
1. 处理后的DUTS-TR数据集目录如下。DUTS-TE-Mask存放groundtruth,DUTS-TE-Image存放原图,test.lst是数据中的图片文件列表,train_pair_edge.lst是记录数据集中图片、groundtruth和边缘图的文件列表。
```bash
├──DUTS-TR
├──DUTS-TR-Image
├──DUTS-TR-Mask
├──test.lst
├──train_pair_edge.lst
```
2. 处理后的DUTS-TE数据集目录如下。DUTS-TE-Mask存放groundtruth,DUTS-TE-Image存放原图,test.lst是数据中的图片文件列表。
```bash
├──DUTS-TE
├──DUTS-TE-Image
├──DUTS-TE-Mask
├──test.lst
```
3. 处理后的除DUTS-TE的5个测试集统一成如下格式(以HKU-IS为例),ground_truth_mask存放groundtruth,images存放原图,test.lst是数据中的图片文件列表。。
```bash
├──HKU-IS
├──ground_truth_mask
├──images
├──test.lst
```
4. test.lst是数据中的图片文件列表,train_pair_edge.lst是包含图片、groundtruth和边缘图的文件列表。
```bash
test.lst文件格式如下(以HKU-IS为例)
0004.png
0005.png
0006.png
....
9056.png
9057.png
```
- 硬件(Ascend/GPU/CPU)
- 使用Ascend/GPU/CPU处理器来搭建硬件环境。
```bash
train_pair_edge.lst文件格式如下(DUTS-TR)
DUTS-TR-Image/ILSVRC2012_test_00007606.jpg DUTS-TR-Mask/ILSVRC2012_test_00007606.png DUTS-TR-Mask/ILSVRC2012_test_00007606_edge.png
DUTS-TR-Image/n03770439_12912.jpg DUTS-TR-Mask/n03770439_12912.png DUTS-TR-Mask/n03770439_12912_edge.png
DUTS-TR-Image/ILSVRC2012_test_00062061.jpg DUTS-TR-Mask/ILSVRC2012_test_00062061.png DUTS-TR-Mask/ILSVRC2012_test_00062061_edge.png
....
DUTS-TR-Image/n02398521_31039.jpg DUTS-TR-Mask/n02398521_31039.png DUTS-TR-Mask/n02398521_31039_edge.png
DUTS-TR-Image/n07768694_14708.jpg DUTS-TR-Mask/n07768694_14708.png DUTS-TR-Mask/n07768694_14708_edge.png
```
## 预训练模型
pytorch预训练模型(包括vgg16, resnet50)
VGG主干网络选用vgg16的结构,包含13个卷积层和3个全连接层,本模型不使用全连接层。
下载 [VGG16预训练模型](https://download.mindspore.cn/thirdparty/vgg16_20M.pth)
ResNet主干网络选用resnet50的结构,包含卷积层和全连接层在内共有50层,本模型不使用全连接层。整体由5个Stage组成,第一个Stage对输出进行预处理,后四个Stage分别包含3,4,6,3个Bottleneck。
下载 [ResNet50预训练模型](https://download.mindspore.cn/thirdparty/resnet50_caffe.pth)
mindspore预训练模型
下载pytorch预训练模型,再运行如下脚本,得到对应的mindspore模型。注:运行该脚本需要同时安装pytorch环境(测试版本号为1.3,CPU 或 GPU)
```bash
# MODEL_NAME: 模型名称vgg或resnet
# PTH_FILE: 待转换模型文件绝对路径
# MSP_FILE: 输出模型文件绝对路径
bash convert_model.sh [MODEL_NAME] [PTH_FILE] [MSP_FILE]
```
## 环境要求
- 硬件(Ascend/GPU)
- 使用Ascend/GPU处理器来搭建硬件环境。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
# 快速入门
## 快速入门
通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估:
在default_config.yaml进行相关配置,其中train_path项设置训练集存放路径,base_model项设置主干网络类型(vgg或resnet),test_path项设置测试集存放路径,vgg和resnet项设置预训练模型存放路径。scripts中的脚本文件里也可传递参数,且可以覆盖掉default_config.yaml中设置的参数。注:所有脚本运行需先进入scripts目录。
- Ascend处理器环境运行
```shell
# 数据集进行裁剪
python data_crop.py --data_name=[DATA_NAME] --data_root=[DATA_ROOT] --output_path=[OUTPUT_PATH]
# 运行训练示例
bash run_standalone_train.sh
......@@ -79,76 +227,94 @@ bash run_distribute_train.sh 8 [RANK_TABLE_FILE]
bash run_eval.sh
```
训练集路径在default_config.yaml中的data项设置
- GPU处理器环境运行
# 脚本说明
```shell
## 脚本及样例代码
# 运行训练示例
bash run_standalone_train_gpu.sh
# 运行分布式训练示例
# DEVICE_NUM: 使用的显卡数量,如: 8
# USED_DEVICES: 使用的显卡id列表,需和显卡数量对应,如: 0,1,2,3,4,5,6,7
bash run_distribute_train_gpu.sh [DEVICE_NUM] [USED_DEVICES]
# 运行评估示例
bash run_eval_gpu.sh
```
## 脚本说明
### 脚本及样例代码
```bash
├── model_zoo
├── EGNet
├── README.md # EGNet相关说明
├── model_utils # config,modelarts等配置脚本文件夹
│ ├──config.py # 解析参数配置文件
├── README_CN.md # EGNet中文README文件
├── model_utils # config,modelarts等配置脚本文件夹
│ ├──config.py # 解析参数配置文件
├── scripts
│ ├──run_train.sh # 启动Ascend单机训练(单卡)
│ ├──run_distribute_train.sh # 启动Ascend分布式训练(8卡)
│ ├──run_eval.sh # 启动Ascend评估
│ ├──run_standalone_train.sh # 启动Ascend单机训练(单卡)
│ ├──run_distribute_train.sh # 启动Ascend分布式训练(8卡)
│ ├──run_eval.sh # 启动Ascend评估
│ ├──run_standalone_train_gpu.sh # 启动GPU单机训练(单卡)
│ ├──run_distribute_train_gpu.sh # 启动GPU分布式训练(多卡)
│ ├──run_eval_gpu.sh # 启动GPU评估
│ ├──dataset_preprocess.sh # 对数据集预处理并生成lst文件
│ ├──convert_model.sh # 转换预训练模型
├── src
│ ├──dataset.py # 加载数据集
│ ├──egnet.py # EGNet的网络结构
│ ├──vgg.py # vgg的网络结构
│ ├──resnet.py # resnet的网络结构
│ ├──sal_edge_loss.py # 损失定义
│ ├──train_forward_backward.py # 前向传播和反向传播定义
├── sal2edge.py # 预处理,把显著图像转化为边缘图像
├── data_crop.py # 数据裁剪
├── train.py # 训练脚本
├── eval.py # 评估脚本
├── export.py # 模型导出脚本
├── default_config.yaml # 参数配置文件
│ ├──dataset.py # 加载数据集
│ ├──egnet.py # EGNet的网络结构
│ ├──vgg.py # vgg的网络结构,vgg16版本
│ ├──resnet.py # resnet的网络结构,resnet50版本
│ ├──sal_edge_loss.py # 损失定义
│ ├──train_forward_backward.py # 前向传播和反向传播定义
├── pretrained_model_convert # pytorch预训练模型转换成mindspore模型
│ ├──pth_to_msp.py # pth文件转换成ckpt文件
│ ├──resnet_msp.py # mindspore下resnet预训练模型的网络结构
│ ├──resnet_pth.py # pytorch下resnet预训练模型的网络结构
│ ├──vgg_msp.py # mindspore下vgg预训练模型的网络结构
│ ├──vgg_pth.py # pytorch下vgg预训练模型的网络结构
├── sal2edge.py # 预处理,把显著图像转化为边缘图像
├── data_crop.py # 数据裁剪并生成test.lst文件
├── train.py # 训练脚本
├── eval.py # 评估脚本
├── export.py # 模型导出脚本
├── default_config.yaml # 参数配置文件
├── requirements.txt # 其他依赖包
```
## 脚本参数
### 脚本参数
在config.py中可以同时配置训练参数和评估参数。
default_config.yaml中可以同时配置训练参数和评估参数。
- 配置EGNet和DUTS-TR数据集。
- 配置EGNet,这里列出一些关键参数
```text
dataset_name: "DUTS-TR" # 数据集名称
name: "egnet" # 网络名称
pre_trained: Ture # 是否基于预训练模型训练
lr_init: 5e-5(resnet) or 2e-5(vgg) # 初始学习率
batch_size: 10 # 训练批次大小
epoch_size: 30 # 总计训练epoch数
momentum: 0.1 # 动量
weight_decay:5e-4 # 权重衰减值
image_height: 200 # 输入到模型的图像高度
image_width: 200 # 输入到模型的图像宽度
train_data_path: "./data/DUTS-TR/" # 训练数据集的相对路径
eval_data_path: "./data/SOD/" # 评估数据集的相对路径
checkpoint_path: "./EGNet/run-nnet/models/" # checkpoint文件保存的相对路径
device_target: "Ascend" # 运行设备 ["Ascend", "GPU"]
base_model: "resnet" # 主干网络,["vgg", "resnet"]
batch_size: 1 # 训练批次大小
n_ave_grad: 10 # 梯度累积step数
epoch_size: 30 # 总计训练epoch数
image_height: 200 # 输入到模型的图像高度
image_width: 200 # 输入到模型的图像宽度
train_path: "./data/DUTS-TR/" # 训练数据集的路径
test_path: "./data" # 测试数据集的根目录
vgg: "/home/EGnet/EGnet/model/vgg16.ckpt" # vgg预训练模型的路径
resnet: "/home/EGnet/EGnet/model/resnet50.ckpt" # resnet预训练模型的路径
model: "EGNet/run-nnet/models/final_vgg_bone.ckpt" # 测试时使用的checkpoint文件
```
更多配置细节请参考 src/config.py。
更多配置细节请参考 default_config.yaml
# 训练过程
## 训练过程
## 训练
- 数据集进行裁剪:
```bash
python data_crop.py --data_name=[DATA_NAME] --data_root=[DATA_ROOT] --output_path=[OUTPUT_PATH]
```
### 训练
- Ascend处理器环境运行
```bash
python train.py --mode=train --base_model=vgg --vgg=[PRETRAINED_PATH]
python train.py --mode=train --base_model=resnet --resnet=[PRETRAINED_PATH]
bash run_standalone_train.sh
```
- 线上modelarts训练
......@@ -159,10 +325,8 @@ online_train_path(obs桶中训练集DUTS-TR的存储路径)
```bash
├──DUTS-TR
├──DUTS-TR-Image
├──DUTS-TR-Image
├──DUTS-TR-Mask
├──train_pair.lst
├──train_pair_edge.lst
```
online_pretrained_path(obs桶中预训练模型的存储路径)
......@@ -181,7 +345,13 @@ train_online = True(设定为线上训练)
训练结束后,您可在默认./EGNet/run-nnet/models/文件夹下找到检查点文件。
## 分布式训练
- GPU处理器环境运行
```bash
bash run_standalone_train_gpu.sh
```
### 分布式训练
- Ascend处理器环境运行
......@@ -193,47 +363,55 @@ bash run_distribute_train.sh 8 [RANK_TABLE_FILE]
- 线上modelarts分布式训练
线上训练需要的参数配置与单卡训练基本一致,只需要新增参数is_distributed = True
线上分布式训练需要的参数配置与单卡训练基本一致,只需要新增参数is_distributed = True
上述shell脚本将在后台运行分布训练。您可以通过train/train.log文件查看结果。
# 评估过程
- GPU处理器环境运行
```bash
# DEVICE_NUM: 使用的显卡数量,如: 8
# USED_DEVICES: 使用的显卡id列表,需和显卡数量对应,如: 0,1,2,3,4,5,6,7
bash run_distribute_train_gpu.sh [DEVICE_NUM] [USED_DEVICES]
```
## 评估过程
## 评估
### 评估
- Ascend处理器环境运行
```bash
python eval.py --model=[MODEL_PATH] --sal_mode=[DATA_NAME] --test_fold=[TEST_DATA_PATH] --base_model=vgg
python eval.py --model=[MODEL_PATH] --sal_mode=[DATA_NAME] --test_fold=[TEST_DATA_PATH] --base_model=resnet
bash run_eval.sh
```
数据集文件结构
- GPU处理器环境运行,需修改default_config.yaml中的model参数为需要评估的模型路径
```text
model: "EGNet/run-nnet/models/final_vgg_bone.ckpt" # 测试时使用的checkpoint文件
```
```bash
├──NAME
├──ground_truth_mask
├──images
├──test.lst
bash run_eval_gpu.sh
```
# 导出过程
## 导出过程
## 导出
### 导出
在导出之前需要修改default_config.ymal配置文件.需要修改的配置项为ckpt_file.
在导出之前需要修改default_config.ymal配置文件的配置项ckpt_file或传入参数--ckpt_file.
```shell
python export.py --ckpt_file=[CKPT_FILE]
```
# 模型描述
## 模型描述
## 性能
### 性能
### 评估性能
#### 评估性能
#### DUTS-TR上的EGNet
##### DUTS-TR上的EGNet(Ascend)
| 参数 | Ascend | Ascend |
| -------------------------- | ----------------------------------------------------------- | ---------------------- |
......@@ -248,11 +426,28 @@ python export.py --ckpt_file=[CKPT_FILE]
| 速度 | 单卡:593.460毫秒/步 ; 8卡:460.952毫秒/步 | 单卡:569.524毫秒/步; 8卡:466.667毫秒/步 |
| 总时长 | 单卡:5h3m ; 8卡: 4h2m | 单卡:4h59m ; 8卡:4h5m |
| 微调检查点 | 412M (.ckpt文件) | 426M (.ckpt文件) |
| 脚本 | [EGNnet脚本]() | [EGNet 脚本]() |
| 脚本 | [EGNet脚本](https://gitee.com/mindspore/models/tree/master/research/cv/EGnet) | [EGNet 脚本](https://gitee.com/mindspore/models/tree/master/research/cv/EGnet) |
##### DUTS-TR上的EGNet(GPU)
| 参数 | GPU | GPU |
| -------------------------- | ----------------------------------------------------------- | ---------------------- |
| 模型版本 | EGNet(VGG) | EGNet(resnet) |
| 资源 | GeForce RTX 2080 Ti(单卡) V100(多卡) | GeForce RTX 2080 Ti(单卡) V100(多卡) |
| 上传日期 | 2021-12-02 | 2021-12-02 |
| MindSpore版本 | 1.3.0 | 1.3.0 |
| 数据集 | DUTS-TR | DUTS-TR |
| 训练参数 | epoch=30, steps=1050, batch_size = 10, lr=2e-5 | epoch=30, steps=1050, batch_size=10, lr=5e-5 |
| 优化器 | Adam | Adam |
| 损失函数 | Binary交叉熵 | Binary交叉熵 |
| 速度 | 单卡:1148.571毫秒/步 ; 2卡:921.905毫秒/步 | 单卡:1323.810毫秒/步; 2卡:1057.143毫秒/步 |
| 总时长 | 单卡:10h3m ; 2卡:8h4m | 单卡:11h35m ; 2卡:9h15m |
| 微调检查点 | 412M (.ckpt文件) | 426M (.ckpt文件) |
| 脚本 | [EGNet脚本](https://gitee.com/mindspore/models/tree/master/research/cv/EGnet) | [EGNet 脚本](https://gitee.com/mindspore/models/tree/master/research/cv/EGnet) |
### 推理性能
#### 推理性能
#### 显著性检测数据集上的EGNet
##### 显著性检测数据集上的EGNet(Ascend)
| 参数 | Ascend | Ascend |
| ------------------- | --------------------------- | --------------------------- |
......@@ -261,21 +456,48 @@ python export.py --ckpt_file=[CKPT_FILE]
| 上传日期 | 2021-12-25 | 2021-12-25 |
| MindSpore 版本 | 1.3.0 | 1.3.0 |
| 数据集 | SOD, 300张图像 | SOD, 300张图像 |
| 评估指标(单卡) | MaxF:0.8659637 ; MAE:0.1540910 ; S:0.7317967 | MaxF:0.8763882 ; MAE:0.1453154 ; S:0.7388669 |
| 评估指标(多卡) | MaxF:0.8667928 ; MAE:0.1532886 ; S:0.7360025 | MaxF:0.8798361 ; MAE:0.1448086 ; S:0.74030272 |
| 评估指标(单卡) | MaxF:0.865 ; MAE:0.154 ; S:0.731 | MaxF:0.876 ; MAE:0.145 ; S:0.738 |
| 评估指标(多卡) | MaxF:0.866 ; MAE:0.153 ; S:0.736 | MaxF:0.879 ; MAE:0.144 ; S:0.740 |
| 数据集 | ECSSD, 1000张图像 | ECSSD, 1000张图像 |
| 评估指标(单卡) | MaxF:0.936 ; MAE:0.074 ; S:0.863 | MaxF:0.947 ; MAE:0.064 ; S:0.876 |
| 评估指标(多卡) | MaxF:0.935 ; MAE:0.080 ; S:0.859 | MaxF:0.945 ; MAE:0.068 ; S:0.873 |
| 数据集 | PASCAL-S, 850张图像 | PASCAL-S, 850张图像 |
| 评估指标(单卡) | MaxF:0.877 ; MAE:0.118 ; S:0.765 | MaxF:0.886 ; MAE:0.106 ; S:0.779 |
| 评估指标(多卡) | MaxF:0.878 ; MAE:0.119 ; S:0.765 | MaxF:0.888 ; MAE:0.108 ; S:0.778 |
| 数据集 | DUTS-OMRON, 5168张图像 | DUTS-OMRON, 5168张图像 |
| 评估指标(单卡) | MaxF:0.782 ; MAE:0.142 ; S:0.752 | MaxF:0.799 ; MAE:0.133 ; S:0.767 |
| 评估指标(多卡) | MaxF:0.781 ; MAE:0.145 ; S:0.749 | MaxF:0.799 ; MAE:0.133 ; S:0.764 |
| 数据集 | HKU-IS, 4447张图像 | HKU-IS, 4447张图像 |
| 评估指标(单卡) | MaxF:0.919 ; MAE:0.073 ; S:0.867 | MaxF:0.929 ; MAE:0.063 ; S:0.881 |
| 评估指标(多卡) | MaxF:0.914 ; MAE:0.079 ; S:0.860 | MaxF:0.925 ; MAE:0.068 ; S:0.876 |
##### 显著性检测数据集上的EGNet(GPU)
| 参数 | GPU | GPU |
| ------------------- | --------------------------- | --------------------------- |
| 模型版本 | EGNet(VGG) | EGNet(resnet) |
| 资源 | GeForce RTX 2080 Ti | GeForce RTX 2080 Ti |
| 上传日期 | 2021-12-02 | 2021-12-02 |
| MindSpore 版本 | 1.3.0 | 1.3.0 |
| 数据集 | DUTS-TE, 5019张图像 | DUTS-TE, 5019张图像 |
| 评估指标(单卡) | MaxF:0.852 ; MAE:0.094 ; S:0.819 | MaxF:0.862 ; MAE:0.089 ; S:0.829 |
| 评估指标(多卡) | MaxF:0.853 ; MAE:0.098 ; S:0.816 | MaxF:0.862 ; MAE:0.095 ; S:0.825 |
| 数据集 | SOD, 300张图像 | SOD, 300张图像 |
| 评估指标(单卡) | MaxF:0.877 ; MAE:0.149 ; S:0.739 | MaxF:0.876 ; MAE:0.150 ; S:0.732 |
| 评估指标(多卡) | MaxF:0.876 ; MAE:0.158 ; S:0.734 | MaxF:0.874 ; MAE:0.153 ; S:0.736 |
| 数据集 | ECSSD, 1000张图像 | ECSSD, 1000张图像 |
| 评估指标(单卡) | MaxF:0.9365406 ; MAE:0.0744784 ; S:0.8639620 | MaxF:0.9477927 ; MAE:0.0649923 ; S:0.8765208 |
| 评估指标(多卡) | MaxF:0.9356243 ; MAE:0.0805953 ; S:0.8595030 | MaxF:0.9457578 ; MAE:0.0684581 ; S:0.8732929 |
| 评估指标(单卡) | MaxF:0.940 ; MAE:0.069 ; S:0.868 | MaxF:0.947 ; MAE:0.064 ; S:0.876 |
| 评估指标(多卡) | MaxF:0.938 ; MAE:0.079 ; S:0.863 | MaxF:0.947 ; MAE:0.066 ; S:0.878 |
| 数据集 | PASCAL-S, 850张图像 | PASCAL-S, 850张图像 |
| 评估指标(单卡) | MaxF:0.8777129 ; MAE:0.1188116 ; S:0.7653073 | MaxF:0.8861882 ; MAE:0.1061731 ; S:0.7792912 |
| 评估指标(多卡) | MaxF:0.8787268 ; MAE:0.1192975 ; S:0.7657838 | MaxF:0.8883396 ; MAE:0.1081997 ; S:0.7786236 |
| 评估指标(单卡) | MaxF:0.881 ; MAE:0.110 ; S:0.771 | MaxF:0.879 ; MAE:0.112 ; S:0.772 |
| 评估指标(多卡) | MaxF:0.883 ; MAE:0.116 ; S:0.772 | MaxF:0.882 ; MAE:0.115 ; S:0.774 |
| 数据集 | DUTS-OMRON, 5168张图像 | DUTS-OMRON, 5168张图像 |
| 评估指标(单卡) | MaxF:0.7821059 ; MAE:0.1424146 ; S:0.7529001 | MaxF:0.7999835 ; MAE:0.1330678 ; S:0.7671095 |
| 评估指标(多卡) | MaxF:0.7815770 ; MAE:0.1455649 ; S:0.7493499 | MaxF:0.7997979 ; MAE:0.1339806 ; S:0.7646356 |
| 评估指标(单卡) | MaxF:0.787 ; MAE:0.139 ; S:0.754 | MaxF:0.799 ; MAE:0.139 ; S:0.761 |
| 评估指标(多卡) | MaxF:0.789 ; MAE:0.144 ; S:0.753 | MaxF:0.800 ; MAE:0.143 ; S:0.762 |
| 数据集 | HKU-IS, 4447张图像 | HKU-IS, 4447张图像 |
| 评估指标(单卡) | MaxF:0.9193007 ; MAE:0.0732772 ; S:0.8674455 | MaxF:0.9299341 ; MAE:0.0631132 ; S:0.8817522 |
| 评估指标(多卡) | MaxF:0.9145629 ; MAE:0.0793372 ; S:0.8608878 | MaxF:0.9254014; MAE:0.0685441 ; S:0.8762386 |
| 评估指标(单卡) | MaxF:0.923 ; MAE:0.067 ; S:0.873 | MaxF:0.928 ; MAE:0.063 ; S:0.878 |
| 评估指标(多卡) | MaxF:0.921 ; MAE:0.074 ; S:0.868 | MaxF:0.928 ; MAE:0.067 ; S:0.878 |
# ModelZoo主页
## ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/models)
# Copyright 2021 Huawei Technologies Co., Ltd
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -21,6 +21,7 @@ from concurrent import futures
import cv2
import pandas as pd
def crop_one(input_img_path, output_img_path):
"""
center crop one image
......@@ -43,7 +44,7 @@ def crop(data_root, output_path):
crop all images with thread pool
"""
if not os.path.exists(data_root):
raise FileNotFoundError("data root not exist")
raise FileNotFoundError("data root not exist: " + data_root)
if not os.path.exists(output_path):
os.makedirs(output_path)
......@@ -58,6 +59,7 @@ def crop(data_root, output_path):
futures.wait(all_task)
print("all done!")
def save(data_root, output_path):
file_list = []
for path in os.listdir(data_root):
......@@ -66,6 +68,7 @@ def save(data_root, output_path):
df = pd.DataFrame(file_list, columns=["one"])
df.to_csv(os.path.join(output_path, "test.lst"), columns=["one"], index=False, header=False)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Crop Image to 200*200")
parser.add_argument("--data_name", type=str, help="dataset name", required=True,
......@@ -76,13 +79,13 @@ if __name__ == "__main__":
default="/home/data")
args = parser.parse_known_args()[0]
if args.data_name == "DUTS-TE":
Mask = "DUTS-TE-MASK"
Mask = "DUTS-TE-Mask"
Image = "DUTS-TE-Image"
elif args.data_name == "DUTS-TR":
Mask = "DUTS-TR-Mask"
Image = "DUTS-TR-Image"
else:
Mask = "groud_truth_mask"
Mask = "ground_truth_mask"
Image = "images"
crop(os.path.join(args.data_root, args.data_name, Mask),
os.path.join(args.output_path, args.data_name, Mask))
......
# ==============================================================================
# Hyper-parameters
n_color: 3
device_target: "Ascend"
# basic parameters
n_color: 3 # color channels of input images
device_target: "Ascend" # device to run the model ["Ascend", "GPU"]
# Dataset settings
train_path: "data/DUTS-TR"
test_path: "data"
train_path: "/home/data2/egnet/DUTS-TR-10498" # training dataset dir;
test_path: "/home/data2/egnet/data200" # testing dataset root;
# Training settings
train_online: False
online_train_path: ""
online_pretrained_path: ""
train_url: ""
is_distributed: False
base_model: "vgg" # ['resnet','vgg']
pretrained_url: "pretrained"
vgg: "pretrained/vgg_pretrained.ckpt"
resnet: "pretrained/resnet_pretrained.ckpt"
epoch: 30
base_model: "resnet" # bone network ["resnet", "vgg"], used when train and eval;
vgg: "/home/EGNet/EGNet/model/vgg16.ckpt" # set pre-trained model
resnet: "/home/EGNet/EGNet/model/resnet50.ckpt" # set pre-trained model
is_distributed: False # set distributed training
epoch: 30 # epoch
batch_size: 1
num_thread: 4
save_fold: "EGNet"
n_ave_grad: 10 # step size for gradient accumulation.
num_thread: 4 # thread num for dataset
save_fold: "EGNet" # root directory for training information
train_save_name: "nnet"
epoch_save: 1
epoch_show: 1
pre_trained: ""
start_epoch: 1
n_ave_grad: 10
show_every: 10
save_tmp: 200
loss_scale: 1
# Training with checkpoint
pre_trained: "" # checkpoint file
start_epoch: 1 # start epoch for training
# Testing settings
eval_online: False
online_eval_path: ""
online_ckpt_path: ""
model: "EGNet/run-nnet/models/final_vgg_bone.ckpt"
model: "/home/data3/egnet_models/resnet/msp/final_bone_1128_1.ckpt" # model for evaluation
test_fold: "result"
test_save_name: "EGNet_"
test_mode: 1
sal_mode: "t" # ['e','t','d','h','p','s']
test_batch_size: 1
sal_mode: "t" # ['e','t','d','h','p','s'] # which dataset to evaluate
test_batch_size: 1 # test batch, do not edit now!
# Misc
mode: "train" # ['train','test']
visdom: False
# Online training setting
train_online: False
online_train_path: ""
online_pretrained_path: ""
train_url: ""
pretrained_url: "pretrained" # used when train and eval;
# Online testing setting
eval_online: False
online_eval_path: ""
online_ckpt_path: ""
# Export settings
file_name: "EGNet"
......
# Copyright 2021 Huawei Technologies Co., Ltd
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -26,6 +26,7 @@ from model_utils.config import base_config
from src.dataset import create_dataset
from src.egnet import build_model
def main(config):
if config.eval_online:
import moxing as mox
......@@ -43,8 +44,8 @@ def main(config):
elif config.sal_mode == "e":
Evalname = "ECSSD"
config.test_path = os.path.join("/cache", config.test_path)
local_data_url = os.path.join(config.test_path, "%s"%(Evalname))
local_list_eval = os.path.join(config.test_path, "%s/test.lst"%(Evalname))
local_data_url = os.path.join(config.test_path, "%s" % (Evalname))
local_list_eval = os.path.join(config.test_path, "%s/test.lst" % (Evalname))
mox.file.copy_parallel(config.online_eval_path, local_data_url)
mox.file.copy_parallel(os.path.join(config.online_eval_path, "test.lst"), local_list_eval)
ckpt_path = os.path.join("/cache", os.path.dirname(config.model))
......@@ -64,6 +65,7 @@ class Metric:
"""
for metric
"""
def __init__(self):
self.epsilon = 1e-4
self.beta = 0.3
......
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os.path
import argparse
import vgg_pth
import resnet_pth
import torch.nn
import vgg_msp
import resnet_msp
import mindspore.nn
def convert_vgg(pretrained_file, result):
vgg16_pth = vgg_pth.vgg16()
if torch.cuda.is_available():
vgg16_pth.load_state_dict(torch.load(pretrained_file))
else:
vgg16_pth.load_state_dict(torch.load(pretrained_file, map_location=torch.device("cpu")))
vgg16_msp = vgg_msp.vgg16()
for p_pth, p_msp in zip(vgg16_pth.parameters(), vgg16_msp.get_parameters()):
p_msp.set_data(mindspore.Tensor(p_pth.detach().numpy()))
mindspore.save_checkpoint(vgg16_msp, result)
def convert_resnet(pretrained_file, result):
resnet50_pth = resnet_pth.resnet50()
resnet50_msp = resnet_msp.resnet50()
if torch.cuda.is_available():
resnet50_pth.load_state_dict(torch.load(pretrained_file), strict=False)
else:
resnet50_pth.load_state_dict(torch.load(pretrained_file, map_location=torch.device("cpu")), strict=False)
p_pth_list = list()
for p_pth in resnet50_pth.parameters():
p_pth_list.append(p_pth.cpu().detach().numpy())
bn_list = list()
for m in resnet50_pth.modules():
if isinstance(m, torch.nn.BatchNorm2d):
bn_list.append(m.running_mean.cpu().numpy())
bn_list.append(m.running_var.cpu().numpy())
p_index = 0
bn_index = 0
for n_msp, p_msp in resnet50_msp.parameters_and_names():
if "moving_" not in n_msp:
p_msp.set_data(mindspore.Tensor(p_pth_list[p_index]))
p_index += 1
else:
p_msp.set_data(mindspore.Tensor(bn_list[bn_index]))
bn_index += 1
mindspore.save_checkpoint(resnet50_msp, result)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--model", choices=["vgg", "resnet"], type=str)
parser.add_argument("--pth_file", type=str, default="vgg16_20M.pth", help="input pth file")
parser.add_argument("--msp_file", type=str, default="vgg16_pretrained.ckpt", help="output msp file")
args = parser.parse_args()
if not os.path.exists(args.pth_file):
raise FileNotFoundError(args.pth_file)
if args.model == "vgg":
convert_vgg(args.pth_file, args.msp_file)
elif args.model == "resnet":
convert_resnet(args.pth_file, args.msp_file)
else:
print("unknown model")
print("success")
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Resnet model define"""
import mindspore.nn as nn
from mindspore import load_checkpoint
affine_par = True
def conv3x3(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, padding="same", stride=stride, has_bias=False)
class Bottleneck(nn.Cell):
"""
Bottleneck layer
"""
expansion = 4
def __init__(self, in_planes, planes, stride=1, dilation_=1, downsample=None):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, stride=stride, has_bias=False)
self.bn1 = nn.BatchNorm2d(planes, affine=affine_par, use_batch_statistics=False)
for i in self.bn1.get_parameters():
i.requires_grad = False
padding = 1
if dilation_ == 2:
padding = 2
elif dilation_ == 4:
padding = 4
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=padding, pad_mode="pad", has_bias=False,
dilation=dilation_)
self.bn2 = nn.BatchNorm2d(planes, affine=affine_par, use_batch_statistics=False)
for i in self.bn2.get_parameters():
i.requires_grad = False
self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, has_bias=False)
self.bn3 = nn.BatchNorm2d(planes * 4, affine=affine_par, use_batch_statistics=False)
for i in self.bn3.get_parameters():
i.requires_grad = False
self.relu = nn.ReLU()
self.downsample = downsample
self.stride = stride
def construct(self, x):
"""
forword
"""
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out
class ResNet(nn.Cell):
"""
resnet
"""
def __init__(self, block, layers):
self.in_planes = 64
super(ResNet, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, pad_mode="pad",
has_bias=False)
self.bn1 = nn.BatchNorm2d(64, affine=affine_par, use_batch_statistics=False)
for i in self.bn1.get_parameters():
i.requires_grad = False
self.relu = nn.ReLU()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same") # change
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=2)
def _make_layer(self, block, planes, blocks, stride=1, dilation=1):
"""
make layer
"""
downsample = None
if stride != 1 or self.in_planes != planes * block.expansion or dilation == 2 or dilation == 4:
downsample = nn.SequentialCell(
nn.Conv2d(self.in_planes, planes * block.expansion,
kernel_size=1, stride=stride, has_bias=False),
nn.BatchNorm2d(planes * block.expansion, affine=affine_par, use_batch_statistics=False),
)
for i in downsample[1].get_parameters():
i.requires_grad = False
layers = [block(self.in_planes, planes, stride, dilation_=dilation, downsample=downsample)]
self.in_planes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(self.in_planes, planes, dilation_=dilation))
return nn.SequentialCell(*layers)
def load_pretrained_model(self, model_file):
"""
load pretrained model
"""
load_checkpoint(model_file, net=self)
def construct(self, x):
"""
forward
"""
tmp_x = []
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
tmp_x.append(x)
x = self.maxpool(x)
x = self.layer1(x)
tmp_x.append(x)
x = self.layer2(x)
tmp_x.append(x)
x = self.layer3(x)
tmp_x.append(x)
x = self.layer4(x)
tmp_x.append(x)
return tmp_x
# adding prefix "base" to parameter names for load_checkpoint().
class Tmp(nn.Cell):
def __init__(self, base):
super(Tmp, self).__init__()
self.base = base
def resnet50():
base = ResNet(Bottleneck, [3, 4, 6, 3])
return Tmp(base)
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import torch.nn as nn
affine_par = True
def conv3x3(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
padding=1, bias=False)
class Bottleneck(nn.Module):
expansion = 4
def __init__(self, inplanes, planes, stride=1, dilation_=1, downsample=None):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False) # change
self.bn1 = nn.BatchNorm2d(planes, affine=affine_par)
for i in self.bn1.parameters():
i.requires_grad = False
padding = 1
if dilation_ == 2:
padding = 2
elif dilation_ == 4:
padding = 4
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, # change
padding=padding, bias=False, dilation=dilation_)
self.bn2 = nn.BatchNorm2d(planes, affine=affine_par)
for i in self.bn2.parameters():
i.requires_grad = False
self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(planes * 4, affine=affine_par)
for i in self.bn3.parameters():
i.requires_grad = False
self.relu = nn.ReLU(inplace=True)
self.downsample = downsample
self.stride = stride
def forward(self, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out
class ResNet(nn.Module):
def __init__(self, block, layers):
self.inplanes = 64
super(ResNet, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64, affine=affine_par)
for i in self.bn1.parameters():
i.requires_grad = False
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation__=2)
for m in self.modules():
if isinstance(m, nn.Conv2d):
m.weight.data.normal_(0, 0.01)
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
def _make_layer(self, block, planes, blocks, stride=1, dilation__=1):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion or dilation__ == 2 or dilation__ == 4:
downsample = nn.Sequential(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion, affine=affine_par),
)
for i in downsample[1].parameters():
i.requires_grad = False
layers = []
layers.append(block(self.inplanes, planes, stride, dilation_=dilation__, downsample=downsample))
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(self.inplanes, planes, dilation_=dilation__))
return nn.Sequential(*layers)
def forward(self, x):
tmp_x = []
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
tmp_x.append(x)
x = self.maxpool(x)
x = self.layer1(x)
tmp_x.append(x)
x = self.layer2(x)
tmp_x.append(x)
x = self.layer3(x)
tmp_x.append(x)
x = self.layer4(x)
tmp_x.append(x)
return tmp_x
def resnet50():
model = ResNet(Bottleneck, [3, 4, 6, 3])
return model
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import mindspore.nn as nn
import mindspore
def vgg(cfg, i, batch_norm=False):
"""Make stage network of VGG."""
layers = []
in_channels = i
stage = 1
pad = nn.Pad(((0, 0), (0, 0), (1, 1), (1, 1))).to_float(mindspore.dtype.float32)
for v in cfg:
if v == "M":
stage += 1
layers += [pad, nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="valid")]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, pad_mode="pad", padding=1, has_bias=True)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU()]
else:
layers += [conv2d, nn.ReLU()]
in_channels = v
return layers
# adding prefix "base" to parameter names for load_checkpoint().
class Tmp(nn.Cell):
def __init__(self, base):
super(Tmp, self).__init__()
self.base = base
def vgg16():
cfg = {'tun': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'tun_ex': [512, 512, 512]}
base = nn.CellList(vgg(cfg['tun'], 3))
base = Tmp(base)
return Tmp(base)
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import torch
import torch.nn as nn
def vgg(cfg, i, batch_norm=False):
layers = []
in_channels = i
stage = 1
for v in cfg:
if v == 'M':
stage += 1
if stage == 6:
layers += [nn.MaxPool2d(kernel_size=3, stride=2, padding=1)]
else:
layers += [nn.MaxPool2d(kernel_size=3, stride=2, padding=1)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = v
return layers
def vgg16():
cfg = {'tun': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'tun_ex': [512, 512, 512]}
return torch.nn.ModuleList(vgg(cfg['tun'], 3))
opencv
pyyaml
pytorch
pandas
Pillow
# Copyright 2021 Huawei Technologies Co., Ltd
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -27,8 +27,9 @@ def sal2edge_one(image_file, output_file):
process one image
"""
if not os.path.exists(image_file):
print("file not exist:", image_file)
return
image = cv2.imread(image_file, cv2.IMREAD_UNCHANGED)
image = cv2.imread(image_file, cv2.IMREAD_GRAYSCALE)
b_image = image > 128
b_image = b_image.astype(np.float64)
dx, dy = np.gradient(b_image)
......@@ -52,16 +53,19 @@ def sal2edge(data_root, output_path, image_list_file):
return
image_list = np.loadtxt(image_list_file, str)
file_list = []
ext = image_list[0][1][-4:]
ext = ".png"
for image in image_list:
file_list.append(image[1][:-4])
file_list.append(image[:-4])
pair_file = open(data_root+"/../train_pair_edge.lst", "w")
with futures.ThreadPoolExecutor(max_workers=os.cpu_count()) as tp:
all_task = []
for file in file_list:
img_path = os.path.join(data_root, file + ext)
result_path = os.path.join(output_path, file + "_edge" + ext)
all_task.append(tp.submit(sal2edge_one, img_path, result_path))
pair_file.write(f"DUTS-TR-Image/{file}.jpg DUTS-TR-Mask/{file}.png DUTS-TR-Mask/{file}_edge.png\n")
futures.wait(all_task)
pair_file.close()
print("all done!")
......
#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
# The number of parameters transferred is not equal to the required number, print prompt information
if [ $# != 3 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash convert_model.sh [MODEL_NAME] [PTH_FILE] [MSP_FILE]"
echo "for example: bash convert_model.sh /weights/vgg16.pth ./weights/vgg16.ckpt"
echo "================================================================================================================="
exit 1
fi
# Get absolute path
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
# Get current script path
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
MODEL_NAME=$1
PTH_FILE=$(get_real_path $2)
MSP_FILE=$(get_real_path $3)
cd $BASE_PATH/..
python pretrained_model_convert/pth_to_msp.py \
--model=$MODEL_NAME \
--pth_file="$PTH_FILE" \
--msp_file="$MSP_FILE"
#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
# The number of parameters transferred is not equal to the required number, print prompt information
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash data_crop.sh [DATA_ROOT] [OUTPUT_ROOT]"
echo "for example: bash data_crop.sh /data/ ./data_crop/"
echo "================================================================================================================="
exit 1
fi
# Get absolute path
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
# Get current script path
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
DATA_ROOT=$(get_real_path $1)
OUTPUT_ROOT=$(get_real_path $2)
cd $DATA_ROOT
mkdir tmp
TMP_ROOT=$DATA_ROOT/tmp
# DUT-OMRON
mkdir $TMP_ROOT/DUT-OMRON
cp -r DUT-OMRON-image/DUT-OMRON-image $TMP_ROOT/DUT-OMRON/images
# ground_truth_mask
cp -r DUT-OMRON-image/pixelwiseGT-new-PNG $TMP_ROOT/DUT-OMRON/ground_truth_mask
# ECSSD nothing
#HKU-IS
mkdir $TMP_ROOT/HKU-IS
cp -r HKU-IS/imgs $TMP_ROOT/HKU-IS/images
cp -r HKU-IS/gt $TMP_ROOT/HKU-IS/ground_truth_mask
#PASCAL-S
mkdir $TMP_ROOT/PASCAL-S
mkdir $TMP_ROOT/PASCAL-S/ground_truth_mask
mkdir $TMP_ROOT/PASCAL-S/images
cp PASCAL-S/Imgs/*.png $TMP_ROOT/PASCAL-S/ground_truth_mask
cp PASCAL-S/Imgs/*.jpg $TMP_ROOT/PASCAL-S/images
# SOD
mkdir $TMP_ROOT/SOD
mkdir $TMP_ROOT/SOD/ground_truth_mask
mkdir $TMP_ROOT/SOD/images
cp SOD/Imgs/*.png $TMP_ROOT/SOD/ground_truth_mask/
cp SOD/Imgs/*.jpg $TMP_ROOT/SOD/images/
cd $BASE_PATH/..
python data_crop.py --data_name=ECSSD --data_root="$DATA_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=SOD --data_root="$TMP_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=DUT-OMRON --data_root="$TMP_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=PASCAL-S --data_root="$TMP_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=HKU-IS --data_root="$TMP_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=DUTS-TE --data_root="$DATA_ROOT" --output_path="$OUTPUT_ROOT"
python data_crop.py --data_name=DUTS-TR --data_root="$DATA_ROOT" --output_path="$OUTPUT_ROOT"
# prevent wrong path
if [ -d $TMP_ROOT/SOD ]; then
rm -rf $TMP_ROOT
fi
python sal2edge.py --data_root="$OUTPUT_ROOT/DUTS-TR/DUTS-TR-Mask/" --output_path="$OUTPUT_ROOT/DUTS-TR/DUTS-TR-Mask/" --image_list_file="$OUTPUT_ROOT/DUTS-TR/test.lst"
......@@ -43,4 +43,3 @@ do
python -u ./train.py --is_distributed True > train.log 2>&1 &
cd ../
done
#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
# The number of parameters transferred is not equal to the required number, print prompt information
if [ $# != 2 ]
then
echo "=============================================================================================================="
echo "Please run the script as: "
echo "bash run_distributed_train_gpu.sh [DEVICE_NUM] [USED_DEVICES]"
echo "for example: bash run_distributed_train_gpu.sh 8 0,1,2,3,4,5,6,7"
echo "================================================================================================================="
exit 1
fi
# Get absolute path
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
# Get current script path
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
export RANK_SIZE=$1
export CUDA_VISIBLE_DEVICES=$2
cd $BASE_PATH/..
mpirun -n $RANK_SIZE --allow-run-as-root \
python -u train.py --device_target=GPU --is_distributed True &> distribute_train.log &
echo "The train log is at ../distribute_train.log."
......@@ -14,27 +14,33 @@
# limitations under the License.
# ============================================================================
cd ..
python eval.py --test_fold='./result/ECSSD' \
python eval.py --device_target=Ascend \
--test_fold='./result/ECSSD' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=e \
--base_model=resnet >test_e.log
python eval.py --test_fold='./result/PASCAL-S' \
python eval.py --device_target=Ascend \
--test_fold='./result/PASCAL-S' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=p \
--base_model=resnet >test_p.log
python eval.py --test_fold='./result/DUT-OMRON' \
python eval.py --device_target=Ascend \
--test_fold='./result/DUT-OMRON' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=d \
--base_model=resnet >test_d.log
python eval.py --test_fold='./result/HKU-IS' \
python eval.py --device_target=Ascend \
--test_fold='./result/HKU-IS' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=h \
--base_model=resnet >test_h.log
python eval.py --test_fold='./result/SOD' \
python eval.py --device_target=Ascend \
--test_fold='./result/SOD' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=s \
--base_model=resnet >test_s.log
python eval.py --test_fold='./result/DUTS-TE' \
python eval.py --device_target=Ascend \
--test_fold='./result/DUTS-TE' \
--model='./EGNet/run-nnet/models/final_resnet_bone.ckpt' \
--sal_mode=t \
--base_model=resnet >test_t.log
\ No newline at end of file
#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
# Get absolute path
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
# Get current script path
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
cd $BASE_PATH/..
echo "evalating ECSSD"
python eval.py --device_target=GPU \
--test_fold='./result/ECSSD' \
--sal_mode=e >test_e.log
echo "evalating PASCAL-S"
python eval.py --device_target=GPU \
--test_fold='./result/PASCAL-S' \
--sal_mode=p >test_p.log
echo "evalating DUT-OMRON"
python eval.py --device_target=GPU \
--test_fold='./result/DUT-OMRON' \
--sal_mode=d >test_d.log
echo "evalating HKU-IS"
python eval.py --device_target=GPU \
--test_fold='./result/HKU-IS' \
--sal_mode=h >test_h.log
echo "evalating SOD"
python eval.py --device_target=GPU \
--test_fold='./result/SOD' \
--sal_mode=s >test_s.log
echo "evalating DUTS-TE"
python eval.py --device_target=GPU \
--test_fold='./result/DUTS-TE' \
--sal_mode=t >test_t.log
......@@ -13,4 +13,5 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
python train.py --base_model=vgg >train.log
cd ..
python train.py --device_target=Ascend --base_model=vgg >train.log
#!/bin/bash
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
# Get absolute path
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
# Get current script path
BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
cd $BASE_PATH/..
python train.py --device_target=GPU &>train.log &
echo "The train log is at ../train.log."
# Copyright 2021 Huawei Technologies Co., Ltd
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
......@@ -24,14 +24,18 @@ import numpy as np
from model_utils.config import base_config
from mindspore.dataset import GeneratorDataset
from mindspore.communication.management import get_rank, get_group_size
if base_config.train_online:
import moxing as mox
mox.file.shift('os', 'mox')
class ImageDataTrain:
"""
training dataset
"""
def __init__(self, train_path=""):
self.sal_root = train_path
self.sal_source = os.path.join(train_path, "train_pair_edge.lst")
......@@ -54,6 +58,7 @@ class ImageDataTest:
"""
test dataset
"""
def __init__(self, test_mode=1, sal_mode="e", test_path="", test_fold=""):
if test_mode == 1:
if sal_mode == "e":
......@@ -97,7 +102,7 @@ class ImageDataTest:
def __getitem__(self, item):
image, _ = load_image_test(os.path.join(self.image_root, self.image_list[item]))
label = load_sal_label(os.path.join(self.test_root, self.image_list[item][0:-4]+".png"))
label = load_sal_label(os.path.join(self.test_root, self.image_list[item][0:-4] + ".png"))
return image, label, item % self.image_num
def save_folder(self):
......@@ -109,7 +114,7 @@ class ImageDataTest:
# get the dataloader (Note: without data augmentation, except saliency with random flip)
def create_dataset(batch_size, mode="train", num_thread=1, test_mode=1, sal_mode="e", train_path="", test_path="",
test_fold="", is_distributed=False):
test_fold="", is_distributed=False, rank_id=0, rank_size=1):
"""
create dataset
"""
......@@ -135,7 +140,10 @@ def create_dataset(batch_size, mode="train", num_thread=1, test_mode=1, sal_mode
return ds.batch(batch_size, drop_remainder=drop_remainder, num_parallel_workers=num_thread), dataset
def save_img(img, path):
def save_img(img, path, is_distributed=False):
if is_distributed and get_rank() != 0:
return
range_ = np.max(img) - np.min(img)
img = (img - np.min(img)) / range_
img = img * 255 + 0.5
......@@ -145,8 +153,7 @@ def save_img(img, path):
def load_image(pah):
if not os.path.exists(pah):
print("File Not Exists")
print(pah)
print("File Not Exists,", pah)
im = cv2.imread(pah)
in_ = np.array(im, dtype=np.float32)
in_ -= np.array((104.00699, 116.66877, 122.67892))
......@@ -163,7 +170,6 @@ def load_image_test(pah):
pah = pah + ".png"
else:
pah = pah + ".jpg"
print("--------", pah)
if not os.path.exists(pah):
print("File Not Exists")
im = cv2.imread(pah)
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment