Skip to content
Snippets Groups Projects
Commit 7bcb2ddd authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!631 ssd_resnet34

Merge pull request !631 from 嘎嘎皮5.0/master
parents e24de128 5813dc27
No related branches found
No related tags found
No related merge requests found
Showing
with 2119 additions and 0 deletions
# Contents
- [Contents](#contents)
- [SSD Description](#ssd-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Prepare the model](#prepare-the-model)
- [Run the scripts](#run-the-scripts)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [Training Process](#training-process)
- [Training on Ascend](#training-on-ascend)
- [Evaluation Process](#evaluation-process)
- [Evaluation on Ascend](#evaluation-on-ascend)
- [Performance](#performance)
- [Export Process](#Export-process)
- [Export](#Export)
- [Inference Process](#Inference-process)
- [Inference](#Inference)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)
## [SSD Description](#contents)
SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.
[Paper](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press).
## [Model Architecture](#contents)
The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections. The early network layers are based on a standard architecture used for high quality image classification, which is called the base network. Then add auxiliary structure to the network to produce detections.
We present four different base architecture.
- **ssd300**, reference from the paper. Using mobilenetv2 as backbone and the same bbox predictor as the paper present.
- ***ssd-mobilenet-v1-fpn**, using mobilenet-v1 and FPN as feature extractor with weight-shared box predcitors.
- ***ssd-resnet50-fpn**, using resnet50 and FPN as feature extractor with weight-shared box predcitors.
- **ssd-vgg16**, reference from the paper. Using vgg16 as backbone and the same bbox predictor as the paper present.
- **ssd-resnet34**,reference from the paper. Using resnet34 as backbone and the same bbox predictor as the paper present.
## [Dataset](#contents)
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: [COCO2017](<http://images.cocodataset.org/>)
- Dataset size:19G
- Train:18G,118000 images
- Val:1G,5000 images
- Annotations:241M,instances,captions,person_keypoints etc
- Data format:image and json files
- Note:Data will be processed in dataset.py
## [Environment Requirements](#contents)
- Install [MindSpore](https://www.mindspore.cn/install/en).
- Download the dataset COCO2017.
- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets.
First, install Cython ,pycocotool and opencv to process data and to get evaluation result.
```shell
pip install Cython
pip install pycocotools
pip install opencv-python
```
1. If coco dataset is used. **Select dataset to coco when run script.**
Change the `coco_root` and other settings you need in `src/config_xxx.py`. The directory structure is as follows:
```shell
.
└─coco_dataset
├─annotations
├─instance_train2017.json
└─instance_val2017.json
├─val2017
└─train2017
```
2. If VOC dataset is used. **Select dataset to voc when run script.**
Change `classes`, `num_classes`, `voc_json` and `voc_root` in `src/config_xxx.py`. `voc_json` is the path of json file with coco format for evaluation, `voc_root` is the path of VOC dataset, the directory structure is as follows:
```shell
.
└─voc_dataset
└─train
├─0001.jpg
└─0001.xml
...
├─xxxx.jpg
└─xxxx.xml
└─eval
├─0001.jpg
└─0001.xml
...
├─xxxx.jpg
└─xxxx.xml
```
3. If your own dataset is used. **Select dataset to other when run script.**
Organize the dataset information into a TXT file, each row in the file is as follows:
```shell
train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
```
Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are setting in `src/config_xxx.py`.
## [Quick Start](#contents)
### Prepare the model
1. Chose the model by changing the `using_model` in `src/config.py`. The optional models are: `ssd300`, `ssd_mobilenet_v1_fpn`, `ssd_vgg16`, `ssd_resnet50_fpn`,`ssd_resnet34`
2. Change the dataset config in the corresponding config. `src/config_xxx.py`, `xxx` is the corresponding backbone network name
3. If you are running with `ssd_mobilenet_v1_fpn` , `ssd_resnet50_fpn`or , `ssd_resnet34` , you need a pretrained model for `mobilenet_v1` ,`resnet50`or `resnet34`. Set the checkpoint path to `feature_extractor_base_param` in `src/config_xxx.py`. For more detail about training pre-trained model, please refer to the corresponding backbone network.
### Run the scripts
After installing MindSpore via the official website, you can start training and evaluation as follows:
- running on Ascend
```shell
# distributed training on Ascend
sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
# run eval on Ascend
sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]
# run inference on Ascend310, MINDIR_PATH is the mindir model which you can export from checkpoint using export.py
bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DEVICE_ID]
```
## [Script Description](#contents)
### [Script and Sample Code](#contents)
```shell
└─ ssd_resnet34
├─ ascend310_infer
├─ scripts
├─ run_distribute_train.sh ## shell script for distributed on ascend 910
├─ run_eval.sh ## shell script for eval on ascend 910
├─ run_infer_310.sh ## shell script for eval on ascend 310
└─ run_standalone_train.sh ## shell script for standalone on ascend 910
├─ src
├─ __init__.py ## init file
├─ anchor_generator.py ## anchor generator
├─ box_util.py ## bbox utils
├─ callback.py ## the callback of train and evaluation
├─ config.py ## total config
├─ config_ssd_resnet34.py ## ssd_resnet34 config
├─ dataset.py ## create dataset and process dataset
├─ eval_utils.py ## eval utils
├─ lr_schedule.py ## learning ratio generator
├─ init_params.py ## parameters utils
├─ resnet34.py ## resnet34 architecture
├─ ssd.py ## ssd architecture
└─ ssd_resnet34.py ## ssd_resnet34 architecture
├─ eval.py ## eval script
├─ export.py ## export mindir script
├─ postprocess.py ## eval on ascend 310
├─ README.md ## English descriptions about SSD
├─ README_CN.md ## Chinese descriptions about SSD
├─ requirements.txt ## Requirements
└─ train.py ## train script
```
### [Script Parameters](#contents)
```shell
Major parameters in train.py and config.py as follows:
"device_num": 1 # Use device nums
"lr": 0.075 # Learning rate init value
"dataset": coco # Dataset name
"epoch_size": 500 # Epoch size
"batch_size": 32 # Batch size of input tensor
"pre_trained": None # Pretrained checkpoint file path
"pre_trained_epoch_size": 0 # Pretrained epoch size
"save_checkpoint_epochs": 10 # The epoch interval between two checkpoints. By default, the checkpoint will be saved per 10 epochs
"loss_scale": 1024 # Loss scale
"filter_weight": False # Load parameters in head layer or not. If the class numbers of train dataset is different from the class numbers in pre_trained checkpoint, please set True.
"freeze_layer": "none" # Freeze the backbone parameters or not, support none and backbone.
"class_num": 81 # Dataset class number
"image_shape": [300, 300] # Image height and width used as input to the model
"mindrecord_dir": "/data/MindRecord_COCO" # MindRecord path
"coco_root": "/data/coco2017" # COCO2017 dataset path
"voc_root": "/data/voc_dataset" # VOC original dataset path
"voc_json": "annotations/voc_instances_val.json" # is the path of json file with coco format for evaluation
"image_dir": "" # Other dataset image path, if coco or voc used, it will be useless
"anno_path": "" # Other dataset annotation path, if coco or voc used, it will be useless
```
### [Training Process](#contents)
To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.**
#### Training on Ascend
- Distribute mode
```shell
sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
```
- Standalone training
```shell
sh scripts/run_standalone_train.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
```
We need five or six parameters for this scripts.
- `RANK_TABLE_FILE :` the path of [rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools), it is better to use absolute path.
- `DATASET`:the dataset mode for distributed train.
- `DATASET_PATH`:the dataset path for distributed train.
- `MINDRECIRD_PATH`:the mindrecord path for distributed train.
- `TRAIN_OUT_PATH`:the output path of train for distributed train.
- `PRE_TRAINED_PATH :` the path of pretrained checkpoint file, it is better to use absolute path.
Training result will be stored in the train path, whose folder name "log". Under this, you can find checkpoint file together with result like the followings in log
```shell
epoch: 1 step: 458, loss is 4.185711
epoch time: 138740.569 ms, per step time: 302.927 ms
epoch: 2 step: 458, loss is 4.3121023
epoch time: 47116.166 ms, per step time: 102.874 ms
epoch: 3 step: 458, loss is 3.2209284
epoch time: 47149.108 ms, per step time: 102.946 ms
epoch: 4 step: 458, loss is 3.5159926
epoch time: 47174.645 ms, per step time: 103.001 ms
...
epoch: 497 step: 458, loss is 1.0916114
epoch time: 47164.002 ms, per step time: 102.978 ms
epoch: 498 step: 458, loss is 1.157409
epoch time: 47172.836 ms, per step time: 102.997 ms
epoch: 499 step: 458, loss is 1.2065268
epoch time: 47155.245 ms, per step time: 102.959 ms
epoch: 500 step: 458, loss is 1.1856415
epoch time: 47666.430 ms, per step time: 104.075 ms
```
#### Transfer Training
You can train your own model based on either pretrained classification model or pretrained detection model. You can perform transfer training by following steps.
1. Convert your own dataset to COCO or VOC style. Otherwise you have to add your own data preprocess code.
2. Change config_xxx.py according to your own dataset, especially the `num_classes`.
3. Prepare a pretrained checkpoint. You can load the pretrained checkpoint by `pre_trained` argument. Transfer training means a new training job, so just keep `pre_trained_epoch_size` same as default value `0`.
4. Set argument `filter_weight` to `True` while calling `train.py`, this will filter the final detection box weight from the pretrained model.
5. Build your own bash scripts using new config and arguments for further convenient.
### [Evaluation Process](#contents)
#### Evaluation on Ascend
```shell
sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]
```
We need five parameters for this scripts.
- `DEVICE_ID`: the device id for eval.
- `DATASET`:the dataset mode of evaluation dataset.
- `DATASET_PATH`:the dataset path for evaluation.
- `CHECKPOINT_PATH`: the absolute path for checkpoint file.
- `MINDRECIRD_PATH`:the mindrecord path for evaluation.
> checkpoint can be produced in training process.
Inference result will be stored in the eval path, whose folder name "log". Under this, you can find result like the followings in log.
```shell
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.240
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.360
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.258
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.229
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.446
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.256
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.389
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.427
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.077
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.439
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.734
========================================
mAP: 0.24011857000302622
```
## Inference Process
### [Export MindIR](#contents)
```shell
python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
```
The ckpt_file parameter is required,
`EXPORT_FORMAT` should be in ["AIR", "MINDIR"]
### Infer on Ascend310
Before performing inference, the mindir file must bu exported by `export.py` script. We only provide an example of inference using MINDIR model.
Current batch_Size can only be set to 1. The precision calculation process needs about 70G+ memory space, otherwise the process will be killed for execeeding memory limits.
```shell
# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID]
```
- `DVPP` is mandatory, and must choose from ["DVPP", "CPU"], it's case-insensitive. Note that the image shape of ssd_vgg16 inference is [300, 300], The DVPP hardware restricts width 16-alignment and height even-alignment. Therefore, the network needs to use the CPU operator to process images.
- `DEVICE_ID` is optional, default value is 0.
### result
Inference result is saved in current path, you can find result like this in acc.log file.
```bash
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.250
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.374
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.266
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.241
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.462
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.260
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.399
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.435
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.090
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.449
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.739
0.249879750926743
```
## [Model Description](#contents)
### [Performance](#contents)
#### Train Performance
| Parameters | Ascend |
| ------------------- | ----------------------------|
| Model Version | SSD_ResNet34 |
| Resource | Ascend 910;CPU 2.60GHz,192 cores;Memory 755 G|
| Uploaded Date | 08/31/2021 (month/day/year) |
| MindSpore Version | 1.3 |
| Dataset | COCO2017 |
| Training Parameters | epoch = 500, batch_size = 32|
| Optimizer | Momentum |
| Loss Function | Sigmoid Cross Entropy,SmoothL1Loss|
| Speed | 8pcs: 101ms/step |
| Total time | 8pcs: 8.34h |
#### Inference Performance
| Parameters | Ascend |
| ------------------- | --------------------------- |
| Model Version | SSD_ResNet34 |
| Resource | Ascend 910 |
| Uploaded Date | 08/31/2021 (month/day/year) |
| MindSpore Version | 1.2 |
| Dataset | COCO2017 |
| outputs | mAP |
| Accuracy | IoU=0.50: 24.0% |
| Model for inference | 98.77M(.ckpt file) |
## [Description of Random Situation](#contents)
In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
## [ModelZoo Homepage](#contents)
Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).
\ No newline at end of file
# 目录
<!-- TOC -->
- [目录](#目录)
- [SSD说明](#ssd说明)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [Ascend上训练](#ascend上训练)
- [评估过程](#评估过程)
- [Ascend处理器环境评估](#ascend处理器环境评估)
- [性能](#性能)
- [导出过程](#导出过程)
- [导出](#导出)
- [推理过程](#推理过程)
- [推理](#推理)
- [随机情况说明](#随机情况说明)
- [ModelZoo主页](#modelzoo主页)
<!-- /TOC -->
# SSD说明
SSD将边界框的输出空间离散成一组默认框,每个特征映射位置具有不同的纵横比和尺度。在预测时,网络对每个默认框中存在的对象类别进行评分,并对框进行调整以更好地匹配对象形状。此外,网络将多个不同分辨率的特征映射的预测组合在一起,自然处理各种大小的对象。
[论文](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press).
# 模型架构
SSD方法基于前向卷积网络,该网络产生固定大小的边界框集合,并针对这些框内存在的对象类实例进行评分,然后通过非极大值抑制步骤进行最终检测。早期的网络层基于高质量图像分类的标准体系结构,被称为基础网络。后来通过向网络添加辅助结构进行检测。
# 数据集
使用的数据集: [COCO2017](<http://images.cocodataset.org/>)
- 数据集大小:19 GB
- 训练集:18 GB,118000张图像
- 验证集:1 GB,5000张图像
- 标注:241 MB,实例,字幕,person_keypoints等
- 数据格式:图像和json文件
- 注意:数据在dataset.py中处理
# 环境要求
- 安装[MindSpore](https://www.mindspore.cn/install)
- 下载数据集COCO2017。
- 本示例默认使用COCO2017作为训练数据集,您也可以使用自己的数据集。
1. 如果使用coco数据集。**执行脚本时选择数据集coco。**
安装Cython和pycocotool,也可以安装mmcv进行数据处理。
```python
pip install Cython
pip install pycocotools
```
并在`config.py`中更改COCO_ROOT和其他您需要的设置。目录结构如下:
```text
.
└─cocodataset
├─annotations
├─instance_train2017.json
└─instance_val2017.json
├─val2017
└─train2017
```
2. 如果使用自己的数据集。**执行脚本时选择数据集为other。**
将数据集信息整理成TXT文件,每行如下:
```text
train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
```
每行是按空间分割的图像标注,第一列是图像的相对路径,其余为[xmin,ymin,xmax,ymax,class]格式的框和类信息。我们从`IMAGE_DIR`(数据集目录)和`ANNO_PATH`(TXT文件路径)的相对路径连接起来的图像路径中读取图像。在`config_ssd_resnet34.py`中设置`IMAGE_DIR`和`ANNO_PATH`。
# 快速入门
通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估:
- Ascend处理器环境运行
```shell
# Ascend分布式训练
sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
```
```shell
# Ascend单卡训练
sh scripts/run_standalone_train.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
```
```shell
# Ascend处理器环境运行eval
sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]
```
# 脚本说明
## 脚本及样例代码
```text
└─ ssd_resnet34
├─ ascend310_infer
├─ scripts
├─ run_distribute_train.sh ## Ascend 910分布式shell脚本
├─ run_standalone_train.sh ## Ascend 910单卡shell脚本
├─ run_infer_310.sh ## Ascend 310评估shell脚本
└─ run_eval.sh ## Ascend910 评估shell脚本
├─ src
├─ __init__.py ## 初始化文件
├─ anchor_generator.py ## anchor生成器
├─ box_util.py ## bbox工具
├─ callback.py ## 用于边训练边推理的callback
├─ config.py ## 总配置
├─ config_ssd_resnet34.py ## ssd_resnet34配置
├─ dataset.py ## 创建并处理数据集
├─ eval_utils.py ## eval工具
├─ init_params.py ## 参数工具
├─ lr_schedule.py ## 学习率生成器
├─ resnet34.py ## resnet34架构
├─ ssd.py ## SSD架构
└─ ssd_resnet34.py ## ssd_resnet34架构
├─ eval.py ## 评估脚本
├─ export.py ## 将checkpoint文件导出为mindir用于310推理
├─ postprocess.py ## Ascend 310 评估
├─ README.md ## SSD英文相关说明
├─ README_CN.md ## SSD中文相关说明
├─ requirements.txt ## 需求文档
└─ train.py ## 训练脚本
```
## 脚本参数
```text
train.py和config_ssd_resnet34.py中主要参数如下:
"device_num": 1 # 使用设备数量
"lr": 0.075 # 学习率初始值
"dataset": coco # 数据集名称
"epoch_size": 500 # 轮次大小
"batch_size": 32 # 输入张量的批次大小
"pre_trained": None # 预训练检查点文件路径
"pre_trained_epoch_size": 0 # 预训练轮次大小
"save_checkpoint_epochs": 10 # 两个检查点之间的轮次间隔。默认情况下,每10个轮次都会保存检查点。
"loss_scale": 1024 # 损失放大
"class_num": 81 # 数据集类数
"image_shape": [300, 300] # 作为模型输入的图像高和宽
"mindrecord_dir": "/data/MindRecord_COCO" # MindRecord路径
"coco_root": "/data/coco2017" # COCO2017数据集路径
"voc_root": "" # VOC原始数据集路径
"image_dir": "" # 其他数据集图片路径,如果使用coco或voc,此参数无效。
"anno_path": "" # 其他数据集标注路径,如果使用coco或voc,此参数无效。
```
## 训练过程
运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir``anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。**
### Ascend上训练
- 分布式
```shell script
sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH][PRE_TRAINED_PATH](optional)
```
此脚本需要五或六个参数。
- `RANK_TABLE_FILE`[rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools)的路径。最好使用绝对路径。
- `DATASET`:分布式训练的数据集模式。
- `DATASET_PATH`:分布式训练的数据集路径。
- `MINDRECORD_PATH`:分布式训练的mindrecord文件。
- `TRAIN_OUTPUT_PATH`:训练输出的检查点文件路径。最好使用绝对路径。
- `PRE_TRAINED_PATH`:预训练检查点文件的路径。最好使用绝对路径。
训练结果保存在train路径中,文件夹名为"log"。 您可在此文件夹中找到检查点文件以及结果,如下所示。
```text
epoch: 1 step: 458, loss is 4.185711
epoch time: 138740.569 ms, per step time: 302.927 ms
epoch: 2 step: 458, loss is 4.3121023
epoch time: 47116.166 ms, per step time: 102.874 ms
epoch: 3 step: 458, loss is 3.2209284
epoch time: 47149.108 ms, per step time: 102.946 ms
epoch: 4 step: 458, loss is 3.5159926
epoch time: 47174.645 ms, per step time: 103.001 ms
...
epoch: 497 step: 458, loss is 1.0916114
epoch time: 47164.002 ms, per step time: 102.978 ms
epoch: 498 step: 458, loss is 1.157409
epoch time: 47172.836 ms, per step time: 102.997 ms
epoch: 499 step: 458, loss is 1.2065268
epoch time: 47155.245 ms, per step time: 102.959 ms
epoch: 500 step: 458, loss is 1.1856415
epoch time: 47666.430 ms, per step time: 104.075 ms
```
## 评估过程
### Ascend处理器环境评估
```shell script
sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]
```
此脚本需要五个参数。
- `DEVICE_ID`: 评估的设备ID。
- `DATASET`:评估数据集的模式。
- `DATASET_PATH`:评估的数据集路径。
- `CHECKPOINT_PATH`:检查点文件的绝对路径。
- `MINDRECORD_PATH`:评估的mindrecord文件。
​ 推理结果保存在eval路径中,文件夹名为“log”。您可以在日志中找到类似以下的结果。
```text
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.240
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.360
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.258
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.229
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.446
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.256
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.389
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.427
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.077
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.439
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.734
========================================
mAP: 0.24011857000302622
```
## 推理过程
### [导出MindIR](#contents)
```shell
python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
```
参数ckpt_file为必填项,
`EXPORT_FORMAT` 必须在 ["AIR", "MINDIR"]中选择。
### 在Ascend310执行推理
在执行推理前,mindir文件必须通过`export.py`脚本导出。以下展示了使用minir模型执行推理的示例。
目前仅支持batch_Size为1的推理。精度计算过程需要70G+的内存,否则进程将会因为超出内存被系统终止。
```shell
# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID]
```
- `DVPP` 为必填项,需要在["DVPP", "CPU"]选择,大小写均可。需要注意的是ssd_vgg16执行推理的图片尺寸为[300, 300],由于DVPP硬件限制宽为16整除,高为2整除,因此,这个网络需要通过CPU算子对图像进行前处理。
- `DEVICE_ID` 可选,默认值为0。
### 结果
推理结果保存在脚本执行的当前路径,你可以在acc.log中看到以下精度计算结果。
```bash
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.250
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.374
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.266
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.241
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.462
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.260
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.399
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.435
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.090
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.449
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.739
0.249879750926743
```
## 模型描述
### 性能
#### 训练性能
| 参数 | Ascend |
| ------------- | ---------------- |
| 模型版本 | SSD_ResNet34 |
| 资源 | Ascend 910 CPU 2.60GHz,192 cores;Memory 755 G|
| 上传日期 | 2021-08-31 |
| MindSpore版本 | 1.3 |
| 数据集 | COCO2017 |
| 训练参数 | epoch:500 batch_size:32|
| 优化器 | Momentum |
| 损失函数 | Sigmoid Cross Entropy,SmoothL1Loss |
| 速度 | 101 ms/step |
| 总耗时 | 8.34小时 |
#### 推理性能
| 参数 | Ascend |
| ------------- | ---------------- |
| 模型版本 | SSD_ResNet34 |
| 资源 | Ascend 910 |
| 上传日期 | 2021-08-31 |
| MindSpore版本 | 1.3 |
| 数据集 | COCO2017 |
| 输出 | mAP |
| 准确率 | IoU=0.50: 24.0% |
| 推理模型 | 98.77M(.ckpt文件) |
## 随机情况说明
dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。
## ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)
\ No newline at end of file
cmake_minimum_required(VERSION 3.14.1)
project(Ascend310Infer)
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
option(MINDSPORE_PATH "mindspore install path" "")
include_directories(${MINDSPORE_PATH})
include_directories(${MINDSPORE_PATH}/include)
include_directories(${PROJECT_SRC_ROOT})
find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
add_executable(main src/main.cc src/utils.cc)
target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags)
find_package(gflags REQUIRED)
\ No newline at end of file
aipp_op {
aipp_mode : static
input_format : YUV420SP_U8
related_input_rank : 0
csc_switch : true
rbuv_swap_switch : false
matrix_r0c0 : 256
matrix_r0c1 : 0
matrix_r0c2 : 359
matrix_r1c0 : 256
matrix_r1c1 : -88
matrix_r1c2 : -183
matrix_r2c0 : 256
matrix_r2c1 : 454
matrix_r2c2 : 0
input_bias_0 : 0
input_bias_1 : 128
input_bias_2 : 128
mean_chn_0 : 124
mean_chn_1 : 117
mean_chn_2 : 104
var_reci_chn_0 : 0.0171247538316637
var_reci_chn_1 : 0.0175070028011204
var_reci_chn_2 : 0.0174291938997821
}
\ No newline at end of file
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ -d out ]; then
rm -rf out
fi
mkdir out
cd out || exit
if [ -f "Makefile" ]; then
make clean
fi
cmake .. \
-DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
make
\ No newline at end of file
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef MINDSPORE_INFERENCE_UTILS_H_
#define MINDSPORE_INFERENCE_UTILS_H_
#include <sys/stat.h>
#include <dirent.h>
#include <vector>
#include <string>
#include <memory>
#include "include/api/types.h"
std::vector<std::string> GetAllFiles(std::string_view dirName);
DIR *OpenDir(std::string_view dirName);
std::string RealPath(std::string_view path);
mindspore::MSTensor ReadFileToTensor(const std::string &file);
int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs);
#endif
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <sys/time.h>
#include <gflags/gflags.h>
#include <dirent.h>
#include <iostream>
#include <string>
#include <algorithm>
#include <iosfwd>
#include <vector>
#include <fstream>
#include <sstream>
#include "include/api/model.h"
#include "include/api/context.h"
#include "include/api/types.h"
#include "include/api/serialization.h"
#include "include/dataset/vision_ascend.h"
#include "include/dataset/execute.h"
#include "include/dataset/vision.h"
#include "inc/utils.h"
using mindspore::Context;
using mindspore::Serialization;
using mindspore::Model;
using mindspore::Status;
using mindspore::ModelType;
using mindspore::GraphCell;
using mindspore::kSuccess;
using mindspore::MSTensor;
using mindspore::dataset::Execute;
using mindspore::dataset::TensorTransform;
using mindspore::dataset::vision::DvppDecodeResizeJpeg;
using mindspore::dataset::vision::Resize;
using mindspore::dataset::vision::HWC2CHW;
using mindspore::dataset::vision::Normalize;
using mindspore::dataset::vision::Decode;
DEFINE_string(mindir_path, "", "mindir path");
DEFINE_string(dataset_path, ".", "dataset path");
DEFINE_int32(device_id, 0, "device id");
DEFINE_string(aipp_path, "./aipp.cfg", "aipp path");
DEFINE_string(cpu_dvpp, "DVPP", "cpu or dvpp process");
DEFINE_int32(image_height, 300, "image height");
DEFINE_int32(image_width, 300, "image width");
int main(int argc, char **argv) {
gflags::ParseCommandLineFlags(&argc, &argv, true);
if (RealPath(FLAGS_mindir_path).empty()) {
std::cout << "Invalid mindir" << std::endl;
return 1;
}
auto context = std::make_shared<Context>();
auto ascend310 = std::make_shared<mindspore::Ascend310DeviceInfo>();
ascend310->SetDeviceID(FLAGS_device_id);
ascend310->SetBufferOptimizeMode("off_optimize");
context->MutableDeviceInfo().push_back(ascend310);
mindspore::Graph graph;
Serialization::Load(FLAGS_mindir_path, ModelType::kMindIR, &graph);
if (FLAGS_cpu_dvpp == "DVPP") {
if (RealPath(FLAGS_aipp_path).empty()) {
std::cout << "Invalid aipp path" << std::endl;
return 1;
} else {
ascend310->SetInsertOpConfigPath(FLAGS_aipp_path);
}
}
Model model;
Status ret = model.Build(GraphCell(graph), context);
if (ret != kSuccess) {
std::cout << "ERROR: Build failed." << std::endl;
return 1;
}
auto all_files = GetAllFiles(FLAGS_dataset_path);
if (all_files.empty()) {
std::cout << "ERROR: no input data." << std::endl;
return 1;
}
std::map<double, double> costTime_map;
size_t size = all_files.size();
for (size_t i = 0; i < size; ++i) {
struct timeval start = {0};
struct timeval end = {0};
double startTimeMs;
double endTimeMs;
std::vector<MSTensor> inputs;
std::vector<MSTensor> outputs;
std::cout << "Start predict input files:" << all_files[i] << std::endl;
if (FLAGS_cpu_dvpp == "DVPP") {
auto resizeShape = {static_cast <uint32_t>(FLAGS_image_height), static_cast <uint32_t>(FLAGS_image_width)};
Execute resize_op(std::shared_ptr<DvppDecodeResizeJpeg>(new DvppDecodeResizeJpeg(resizeShape)));
auto imgDvpp = std::make_shared<MSTensor>();
resize_op(ReadFileToTensor(all_files[i]), imgDvpp.get());
inputs.emplace_back(imgDvpp->Name(), imgDvpp->DataType(), imgDvpp->Shape(),
imgDvpp->Data().get(), imgDvpp->DataSize());
} else {
std::shared_ptr<TensorTransform> decode(new Decode());
std::shared_ptr<TensorTransform> hwc2chw(new HWC2CHW());
std::shared_ptr<TensorTransform> normalize(
new Normalize({123.675, 116.28, 103.53}, {58.395, 57.120, 57.375}));
auto resizeShape = {FLAGS_image_height, FLAGS_image_width};
std::shared_ptr<TensorTransform> resize(new Resize(resizeShape));
Execute composeDecode({decode, resize, normalize, hwc2chw});
auto img = MSTensor();
auto image = ReadFileToTensor(all_files[i]);
composeDecode(image, &img);
std::vector<MSTensor> model_inputs = model.GetInputs();
if (model_inputs.empty()) {
std::cout << "Invalid model, inputs is empty." << std::endl;
return 1;
}
inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(),
img.Data().get(), img.DataSize());
}
gettimeofday(&start, nullptr);
ret = model.Predict(inputs, &outputs);
gettimeofday(&end, nullptr);
if (ret != kSuccess) {
std::cout << "Predict " << all_files[i] << " failed." << std::endl;
return 1;
}
startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
costTime_map.insert(std::pair<double, double>(startTimeMs, endTimeMs));
WriteResult(all_files[i], outputs);
}
double average = 0.0;
int inferCount = 0;
for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
double diff = 0.0;
diff = iter->second - iter->first;
average += diff;
inferCount++;
}
average = average / inferCount;
std::stringstream timeCost;
timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << inferCount << std::endl;
std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << inferCount << std::endl;
std::string fileName = "./time_Result" + std::string("/test_perform_static.txt");
std::ofstream fileStream(fileName.c_str(), std::ios::trunc);
fileStream << timeCost.str();
fileStream.close();
costTime_map.clear();
return 0;
}
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <fstream>
#include <algorithm>
#include <iostream>
#include "inc/utils.h"
using mindspore::MSTensor;
using mindspore::DataType;
std::vector<std::string> GetAllFiles(std::string_view dirName) {
struct dirent *filename;
DIR *dir = OpenDir(dirName);
if (dir == nullptr) {
return {};
}
std::vector<std::string> res;
while ((filename = readdir(dir)) != nullptr) {
std::string dName = std::string(filename->d_name);
if (dName == "." || dName == ".." || filename->d_type != DT_REG) {
continue;
}
res.emplace_back(std::string(dirName) + "/" + filename->d_name);
}
std::sort(res.begin(), res.end());
for (auto &f : res) {
std::cout << "image file: " << f << std::endl;
}
return res;
}
int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) {
std::string homePath = "./result_Files";
for (size_t i = 0; i < outputs.size(); ++i) {
size_t outputSize;
std::shared_ptr<const void> netOutput;
netOutput = outputs[i].Data();
outputSize = outputs[i].DataSize();
int pos = imageFile.rfind('/');
std::string fileName(imageFile, pos + 1);
fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
std::string outFileName = homePath + "/" + fileName;
FILE * outputFile = fopen(outFileName.c_str(), "wb");
fwrite(netOutput.get(), outputSize, sizeof(char), outputFile);
fclose(outputFile);
outputFile = nullptr;
}
return 0;
}
mindspore::MSTensor ReadFileToTensor(const std::string &file) {
if (file.empty()) {
std::cout << "Pointer file is nullptr" << std::endl;
return mindspore::MSTensor();
}
std::ifstream ifs(file);
if (!ifs.good()) {
std::cout << "File: " << file << " is not exist" << std::endl;
return mindspore::MSTensor();
}
if (!ifs.is_open()) {
std::cout << "File: " << file << "open failed" << std::endl;
return mindspore::MSTensor();
}
ifs.seekg(0, std::ios::end);
size_t size = ifs.tellg();
mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
ifs.seekg(0, std::ios::beg);
ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
ifs.close();
return buffer;
}
DIR *OpenDir(std::string_view dirName) {
if (dirName.empty()) {
std::cout << " dirName is null ! " << std::endl;
return nullptr;
}
std::string realPath = RealPath(dirName);
struct stat s;
lstat(realPath.c_str(), &s);
if (!S_ISDIR(s.st_mode)) {
std::cout << "dirName is not a valid directory !" << std::endl;
return nullptr;
}
DIR *dir;
dir = opendir(realPath.c_str());
if (dir == nullptr) {
std::cout << "Can not open dir " << dirName << std::endl;
return nullptr;
}
std::cout << "Successfully opened the dir " << dirName << std::endl;
return dir;
}
std::string RealPath(std::string_view path) {
char realPathMem[PATH_MAX] = {0};
char *realPathRet = nullptr;
realPathRet = realpath(path.data(), realPathMem);
if (realPathRet == nullptr) {
std::cout << "File: " << path << " is not exist.";
return "";
}
std::string realPath(realPathMem);
std::cout << path << " realpath is: " << realPath << std::endl;
return realPath;
}
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Evaluation for SSD"""
import os
import ast
import argparse
import time
import numpy as np
from mindspore import context, Tensor
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from src.ssd import SsdInferWithDecoder, ssd_resnet34
from src.dataset import create_ssd_dataset, create_mindrecord
from src.config import config
from src.eval_utils import metrics
from src.box_utils import default_boxes
def ssd_eval(dataset_path, ckpt_path, anno_json):
"""SSD evaluation."""
batch_size = 1
ds = create_ssd_dataset(dataset_path, batch_size=batch_size, repeat_num=1,
is_training=False, use_multiprocessing=False)
if config.model == "ssd_resnet34":
net = ssd_resnet34(config=config)
else:
raise ValueError(f'config.model: {config.model} is not supported')
net = SsdInferWithDecoder(net, Tensor(default_boxes), config)
print("Load Checkpoint!")
param_dict = load_checkpoint(ckpt_path)
net.init_parameters_data()
load_param_into_net(net, param_dict)
net.set_train(False)
i = batch_size
total = ds.get_dataset_size() * batch_size
start = time.time()
pred_data = []
print("\n========================================\n")
print("total images num: ", total)
print("Processing, please wait a moment.")
for data in ds.create_dict_iterator(output_numpy=True, num_epochs=1):
img_id = data['img_id']
img_np = data['image']
image_shape = data['image_shape']
output = net(Tensor(img_np))
for batch_idx in range(img_np.shape[0]):
pred_data.append({"boxes": output[0].asnumpy()[batch_idx],
"box_scores": output[1].asnumpy()[batch_idx],
"img_id": int(np.squeeze(img_id[batch_idx])),
"image_shape": image_shape[batch_idx]})
percent = round(i / total * 100., 2)
print(f' {str(percent)} [{i}/{total}]', end='\r')
i += batch_size
cost_time = int((time.time() - start) * 1000)
print(f' 100% [{total}/{total}] cost {cost_time} ms')
mAP = metrics(pred_data, anno_json)
print("\n========================================\n")
print(f"mAP: {mAP}")
def get_eval_args():
"""Get eval args"""
parser = argparse.ArgumentParser(description='SSD evaluation')
parser.add_argument("--data_url", type=str)
parser.add_argument("--train_url", type=str, default="")
parser.add_argument("--mindrecord", type=str)
parser.add_argument("--run_online", type=ast.literal_eval, default=False)
parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.")
parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.")
parser.add_argument("--checkpoint_path", type=str, required=True, help="Checkpoint file path.")
parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"),
help="run platform, support Ascend ,GPU and CPU.")
return parser.parse_args()
if __name__ == '__main__':
args_opt = get_eval_args()
if args_opt.run_online:
import moxing as mox
config.checkpoint_path = "/cache/checkpoint_path/checkpoint.ckpt"
config.coco_root = "/cache/data_url"
config.mindrecord_dir = "/cache/mindrecord_url"
mox.file.copy_parallel(args_opt.data_url, config.coco_root)
mox.file.copy_parallel(args_opt.checkpoint_path, config.checkpoint_path)
else:
config.checkpoint_path = args_opt.checkpoint_path
config.coco_root = args_opt.data_url
config.mindrecord_dir = args_opt.mindrecord
if args_opt.dataset == "coco":
json_path = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type))
elif args_opt.dataset == "voc":
json_path = os.path.join(config.voc_root, config.voc_json)
else:
raise ValueError('SSD eval only support dataset mode is coco and voc!')
context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id)
mindrecord_file = create_mindrecord(args_opt.dataset, "ssd_eval.mindrecord", False)
if args_opt.run_online:
import moxing as mox
mox.file.copy_parallel(mindrecord_file, args_opt.mindrecord)
print("Start Eval!")
ssd_eval(mindrecord_file, config.checkpoint_path, json_path)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Transfer data format"""
import argparse
import numpy as np
import mindspore
from mindspore import context, Tensor
from mindspore.train.serialization import load_checkpoint, load_param_into_net, export
from src.ssd import SsdInferWithDecoder, ssd_resnet34
from src.config import config
from src.box_utils import default_boxes
parser = argparse.ArgumentParser(description='SSD export')
parser.add_argument("--device_id", type=int, default=0, help="Device id")
parser.add_argument("--batch_size", type=int, default=1, help="batch size")
parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
parser.add_argument("--file_name", type=str, default="ssd", help="output file name.")
parser.add_argument('--file_format', type=str, choices=["AIR", "MINDIR"], default='AIR', help='file format')
parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="Ascend",
help="device target")
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
if args.device_target == "Ascend":
context.set_context(device_id=args.device_id)
if __name__ == '__main__':
if config.model == "ssd_resnet34":
net = ssd_resnet34(config=config)
else:
raise ValueError(f'config.model: {config.model} is not supported')
net = SsdInferWithDecoder(net, Tensor(default_boxes), config)
param_dict = load_checkpoint(args.ckpt_file)
net.init_parameters_data()
load_param_into_net(net, param_dict)
net.set_train(False)
input_shp = [args.batch_size, 3] + config.img_shape
input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), mindspore.float32)
export(net, input_array, file_name=args.file_name, file_format=args.file_format)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""post process for 310 inference"""
import os
import argparse
import numpy as np
from PIL import Image
from src.config import config
from src.eval_utils import metrics
batch_size = 1
parser = argparse.ArgumentParser(description="ssd acc calculation")
parser.add_argument("--result_path", type=str, required=True, help="result files path.")
parser.add_argument("--img_path", type=str, required=True, help="image file path.")
parser.add_argument("--anno_file", type=str, required=True, help="annotation file.")
parser.add_argument("--drop", action="store_true", help="drop iscrowd images or not.")
args = parser.parse_args()
def get_imgSize(file_name):
img = Image.open(file_name)
return img.size
def get_result(result_path, img_id_file_path):
"""print the mAP"""
if args.drop:
from pycocotools.coco import COCO
train_cls = config.classes
train_cls_dict = {}
for i, cls in enumerate(train_cls):
train_cls_dict[cls] = i
coco = COCO(args.anno_file)
classs_dict = {}
cat_ids = coco.loadCats(coco.getCatIds())
for cat in cat_ids:
classs_dict[cat["id"]] = cat["name"]
files = os.listdir(img_id_file_path)
pred_data = []
for file in files:
img_ids_name = file.split('.')[0]
img_id = int(np.squeeze(img_ids_name))
if args.drop:
anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None)
anno = coco.loadAnns(anno_ids)
annos = []
iscrowd = False
for label in anno:
bbox = label["bbox"]
class_name = classs_dict[label["category_id"]]
iscrowd = iscrowd or label["iscrowd"]
if class_name in train_cls:
x_min, x_max = bbox[0], bbox[0] + bbox[2]
y_min, y_max = bbox[1], bbox[1] + bbox[3]
annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]])
if iscrowd or (not annos):
continue
img_size = get_imgSize(os.path.join(img_id_file_path, file))
image_shape = np.array([img_size[1], img_size[0]])
result_path_0 = os.path.join(result_path, img_ids_name + "_0.bin")
result_path_1 = os.path.join(result_path, img_ids_name + "_1.bin")
boxes = np.fromfile(result_path_0, dtype=np.float32).reshape(config.num_ssd_boxes, 4)
box_scores = np.fromfile(result_path_1, dtype=np.float32).reshape(config.num_ssd_boxes, config.num_classes)
pred_data.append({
"boxes": boxes,
"box_scores": box_scores,
"img_id": img_id,
"image_shape": image_shape
})
mAP = metrics(pred_data, args.anno_file)
print("mAP:{}".format(mAP))
if __name__ == '__main__':
get_result(args.result_path, args.img_path)
pycocotools
opencv-python
xml-python
Pillow
numpy
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "=============================================================================================================="
echo "Please run the script as: "
echo "sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH] [PRE_TRAINED_PATH](optional)"
echo "for example: sh scripts/run_distribute_train.sh /home/neu/hrnet_final/rank_table_file_path.json coco /home/neu/ssd-coco /home/neu/coco-mindrecord .train_out /home/neu/ssdresnet34lj/resnet34.ckpt(optional)"
echo "It is better to use absolute path."
echo "================================================================================================================="
if [ $# != 5 ] && [ $# != 6 ]
then
echo "Using: sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH]"
echo "or"
echo "Using: sh scripts/run_distribute_train.sh [RANK_TABLE_FILE] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH] [PRE_TRAINED_PATH]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
PATH1=$(get_real_path $3) # dataset_path
PATH2=$(get_real_path $4) # mindrecord_path
PATH3=$(get_real_path $5) # train_output_path
PATH4=$(get_real_path $6) # pre_trained_path
PATH5=$(get_real_path $1) # rank_table_file_path
if [ ! -d $PATH1 ]
then
echo "error: DATASET_PATH=$PATH1 is not a directory."
exit 1
fi
if [ ! -d $PATH2 ]
then
echo "error: MINDRECORD_PATH=$PATH2 is not a directory."
exit 1
fi
if [ ! -d $PATH3 ]
then
echo "error: TRAIN_OUTPUT_PATH=$PATH3 is not a directory."
fi
if [ ! -f $PATH4 ] && [ $# == 6 ]
then
echo "error: PRE_TRAINED_PATH=$PATH4 is not a file."
exit 1
fi
if [ ! -f $PATH5 ]
then
echo "error: RANK_TABLE_FILE_PATH=$PATH5 is not a file."
exit 1
fi
ulimit -u unlimited
export DEVICE_NUM=8
export RANK_SIZE=8
export RANK_TABLE_FILE=$PATH5
export SERVER_ID=0
rank_start=$((DEVICE_NUM * SERVER_ID))
for((i=0; i<${DEVICE_NUM}; i++))
do
export DEVICE_ID=${i}
export RANK_ID=$((rank_start + i))
rm -rf ./train_parallel$i
mkdir ./train_parallel$i
echo ./train_parallel$i
cp ./train.py ./train_parallel$i
cp -r ./src ./train_parallel$i
cd ./train_parallel$i || exit
echo "Start training for rank $RANK_ID, device $DEVICE_ID."
env > env.log
if [ $# == 5 ]
then
python train.py --data_url $PATH1 --mindrecord_url $PATH2 --train_url $PATH3 --run_platform Ascend --lr 0.075 --epoch_size 500 --dataset $2 --distribute True --device_num 8 &> log &
else
python train.py --data_url $PATH1 --mindrecord_url $PATH2 --train_url $PATH3 --run_platform Ascend --lr 0.075 --epoch_size 500 --dataset $2 --pre_trained $PATH4 --distribute True --device_num 8 &> log &
fi
cd ..
done
\ No newline at end of file
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "=============================================================================================================="
echo "Please run the script as: "
echo "sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]"
echo "for example: sh scripts/run_eval.sh 0 coco /home/neu/ssd-coco /home/neu/ssdresnet34lj/ckpt0/ssd-990_458.ckpt /home/neu/coco-mindrecord"
echo "It is better to use absolute path."
echo "================================================================================================================="
if [ $# != 5 ]
then
echo "Using: sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDRECORD_PATH]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
PATH1=$(get_real_path $3)
PATH2=$(get_real_path $4)
PATH3=$(get_real_path $5)
if [ ! -d $PATH1 ]
then
echo "error: DATASET_PATH=$PATH1 is not a dictionary."
exit 1
fi
if [ ! -f $PATH2 ]
then
echo "error: CHECKPOINT_PATH=$PATH2 is not a file."
exit 1
fi
if [ ! -d $PATH3 ]
then
echo "error: MINDRECORD_PATH=$PATH3 is not a dictionary."
exit 1
fi
ulimit -u unlimited
export DEVICE_NUM=1
export DEVICE_ID=$1
export RANK_SIZE=$DEVICE_NUM
export RANK_ID=0
if [ -d "eval" ];
then
rm -rf ./eval
fi
mkdir ./eval
cp ./eval.py ./eval
cp -r ./src ./eval
cd ./eval || exit
env > env.log
echo "start evaluation for device $DEVICE_ID"
python eval.py --data_url $PATH1 --dataset $2 --device_id $1 --run_platform Ascend --checkpoint_path $PATH2 --mindrecord $PATH3 &> log &
cd ..
\ No newline at end of file
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [[ $# -lt 4 || $# -gt 5 ]]; then
echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID]
DVPP is mandatory, and must choose from [DVPP|CPU], it's case-insensitive
ANNO_PATH is mandatory, and should specify annotation file path of your data including file name.
DEVICE_ID is optional, it can be set by environment variable device_id, otherwise the value is zero"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
model=$(get_real_path $1)
data_path=$(get_real_path $2)
DVPP=${3^^}
anno=$(get_real_path $4)
device_id=0
if [ $# == 5 ]; then
device_id=$5
fi
echo "mindir name: "$model
echo "dataset path: "$data_path
echo "image process mode: "$DVPP
echo "anno file: "$anno
echo "device id: "$device_id
export ASCEND_HOME=/usr/local/Ascend/
if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then
export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH
export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe
export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH
export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp
else
export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH
export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH
export ASCEND_OPP_PATH=$ASCEND_HOME/opp
fi
function compile_app()
{
cd ../ascend310_infer || exit
bash build.sh &> build.log
}
function infer()
{
cd - || exit
if [ -d result_Files ]; then
rm -rf ./result_Files
fi
if [ -d time_Result ]; then
rm -rf ./time_Result
fi
mkdir result_Files
mkdir time_Result
if [ "$DVPP" == "DVPP" ];then
../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --device_id=$device_id --cpu_dvpp=$DVPP --aipp_path=../ascend310_infer/aipp.cfg --image_height=640 --image_width=640 &> infer.log
elif [ "$DVPP" == "CPU" ]; then
../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --cpu_dvpp=$DVPP --device_id=$device_id --image_height=300 --image_width=300 &> infer.log
else
echo "image process mode must be in [DVPP|CPU]"
exit 1
fi
}
function cal_acc()
{
python3.7 ../postprocess.py --result_path=./result_Files --img_path=$data_path --anno_file=$anno --drop &> acc.log &
}
compile_app
if [ $? -ne 0 ]; then
echo "compile app code failed"
exit 1
fi
infer
if [ $? -ne 0 ]; then
echo " execute inference failed"
exit 1
fi
cal_acc
if [ $? -ne 0 ]; then
echo "calculate accuracy failed"
exit 1
fi
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "=============================================================================================================="
echo "Please run the script as: "
echo "sh scripts/run_standalone_train.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH] [PRE_TRAINED_PATH]"
echo "for example: sh scripts/run_standalone_train.sh 0 coco /home/neu/ssd-coco /home/neu/coco-mindrecord ./train_out /home/neu/ssdresnet34lj/resnet34.ckpt(optional)"
echo "It is better to use absolute path."
echo "================================================================================================================="
if [ $# != 5 ] && [ $# != 6 ]
then
echo "Using: sh scripts/run_standalone_train.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH]"
echo "or"
echo "Using: sh scripts/run_standalone_train.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [MINDRECORD_PATH] [TRAIN_OUTPUT_PATH] [PRE_TRAINED_PATH]"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
PATH1=$(get_real_path $3) # dataset_path
PATH2=$(get_real_path $4) # mindrecord_path
PATH3=$(get_real_path $5) # train_output_path
PATH4=$(get_real_path $6) # pre_trained_path
if [ ! -d $PATH1 ]
then
echo "error: DATASET_PATH=$PATH1 is not a directory."
exit 1
fi
if [ ! -d $PATH2 ]
then
echo "error: MINDRECORD_PATH=$PATH2 is not a directory."
exit 1
fi
if [ ! -d $PATH3 ]
then
echo "error: TRAIN_OUTPUT_PATH=$PATH3 is not a directory."
fi
if [ ! -f $PATH4 ] && [ $# == 6 ]
then
echo "error: PRE_TRAINED_PATH=$PATH4 is not a file."
exit 1
fi
ulimit -u unlimited
export DEVICE_NUM=1
export DEVICE_ID=$1
export RANK_ID=0
export RANK_SIZE=1
if [ -d "./train" ]
then
rm -rf ./train
echo "Remove dir ./train."
fi
mkdir ./train
echo "Create a dir ./train."
cp ./train.py ./train
cp -r ./src ./train
cd ./train || exit
echo "Start training for device $DEVICE_ID"
env > env.log
if [ $# == 5 ]
then
python train.py --data_url $PATH1 --mindrecord_url $PATH2 --train_url $PATH3 --run_platform Ascend --lr 0.075 --epoch_size 1000 --dataset $2 &> log &
else
python train.py --data_url $PATH1 --mindrecord_url $PATH2 --train_url $PATH3 --run_platform Ascend --lr 0.075 --epoch_size 1000 --dataset $2 --pre_trained $PATH4 &> log &
fi
cd ..
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Anchor Generator"""
import numpy as np
class GridAnchorGenerator:
"""
Anchor Generator
"""
def __init__(self, image_shape, scale, scales_per_octave, aspect_ratios):
super(GridAnchorGenerator, self).__init__()
self.scale = scale
self.scales_per_octave = scales_per_octave
self.aspect_ratios = aspect_ratios
self.image_shape = image_shape
def generate(self, step):
"""Generate anchor"""
scales = np.array([2**(float(scale) / self.scales_per_octave)
for scale in range(self.scales_per_octave)]).astype(np.float32)
aspects = np.array(list(self.aspect_ratios)).astype(np.float32)
scales_grid, aspect_ratios_grid = np.meshgrid(scales, aspects)
scales_grid = scales_grid.reshape([-1])
aspect_ratios_grid = aspect_ratios_grid.reshape([-1])
feature_size = [self.image_shape[0] / step, self.image_shape[1] / step]
grid_height, grid_width = feature_size
base_size = np.array([self.scale * step, self.scale * step]).astype(np.float32)
anchor_offset = step / 2.0
ratio_sqrt = np.sqrt(aspect_ratios_grid)
heights = scales_grid / ratio_sqrt * base_size[0]
widths = scales_grid * ratio_sqrt * base_size[1]
y_centers = np.arange(grid_height).astype(np.float32)
y_centers = y_centers * step + anchor_offset
x_centers = np.arange(grid_width).astype(np.float32)
x_centers = x_centers * step + anchor_offset
x_centers, y_centers = np.meshgrid(x_centers, y_centers)
x_centers_shape = x_centers.shape
y_centers_shape = y_centers.shape
widths_grid, x_centers_grid = np.meshgrid(widths, x_centers.reshape([-1]))
heights_grid, y_centers_grid = np.meshgrid(heights, y_centers.reshape([-1]))
x_centers_grid = x_centers_grid.reshape(*x_centers_shape, -1)
y_centers_grid = y_centers_grid.reshape(*y_centers_shape, -1)
widths_grid = widths_grid.reshape(-1, *x_centers_shape)
heights_grid = heights_grid.reshape(-1, *y_centers_shape)
bbox_centers = np.stack([y_centers_grid, x_centers_grid], axis=3)
bbox_sizes = np.stack([heights_grid, widths_grid], axis=3)
bbox_centers = bbox_centers.reshape([-1, 2])
bbox_sizes = bbox_sizes.reshape([-1, 2])
bbox_corners = np.concatenate([bbox_centers - 0.5 * bbox_sizes, bbox_centers + 0.5 * bbox_sizes], axis=1)
self.bbox_corners = bbox_corners / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
self.bbox_centers = np.concatenate([bbox_centers, bbox_sizes], axis=1)
self.bbox_centers = self.bbox_centers / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
print(self.bbox_centers.shape)
return self.bbox_centers, self.bbox_corners
def generate_multi_levels(self, steps):
"""Gennerate multi levels"""
bbox_centers_list = []
bbox_corners_list = []
for step in steps:
bbox_centers, bbox_corners = self.generate(step)
bbox_centers_list.append(bbox_centers)
bbox_corners_list.append(bbox_corners)
self.bbox_centers = np.concatenate(bbox_centers_list, axis=0)
self.bbox_corners = np.concatenate(bbox_corners_list, axis=0)
return self.bbox_centers, self.bbox_corners
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Bbox utils"""
import math
import itertools as it
import numpy as np
from .config import config
from .anchor_generator import GridAnchorGenerator
class GeneratDefaultBoxes():
"""
Generate Default boxes for SSD, follows the order of (W, H, archor_sizes).
`self.default_boxes` has a shape of [archor_sizes, H, W, 4], the last dimension is [y, x, h, w].
`self.default_boxes_tlbr` has a shape as `self.default_boxes`, the last dimension is [y1, x1, y2, x2].
"""
def __init__(self):
fk = config.img_shape[0] / np.array(config.steps)
scale_rate = (config.max_scale - config.min_scale) / (len(config.num_default) - 1)
scales = [config.min_scale + scale_rate * i for i in range(len(config.num_default))] + [1.0]
self.default_boxes = []
for idex, feature_size in enumerate(config.feature_size):
sk1 = scales[idex]
sk2 = scales[idex + 1]
sk3 = math.sqrt(sk1 * sk2)
if idex == 0 and not config.aspect_ratios[idex]:
w, h = sk1 * math.sqrt(2), sk1 / math.sqrt(2)
all_sizes = [(0.1, 0.1), (w, h), (h, w)]
else:
all_sizes = [(sk1, sk1)]
for aspect_ratio in config.aspect_ratios[idex]:
w, h = sk1 * math.sqrt(aspect_ratio), sk1 / math.sqrt(aspect_ratio)
all_sizes.append((w, h))
all_sizes.append((h, w))
all_sizes.append((sk3, sk3))
assert len(all_sizes) == config.num_default[idex]
for i, j in it.product(range(feature_size), repeat=2):
for w, h in all_sizes:
cx, cy = (j + 0.5) / fk[idex], (i + 0.5) / fk[idex]
self.default_boxes.append([cy, cx, h, w])
def to_tlbr(cy, cx, h, w):
return cy - h / 2, cx - w / 2, cy + h / 2, cx + w / 2
# For IoU calculation
self.default_boxes_tlbr = np.array(tuple(to_tlbr(*i) for i in self.default_boxes), dtype='float32')
self.default_boxes = np.array(self.default_boxes, dtype='float32')
if 'use_anchor_generator' in config and config.use_anchor_generator:
generator = GridAnchorGenerator(config.img_shape, 4, 2, [1.0, 2.0, 0.5])
default_boxes, default_boxes_tlbr = generator.generate_multi_levels(config.steps)
else:
default_boxes_tlbr = GeneratDefaultBoxes().default_boxes_tlbr
default_boxes = GeneratDefaultBoxes().default_boxes
y1, x1, y2, x2 = np.split(default_boxes_tlbr[:, :4], 4, axis=-1)
vol_anchors = (x2 - x1) * (y2 - y1)
matching_threshold = config.match_threshold
def ssd_bboxes_encode(boxes):
"""
Labels anchors with ground truth inputs.
Args:
boxex: ground truth with shape [N, 5], for each row, it stores [y, x, h, w, cls].
Returns:
gt_loc: location ground truth with shape [num_anchors, 4].
gt_label: class ground truth with shape [num_anchors, 1].
num_matched_boxes: number of positives in an image.
"""
def jaccard_with_anchors(bbox):
"""Compute jaccard score a box and the anchors."""
# Intersection bbox and volume.
ymin = np.maximum(y1, bbox[0])
xmin = np.maximum(x1, bbox[1])
ymax = np.minimum(y2, bbox[2])
xmax = np.minimum(x2, bbox[3])
w = np.maximum(xmax - xmin, 0.)
h = np.maximum(ymax - ymin, 0.)
# Volumes.
inter_vol = h * w
union_vol = vol_anchors + (bbox[2] - bbox[0]) * (bbox[3] - bbox[1]) - inter_vol
jaccard = inter_vol / union_vol
return np.squeeze(jaccard)
pre_scores = np.zeros((config.num_ssd_boxes), dtype=np.float32)
t_boxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
t_label = np.zeros((config.num_ssd_boxes), dtype=np.int64)
for bbox in boxes:
label = int(bbox[4])
scores = jaccard_with_anchors(bbox)
idx = np.argmax(scores)
scores[idx] = 2.0
mask = (scores > matching_threshold)
mask = mask & (scores > pre_scores)
pre_scores = np.maximum(pre_scores, scores * mask)
t_label = mask * label + (1 - mask) * t_label
for i in range(4):
t_boxes[:, i] = mask * bbox[i] + (1 - mask) * t_boxes[:, i]
index = np.nonzero(t_label)
# Transform to tlbr.
bboxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
bboxes[:, [0, 1]] = (t_boxes[:, [0, 1]] + t_boxes[:, [2, 3]]) / 2
bboxes[:, [2, 3]] = t_boxes[:, [2, 3]] - t_boxes[:, [0, 1]]
# Encode features.
bboxes_t = bboxes[index]
default_boxes_t = default_boxes[index]
bboxes_t[:, :2] = (bboxes_t[:, :2] - default_boxes_t[:, :2]) / (default_boxes_t[:, 2:] * config.prior_scaling[0])
tmp = np.maximum(bboxes_t[:, 2:4] / default_boxes_t[:, 2:4], 0.000001)
bboxes_t[:, 2:4] = np.log(tmp) / config.prior_scaling[1]
bboxes[index] = bboxes_t
num_match = np.array([len(np.nonzero(t_label)[0])], dtype=np.int32)
return bboxes, t_label.astype(np.int32), num_match
def ssd_bboxes_decode(boxes):
"""Decode predict boxes to [y, x, h, w]"""
boxes_t = boxes.copy()
default_boxes_t = default_boxes.copy()
boxes_t[:, :2] = boxes_t[:, :2] * config.prior_scaling[0] * default_boxes_t[:, 2:] + default_boxes_t[:, :2]
boxes_t[:, 2:4] = np.exp(boxes_t[:, 2:4] * config.prior_scaling[1]) * default_boxes_t[:, 2:4]
bboxes = np.zeros((len(boxes_t), 4), dtype=np.float32)
bboxes[:, [0, 1]] = boxes_t[:, [0, 1]] - boxes_t[:, [2, 3]] / 2
bboxes[:, [2, 3]] = boxes_t[:, [0, 1]] + boxes_t[:, [2, 3]] / 2
return np.clip(bboxes, 0, 1)
def intersect(box_a, box_b):
"""Compute the intersect of two sets of boxes."""
max_yx = np.minimum(box_a[:, 2:4], box_b[2:4])
min_yx = np.maximum(box_a[:, :2], box_b[:2])
inter = np.clip((max_yx - min_yx), a_min=0, a_max=np.inf)
return inter[:, 0] * inter[:, 1]
def jaccard_numpy(box_a, box_b):
"""Compute the jaccard overlap of two sets of boxes."""
inter = intersect(box_a, box_b)
area_a = ((box_a[:, 2] - box_a[:, 0]) *
(box_a[:, 3] - box_a[:, 1]))
area_b = ((box_b[2] - box_b[0]) *
(box_b[3] - box_b[1]))
union = area_a + area_b - inter
return inter / union
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""Train and evaluation"""
import os
import numpy as np
from mindspore import Tensor
from mindspore.train.callback import Callback
from src.dataset import create_ssd_dataset
from src.eval_utils import metrics
from src.box_utils import default_boxes
from src.config import config
from src.ssd import SsdInferWithDecoder
class EvalCallBack(Callback):
"""EvalCallBack"""
def __init__(self, eval_dataset, net, eval_per_epoch, train_url, json_path):
self.net = net
self.eval_dataset = eval_dataset
self.eval_per_epoch = eval_per_epoch
self.train_url = train_url
self.json_path = json_path
self.best_top_mAP = 0
self.best_top_epoch = 0
def epoch_end(self, run_context):
"""Epoch_end"""
cb_param = run_context.original_args()
cur_epoch = cb_param.cur_epoch_num
if cur_epoch % self.eval_per_epoch == 0:
net = SsdInferWithDecoder(self.net, Tensor(default_boxes), config)
net.set_train(False)
pred_data = []
for data in self.eval_dataset.create_dict_iterator(output_numpy=True, num_epochs=1):
img_id = data['img_id']
img_np = data['image']
image_shape = data['image_shape']
output = net(Tensor(img_np))
for batch_idx in range(img_np.shape[0]):
pred_data.append({"boxes": output[0].asnumpy()[batch_idx],
"box_scores": output[1].asnumpy()[batch_idx],
"img_id": int(np.squeeze(img_id[batch_idx])),
"image_shape": image_shape[batch_idx]})
mAP = metrics(pred_data, self.json_path)
if mAP > self.best_top_mAP:
self.best_top_mAP = mAP
self.best_top_epoch = cur_epoch
print(f"mAP: {mAP}" + f" best_top_mAP: {self.best_top_mAP}" + f" best_top_epochs:{self.best_top_epoch}")
net.set_train(True)
def eval_callback(val_data_url, model, json_path, train_url, eval_per_epoch):
"""Eval_callback"""
val_data_list = []
for i in range(8):
val_data = os.path.join(val_data_url, "ssd_eval.mindrecord" + str(i))
val_data_list.append(val_data)
dataset = create_ssd_dataset(val_data_list, batch_size=32, repeat_num=1,
is_training=False, use_multiprocessing=False)
callback = EvalCallBack(dataset, model, eval_per_epoch, train_url, json_path)
return callback
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment