diff --git a/research/cv/faster_rcnn_dcn/README.md b/research/cv/faster_rcnn_dcn/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7a41176a9d4cbd08d23b193a103ba2e0d4546dc8 --- /dev/null +++ b/research/cv/faster_rcnn_dcn/README.md @@ -0,0 +1,506 @@ +# Contents + +- [Contents](#contents) +- [Faster R-CNN-DCN description](#faster-r-cnn-dcn-description) +- [Model architecture](#model-architecture) +- [Dataset](#dataset) +- [Environmental requirements](#environmental-requirements) +- [Quick start](#quick-start) +- [Script description](#script-description) + - [Script and sample code](#script-and-sample-code) + - [Training process](#training-process) + - [Usage](#usage) + - [Result](#result) + - [Evaluation process](#evaluation-process) + - [Usage](#usage) + - [Result](#result) +- [Model description](#model-description) + - [Performance](#performance) + - [Training performance](#training-performance) + - [Evaluation performance](#evaluation-performance) +- [ModelZoo Home page](#modelzoo-home-page) + +<!-- /TOC --> + +# Faster R-CNN-DCN description + +Before Faster R-CNN, the target detection network relied on region proposal algorithms to hypothesize the location of the target, such as SPPNet, Fast R-CNN, etc. The research results show that the inference time of these detection networks is shortened, but the calculation of the region proposals is still a bottleneck. + +Faster R-CNN proposes that convolution feature maps based on region detectors (such as Fast R-CNN) can also be used to generate region candidates. Building a region proposal network (RPN) on top of these convolutional features requires adding some additional convolutional layers (sharing the convolutional features of the full image with the detection network, which can perform region proposals almost at no cost), while outputting the bounding boxes coordinates and objectivity scores. Therefore, RPN is a fully convolutional network that can be trained end-to-end to generate high-quality region proposals, and then sent to Fast R-CNN for detection. + +[Paper](https://arxiv.org/abs/1506.01497): Ren S , He K , Girshick R , et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(6). + +# Deformable Convolution description + +In recent years, convolutional neural networks fast progress in the field of computer vision, and have many applications in the field of image recognition, semantic segmentation, and object detection. However, due to the fixed geometric structure of convolutional neural networks, the geometric deformation to model is limited, so Deformable Convolution is proposed. + +Deformable convolution adds extra offsets to the spatial sampling positions of the convolution kernel in the convolutional layer, and learns the offsets from the target task without additional supervision. Since deformable convolution makes the shape of the convolution kernel not only a rectangular frame, but also closer to the feature extraction target, it can extract the features we want more accurately. + +The V2 version of deformable convolution is used in this network. + +[Paper](https://arxiv.org/pdf/1811.11168): Zhu X, Hu H, Lin S, et al. Deformable convnets v2: More deformable, better results[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 9308-9316. + +# Model architecture + +Faster R-CNN-DCN is a two-stage target detection network. The network uses RPN, which can share the convolutional features of the entire image with the detection network, and can perform region candidate calculations almost cost-free. The entire network further merges RPN and Fast R-CNN into one network by sharing convolutional features. + +By adding a deformable convolutional network, the convolutional layer in the 3-5 stage of resnet is replaced with a deformable convolutional layer, so that the shape of the convolution kernel is closer to the feature, and the desired feature can be extracted more accurately. + +# Dataset + +Dataset used: [COCO 2017](<https://cocodataset.org/>) + +- Dataset size: 19G + - Training set: 18G,118,000 Images + - Validation set: 1G,5000 Images + - Labels set: 241M,Instances,Captions,person_keypoints class +- Data Format: Images and json files + - Note: The data is processed in dataset.py. + +# Environmental requirements + +- Hardware(Ascend/GPU) + - Use the Ascend processor to build the hardware environment. +- Get the Docker image + - [Ascend Hub](https://ascend.huawei.com/ascendhub/#/home) + +- Install[MindSpore](https://www.mindspore.cn/install). + +- Download the data set COCO 2017. + +- This example uses COCO 2017 as the training data set by default, but you can also use your own data set. + + 1. If the COCO data set is used, **select the data set COCO when executing the script.** + Install Cython and pycocotool. + + ```python + pip install Cython + + pip install pycocotools + ``` + + Change COCO_ROOT and other required settings in `default_config.yaml` or `default_config_gpu.yaml` according to the running needs of the model. The directory structure is as follows: + + ```path + . + └─cocodataset + ├─annotations + ├─instance_train2017.json + └─instance_val2017.json + ├─val2017 + └─train2017 + + ``` + + 2. If you use your own data set, **select the data set as other when executing the script.** + Organize the data set information into a TXT file, the content of each line is as follows: + + ```txt + train2017/0000001.jpg 0,259,401,459,7,0 35,28,324,201,2,0 0,30,59,80,2,0 + ``` + + Each row is an image and a label divided by space, the first column is the relative path of the image, and the rest are boxes in the format of [xmin,ymin,xmax,ymax,class,is_crowd], the class and the information about whether it is a group of objects. Read the image from the image path of `image_dir` (data set directory) and the relative path of `anno_path` (TXT file path). `image_dir` and `anno_path` can be set in `config_50.yaml, config_101.yaml or config_152.yaml`. + +# Quick start + +After installing MindSpore through the official website, you can follow the steps below for training and evaluation: + +Notice: + +1. It takes a long time to generate the MindRecord file for the first run. +2. The pre-trained model is a ResNet-50 checkpoint trained on ImageNet2012. You can use the [resnet50](https://gitee.com/mindspore/models/tree/master/official/cv/resnet) script in ModelZoo to train. +3. BACKBONE_MODEL is trained through the ResNet-50 [resnet50](https://gitee.com/mindspore/models/tree/master/official/cv/resnet) script in modelzoo. +4. PRETRAINED_MODEL is the converted weight file. VALIDATION_JSON_FILE is a label file. CHECKPOINT_PATH is the checkpoint file after training. + +> For GPU training please use [GPU pretrained ResNet-50 model](https://download.mindspore.cn/model_zoo/r1.3/resnet50_gpu_v130_imagenet_official_cv_bs32_acc0/) (resnet50_gpu_v130_imagenet_official_cv_bs32_acc0) + +## Run on Ascend + +```shell + +# Stand-alone training +bash run_standalone_train_ascend.sh [PRETRAINED_MODEL] [BACKBONE] [COCO_ROOT] [MINDRECORD_DIR](option) + +# Distributed training +bash run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL] [BACKBONE] [COCO_ROOT] [MINDRECORD_DIR](option) + +# Evaluation +bash run_eval_ascend.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [BACKBONE] [COCO_ROOT] [MINDRECORD_DIR](option) + +# Inference +bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [ANNO_PATH] [DEVICE_ID] +``` + +## Run on GPU + +Use [pretrained ResNet-50 model](https://download.mindspore.cn/model_zoo/r1.3/resnet50_gpu_v130_imagenet_official_cv_bs32_acc0/) (resnet50_gpu_v130_imagenet_official_cv_bs32_acc0) + +```shell + +# Stand-alone training +bash run_standalone_train_gpu.sh [PRETRAINED_MODEL] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option) + +# Distributed training +bash run_distribute_train_gpu.sh [DEVICE_NUM] [PRETRAINED_MODEL] [COCO_ROOT] [MINDRECORD_DIR](option) + +# Evaluation +bash run_eval_gpu.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option) + +``` + +- ModelArts for training (If you want to run on modelarts, you can refer to the following documents [modelarts](https://support.huaweicloud.com/modelarts/)) + + ```python + # Using 8 card training on ModelArts + # (1) Execute a or b + # a. Set "enable_modelarts=True" in the default_config.yaml file + # Set "distribute=True" in the default_config.yaml file + # Set "dataset_path='/cache/data'" in the default_config.yaml file + # Set "epoch_size: 20" in the default_config.yaml file + # (Optional) Set "checkpoint_url='s3://dir_to_your_pretrained/'" in the default_config.yaml file + # Set other parameters in the default_config.yaml file + # b. Set "enable_modelarts=True" on the web page + # Set "distribute=True" on the web page + # Set "dataset_path=/cache/data" on the web page + # Set "epoch_size: 20" on the web page + # (Optional) Set "checkpoint_url='s3://dir_to_your_pretrained/'" on the web page + # Set other parameters on the web page + # (2) Prepare model code + # (3) If you choose to fine-tune your model, please upload your pre-trained model to the S3 bucket + # (4) Perform a or b (recommended to choose a) + # a. First, compress the data set into a ".zip" file. + # Second, upload your compressed data set to the S3 bucket (You can also upload uncompressed data sets, but that may be very slow.) + # b. Upload the original data set to the S3 bucket. + # (Data set conversion occurs during the training process, which takes more time. The conversion will be performed every time you train.) + # (5) Set your code path on the web page to "/path/faster_rcnn" + # (6) Set the startup file to "train.py" on the web page + # (7) Set "training data set", "training output file path", "job log path", etc. on the web page + # (8) Start training + # + # Use single card training on ModelArts + # (1) Execute a or b + # a. Set "enable_modelarts=True" in the default_config.yaml file + # Set "dataset_path='/cache/data'" in the default_config.yaml file + # Set "epoch_size: 20" in the default_config.yaml file + # (Optional) Set "checkpoint_url='s3://dir_to_your_pretrained/'" in the default_config.yaml file + # Set other parameters in the default_config.yaml file + # b. Set "enable_modelarts=True" on the web page + # Set "dataset_path='/cache/data'" on the web page + # Set "epoch_size: 20" on the web page + # (Optional) Set "checkpoint_url='s3://dir_to_your_pretrained/'" on the web page + # Set other parameters on the web page + # (2) Prepare model code + # (3) If you choose to fine-tune your model, upload your pre-trained model to the S3 bucket + # (4) Perform a or b (recommended to choose a) + # a. First, compress the data set into a ".zip" file. + # Second, upload your compressed data set to the S3 bucket (You can also upload uncompressed data sets, but that may be very slow.) + # b. Upload the original data set to the S3 bucket. + # (Data set conversion occurs during the training process, which takes more time. The conversion will be performed every time you train.) + # (5) Set your code path on the web page to "/path/faster_rcnn" + # (6) Set the startup file to "train.py" on the web page + # (7) Set "training data set", "training output file path", "job log path", etc. on the web page + # (8) Create training job + # + # Use single card evaluation on ModelArts + # (1) Execute a or b + # a. Set "enable_modelarts=True" in the default_config.yaml file + # Set "checkpoint_url='s3://dir_to_your_trained_model/'" in the default_config.yaml file + # Set "checkpoint='./faster_rcnn/faster_rcnn_trained.ckpt'" in the default_config.yaml file + # Set "dataset_path='/cache/data'" in the default_config.yaml file + # Set other parameters in the default_config.yaml file + # b. Set "enable_modelarts=True" on the web page + # Set "checkpoint_url='s3://dir_to_your_trained_model/'" on the webpage + # Set "checkpoint='./faster_rcnn/faster_rcnn_trained.ckpt'" on the webpage + # Set "dataset_path='/cache/data'" on the web page + # Set other parameters on the web page + # (2) Prepare model code + # (3) Upload your trained model to the S3 bucket + # (4) Perform a or b (recommended to choose a) + # a. First, compress the data set into a ".zip" file. + # Second, upload your compressed data set to the S3 bucket (You can also upload uncompressed data sets, but that may be very slow.) + # b. Upload the original data set to the S3 bucket. + # (Data set conversion occurs during the training process, which takes more time. The conversion will be performed every time you train.) + # (5) Set your code path on the web page to "/path/faster_rcnn" + # (6) Set the startup file to "eval.py" on the web page + # (7) Set "training data set", "training output file path", "job log path", etc. on the web page + # (8) Create training job + ``` + +# Script description + +## Script and sample code + +```shell +. +└─faster_rcnn_dcn + ├─README.md // Faster R-CNN related instructions + ├─ascend310_infer // Implement 310 inference source code + ├─scripts + ├─run_standalone_train_ascend.sh // Ascend stand-alone shell script + ├─run_standalone_train_gpu.sh // GPU stand-alone shell script + ├─run_distribute_train_ascend.sh // Ascend distributed shell script + ├─run_distribute_train_gpu.sh // GPU distributed shell script + ├─run_infer_310.sh // Ascend distributed shell script + ├─run_eval_ascend.sh // Ascend distributed shell script + └─run_eval_gpu.sh // GPU distributed shell script + ├─src + ├─FasterRcnn + ├─__init__.py // init file + ├─anchor_generator.py // Anchor generator + ├─bbox_assign_sample.py // First stage sampler + ├─bbox_assign_sample_stage2.py // Second stage sampler + ├─dcn_v2.py // Variable convolutional V2 network + ├─faster_rcnn_resnet50.py // Faster R-CNN network with Resnet50 as the backbone + ├─fpn_neck.py // Feature Pyramid Network + ├─proposal_generator.py // Candidate generator + ├─rcnn.py // R-CNN network + ├─resnet.py // Backbone network + ├─roi_align.py // ROI alignment network + └─rpn.py // Regional candidate network + ├─dataset.py // Create and process the data set + ├─lr_schedule.py // Learning rate generator + ├─network_define.py // Faster R-CNN network definition + ├─util.py // Evaluation related operations + └─model_utils + ├─config.py // Get .yaml configuration parameters + ├─device_adapter.py // Get the id on the cloud + ├─local_adapter.py // Get local id + └─moxing_adapter.py // Data preparation on the cloud + ├─default_config.yaml // Resnet50 related configuration for Ascend + ├─default_config_gpu.yaml // Resnet50 related configuration for GPU + ├─export.py // Script to export AIR, MINDIR model + ├─eval.py // Evaluation script + ├─postprogress.py // 310 inference post-processing script + └─train.py // Training script +``` + +## Training process + +### Usage + +#### Run on Ascend + +```shell +# Ascend stand-alone training +bash run_standalone_train_ascend.sh [PRETRAINED_MODEL] [COCO_ROOT] [MINDRECORD_DIR](option) + +# Ascend distributed training +bash run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_MODEL] [COCO_ROOT] [MINDRECORD_DIR](option) +``` + +#### Run on GPU + +```shell +# GPU stand-alone training +bash run_standalone_train_gpu.sh [PRETRAINED_MODEL] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option) + +# GPU distributed training +bash run_distribute_train_gpu.sh [DEVICE_NUM] [PRETRAINED_MODEL] [COCO_ROOT] [MINDRECORD_DIR](option) +``` + +Notes: + +1. The rank_table.json specified by RANK_TABLE_FILE is required to run distributed tasks. You can use [hccl_tools](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) to generate this file. +2. PRETRAINED_MODEL should be a trained ResNet-50 checkpoint. If you need to load the checkpoints of the trained FasterRcnn, you need to modify train.py as follows: + +```python +# Comment out the following code +# load_path = args_opt.pre_trained +# if load_path != "": +# param_dict = load_checkpoint(load_path) +# for item in list(param_dict.keys()): +# if not item.startswith('backbone'): +# param_dict.pop(item) +# load_param_into_net(net, param_dict) + +# When loading the trained FasterRcnn checkpoint, you need to load the network parameters and optimizer to the model, so you can add the following code after defining the optimizer: + lr = Tensor(dynamic_lr(config, rank_size=device_num), mstype.float32) + opt = SGD(params=net.trainable_params(), learning_rate=lr, momentum=config.momentum, + weight_decay=config.weight_decay, loss_scale=config.loss_scale) + + if load_path != "": + param_dict = load_checkpoint(load_path) + for item in list(param_dict.keys()): + if item in ("global_step", "learning_rate") or "rcnn.reg_scores" in item or "rcnn.cls_scores" in item: + param_dict.pop(item) + load_param_into_net(opt, param_dict) + load_param_into_net(net, param_dict) +``` + +3. defaule_config.yaml contains the original data set path, you can choose "coco_root" or "image_dir". + +### Result + +The training results are saved in the example path, and the folder name starts with "train" or "train_parallel". You can find the checkpoint file and results in loss_rankid.log, as shown below. + +```log +# Distributed training results(8P) +339 epoch: 1 step: 1 total_loss: 5.00443 +340 epoch: 1 step: 2 total_loss: 1.09367 +340 epoch: 1 step: 3 total_loss: 0.90158 +... +346 epoch: 1 step: 15 total_loss: 0.31314 +347 epoch: 1 step: 16 total_loss: 0.84451 +347 epoch: 1 step: 17 total_loss: 0.63137 +``` + +## Evaluation process + +### Usage + +#### Run on Ascend + +```shell +# Ascend evaluation +bash run_eval_ascend.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [COCO_ROOT] [MINDRECORD_DIR](option) +``` + +> Generate checkpoints during training. +> +> The number of images in the data set must be the same as the number of tags in the VALIDATION_JSON_FILE file, otherwise the accuracy result display format may be abnormal. + +#### Run on GPU + +```shell +# GPU evaluation +bash run_eval_gpu.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option) +``` + +> Generate checkpoints during training. +> +> The number of images in the data set must be the same as the number of tags in the VALIDATION_JSON_FILE file, otherwise the accuracy result display format may be abnormal. + +### Result on Ascend + +The evaluation result will be saved in the example path, the folder name is "eval". Under this folder, you can find results similar to the following in the log. + +```log +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.406 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.624 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.441 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.264 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.439 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.533 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.330 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.517 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.541 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.384 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.577 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.675 +``` + +### Result on GPU + +The evaluation result will be saved in the example path, the folder name is "eval". Under this folder, you can find results similar to the following in the log. + +```log +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.402 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.615 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.429 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.522 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.331 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.516 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.540 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.374 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.570 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.670 +``` + +## Model export + +```shell +python export.py --config_path [CONFIG_PATH] --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT] +``` + +`EXPORT_FORMAT` Optional ["AIR", "MINDIR"] + +## Inference process + +### Instructions + +It is necessary to complete the export of the model in the Shengteng 910 environment before inference. The following example only supports mindir inference with batch_size=1. + +```shell +# Ascend310 inference +bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [ANNO_PATH] [DEVICE_ID] +``` + +### Result on Ascend + +The result of the inference is saved in the current directory, and the result similar to the following can be found in the acc.log log file. + +```log +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.403 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.620 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.252 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.436 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.523 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.328 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.513 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.536 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.370 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.573 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.667 + +``` + +### Result on GPU + +The result of the inference is saved in the current directory, and the result similar to the following can be found in the acc.log log file. + +```log +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.402 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.615 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.429 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.522 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.331 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.516 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.540 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.374 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.570 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.670 + +``` + +# Model description + +## Performance + +### Training performance + +| parameter |Ascend | GPU | +| -------------------------- | -----------------------------------------------------------|------------------------------------------------------------| +| Model version | V1 | V1 | +| resource | Ascend 910;CPU 2.60GHz, 192 cores;RAM:755G | GeForce RTX 3090;CPU 2.90GHz, 64 cores;RAM:252G | +| Upload date | 2021/11/5 | 2021/11/13 | +| MindSpore Version | 1.3.0 | 1.5.0rc1 | +| Dataset | COCO 2017 | COCO 2017 | +| Training parameters | epoch=70, batch_size=2 | epoch=72, batch_size=2 | +| Optimizer | SGD | SGD | +| Loss function | Softmax Cross entropy, Sigmoid Cross entropy, SmoothL1Loss | Softmax Cross entropy, Sigmoid Cross entropy, SmoothL1Loss | +| speed | 8 cards:448 milliseconds/step | 8 cards:655 milliseconds/step | +| total time | 8 cards:66.2 hours | 8 cards:96,8 hours | +| parameters(M) | 486 | 486 | + +### Evaluation performance + +| parameter | Ascend | GPU | +| ------------------- | ----------------- | ----------------- | +| Model version | V1 | V1 | +| resource | Ascend 910 | GeForce RTX 3090 | +| Upload date | 2021/11/5 | 2021/11/13 | +| MindSpore Version | 1.3.0 | 1.5.0rc1 | +| Dataset | COCO2017 | COCO2017 | +| batch_size | 2 | 2 | +| Output | mAP | mAP | +| Accuracy | IoU=0.50:62.0% | IoU=0.50:61.5% | +| Evaluation model | 486M (.ckpt file) | 486M (.ckpt file) | + +# ModelZoo home page + +Please visit the official website [homepage](https://gitee.com/mindspore/models). diff --git a/research/cv/faster_rcnn_dcn/default_config_gpu.yaml b/research/cv/faster_rcnn_dcn/default_config_gpu.yaml new file mode 100644 index 0000000000000000000000000000000000000000..5c7ffdd121a0b8203d866d2588ff2dda702c77a8 --- /dev/null +++ b/research/cv/faster_rcnn_dcn/default_config_gpu.yaml @@ -0,0 +1,213 @@ +# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing) +enable_modelarts: False +data_url: "" +train_url: "" +checkpoint_url: "" +data_path: "/cache/data" +output_path: "/cache/train" +load_path: "/cache/checkpoint_path" +device_target: GPU +enable_profiling: False + +# ============================================================================== +# config +img_width: 1280 +img_height: 768 +keep_ratio: True +flip_ratio: 0.5 +expand_ratio: 1.0 + +# anchor +feature_shapes: +- [192, 320] +- [96, 160] +- [48, 80] +- [24, 40] +- [12, 20] +anchor_scales: [8] +anchor_ratios: [0.5, 1.0, 2.0] +anchor_strides: [4, 8, 16, 32, 64] +num_anchors: 3 + +# resnet +#resnet50 [3,4,6,3] +#renet101[3,4,23,3] +resnet_block: [3, 4, 6, 3] +resnet_in_channels: [64, 256, 512, 1024] +resnet_out_channels: [256, 512, 1024, 2048] + +# fpn +fpn_in_channels: [256, 512, 1024, 2048] +fpn_out_channels: 256 +fpn_num_outs: 5 + +# rpn +rpn_in_channels: 256 +rpn_feat_channels: 256 +rpn_loss_cls_weight: 1.0 +rpn_loss_reg_weight: 1.0 +rpn_cls_out_channels: 1 +rpn_target_means: [0., 0., 0., 0.] +rpn_target_stds: [1.0, 1.0, 1.0, 1.0] + +# bbox_assign_sampler +neg_iou_thr: 0.3 +pos_iou_thr: 0.7 +min_pos_iou: 0.3 +num_bboxes: 245520 +num_gts: 128 +num_expected_neg: 256 +num_expected_pos: 128 + +# proposal +activate_num_classes: 2 +use_sigmoid_cls: True + +# roi_align +roi_layer: {type: 'RoIAlign', out_size: 7, sample_num: 2} +roi_align_out_channels: 256 +roi_align_featmap_strides: [4, 8, 16, 32] +roi_align_finest_scale: 56 +roi_sample_num: 640 + +# bbox_assign_sampler_stage2 +neg_iou_thr_stage2: 0.5 +pos_iou_thr_stage2: 0.5 +min_pos_iou_stage2: 0.5 +num_bboxes_stage2: 2000 +num_expected_pos_stage2: 128 +num_expected_neg_stage2: 512 +num_expected_total_stage2: 512 + +# rcnn +rcnn_num_layers: 2 +rcnn_in_channels: 256 +rcnn_fc_out_channels: 1024 +rcnn_loss_cls_weight: 1 +rcnn_loss_reg_weight: 1 +rcnn_target_means: [0., 0., 0., 0.] +rcnn_target_stds: [0.1, 0.1, 0.2, 0.2] + +# train proposal +rpn_proposal_nms_across_levels: False +rpn_proposal_nms_pre: 2000 +rpn_proposal_nms_post: 2000 +rpn_proposal_max_num: 2000 +rpn_proposal_nms_thr: 0.7 +rpn_proposal_min_bbox_size: 0 + +# test proposal +rpn_nms_across_levels: False +rpn_nms_pre: 1000 +rpn_nms_post: 1000 +rpn_max_num: 1000 +rpn_nms_thr: 0.7 +rpn_min_bbox_min_size: 0 +test_score_thr: 0.05 +test_iou_thr: 0.5 +test_max_per_img: 100 +test_batch_size: 2 + +rpn_head_use_sigmoid: True +rpn_head_weight: 1.0 + +# LR +#resnet101 base_lr=0.02 +base_lr: 0.04 +warmup_step: 500 +warmup_ratio: 0.0625 +sgd_step: [8, 11] +sgd_momentum: 0.9 + +# train +batch_size: 2 +loss_scale: 256 +momentum: 0.91 +weight_decay: 0.00001 +epoch_size: 72 +save_checkpoint_epochs: 1 +keep_checkpoint_max: 20 +save_checkpoint_path: "save_ckpt/" +save_on_master: True + +# Number of threads used to process the dataset in parallel +num_parallel_workers: 8 +# Parallelize Python operations with multiple worker processes +python_multiprocessing: True +mindrecord_dir: "" +coco_root: "" +train_data_type: "train2017" +val_data_type: "val2017" +instance_set: "annotations/instances_{}.json" +coco_classes: ['background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', + 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', + 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', + 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', + 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', + 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', + 'kite', 'baseball bat', 'baseball glove', 'skateboard', + 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', + 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', + 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', + 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', + 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', + 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', + 'refrigerator', 'book', 'clock', 'vase', 'scissors', + 'teddy bear', 'hair drier', 'toothbrush'] +num_classes: 81 + +# annotations file(json format or user defined text format) +anno_path: '' +image_dir: '' + +# train.py FasterRcnn training +run_distribute: False +dataset: "coco" +pre_trained: "" +device_id: 0 +device_num: 1 +rank_id: 0 + +# eval.py FasterRcnn evaluation +checkpoint_path: " " + +# export.py fasterrcnn_export +file_name: "Faster_Rcnn_DCN" +file_format: "MINDIR" +ckpt_file: "" + +# postprocess +result_path: '' + +--- +# Config description for each option +enable_modelarts: 'Whether training on modelarts, default: False' +data_url: 'Dataset url for obs' +train_url: 'Training output url for obs' +data_path: 'Dataset path for local' +output_path: 'Training output path for local' +result_dir: "result files path." +label_dir: "image file path." + +device_target: "device where the code will be implemented, default is Ascend" +file_name: "output file name." +parameter_server: 'Run parameter server train' +width: 'input width' +height: 'input height' +enable_profiling: 'Whether enable profiling while training, default: False' +run_distribute: 'Run distribute, default is false.' +save_on_master: "Save ckpt on master or all rank, True for master, False for all ranks." + +pre_trained: 'Pretrained checkpoint path' +device_id: 'Device id, default is 0.' +device_num: 'Use device nums, default is 1.' +rank_id: 'Rank id, default is 0.' +file_format: 'file format' +anno_path: "Ann file, default is val.json." +checkpoint_path: "Checkpoint file path." +ckpt_file: 'fasterrcnn ckpt file.' +result_path: "result file path." + +--- +device_target: ['Ascend', 'GPU', 'CPU'] +file_format: ["AIR", "ONNX", "MINDIR"] diff --git a/research/cv/faster_rcnn_dcn/eval.py b/research/cv/faster_rcnn_dcn/eval.py index 656e536d5a3ba6284b1b53ef237674fbbe2a719b..b32139dd234cb1dad9d679f6e4361ef9d17841c0 100644 --- a/research/cv/faster_rcnn_dcn/eval.py +++ b/research/cv/faster_rcnn_dcn/eval.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -17,25 +17,37 @@ import os import time from collections import defaultdict + import numpy as np -from pycocotools.coco import COCO -import mindspore.common.dtype as mstype from mindspore import context -from mindspore.train.serialization import load_checkpoint, load_param_into_net +from mindspore.common import dtype as mstype from mindspore.common import set_seed, Parameter +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from pycocotools.coco import COCO +from src.FasterRcnn.faster_rcnn_resnet import Faster_Rcnn_Resnet from src.dataset import data_to_mindrecord_byte_image, create_fasterrcnn_dataset, parse_json_annos_from_txt -from src.util import coco_eval, bbox2result_1image, results2json from src.model_utils.config import config -from src.model_utils.moxing_adapter import moxing_wrapper from src.model_utils.device_adapter import get_device_id -from src.FasterRcnn.faster_rcnn_resnet import Faster_Rcnn_Resnet +from src.model_utils.moxing_adapter import moxing_wrapper +from src.util import coco_eval, bbox2result_1image, results2json set_seed(1) context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target, device_id=get_device_id()) + def fasterrcnn_eval(dataset_path, ckpt_path, anno_path): - """FasterRcnn evaluation.""" + """ + Evaluate FasterRCNN on the provided dataset + + Args: + dataset_path: Path to dataset + ckpt_path: Path to checkpoint + anno_path: Path to annotation + + Returns: + + """ if not os.path.isfile(ckpt_path): raise RuntimeError("CheckPoint file {} is not valid.".format(ckpt_path)) ds = create_fasterrcnn_dataset(config, dataset_path, batch_size=config.test_batch_size, is_training=False) @@ -117,12 +129,15 @@ def fasterrcnn_eval(dataset_path, ckpt_path, anno_path): acc_file.close() print("eval end") + def modelarts_pre_process(): + """Prepare everything for modelarts""" config.coco_root = config.data_path config.mindrecord_dir = os.path.join(config.coco_root, "MindRecord_COCO") config.checkpoint_path = os.path.join(config.load_path, config.checkpoint_path) config.acclog_path = os.path.join(config.output_path, "mAP.log") + @moxing_wrapper(pre_process=modelarts_pre_process) def eval_fasterrcnn(): """ eval_fasterrcnn """ @@ -153,6 +168,7 @@ def eval_fasterrcnn(): print("Start Eval!") fasterrcnn_eval(mindrecord_file, config.checkpoint_path, config.anno_path) + if __name__ == '__main__': config.acclog_path = "./mAP.log" eval_fasterrcnn() diff --git a/research/cv/faster_rcnn_dcn/export.py b/research/cv/faster_rcnn_dcn/export.py index 803d220b97acffbfe5711febc5776485bd2ae815..3e53604e6815e1c68ff940765a45093a9705ba8b 100644 --- a/research/cv/faster_rcnn_dcn/export.py +++ b/research/cv/faster_rcnn_dcn/export.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,21 +15,22 @@ """export checkpoint file into air, onnx, mindir models""" import numpy as np - -import mindspore.common.dtype as mstype from mindspore import Tensor, load_checkpoint, load_param_into_net, export, context +from mindspore.common import dtype as mstype + from src.FasterRcnn.faster_rcnn_resnet import FasterRcnn_Infer from src.model_utils.config import config -from src.model_utils.moxing_adapter import moxing_wrapper from src.model_utils.device_adapter import get_device_id - +from src.model_utils.moxing_adapter import moxing_wrapper context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target) if config.device_target == "Ascend": context.set_context(device_id=get_device_id()) + def modelarts_pre_process(): - pass + """Prepare everything for modelarts""" + @moxing_wrapper(pre_process=modelarts_pre_process) def export_fasterrcnn(): @@ -53,5 +54,6 @@ def export_fasterrcnn(): export(net, img, img_metas, file_name=config.file_name, file_format=config.file_format) + if __name__ == '__main__': export_fasterrcnn() diff --git a/research/cv/faster_rcnn_dcn/scripts/run_distribute_train_gpu.sh b/research/cv/faster_rcnn_dcn/scripts/run_distribute_train_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..203e85f9adadd9b1417bf202d7d7b9774ad2d003 --- /dev/null +++ b/research/cv/faster_rcnn_dcn/scripts/run_distribute_train_gpu.sh @@ -0,0 +1,71 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +echo "==============================================================================================================" +echo "Please run the script as: " +echo "bash run_distribute_train_gpu.sh DEVICE_NUM PRETRAINED_PATH COCO_ROOT MINDRECORD_DIR(optional)" +echo "for example: bash run_distribute_train_gpu.sh 8 /path/pretrain.ckpt cocodataset mindrecord_dir(optional)" +echo "It is better to use absolute path." +echo "==============================================================================================================" + +if [ $# != 3 ] && [ $# != 4 ] +then + echo "Usage: bash run_distribute_train_gpu.sh [DEVICE_NUM] [PRETRAINED_PATH] [COCO_ROOT] [MINDRECORD_DIR](option)" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +export RANK_SIZE=$1 +PRETRAINED_PATH=$(get_real_path $2) +PATH3=$(get_real_path $3) + +rm -rf run_distribute_train +mkdir run_distribute_train +cp -rf ../src/ ../train.py ../*.yaml ./run_distribute_train +cd run_distribute_train || exit + +mindrecord_dir=$PATH3/FASTERRCNN_MINDRECORD/ +if [ $# -eq 4 ] +then + mindrecord_dir=$(get_real_path $4) + if [ ! -d $mindrecord_dir ] + then + echo "error: mindrecord_dir=$mindrecord_dir is not a dir" + exit 1 + fi +fi +echo $mindrecord_dir + +BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd) +CONFIG_FILE="${BASE_PATH}/../../default_config_gpu.yaml" + +echo "start training on $RANK_SIZE devices" + +mpirun -n $RANK_SIZE --allow-run-as-root --output-filename log_output --merge-stderr-to-stdout \ + python train.py \ + --config_path=$CONFIG_FILE \ + --run_distribute=True \ + --device_target="GPU" \ + --device_num=$RANK_SIZE \ + --pre_trained=$PRETRAINED_PATH \ + --coco_root=$PATH3 \ + --mindrecord_dir=$mindrecord_dir > log 2>&1 & diff --git a/research/cv/faster_rcnn_dcn/scripts/run_eval_ascend.sh b/research/cv/faster_rcnn_dcn/scripts/run_eval_ascend.sh index 551f92a6917de0634fb5c9ff4b02eeaba67510a4..3e696c4f40e532e37e31016982ad336a30d8f7b3 100644 --- a/research/cv/faster_rcnn_dcn/scripts/run_eval_ascend.sh +++ b/research/cv/faster_rcnn_dcn/scripts/run_eval_ascend.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. diff --git a/research/cv/faster_rcnn_dcn/scripts/run_eval_gpu.sh b/research/cv/faster_rcnn_dcn/scripts/run_eval_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..fc1d917fbca07a16fc492d59239a554330b735db --- /dev/null +++ b/research/cv/faster_rcnn_dcn/scripts/run_eval_gpu.sh @@ -0,0 +1,89 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 4 ] && [ $# != 5 ] +then + echo "Usage: bash run_eval_gpu.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option)" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} +PATH1=$(get_real_path $1) +PATH2=$(get_real_path $2) +PATH3=$(get_real_path $3) +echo $PATH3 +echo $PATH1 +echo $PATH2 + +if [ ! -f $PATH1 ] +then + echo "error: ANNO_PATH=$PATH1 is not a file" +exit 1 +fi + +if [ ! -f $PATH2 ] +then + echo "error: CHECKPOINT_PATH=$PATH2 is not a file" +exit 1 +fi + +if [ ! -d $PATH3 ] +then + echo "error: COCO_ROOT=$PATH3 is not a dir" +exit 1 +fi + +mindrecord_dir=$PATH3/FASTERRCNN_MINDRECORD/ +if [ $# == 5 ] +then + mindrecord_dir=$(get_real_path $5) + if [ ! -d $mindrecord_dir ] + then + echo "error: mindrecord_dir=$mindrecord_dir is not a dir" + exit 1 + fi +fi +echo $mindrecord_dir + +BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd) +CONFIG_FILE="${BASE_PATH}/../default_config_gpu.yaml" + +export DEVICE_NUM=1 +export RANK_SIZE=$DEVICE_NUM +export DEVICE_ID=$4 +export RANK_ID=0 + +if [ -d "eval" ]; +then + rm -rf ./eval +fi +mkdir ./eval +cp ../*.py ./eval +cp ../*.yaml ./eval +cp *.sh ./eval +cp -r ../src ./eval +cd ./eval || exit +env > env.log +echo "start eval for device $DEVICE_ID" +python eval.py --config_path=$CONFIG_FILE --device_id=$DEVICE_ID --anno_path=$PATH1 --checkpoint_path=$PATH2 \ +--coco_root=$PATH3 --mindrecord_dir=$mindrecord_dir &> log & +cd .. diff --git a/research/cv/faster_rcnn_dcn/scripts/run_standalone_train_gpu.sh b/research/cv/faster_rcnn_dcn/scripts/run_standalone_train_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..858cb06fd2b5432ea25db2f85627d343db65f5f9 --- /dev/null +++ b/research/cv/faster_rcnn_dcn/scripts/run_standalone_train_gpu.sh @@ -0,0 +1,82 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 3 ] && [ $# != 4 ] +then + echo "Usage: bash run_standalone_train_gpu.sh [PRETRAINED_PATH] [COCO_ROOT] [DEVICE_ID] [MINDRECORD_DIR](option)" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +PATH1=$(get_real_path $1) +PATH2=$(get_real_path $2) +echo $PATH1 +echo $PATH2 + +if [ ! -f $PATH1 ] +then + echo "error: PRETRAINED_PATH=$PATH1 is not a file" +exit 1 +fi + +if [ ! -d $PATH2 ] +then + echo "error: COCO_ROOT=$PATH2 is not a dir" +exit 1 +fi + +mindrecord_dir=$PATH2/FASTERRCNN_MINDRECORD/ +if [ $# == 4 ] +then + mindrecord_dir=$(get_real_path $4) + if [ ! -d $mindrecord_dir ] + then + echo "error: mindrecord_dir=$mindrecord_dir is not a dir" + exit 1 + fi +fi +echo $mindrecord_dir + +BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd) +CONFIG_FILE="${BASE_PATH}/../default_config_gpu.yaml" + +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_ID=0 +export RANK_SIZE=1 + +if [ -d "train" ]; +then + rm -rf ./train +fi +mkdir ./train +cp ../*.py ./train +cp ../*.yaml ./train +cp *.sh ./train +cp -r ../src ./train +cd ./train || exit +echo "start training for device $DEVICE_ID" +env > env.log +python train.py --config_path=$CONFIG_FILE --coco_root=$PATH2 --mindrecord_dir=$mindrecord_dir \ +--device_id=$DEVICE_ID --pre_trained=$PATH1 --device_target="GPU" > log 2>&1 & +cd .. diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/anchor_generator.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/anchor_generator.py index 82e10093baa985d806e5d3752cdaaf0175be81a1..bf1d486c83488d245f70674e75f5da5341ef7065 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/anchor_generator.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/anchor_generator.py @@ -1,4 +1,4 @@ -# Copyright 2020-2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,7 +16,8 @@ import numpy as np -class AnchorGenerator(): + +class AnchorGenerator: """Anchor generator for FasterRcnn.""" def __init__(self, base_size, scales, ratios, scale_major=True, ctr=None): """Anchor generator init method.""" diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample.py index 28f550c4c4c184f52ecdd98b7a759fc821e274f9..0249db8217f6d4547a0b2347b6fc28e77814f5ee 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,10 +15,10 @@ """FasterRcnn-DCN positive and negative sample screening for RPN.""" import numpy as np -import mindspore.nn as nn -from mindspore.ops import operations as P +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.tensor import Tensor -import mindspore.common.dtype as mstype +from mindspore.ops import operations as P class BboxAssignSample(nn.Cell): diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample_stage2.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample_stage2.py index 7543cabdf06b0b9412148757c651935af94961fd..f89ecdc720c7744b22d299ddad36fb1f2fcfdfc9 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample_stage2.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/bbox_assign_sample_stage2.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,10 +15,10 @@ """FasterRcnn-DCN tpositive and negative sample screening for Rcnn.""" import numpy as np -import mindspore.nn as nn -import mindspore.common.dtype as mstype -from mindspore.ops import operations as P +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.tensor import Tensor +from mindspore.ops import operations as P class BboxAssignSampleForRcnn(nn.Cell): @@ -110,12 +110,22 @@ class BboxAssignSampleForRcnn(nn.Cell): def construct(self, gt_bboxes_i, gt_labels_i, valid_mask, bboxes, gt_valids): """construct""" - gt_bboxes_i = self.select(self.cast(self.tile(self.reshape(self.cast(gt_valids, mstype.int32), \ - (self.num_gts, 1)), (1, 4)), mstype.bool_), \ - gt_bboxes_i, self.check_gt_one) - bboxes = self.select(self.cast(self.tile(self.reshape(self.cast(valid_mask, mstype.int32), \ - (self.num_bboxes, 1)), (1, 4)), mstype.bool_), \ - bboxes, self.check_anchor_two) + gt_bboxes_i = self.select( + self.cast( + self.tile(self.reshape(self.cast(gt_valids, mstype.int32), (self.num_gts, 1)), (1, 4)), + mstype.bool_ + ), + gt_bboxes_i, + self.check_gt_one + ) + bboxes = self.select( + self.cast( + self.tile(self.reshape(self.cast(valid_mask, mstype.int32), (self.num_bboxes, 1)), (1, 4)), + mstype.bool_ + ), + bboxes, + self.check_anchor_two + ) overlaps = self.iou(bboxes, gt_bboxes_i) @@ -130,7 +140,7 @@ class BboxAssignSampleForRcnn(nn.Cell): assigned_gt_inds2 = self.select(neg_sample_iou_mask, self.assigned_gt_zeros, self.assigned_gt_inds) pos_sample_iou_mask = self.greaterequal(max_overlaps_w_gt, self.scalar_pos_iou_thr) - assigned_gt_inds3 = self.select(pos_sample_iou_mask, \ + assigned_gt_inds3 = self.select(pos_sample_iou_mask, max_overlaps_w_gt_index + self.assigned_gt_ones, assigned_gt_inds2) for j in range(self.num_gts): diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/dcn_v2.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/dcn_v2.py index 604a748cc31ebbdea60d949761ad7b66c2770821..e78b833e9c68931c6bc1ead3af6ce06f8e8615b9 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/dcn_v2.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/dcn_v2.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -14,11 +14,12 @@ # ============================================================================ """Deformable Convolution operator V2""" -import numpy as np import mindspore as ms +import mindspore.common.dtype as mstype import mindspore.nn as nn import mindspore.ops as ops -import mindspore.common.dtype as mstype +import numpy as np + np.random.seed(0) ms.common.set_seed(0) @@ -64,7 +65,6 @@ def _get_offset_base(offset_shape, stride): def _get_feature_by_index(x, p_h, p_w): - """gather feature by specified index""" # x (n, c, h_in, w_in) # p_h (n, h, w, k*k) diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/faster_rcnn_resnet.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/faster_rcnn_resnet.py index b0202e6229c4ceed59696f56d37a45dcf87aa731..d85519f9f7af97eb272657cf76e789afc3af25f2 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/faster_rcnn_resnet.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/faster_rcnn_resnet.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,20 +15,21 @@ """FasterRcnn-DCN based on ResNet.""" import numpy as np -import mindspore.nn as nn from mindspore import context -from mindspore.ops import operations as P +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.tensor import Tensor -import mindspore.common.dtype as mstype from mindspore.ops import functional as F -from .resnet import ResNetFea, ResidualBlockUsing, ResidualBlockUsingDCN +from mindspore.ops import operations as P + +from .anchor_generator import AnchorGenerator from .bbox_assign_sample_stage2 import BboxAssignSampleForRcnn from .fpn_neck import FeatPyramidNeck from .proposal_generator import Proposal from .rcnn import Rcnn -from .rpn import RPN +from .resnet import ResNetFea, ResidualBlockUsing, ResidualBlockUsingDCN from .roi_align import SingleRoIExtractor -from .anchor_generator import AnchorGenerator +from .rpn import RPN class Faster_Rcnn_Resnet(nn.Cell): @@ -222,7 +223,7 @@ class Faster_Rcnn_Resnet(nn.Cell): self.test_num_proposal = self.test_batch_size * self.rpn_max_num def init_tensor(self, config): - + """Initialize tensor""" roi_align_index = [np.array(np.ones((config.num_expected_pos_stage2 + config.num_expected_neg_stage2, 1)) * i, dtype=self.dtype) for i in range(self.train_batch_size)] @@ -478,12 +479,15 @@ class Faster_Rcnn_Resnet(nn.Cell): return multi_level_anchors + class FasterRcnn_Infer(nn.Cell): + """Perform inference of FasterRCNN""" def __init__(self, config): super(FasterRcnn_Infer, self).__init__() self.network = Faster_Rcnn_Resnet(config) self.network.set_train(False) def construct(self, img_data, img_metas): + """Forward pass throw network""" output = self.network(img_data, img_metas, None, None, None) return output diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/fpn_neck.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/fpn_neck.py index c2916f285628d4c5e31de8cef9cc474e819c3ae8..f77407062ae42bb34d3682fad61e1f2a37ed77ef 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/fpn_neck.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/fpn_neck.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,20 +15,41 @@ """FasterRcnn-DCN feature pyramid network.""" import numpy as np -import mindspore.nn as nn -from mindspore.ops import operations as P -from mindspore.common.tensor import Tensor +from mindspore import nn from mindspore.common import dtype as mstype from mindspore.common.initializer import initializer +from mindspore.common.tensor import Tensor +from mindspore.ops import operations as P def bias_init_zeros(shape): - """Bias init method.""" + """ + Bias init method + + Args: + shape: Tensor shape + + Returns: + Bias tensor initialized with zeros + """ return Tensor(np.array(np.zeros(shape).astype(np.float32))) def _conv(in_channels, out_channels, kernel_size=3, stride=1, padding=0, pad_mode='pad'): - """Conv2D wrapper.""" + """ + Conv2D wrapper + + Args: + in_channels: Input channels + out_channels: Output channels + kernel_size: Size of kernel + stride: Strides + padding: Paddings + pad_mode: Padding mode + + Returns: + Wrapped Conv2D layer + """ shape = (out_channels, in_channels, kernel_size, kernel_size) weights = initializer("XavierUniform", shape=shape, dtype=mstype.float32).init_data() shape_bias = (out_channels,) diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/proposal_generator.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/proposal_generator.py index 2fe5f31aa48a8a2d6675db5c56a9af9e039313c7..6987f8883e9b46a20f5467f8d95d885da560325b 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/proposal_generator.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/proposal_generator.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,10 +15,10 @@ """FasterRcnn-DCN proposal generator.""" import numpy as np -import mindspore.nn as nn -import mindspore.common.dtype as mstype -from mindspore.ops import operations as P from mindspore import Tensor +from mindspore import nn +from mindspore.common import dtype as mstype +from mindspore.ops import operations as P class Proposal(nn.Cell): diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/rcnn.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/rcnn.py index 11defd618a8618d7aa3fab32500368132264f951..e0922a12021ea3805282f39abd71b2ca80910ab6 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/rcnn.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/rcnn.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,13 +15,13 @@ """FasterRcnn-DCN Rcnn network.""" import numpy as np -import mindspore.common.dtype as mstype -import mindspore.nn as nn -from mindspore.ops import operations as P -from mindspore.common.tensor import Tensor +from mindspore import context +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.initializer import initializer from mindspore.common.parameter import Parameter -from mindspore import context +from mindspore.common.tensor import Tensor +from mindspore.ops import operations as P class DenseNoTranpose(nn.Cell): @@ -37,6 +37,7 @@ class DenseNoTranpose(nn.Cell): self.device_type = "Ascend" if context.get_context("device_target") == "Ascend" else "Others" def construct(self, x): + """Forward pass throw model""" if self.device_type == "Ascend": x = self.cast(x, mstype.float16) weight = self.cast(self.weight, mstype.float16) diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/resnet.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/resnet.py index 4e8a0ee9037457e60d8cdb1a5d224cdfea5ebff4..38ae443442ae20c8b2030196ade65759a99ce0e4 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/resnet.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/resnet.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,19 +15,42 @@ """Resnet backbone.""" import numpy as np -import mindspore.nn as nn -from mindspore.ops import operations as P +from mindspore import nn from mindspore.common.tensor import Tensor from mindspore.ops import functional as F +from mindspore.ops import operations as P + from .dcn_v2 import DeformConv2d + def weight_init_ones(shape): - """Weight init.""" + """ + Weight init + + Args: + shape: Data shape + + Returns: + Initialized weights + """ return Tensor(np.full(shape, 0.01).astype(np.float32)) def _conv(in_channels, out_channels, kernel_size=3, stride=1, padding=0, pad_mode='pad'): - """Conv2D wrapper.""" + """ + Conv2D wrapper + + Args: + in_channels: Input channels + out_channels: Output channels + kernel_size: Size of kernel + stride: Strides + padding: Paddings + pad_mode: Padding mode + + Returns: + Wrapped Conv2D layer + """ shape = (out_channels, in_channels, kernel_size, kernel_size) weights = weight_init_ones(shape) return nn.Conv2d(in_channels, out_channels, @@ -36,7 +59,18 @@ def _conv(in_channels, out_channels, kernel_size=3, stride=1, padding=0, pad_mod def _BatchNorm2dInit(out_chls, momentum=0.1, affine=True, use_batch_statistics=True): - """Batchnorm2D wrapper.""" + """ + Batchnorm2D wrapper + + Args: + out_chls: Number of output channels + momentum: Momentum + affine: Flag to apply affine transformations + use_batch_statistics: Flag to use batch statistics + + Returns: + Wrapped BatchNorm2d layer + """ dtype = np.float32 gamma_init = Tensor(np.array(np.ones(out_chls)).astype(dtype)) beta_init = Tensor(np.array(np.ones(out_chls) * 0).astype(dtype)) diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/roi_align.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/roi_align.py index ef2e37b08c11f91b8368fd9db750887f0bf07c07..62cdad91e3b4f13bb3ca985a2704d10681c5ebe3 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/roi_align.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/roi_align.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,12 +15,13 @@ """FasterRcnn-DCN ROIAlign module.""" import numpy as np -import mindspore.nn as nn -import mindspore.common.dtype as mstype -from mindspore.ops import operations as P -from mindspore.ops import composite as C -from mindspore.nn import layer as L +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.tensor import Tensor +from mindspore.nn import layer as L +from mindspore.ops import composite as C +from mindspore.ops import operations as P + class ROIAlign(nn.Cell): """ @@ -44,6 +45,7 @@ class ROIAlign(nn.Cell): self.sample_num = int(sample_num) self.align_op = P.ROIAlign(self.out_size[0], self.out_size[1], self.spatial_scale, self.sample_num) + def construct(self, features, rois): return self.align_op(features, rois) @@ -120,16 +122,28 @@ class SingleRoIExtractor(nn.Cell): self.twos = Tensor(np.array(np.ones((self.batch_size, 1)), dtype=self.dtype) * 2) self.res_ = Tensor(np.array(np.zeros((self.batch_size, self.out_channels, self.out_size, self.out_size)), dtype=self.dtype)) + def num_inputs(self): + """Number of inputs""" return len(self.featmap_strides) def init_weights(self): - pass + """Initialize weights""" def log2(self, value): + """Calc logarithm""" return self.log(value) / self.log(self.twos) def build_roi_layers(self, featmap_strides): + """ + Build ROI layers + + Args: + featmap_strides: Strides of featuremaps + + Returns: + ROI layers + """ roi_layers = [] for s in featmap_strides: layer_cls = ROIAlign(self.out_size, self.out_size, diff --git a/research/cv/faster_rcnn_dcn/src/FasterRcnn/rpn.py b/research/cv/faster_rcnn_dcn/src/FasterRcnn/rpn.py index f8e324a66248e16fd680a1bd6aba9b71a3c4ab5f..5748d33c8d29b2d0c0a2b9f54e6640d83ebf36ef 100644 --- a/research/cv/faster_rcnn_dcn/src/FasterRcnn/rpn.py +++ b/research/cv/faster_rcnn_dcn/src/FasterRcnn/rpn.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,12 +15,13 @@ """RPN for FasterRCNN-DCN""" import numpy as np -import mindspore.nn as nn -import mindspore.common.dtype as mstype from mindspore import context, Tensor -from mindspore.ops import operations as P -from mindspore.ops import functional as F +from mindspore import nn +from mindspore.common import dtype as mstype from mindspore.common.initializer import initializer +from mindspore.ops import functional as F +from mindspore.ops import operations as P + from .bbox_assign_sample import BboxAssignSample @@ -65,6 +66,7 @@ class RpnRegClsBlock(nn.Cell): has_bias=True, weight_init=weight_reg, bias_init=bias_reg) def construct(self, x): + """Forward pass throw network""" x = self.relu(self.rpn_conv(x)) x1 = self.rpn_cls(x) diff --git a/research/cv/faster_rcnn_dcn/src/dataset.py b/research/cv/faster_rcnn_dcn/src/dataset.py index 23697df6a9bc7fb887443e92a49b1838ee4b50a4..72cba6b04f5b4e92acdd548fe5e58925bf0202c9 100644 --- a/research/cv/faster_rcnn_dcn/src/dataset.py +++ b/research/cv/faster_rcnn_dcn/src/dataset.py @@ -1,4 +1,4 @@ -# Copyright 2021-2022 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -17,12 +17,12 @@ from __future__ import division import os -import numpy as np -from numpy import random import cv2 -import mindspore.dataset as de -import mindspore.dataset.vision as C +import numpy as np +import numpy.random as random +from mindspore import dataset as de +from mindspore.dataset.vision import c_transforms as C from mindspore.mindrecord import FileWriter @@ -74,6 +74,7 @@ def bbox_overlaps(bboxes1, bboxes2, mode='iou'): class PhotoMetricDistortion: """Photo Metric Distortion""" + def __init__(self, brightness_delta=32, contrast_range=(0.5, 1.5), @@ -85,6 +86,18 @@ class PhotoMetricDistortion: self.hue_delta = hue_delta def __call__(self, img, boxes, labels): + """ + Apply random brightness + Args: + img: Source image + boxes: Bounding boxes positions + labels: Labels + + Returns: + Image with random brightness; + Bounding boxes positions; + Labels + """ # random brightness img = img.astype('float32') @@ -134,7 +147,8 @@ class PhotoMetricDistortion: class Expand: - """expand image""" + """Expand an image""" + def __init__(self, mean=(0, 0, 0), to_rgb=True, ratio_range=(1, 4)): if to_rgb: self.mean = mean[::-1] @@ -143,6 +157,16 @@ class Expand: self.min_ratio, self.max_ratio = ratio_range def __call__(self, img, boxes, labels): + """ + Apply an image expansion + Args: + img: Source image + boxes: Bounding boxes positions + labels: Labels + + Returns: + Expanded images + """ if random.randint(2): return img, boxes, labels @@ -159,6 +183,17 @@ class Expand: def rescale_with_tuple(img, scale): + """ + Rescale an image with given scales + + Args: + img: Source image + scale: Scales to rescale with them + + Returns: + Rescaled image + Scale factor + """ h, w = img.shape[:2] scale_factor = min(max(scale) / max(h, w), min(scale) / min(h, w)) new_size = int(w * float(scale_factor) + 0.5), int(h * float(scale_factor) + 0.5) @@ -168,17 +203,44 @@ def rescale_with_tuple(img, scale): def rescale_with_factor(img, scale_factor): + """ + Rescale ab image with given scales + + Args: + img: Source image + scale_factor: Scale given as factor + + Returns: + Rescaled image + """ h, w = img.shape[:2] new_size = int(w * float(scale_factor) + 0.5), int(h * float(scale_factor) + 0.5) return cv2.resize(img, new_size, interpolation=cv2.INTER_NEAREST) def rescale_column(img, img_shape, gt_bboxes, gt_label, gt_num, config): - """rescale operation for image""" + """ + Rescale operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data, scale_factor = rescale_with_tuple(img, (config.img_width, config.img_height)) if img_data.shape[0] > config.img_height: img_data, scale_factor2 = rescale_with_tuple(img_data, (config.img_height, config.img_height)) - scale_factor = scale_factor*scale_factor2 + scale_factor = scale_factor * scale_factor2 gt_bboxes = gt_bboxes * scale_factor gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_data.shape[1] - 1) @@ -194,14 +256,32 @@ def rescale_column(img, img_shape, gt_bboxes, gt_label, gt_num, config): img_shape = (config.img_height, config.img_width, 1.0) img_shape = np.asarray(img_shape, dtype=np.float32) - return (pad_img_data, img_shape, gt_bboxes, gt_label, gt_num) + return pad_img_data, img_shape, gt_bboxes, gt_label, gt_num + def rescale_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config): - """rescale operation for image of eval""" + """ + Rescale operation for an image of evaluation data + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data, scale_factor = rescale_with_tuple(img, (config.img_width, config.img_height)) if img_data.shape[0] > config.img_height: img_data, scale_factor2 = rescale_with_tuple(img_data, (config.img_height, config.img_height)) - scale_factor = scale_factor*scale_factor2 + scale_factor = scale_factor * scale_factor2 pad_h = config.img_height - img_data.shape[0] pad_w = config.img_width - img_data.shape[1] @@ -213,11 +293,28 @@ def rescale_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config): img_shape = np.append(img_shape, (scale_factor, scale_factor)) img_shape = np.asarray(img_shape, dtype=np.float32) - return (pad_img_data, img_shape, gt_bboxes, gt_label, gt_num) + return pad_img_data, img_shape, gt_bboxes, gt_label, gt_num def resize_column(img, img_shape, gt_bboxes, gt_label, gt_num, config): - """resize operation for image""" + """ + Resize operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data = img h, w = img_data.shape[:2] img_data = cv2.resize( @@ -235,11 +332,28 @@ def resize_column(img, img_shape, gt_bboxes, gt_label, gt_num, config): gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_shape[1] - 1) gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_shape[0] - 1) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def resize_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config): - """resize operation for image of eval""" + """ + Resize operation for an image of evaluation data + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data = img h, w = img_data.shape[:2] img_data = cv2.resize( @@ -257,21 +371,55 @@ def resize_column_test(img, img_shape, gt_bboxes, gt_label, gt_num, config): gt_bboxes[:, 0::2] = np.clip(gt_bboxes[:, 0::2], 0, img_shape[1] - 1) gt_bboxes[:, 1::2] = np.clip(gt_bboxes[:, 1::2], 0, img_shape[0] - 1) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def impad_to_multiple_column(img, img_shape, gt_bboxes, gt_label, gt_num, config): - """impad operation for image""" + """ + Image padding operation for sn image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data = cv2.copyMakeBorder(img, 0, config.img_height - img.shape[0], 0, config.img_width - img.shape[1], cv2.BORDER_CONSTANT, value=0) img_data = img_data.astype(np.float32) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def imnormalize_column(img, img_shape, gt_bboxes, gt_label, gt_num): - """imnormalize operation for image""" + """ + Image normalization operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ mean = np.asarray([123.675, 116.28, 103.53]) std = np.asarray([58.395, 57.12, 57.375]) img_data = img.copy().astype(np.float32) @@ -280,11 +428,28 @@ def imnormalize_column(img, img_shape, gt_bboxes, gt_label, gt_num): cv2.multiply(img_data, 1 / np.float64(std.reshape(1, -1)), img_data) # inplace img_data = img_data.astype(np.float32) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def flip_column(img, img_shape, gt_bboxes, gt_label, gt_num): - """flip operation for image""" + """ + Flip operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data = img img_data = np.flip(img_data, axis=1) flipped = gt_bboxes.copy() @@ -293,11 +458,28 @@ def flip_column(img, img_shape, gt_bboxes, gt_label, gt_num): flipped[..., 0::4] = w - gt_bboxes[..., 2::4] - 1 flipped[..., 2::4] = w - gt_bboxes[..., 0::4] - 1 - return (img_data, img_shape, flipped, gt_label, gt_num) + return img_data, img_shape, flipped, gt_label, gt_num def transpose_column(img, img_shape, gt_bboxes, gt_label, gt_num): - """transpose operation for image""" + """ + Transpose operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ img_data = img.transpose(2, 0, 1).copy() img_data = img_data.astype(np.float32) img_shape = img_shape.astype(np.float32) @@ -305,27 +487,73 @@ def transpose_column(img, img_shape, gt_bboxes, gt_label, gt_num): gt_label = gt_label.astype(np.int32) gt_num = gt_num.astype(np.bool) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def photo_crop_column(img, img_shape, gt_bboxes, gt_label, gt_num): - """photo crop operation for image""" + """ + Photo crop operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ random_photo = PhotoMetricDistortion() img_data, gt_bboxes, gt_label = random_photo(img, gt_bboxes, gt_label) - return (img_data, img_shape, gt_bboxes, gt_label, gt_num) + return img_data, img_shape, gt_bboxes, gt_label, gt_num def expand_column(img, img_shape, gt_bboxes, gt_label, gt_num): - """expand operation for image""" + """ + Expand operation for an image + + Args: + img: Source image + img_shape: Shape of the image + gt_bboxes: Ground truth bounding boxes + gt_label: Ground truth labels + gt_num: Number of ground truth labels + config: Config object with training parameters + + Returns: + Padded image + Shape of image + Ground truth bounding boxes + Ground truth labels + Number of ground truth labels + """ expand = Expand() img, gt_bboxes, gt_label = expand(img, gt_bboxes, gt_label) - return (img, img_shape, gt_bboxes, gt_label, gt_num) + return img, img_shape, gt_bboxes, gt_label, gt_num def preprocess_fn(image, box, is_training, config): - """Preprocess function for dataset.""" + """ + Preprocess function for dataset + + Args: + image: Source image + box: Bounding box + is_training: Flag is this training mode + config: Config object with training parameters + + Returns: + Preprocessed images + """ + def _infer_data(image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert): image_shape = image_shape[:2] input_data = image_bgr, image_shape, gt_box_new, gt_label_new, gt_iscrowd_new_revert @@ -340,7 +568,17 @@ def preprocess_fn(image, box, is_training, config): return output_data def _data_aug(image, box, is_training): - """Data augmentation function.""" + """ + Data augmentation function + + Args: + image: Source image + box: Bounding box + is_training: Flag is this training mode + + Returns: + Augmented images + """ image_bgr = image.copy() image_bgr[:, :, 0] = image[:, :, 2] image_bgr[:, :, 1] = image[:, :, 1] @@ -380,7 +618,17 @@ def preprocess_fn(image, box, is_training, config): def create_coco_label(is_training, config): - """Get image path and annotation from COCO.""" + """ + Get image path and annotation from COCO + + Args: + is_training: Flag is this training mode + config: Config object with training parameters + + Returns: + Image files + Image annotations dict + """ from pycocotools.coco import COCO coco_root = config.coco_root @@ -431,7 +679,16 @@ def create_coco_label(is_training, config): def parse_json_annos_from_txt(anno_file, config): - """for user defined annotations text file, parse it to json format data""" + """ + For user defined annotations text file, parse it to json format data + + Args: + anno_file: Annotation file + config: Config object with training parameters + + Returns: + Annotations in JSON format + """ if not os.path.isfile(anno_file): raise RuntimeError("Evaluation annotation file {} is not valid.".format(anno_file)) @@ -478,7 +735,18 @@ def parse_json_annos_from_txt(anno_file, config): def create_train_data_from_txt(image_dir, anno_path): - """Filter valid image file, which both in image_dir and anno_path.""" + """ + Filter valid image file, which both in image_dir and anno_path. + + Args: + image_dir: Directory with images + anno_path: Annotation path + + Returns: + Image files + Image annotations dict + """ + def anno_parser(annos_str): """Parse annotation from string to list.""" annos = [] @@ -489,6 +757,7 @@ def create_train_data_from_txt(image_dir, anno_path): iscrowd = int(anno[5]) annos.append([xmin, ymin, xmax, ymax, cls_id, iscrowd]) return annos + image_files = [] image_anno_dict = {} if not os.path.isdir(image_dir): @@ -510,7 +779,19 @@ def create_train_data_from_txt(image_dir, anno_path): def data_to_mindrecord_byte_image(config, dataset="coco", is_training=True, prefix="fasterrcnn.mindrecord", file_num=8): - """Create MindRecord file.""" + """ + Create MindRecord file + + Args: + config: Config object with training parameters + dataset: Dataset name + is_training: Flag is it training mode + prefix: Prefix for mindrecords names + file_num: Number of filters + + Returns: + + """"" mindrecord_dir = config.mindrecord_dir mindrecord_path = os.path.join(mindrecord_dir, prefix) writer = FileWriter(mindrecord_path, file_num) @@ -536,7 +817,22 @@ def data_to_mindrecord_byte_image(config, dataset="coco", is_training=True, pref def create_fasterrcnn_dataset(config, mindrecord_file, batch_size=2, device_num=1, rank_id=0, is_training=True, num_parallel_workers=8, python_multiprocessing=False): - """Create FasterRcnn dataset with MindDataset.""" + """ + Create FasterRcnn dataset with MindDataset + + Args: + config: Config object with training parameters + mindrecord_file: Mindrecord file + batch_size: Size of batch + device_num: Number of device + rank_id: ID of current device + is_training: Flag is it training mode + num_parallel_workers: Number of parallel workers + python_multiprocessing: Flag to use python multiprocessing + + Returns: + Dataset object + """ cv2.setNumThreads(0) de.config.set_prefetch_size(8) ds = de.MindDataset(mindrecord_file, columns_list=["image", "annotation"], num_shards=device_num, shard_id=rank_id, diff --git a/research/cv/faster_rcnn_dcn/src/lr_schedule.py b/research/cv/faster_rcnn_dcn/src/lr_schedule.py index 2f36cf0b0400c6414f4290493575f0ceca28522e..0d015ddfcbd6e37704f76004e94403c0e3e0f452 100644 --- a/research/cv/faster_rcnn_dcn/src/lr_schedule.py +++ b/research/cv/faster_rcnn_dcn/src/lr_schedule.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,18 +16,54 @@ import math + def linear_warmup_learning_rate(current_step, warmup_steps, base_lr, init_lr): + """ + Scheduler for linear warmup of learning rate + + Args: + current_step: Number of the current step + warmup_steps: Number of steps to warmup + base_lr: Base value of learning rate + init_lr: Initial value of learning rate + + Returns: + Current value of learning rate + """ lr_inc = (float(base_lr) - float(init_lr)) / float(warmup_steps) learning_rate = float(init_lr) + lr_inc * current_step return learning_rate + def a_cosine_learning_rate(current_step, base_lr, warmup_steps, decay_steps): + """ + Generate values for cosine annealing + + Args: + current_step: Number of the current step + base_lr: Base value of learning rate + warmup_steps: Number of steps to warmup + decay_steps: General number of steps + + Returns: + Current value of learning rate + """ base = float(current_step - warmup_steps) / float(decay_steps) learning_rate = (1 + math.cos(base * math.pi)) / 2 * base_lr return learning_rate + def dynamic_lr(config, steps_per_epoch): - """dynamic learning rate generator""" + """ + Dynamic learning rate generator + + Args: + config: Config object with training parameters + steps_per_epoch: Number of steps per epoch + + Returns: + List of learning rate values + """ base_lr = config.base_lr total_steps = steps_per_epoch * (config.epoch_size + 1) warmup_steps = int(config.warmup_step) diff --git a/research/cv/faster_rcnn_dcn/src/model_utils/config.py b/research/cv/faster_rcnn_dcn/src/model_utils/config.py index 576681b9372db14a714c684e5247373366f39803..cb24fdc0763ab47c2d99fdbe6fc568db0bcdef41 100644 --- a/research/cv/faster_rcnn_dcn/src/model_utils/config.py +++ b/research/cv/faster_rcnn_dcn/src/model_utils/config.py @@ -14,12 +14,14 @@ # ============================================================================ """Parse arguments""" -import os -import ast import argparse +import ast +import os from pprint import pprint, pformat + import yaml + class Config: """ Configuration namespace. Convert dictionary to members. @@ -32,9 +34,11 @@ class Config: setattr(self, k, Config(v) if isinstance(v, dict) else v) def __str__(self): + """Convert object to string""" return pformat(self.__dict__) def __repr__(self): + """Get string representation""" return self.__str__() @@ -124,4 +128,5 @@ def get_config(): print("Please check the above information for the configurations", flush=True) return Config(final_config) + config = get_config() diff --git a/research/cv/faster_rcnn_dcn/src/model_utils/local_adapter.py b/research/cv/faster_rcnn_dcn/src/model_utils/local_adapter.py index 769fa6dc78e59eb66dbc8e6773accdc1d08b649e..ee29eece6fdfe5e2b4f60248be0dc6cc3a80ab83 100644 --- a/research/cv/faster_rcnn_dcn/src/model_utils/local_adapter.py +++ b/research/cv/faster_rcnn_dcn/src/model_utils/local_adapter.py @@ -17,20 +17,45 @@ import os + def get_device_id(): + """ + Get device ID + + Returns: + Device ID + """ device_id = os.getenv('DEVICE_ID', '0') return int(device_id) def get_device_num(): + """ + Get device number + + Returns: + Number of device + """ device_num = os.getenv('RANK_SIZE', '1') return int(device_num) def get_rank_id(): + """ + Get rank ID + + Returns: + Rank ID + """ global_rank_id = os.getenv('RANK_ID', '0') return int(global_rank_id) def get_job_id(): + """ + Get job ID + + Returns: + Job ID + """ return "Local Job" diff --git a/research/cv/faster_rcnn_dcn/src/model_utils/moxing_adapter.py b/research/cv/faster_rcnn_dcn/src/model_utils/moxing_adapter.py index 77f8730d824b160cf4f9767b0be8995c06124862..697ed56141acc28c8acf178603e46657917b1ee3 100644 --- a/research/cv/faster_rcnn_dcn/src/model_utils/moxing_adapter.py +++ b/research/cv/faster_rcnn_dcn/src/model_utils/moxing_adapter.py @@ -14,38 +14,73 @@ # ============================================================================ """Moxing adapter for ModelArts""" -import os import functools +import os + from mindspore import context from mindspore.profiler import Profiler + from .config import config _global_sync_count = 0 + def get_device_id(): + """ + Get device ID + + Returns: + Device ID + """ device_id = os.getenv('DEVICE_ID', '0') return int(device_id) def get_device_num(): + """ + Get number of devices + + Returns: + Number of devices + """ device_num = os.getenv('RANK_SIZE', '1') return int(device_num) def get_rank_id(): + """ + Get ID of current device + + Returns: + Current device ID + """ global_rank_id = os.getenv('RANK_ID', '0') return int(global_rank_id) def get_job_id(): + """ + Get job id + + Returns: + Job ID + """ job_id = os.getenv('JOB_ID') job_id = job_id if job_id != "" else "default" return job_id + def sync_data(from_path, to_path): """ Download data from remote obs to local directory if the first url is remote url and the second one is local path Upload data from local directory to remote obs in contrast. + + Args: + from_path: Source path + to_path: Destination path + + Returns: + """ import moxing as mox import time @@ -75,7 +110,14 @@ def sync_data(from_path, to_path): def moxing_wrapper(pre_process=None, post_process=None): """ - Moxing wrapper to download dataset and upload outputs. + Moxing wrapper to download dataset and upload outputs + + Args: + pre_process: Preprocessing function + post_process: Postprocessing function + + Returns: + Moxing wrapper """ def wrapper(run_func): @functools.wraps(run_func) diff --git a/research/cv/faster_rcnn_dcn/src/network_define.py b/research/cv/faster_rcnn_dcn/src/network_define.py index 2160a9512742cbfc2c51ef1c909d63b76f4271f5..062351be93329761d46d42a637bbe94e899cce62 100644 --- a/research/cv/faster_rcnn_dcn/src/network_define.py +++ b/research/cv/faster_rcnn_dcn/src/network_define.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,14 +15,15 @@ """FasterRcnn-DCN training network wrapper.""" import time + import numpy as np -import mindspore.nn as nn +from mindspore import ParameterTuple +from mindspore import nn from mindspore.common.tensor import Tensor -from mindspore.ops import functional as F +from mindspore.nn.wrap.grad_reducer import DistributedGradReducer from mindspore.ops import composite as C -from mindspore import ParameterTuple +from mindspore.ops import functional as F from mindspore.train.callback import Callback -from mindspore.nn.wrap.grad_reducer import DistributedGradReducer time_stamp_init = False time_stamp_first = 0 @@ -56,7 +57,14 @@ class LossCallBack(Callback): time_stamp_init = True def step_end(self, run_context): - """step_end""" + """ + Event on the end of step + Args: + run_context: Data related to the finished step + + Returns: + + """ cb_params = run_context.original_args() loss = cb_params.net_outputs.asnumpy() cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1 @@ -83,6 +91,7 @@ class LossCallBack(Callback): class LossNet(nn.Cell): """FasterRcnn loss method""" def construct(self, x1, x2, x3, x4, x5, x6): + """Forward pass throw network""" return x1 + x2 @@ -100,6 +109,7 @@ class WithLossCell(nn.Cell): self._loss_fn = loss_fn def construct(self, x, img_shape, gt_bboxe, gt_label, gt_num): + """Forward pass throw network""" loss1, loss2, loss3, loss4, loss5, loss6 = self._backbone(x, img_shape, gt_bboxe, gt_label, gt_num) return self._loss_fn(loss1, loss2, loss3, loss4, loss5, loss6) @@ -143,6 +153,7 @@ class TrainOneStepCell(nn.Cell): self.grad_reducer = DistributedGradReducer(optimizer.parameters, mean, degree) def construct(self, x, img_shape, gt_bboxe, gt_label, gt_num): + """Forward pass throw network""" weights = self.weights loss = self.network(x, img_shape, gt_bboxe, gt_label, gt_num) grads = self.grad(self.network, weights)(x, img_shape, gt_bboxe, gt_label, gt_num, self.sens) diff --git a/research/cv/faster_rcnn_dcn/src/util.py b/research/cv/faster_rcnn_dcn/src/util.py index fed172d537c9205bc4c0075a117153753638c711..075e6125a6dc85b27043c50897cce8feb1d9e199 100644 --- a/research/cv/faster_rcnn_dcn/src/util.py +++ b/research/cv/faster_rcnn_dcn/src/util.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -15,6 +15,7 @@ """coco eval for FasterRcnn-DCN""" import json + import numpy as np from pycocotools.coco import COCO from pycocotools.cocoeval import COCOeval @@ -35,8 +36,21 @@ summary_init = { 'Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ]': _init_value, } + def coco_eval(result_files, result_types, coco, max_dets=(100, 300, 1000), single_result=False): - """coco eval for fasterrcnn""" + """ + Coco evaluation of FasterRCNN + + Args: + result_files: Files to save results + result_types: Result types + coco: COCO dataset path + max_dets: Maximum depth + single_result: Flag to obtain single result + + Returns: + Dict with metrics summary + """ anns = json.load(open(result_files['bbox'])) if not anns: return summary_init @@ -101,14 +115,25 @@ def coco_eval(result_files, result_types, coco, max_dets=(100, 300, 1000), singl return summary_metrics + def xyxy2xywh(bbox): + """ + Transform bounding box format + + Args: + bbox: Bounding box in xyxy format + + Returns: + Bounding box in xywh format + """ _bbox = bbox.tolist() return [ _bbox[0], _bbox[1], _bbox[2] - _bbox[0] + 1, _bbox[3] - _bbox[1] + 1, - ] + ] + def bbox2result_1image(bboxes, labels, num_classes): """Convert detection results to a list of numpy arrays. @@ -127,11 +152,21 @@ def bbox2result_1image(bboxes, labels, num_classes): result = [bboxes[labels == i, :] for i in range(num_classes - 1)] return result + def proposal2json(dataset, results): - """convert proposal to json mode""" + """ + Convert proposal to json mode + + Args: + dataset: Dataset object + results: Results of the network + + Returns: + Predicts stored in the JSON format + """ img_ids = dataset.getImgIds() json_results = [] - dataset_len = dataset.get_dataset_size()*2 + dataset_len = dataset.get_dataset_size() * 2 for idx in range(dataset_len): img_id = img_ids[idx] bboxes = results[idx] @@ -144,8 +179,18 @@ def proposal2json(dataset, results): json_results.append(data) return json_results + def det2json(dataset, results): - """convert det to json mode""" + """ + Convert detections to json mode + + Args: + dataset: Dataset object + results: Results of the network + + Returns: + Predicts stored in the JSON format + """ cat_ids = dataset.getCatIds() img_ids = dataset.getImgIds() json_results = [] @@ -165,8 +210,19 @@ def det2json(dataset, results): json_results.append(data) return json_results + def segm2json(dataset, results): - """convert segm to json mode""" + """ + Convert segmentations to json mode + + Args: + dataset: Dataset object + results: Results of the network + + Returns: + Predicts stored in the JSON format for bounding box; + Predicts stored in the JSON format for segmentation; + """ bbox_json_results = [] segm_json_results = [] for idx in range(len(dataset)): @@ -199,8 +255,19 @@ def segm2json(dataset, results): segm_json_results.append(data) return bbox_json_results, segm_json_results + def results2json(dataset, results, out_file): - """convert result convert to json mode""" + """ + Convert results to json mode + + Args: + dataset: Dataset object + results: Results of the network + out_file: Output file + + Returns: + JSON dict with results + """ result_files = dict() if isinstance(results[0], list): json_results = det2json(dataset, results) diff --git a/research/cv/faster_rcnn_dcn/train.py b/research/cv/faster_rcnn_dcn/train.py index 6a41d5189f4f3fe0039c00059f0c23ac326b4879..11fb1b771fd466086639f560c054abfe0b01f05a 100644 --- a/research/cv/faster_rcnn_dcn/train.py +++ b/research/cv/faster_rcnn_dcn/train.py @@ -1,4 +1,4 @@ -# Copyright 2021 Huawei Technologies Co., Ltd +# Copyright 2022 Huawei Technologies Co., Ltd # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,37 +16,45 @@ import os import time -import numpy as np -import mindspore.common.dtype as mstype +import numpy as np from mindspore import context, Tensor, Parameter +from mindspore.common import dtype as mstype +from mindspore.common import set_seed from mindspore.communication.management import init, get_rank, get_group_size -from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, TimeMonitor -from mindspore.train import Model from mindspore.context import ParallelMode -from mindspore.train.serialization import load_checkpoint, load_param_into_net from mindspore.nn import SGD -from mindspore.common import set_seed +from mindspore.train import Model +from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, TimeMonitor +from mindspore.train.serialization import load_checkpoint, load_param_into_net from src.FasterRcnn.faster_rcnn_resnet import Faster_Rcnn_Resnet -from src.network_define import LossCallBack, WithLossCell, TrainOneStepCell, LossNet from src.dataset import data_to_mindrecord_byte_image, create_fasterrcnn_dataset from src.lr_schedule import dynamic_lr from src.model_utils.config import config +from src.model_utils.device_adapter import get_device_id, get_device_num, get_rank_id from src.model_utils.moxing_adapter import moxing_wrapper -from src.model_utils.device_adapter import get_device_id +from src.network_define import LossCallBack, WithLossCell, TrainOneStepCell, LossNet set_seed(1) context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target, device_id=get_device_id()) if config.device_target == "GPU": - context.set_context(enable_graph_kernel=True) + context.set_context(enable_graph_kernel=False) if config.run_distribute: - init() - rank = get_rank() - device_num = get_group_size() - context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL, - gradients_mean=True) + if config.device_target == "Ascend": + rank = get_rank_id() + device_num = get_device_num() + context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL, + gradients_mean=True) + init() + else: + init("nccl") + context.reset_auto_parallel_context() + rank = get_rank() + device_num = get_group_size() + context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL, + gradients_mean=True) else: rank = 0 device_num = 1 @@ -59,6 +67,7 @@ if config.save_on_master: else: config.save_checkpoint = 1 + def train_fasterRcnn_(): """ train_fasterrcnn_ """ print("Start create dataset!") @@ -109,7 +118,9 @@ def train_fasterRcnn_(): return dataset_size, dataset + def modelarts_pre_process(): + """Prepare everything for modelarts""" config.save_checkpoint_path = config.output_path config.pre_trained = os.path.join(config.load_path, config.pre_trained) @@ -183,5 +194,6 @@ def train_fasterRcnn(): print("Start training") model.train(config.epoch_size, dataset, callbacks=cb, dataset_sink_mode=True) + if __name__ == '__main__': train_fasterRcnn()