diff --git a/research/cv/centernet_resnet50_v1/README.md b/research/cv/centernet_resnet50_v1/README.md new file mode 100644 index 0000000000000000000000000000000000000000..c4d02443f4ff1d692d31b50f71d1825e8f384147 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/README.md @@ -0,0 +1,610 @@ +# Contents + +- [CenterNet Description](#CenterNet-description) +- [Model Architecture](#model-architecture) +- [Dataset](#dataset) +- [Environment Requirements](#environment-requirements) +- [Quick Start](#quick-start) +- [Script Description](#script-description) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Process](#training-process) + - [Distributed Training](#distributed-training) + - [Testing Process](#testing-process) + - [Testing and Evaluation](#testing-and-evaluation) + - [Inference Process](#inference-process) + - [Convert](#convert) + - [Infer on Ascend310](#infer-on-Ascend310) + - [Result](#result) +- [Model Description](#model-description) + - [Performance](#performance) + - [Training Performance On Ascend 910](#training-performance-on-ascend-910) + - [Inference Performance On Ascend 910](#inference-performance-on-ascend-910) + - [Inference Performance On Ascend 310](#inference-performance-on-ascend-310) +- [ModelZoo Homepage](#modelzoo-homepage) + +# [CenterNet Description](#contents) + +CenterNet is a novel practical anchor-free method for object detection, 3D detection, and pose estimation, which detect identifies objects as axis-aligned boxes in an image. The detector uses keypoint estimation to find center points and regresses to all other object properties, such as size, 3D location, orientation, and even pose. In nature, it's a one-stage method to simultaneously predict center location and bboxes with real-time speed and higher accuracy than corresponding bounding box based detectors. +We support training and evaluation on Ascend910. + +[Paper](https://arxiv.org/pdf/1904.07850.pdf): Objects as Points. 2019. +Xingyi Zhou(UT Austin) and Dequan Wang(UC Berkeley) and Philipp Krahenbuhl(UT Austin) + +# [Model Architecture](#contents) + +ResNet has the channels of the three upsampling layers to 256, 128, 64, respectively, to save computation. One 3 脳 3 deformable convolutional layer is added before each up-convolution with channel 256, 128, 64, respectively. The up-convolutional kernels are initialized as bilinear interpolation. + +# [Dataset](#contents) + +Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. + +Dataset used: [COCO2017](https://cocodataset.org/) + +- Dataset size锛�26G + - Train锛�19G锛�118000 images + - Val锛�0.8G锛�5000 images + - Test: 6.3G, 40000 images + - Annotations锛�808M锛宨nstances锛宑aptions etc +- Data format锛歩mage and json files + +- Note锛欴ata will be processed in dataset.py + +- The directory structure is as follows, name of directory and file is user defined: + + ```path + . + 鈹溾攢鈹€ dataset + 鈹溾攢鈹€ centernet + 鈹溾攢鈹€ annotations + 鈹� 鈹溾攢 train.json + 鈹� 鈹斺攢 val.json + 鈹斺攢 images + 鈹溾攢 train + 鈹� 鈹斺攢images + 鈹� 鈹溾攢class1_image_folder + 鈹� 鈹溾攢 ... + 鈹� 鈹斺攢classn_image_folder + 鈹斺攢 val + 鈹� 鈹斺攢images + 鈹� 鈹溾攢class1_image_folder + 鈹� 鈹溾攢 ... + 鈹� 鈹斺攢classn_image_folder + 鈹斺攢 test + 鈹斺攢images + 鈹溾攢class1_image_folder + 鈹溾攢 ... + 鈹斺攢classn_image_folder + ``` + +# [Environment Requirements](#contents) + +- Hardware锛圓scend锛� + + - Prepare hardware environment with Ascend processor. +- Framework + + - [MindSpore](https://www.mindspore.cn/install/en) +- For more information, please check the resources below锛� + - [MindSpore tutorials](https://www.mindspore.cn/tutorial/zh-CN/r1.2/index.html) + - [MindSpore Python API](https://www.mindspore.cn/doc/api_python/zh-CN/r1.2/index.html) +- Download the dataset COCO2017. +- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets. + + 1. If coco dataset is used. **Select dataset to coco when run script.** + Install Cython and pycocotool, and you can also install mmcv to process data. + + ```pip + pip install Cython + + pip install pycocotools + + pip install mmcv==0.2.14 + ``` + + And change the COCO_ROOT and other settings you need in `config.py`. The directory structure is as follows: + + ```path + . + 鈹斺攢cocodataset + 鈹溾攢annotations + 鈹溾攢instance_train2017.json + 鈹斺攢instance_val2017.json + 鈹溾攢val2017 + 鈹斺攢train2017 + + ``` + + 2. If your own dataset is used. **Select dataset to other when run script.** + Organize the dataset information the same format as COCO. + +# [Quick Start](#contents) + +- running on local + + After installing MindSpore via the official website, you can start training and evaluation as follows: + + Note: + 1.the first run of training will generate the mindrecord file, which will take a long time. + 2.MINDRECORD_DATASET_PATH is the mindrecord dataset directory. + 3.For `train.py`, LOAD_CHECKPOINT_PATH is the pretrained checkpoint file directory, if no just set "". + 4.For `eval.py`, LOAD_CHECKPOINT_PATH is the checkpoint to be evaluated. + 5.RUN_MODE support validation and testing, set to be "val"/"test" + + ```shell + # create dataset in mindrecord format + bash scripts/convert_dataset_to_mindrecord.sh [COCO_DATASET_DIR] [MINDRECORD_DATASET_DIR] + + # standalone training on Ascend + bash scripts/run_standalone_train_ascend.sh [DEVICE_ID] [MINDRECORD_DATASET_PATH] [LOAD_CHECKPOINT_PATH](optional) + + # distributed training on Ascend + bash scripts/run_distributed_train_ascend.sh [MINDRECORD_DATASET_PATH] [RANK_TABLE_FILE] [LOAD_CHECKPOINT_PATH](optional) + + # eval on Ascend + bash scripts/run_standalone_eval_ascend.sh [DEVICE_ID] [RUN_MODE] [DATA_DIR] [LOAD_CHECKPOINT_PATH] + ``` + +- running on ModelArts + + If you want to run in modelarts, please check the official documentation of modelarts, and you can start training as follows + + - Creating mindrecord dataset with single cards on ModelArts + + ```text + # (1) Upload the code folder to S3 bucket. + # (2) Upload the COCO2017 dataset to S3 bucket. + # (2) Click to "create task" on the website UI interface. + # (3) Set the code directory to "/{path}/centernet_resnet50" on the website UI interface. + # (4) Set the startup file to /{path}/centernet_resnet50/dataset.py" on the website UI interface. + # (5) Perform a or b. + # a. setting parameters in /{path}/centernet_resnet50/default_config.yaml. + # 1. Set 鈥漞nable_modelarts: True鈥� + # b. adding on the website UI interface. + # 1. Add 鈥漞nable_modelarts=True鈥� + # (7) Check the "data storage location" on the website UI interface and set the "Dataset path" path. + # (8) Set the "Output file path" and "Job log path" to your path on the website UI interface. + # (9) Under the item "resource pool selection", select the specification of single cards. + # (10) Create your job. + ``` + + - Training with single cards on ModelArts + + ```text + # (1) Upload the code folder to S3 bucket. + # (2) Click to "create task" on the website UI interface. + # (3) Set the code directory to "/{path}/centernet_resnet50" on the website UI interface. + # (4) Set the startup file to /{path}/centernet_resnet50/train.py" on the website UI interface. + # (5) Perform a or b. + # a. setting parameters in /{path}/centernet_resnet50/default_config.yaml. + # 1. Set 鈥漞nable_modelarts: True鈥� + # 2. Set 鈥渆poch_size: 330鈥� + # 3. Set 鈥渄istribute: 'true'鈥� + # 4. Set 鈥渟ave_checkpoint_path: ./checkpoints鈥� + # b. adding on the website UI interface. + # 1. Add 鈥漞nable_modelarts=True鈥� + # 2. Add 鈥渆poch_size=330鈥� + # 3. Add 鈥渄istribute=true鈥� + # 4. Add 鈥渟ave_checkpoint_path=./checkpoints鈥� + # (6) Upload the mindrecord dataset to S3 bucket. + # (7) Check the "data storage location" on the website UI interface and set the "Dataset path" path. + # (8) Set the "Output file path" and "Job log path" to your path on the website UI interface. + # (9) Under the item "resource pool selection", select the specification of single cards. + # (10) Create your job. + ``` + + - evaluating with single card on ModelArts + + ```text + # (1) Upload the code folder to S3 bucket. + # (2) Git clone https://github.com/xingyizhou/CenterNet.git on local, and put the folder 'CenterNet' under the folder 'centernet' on s3 bucket. + # (3) Click to "create task" on the website UI interface. + # (4) Set the code directory to "/{path}/centernet_resnet50" on the website UI interface. + # (5) Set the startup file to /{path}/centernet_resnet50/eval.py" on the website UI interface. + # (6) Perform a or b. + # a. setting parameters in /{path}/centernet_resnet50/default_config.yaml. + # 1. Set 鈥漞nable_modelarts: True鈥� + # 2. Set 鈥渞un_mode: 'val'鈥� + # 3. Set "load_checkpoint_path='/cache/checkpoint_path/model.ckpt'" on yaml file. + # 4. Set "checkpoint_url=/The path of checkpoint in S3/" on yaml file. + # b. adding on the website UI interface. + # 1. Add 鈥漞nable_modelarts=True鈥� + # 2. Add 鈥渞un_mode=val鈥� + # 3. Add "load_checkpoint_path='/cache/checkpoint_path/model.ckpt'" on the website UI interface. + # 4. Add "checkpoint_url=/The path of checkpoint in S3/" on the website UI interface. + # (7) Upload the dataset(not mindrecord format) to S3 bucket. + # (8) Check the "data storage location" on the website UI interface and set the "Dataset path" path. + # (9) Set the "Output file path" and "Job log path" to your path on the website UI interface. + # (10) Under the item "resource pool selection", select the specification of a single card. + # (11) Create your job. + ``` + +# [Script Description](#contents) + +## [Script and Sample Code](#contents) + +```path +. +鈹溾攢鈹€ cv + 鈹溾攢鈹€ centernet_resnet50 + 鈹溾攢鈹€ train.py // training scripts + 鈹溾攢鈹€ eval.py // testing and evaluation outputs + 鈹溾攢鈹€ export.py // convert mindspore model to mindir model + 鈹溾攢鈹€ README.md // descriptions about centernet_resnet50 + 鈹溾攢鈹€ default_config.yaml // parameter configuration + 鈹溾攢鈹€ ascend310_infer // application for 310 inference + 鈹溾攢鈹€ preprocess.py // preprocess scripts + 鈹溾攢鈹€ postprocess.py // postprocess scripts + 鈹溾攢鈹€ scripts + 鈹� 鈹溾攢鈹€ ascend_distributed_launcher + 鈹� 鈹� 鈹溾攢鈹€ __init__.py + 鈹� 鈹� 鈹溾攢鈹€ hyper_parameter_config.ini // hyper parameter for distributed training + 鈹� 鈹� 鈹溾攢鈹€ get_distribute_train_cmd.py // script for distributed training + 鈹� 鈹� 鈹溾攢鈹€ README.md + 鈹� 鈹溾攢鈹€ convert_dataset_to_mindrecord.sh // shell script for converting coco type dataset to mindrecord + 鈹� 鈹溾攢鈹€ run_standalone_train_ascend.sh // shell script for standalone training on ascend + 鈹� 鈹溾攢鈹€ run_infer_310.sh // shell script for 310 inference on ascend + 鈹� 鈹溾攢鈹€ run_distributed_train_ascend.sh // shell script for distributed training on ascend + 鈹� 鈹溾攢鈹€ run_standalone_eval_ascend.sh // shell script for standalone evaluation on ascend + 鈹斺攢鈹€ src + 鈹溾攢鈹€ model_utils + 鈹� 鈹溾攢鈹€ config.py // parsing parameter configuration file of "*.yaml" + 鈹� 鈹溾攢鈹€ device_adapter.py // local or ModelArts training + 鈹� 鈹溾攢鈹€ local_adapter.py // get related environment variables on local + 鈹� 鈹斺攢鈹€ moxing_adapter.py // get related environment variables abd transfer data on ModelArts + 鈹溾攢鈹€ __init__.py + 鈹溾攢鈹€ centernet_det.py // centernet networks, training entry + 鈹溾攢鈹€ dataset.py // generate dataloader and data processing entry + 鈹溾攢鈹€ decode.py // decode the head features + 鈹溾攢鈹€ resnet50.py // resnet50 backbone + 鈹溾攢鈹€ image.py // image preprocess functions + 鈹溾攢鈹€ post_process.py // post-process functions after decode in inference + 鈹溾攢鈹€ utils.py // auxiliary functions for train, to log and preload + 鈹斺攢鈹€ visual.py // visualization image, bbox, score and keypoints +``` + +## [Script Parameters](#contents) + +### Create MindRecord type dataset + +```text +usage: dataset.py [--coco_data_dir COCO_DATA_DIR] + [--mindrecord_dir MINDRECORD_DIR] + [--mindrecord_prefix MINDRECORD_PREFIX] + +options: + --coco_data_dir path to coco dataset directory: PATH, default is "" + --mindrecord_dir path to mindrecord dataset directory: PATH, default is "" + --mindrecord_prefix prefix of MindRecord dataset filename: STR, default is "coco_det.train.mind" +``` + +### Training + +```text +usage: train.py [--device_target DEVICE_TARGET] [--distribute DISTRIBUTE] + [--need_profiler NEED_PROFILER] [--profiler_path PROFILER_PATH] + [--epoch_size EPOCH_SIZE] [--train_steps TRAIN_STEPS] [device_id DEVICE_ID] + [--device_num DEVICE_NUM] [--do_shuffle DO_SHUFFLE] + [--enable_data_sink ENABLE_DATA_SINK] [--data_sink_steps N] + [--enable_save_ckpt ENABLE_SAVE_CKPT] + [--save_checkpoint_path SAVE_CHECKPOINT_PATH] + [--load_checkpoint_path LOAD_CHECKPOINT_PATH] + [--save_checkpoint_steps N] [--save_checkpoint_num N] + [--mindrecord_dir MINDRECORD_DIR] + [--mindrecord_prefix MINDRECORD_PREFIX] + [--save_result_dir SAVE_RESULT_DIR] + +options: + --device_target device where the code will be implemented: "Ascend" + --distribute training by several devices: "true"(training by more than 1 device) | "false", default is "true" + --need profiler whether to use the profiling tools: "true" | "false", default is "false" + --profiler_path path to save the profiling results: PATH, default is "" + --epoch_size epoch size: N, default is 1 + --train_steps training Steps: N, default is -1 + --device_id device id: N, default is 0 + --device_num number of used devices: N, default is 1 + --do_shuffle enable shuffle: "true" | "false", default is "true" + --enable_lossscale enable lossscale: "true" | "false", default is "true" + --enable_data_sink enable data sink: "true" | "false", default is "true" + --data_sink_steps set data sink steps: N, default is 1 + --enable_save_ckpt enable save checkpoint: "true" | "false", default is "true" + --save_checkpoint_path path to save checkpoint files: PATH, default is "" + --load_checkpoint_path path to load checkpoint files: PATH, default is "" + --save_checkpoint_steps steps for saving checkpoint files: N, default is 1000 + --save_checkpoint_num number for saving checkpoint files: N, default is 1 + --mindrecord_dir path to mindrecord dataset directory: PATH, default is "" + --mindrecord_prefix prefix of MindRecord dataset filename: STR, default is "coco_det.train.mind" + --save_result_dir path to save the visualization results: PATH, default is "" +``` + +### Evaluation + +```text +usage: eval.py [--device_target DEVICE_TARGET] [--device_id N] + [--load_checkpoint_path LOAD_CHECKPOINT_PATH] + [--data_dir DATA_DIR] [--run_mode RUN_MODE] + [--visual_image VISUAL_IMAGE] + [--enable_eval ENABLE_EVAL] [--save_result_dir SAVE_RESULT_DIR] +options: + --device_target device where the code will be implemented: "Ascend" + --device_id device id to run task, default is 0 + --load_checkpoint_path initial checkpoint (usually from a pre-trained CenterNet model): PATH, default is "" + --data_dir validation or test dataset dir: PATH, default is "" + --run_mode inference mode: "val" | "test", default is "val" + --visual_image whether visualize the image and annotation info: "true" | "false", default is "false" + --save_result_dir path to save the visualization and inference results: PATH, default is "" +``` + +### Options and Parameters + +Parameters for training and evaluation can be set in file `config.py`. + +#### Options + +```text +train_config. + batch_size: 32 // batch size of input dataset: N, default is 32 + loss_scale_value: 1024 // initial value of loss scale: N, default is 1024 + optimizer: 'Adam' // optimizer used in the network: Adam, default is Adam + lr_schedule: 'MultiDecay' // schedules to get the learning rate +``` + +```text +config for evaluation. + SOFT_NMS: True // nms after decode: True | False, default is True + keep_res: True // keep original or fix resolution: True | False, default is True + multi_scales: [1.0] // use multi-scales of image: List, default is [1.0] + K: 100 // number of bboxes to be computed by TopK, default is 100 + score_thresh: 0.3 // threshold of score when visualize image and annotation info,default is 0.3 +``` + +#### Parameters + +```text +Parameters for dataset (Training/Evaluation): + num_classes number of categories: N, default is 80 + max_objs maximum numbers of objects labeled in each image,default is 128 + input_res_train train input resolution, default is [512, 512] + output_res output resolution, default is [128, 128] + input_res_test test input resolution, default is [680, 680] + rand_crop whether crop image in random during data augmenation: True | False, default is True + shift maximum value of image shift during data augmenation: N, default is 0.1 + scale maximum value of image scale times during data augmenation: N, default is 0.4 + aug_rot properbility of image rotation during data augmenation: N, default is 0.0 + rotate maximum value of rotation angle during data augmentation: N, default is 0.0 + flip_prop properbility of image flip during data augmenation: N, default is 0.5 + color_aug color augmentation of RGB image, default is True + coco_classes name of categories in COCO2017 + mean mean value of RGB image + std variance of RGB image + eig_vec eigenvectors of RGB image + eig_val eigenvalues of RGB image + +Parameters for network (Training/Evaluation): + num_stacks 銆€銆€銆€銆€銆€銆€銆€ the number of stacked resnet network,default is 1 + down_ratio the ratio of input and output resolution during training, default is 4 + head_conv the input dimension of resnet network,default is 64 + block_class block for network,default is [3, 4, 23, 3] + dense_hp whether apply weighted pose regression near center point: True | False,default is True + dense_wh apply weighted regression near center or just apply regression on center point + cat_spec_wh category specific bounding box size + reg_offset regress local offset or not: True | False,default is True + hm_weight loss weight for keypoint heatmaps: N, default is 1.0 + off_weight loss weight for keypoint local offsets: N,default is 1 + wh_weight loss weight for bounding box size: N, default is 0.1 + mse_loss use mse loss or focal loss to train keypoint heatmaps: True | False,default is False + reg_loss l1 or smooth l1 for regression loss: 'l1' | 'sl1', default is 'l1' + +Parameters for optimizer and learning rate: + Adam: + weight_decay weight decay: Q + decay_filer lamda expression to specify which param will be decayed + + PolyDecay: + learning_rate initial value of learning rate: Q + end_learning_rate final value of learning rate: Q + power learning rate decay factor + eps normalization parameter + warmup_steps number of warmup_steps + + MultiDecay: + learning_rate initial value of learning rate: Q + eps normalization parameter + warmup_steps number of warmup_steps + multi_epochs list of epoch numbers after which the lr will be decayed + factor learning rate decay factor +``` + +## [Training Process](#contents) + +Before your first training, convert coco type dataset to mindrecord files is needed to improve performance on host. + +```shell +bash scripts/convert_dataset_to_mindrecord.sh /path/coco_dataset_dir /path/mindrecord_dataset_dir +``` + +The command above will run in the background, after converting mindrecord files will be located in path specified by yourself. + +### Distributed Training + +#### Running on Ascend + +```shell +bash scripts/run_distributed_train_ascend.sh /path/mindrecord_dataset /path/hccl.json /path/load_ckpt(optional) +``` + +The command above will run in the background, you can view training logs in LOG*/training_log.txt and LOG*/ms_log/. After training finished, you will get some checkpoint files under the LOG*/ckpt_0 folder by default. The loss value will be displayed as follows: + +```text +# grep "epoch" training_log.txt +epoch: 328, current epoch percent: 1.000, step: 150682, outputs are (Tensor(shape=[], dtype=Float32, value= 1.71943), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 1024)) +epoch time: 236204.566 ms, per step time: 515.730 ms +epoch: 329, current epoch percent: 1.000, step: 151140, outputs are (Tensor(shape=[], dtype=Float32, value= 1.53505), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 1024)) +epoch time: 235430.151 ms, per step time: 514.040 ms +... +``` + +## [Testing Process](#contents) + +### Testing and Evaluation + +```shell +# Evaluation base on validation dataset will be done automatically, while for test or test-dev dataset, the accuracy should be upload to the CodaLab official website(https://competitions.codalab.org). +# On Ascend +bash scripts/run_standalone_eval_ascend.sh device_id val(or test) /path/coco_dataset /path/load_ckpt +``` + +you can see the MAP result below as below: + +```log +overall performance on coco2017 validation dataset + Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.302 + Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.479 + Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.320 + Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.114 + Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.328 + Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.461 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.272 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.428 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.446 + Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200 + Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.488 + Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.663 +``` + +## [Inference Process](#contents) + +### Convert + +If you want to infer the network on Ascend 310, you should convert the model to MINDIR: + +- Export on local + + ```text + python export.py --device_id [DEVICE_ID] --export_format MINDIR --export_load_ckpt [CKPT_FILE__PATH] --export_name [EXPORT_FILE_NAME] + ``` + +- Export on ModelArts (If you want to run in modelarts, please check the official documentation of [modelarts](https://support.huaweicloud.com/modelarts/), and you can start as follows) + + ```text + # (1) Upload the code folder to S3 bucket. + # (2) Click to "create training task" on the website UI interface. + # (3) Set the code directory to "/{path}/centernet_resnet50" on the website UI interface. + # (4) Set the startup file to /{path}/centernet_resnet50/export.py" on the website UI interface. + # (5) Perform a or b. + # a. setting parameters in /{path}/centernet_resnet50/default_config.yaml. + # 1. Set 鈥漞nable_modelarts: True鈥� + # 2. Set 鈥渆xport_load_ckpt: ./{path}/*.ckpt鈥�('export_load_ckpt' indicates the path of the weight file to be exported relative to the file `export.py`, and the weight file must be included in the code directory.) + # 3. Set 鈥漞xport_name: centernet_resnet50鈥� + # 4. Set 鈥漞xport_format锛歁INDIR鈥� + # b. adding on the website UI interface. + # 1. Add 鈥漞nable_modelarts=True鈥� + # 2. Add 鈥渆xport_load_ckpt=./{path}/*.ckpt鈥�('export_load_ckpt' indicates the path of the weight file to be exported relative to the file `export.py`, and the weight file must be included in the code directory.) + # 3. Add 鈥漞xport_name=centernet_resnet50鈥� + # 4. Add 鈥漞xport_format=MINDIR鈥� + # (7) Check the "data storage location" on the website UI interface and set the "Dataset path" path (This step is useless, but necessary.). + # (8) Set the "Output file path" and "Job log path" to your path on the website UI interface. + # (9) Under the item "resource pool selection", select the specification of a single card. + # (10) Create your job. + # You will see centernet.mindir under {Output file path}. + ``` + +### Infer on Ascend310 + +Before performing inference, the mindir file must be exported by export.py script. We only provide an example of inference using MINDIR model. Current batch_size can only be set to 1. + + ```shell + #Ascend310 inference + bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [PREPROCESS_IMAGES] [DEVICE_ID] + ``` + +- `PREPROCESS_IMAGES` Weather need preprocess or not, it's value must be in `[y, n]` + +### Result + +Inference result is saved in current path, you can find result like this in acc.log file.Since the input images are fixed shape on Ascend 310, all accuracy will be lower than that on Ascend 910. + +```log + #acc.log + =============coco2017 310 infer reulst========= + Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.295 + Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.485 + Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.303 + Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.143 + Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.343 + Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.393 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.262 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.426 + Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.448 + Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.244 + Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.498 + Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.600 +``` + +# [Model Description](#contents) + +## [Performance](#contents) + +### Training Performance On Ascend 910 + +CenterNet on 11.8K images(The annotation and data format must be the same as coco) + +| Parameters | CenterNet_ResNet50 | +| ---------------------- | ------------------------------------------------------------ | +| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G | +| uploaded Date | 11/01/2021 (month/day/year) | +| MindSpore Version | 1.3.0 | +| Dataset | COCO2017 | +| Training Parameters | 8p, epoch=330, steps=151140, batch_size = 32, lr=4.8e-4 | +| Optimizer | Adam | +| Loss Function | Focal Loss, L1 Loss, RegLoss | +| outputs | detections | +| Loss | 1.5-2.0 | +| Speed | 8p 40 img/s | +| Total time: training | 8p: 18 h | +| Total time: evaluation | keep res: test 1h, val 0.7h; fix res: test 40min, val 8min | +| Checkpoint | 392MB (.ckpt file) | +| Scripts | [centernet_resnet50 script](https://gitee.com/mindspore/models/tree/master/research/cv/centernet_resnet50) | + +### Inference Performance On Ascend 910 + +CenterNet on validation(5K images) and test-dev(40K images) + +| Parameters | CenterNet_ResNet50 | +| -------------------- | ------------------------------------------------------------ | +| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G | +| uploaded Date | 11/01/2021 (month/day/year) | +| MindSpore Version | 1.3.0 | +| Dataset | COCO2017 | +| batch_size | 1 | +| outputs | mAP | +| Accuracy(validation) | MAP: 30.2%, AP50: 47.9%, AP75: 32.0%, Medium: 32.8%, Large: 39.3% | + +### Inference Performance On Ascend 310 + +CenterNet on validation(5K images) + +| Parameters | CenterNet_ResNet50 | +| -------------------- | ------------------------------------------------------------ | +| Resource | Ascend 310; CentOS 3.10 | +| uploaded Date | 8/31/2021 (month/day/year) | +| MindSpore Version | 1.3.0 | +| Dataset | COCO2017 | +| batch_size | 1 | +| outputs | mAP | +| Accuracy(validation) | MAP: 29.5%, AP50: 48.5%, AP75: 30.3%, Medium: 49.8%, Large: 60.0% | + +# [Description of Random Situation](#contents) + +In run_distributed_train_ascend.sh, we set do_shuffle to True to shuffle the dataset by default. +In train.py, we set a random seed to make sure that each node has the same initial weight in distribute training. + +# [ModelZoo Homepage](#contents) + + Please check the official [homepage](https://gitee.com/mindspore/models). + +# FAQ + +First refer to [ModelZoo FAQ](https://gitee.com/mindspore/models#FAQ) to find some common public questions. + +- **Q: What to do if memory overflow occurs when using PYNATIVE_MODE锛�** **A**:Memory overflow is usually because PYNATIVE_MODE requires more memory. Setting the batch size to 31 reduces memory consumption and can be used for network training. diff --git a/research/cv/centernet_resnet50_v1/ascend310_infer/CMakeLists.txt b/research/cv/centernet_resnet50_v1/ascend310_infer/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..16eb6dafecacb9829f2f7513554a63e7ee710118 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/ascend310_infer/CMakeLists.txt @@ -0,0 +1,15 @@ +cmake_minimum_required(VERSION 3.14.1) +project(Ascend310Infer) +add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) +set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined") +set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/) +option(MINDSPORE_PATH "mindspore install path" "") +include_directories(${MINDSPORE_PATH}) +include_directories(${MINDSPORE_PATH}/include) +include_directories(${PROJECT_SRC_ROOT}) +find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib) +file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*) + +add_executable(main src/main.cc src/utils.cc) +target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags) + diff --git a/research/cv/centernet_resnet50_v1/ascend310_infer/build.sh b/research/cv/centernet_resnet50_v1/ascend310_infer/build.sh new file mode 100644 index 0000000000000000000000000000000000000000..285514e19f2a1878a7bf8f0eed3c99fbc73868c4 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/ascend310_infer/build.sh @@ -0,0 +1,29 @@ +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ -d out ]; then + rm -rf out +fi + +mkdir out +cd out || exit + +if [ -f "Makefile" ]; then + make clean +fi + +cmake .. \ + -DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`" +make diff --git a/research/cv/centernet_resnet50_v1/ascend310_infer/inc/utils.h b/research/cv/centernet_resnet50_v1/ascend310_infer/inc/utils.h new file mode 100644 index 0000000000000000000000000000000000000000..efebe03a8c1179f5a1f9d5f7ee07e0352a9937c6 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/ascend310_infer/inc/utils.h @@ -0,0 +1,32 @@ +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef MINDSPORE_INFERENCE_UTILS_H_ +#define MINDSPORE_INFERENCE_UTILS_H_ + +#include <sys/stat.h> +#include <dirent.h> +#include <vector> +#include <string> +#include <memory> +#include "include/api/types.h" + +std::vector<std::string> GetAllFiles(std::string_view dirName); +DIR *OpenDir(std::string_view dirName); +std::string RealPath(std::string_view path); +mindspore::MSTensor ReadFileToTensor(const std::string &file); +int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs); +#endif diff --git a/research/cv/centernet_resnet50_v1/ascend310_infer/src/main.cc b/research/cv/centernet_resnet50_v1/ascend310_infer/src/main.cc new file mode 100644 index 0000000000000000000000000000000000000000..4ede398e21dc35c9e79cfad635f39495b744a361 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/ascend310_infer/src/main.cc @@ -0,0 +1,134 @@ +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include <sys/time.h> +#include <gflags/gflags.h> +#include <dirent.h> +#include <iostream> +#include <string> +#include <algorithm> +#include <iosfwd> +#include <vector> +#include <fstream> +#include <sstream> + +#include "include/api/model.h" +#include "include/api/context.h" +#include "include/api/types.h" +#include "include/api/serialization.h" +#include "include/dataset/execute.h" +#include "include/dataset/vision.h" +#include "inc/utils.h" + +using mindspore::Context; +using mindspore::Serialization; +using mindspore::Model; +using mindspore::Status; +using mindspore::MSTensor; +using mindspore::dataset::Execute; +using mindspore::ModelType; +using mindspore::GraphCell; +using mindspore::kSuccess; + +DEFINE_string(mindir_path, "", "mindir path"); +DEFINE_string(input0_path, ".", "input0 path"); +DEFINE_int32(device_id, 0, "device id"); + +int main(int argc, char **argv) { + gflags::ParseCommandLineFlags(&argc, &argv, true); + if (RealPath(FLAGS_mindir_path).empty()) { + std::cout << "Invalid mindir" << std::endl; + return 1; + } + + auto context = std::make_shared<Context>(); + auto ascend310 = std::make_shared<mindspore::Ascend310DeviceInfo>(); + ascend310->SetDeviceID(FLAGS_device_id); + ascend310->SetPrecisionMode("allow_fp32_to_fp16"); + ascend310->SetOpSelectImplMode("high_precision"); + ascend310->SetBufferOptimizeMode("off_optimize"); + context->MutableDeviceInfo().push_back(ascend310); + mindspore::Graph graph; + Serialization::Load(FLAGS_mindir_path, ModelType::kMindIR, &graph); + + Model model; + Status ret = model.Build(GraphCell(graph), context); + if (ret != kSuccess) { + std::cout << "ERROR: Build failed." << std::endl; + return 1; + } + + std::vector<MSTensor> model_inputs = model.GetInputs(); + if (model_inputs.empty()) { + std::cout << "Invalid model, inputs is empty." << std::endl; + return 1; + } + + auto input0_files = GetAllFiles(FLAGS_input0_path); + + if (input0_files.empty()) { + std::cout << "ERROR: input data empty." << std::endl; + return 1; + } + + std::map<double, double> costTime_map; + size_t size = input0_files.size(); + + for (size_t i = 0; i < size; ++i) { + struct timeval start = {0}; + struct timeval end = {0}; + double startTimeMs; + double endTimeMs; + std::vector<MSTensor> inputs; + std::vector<MSTensor> outputs; + std::cout << "Start predict input files:" << input0_files[i] << std::endl; + + auto input0 = ReadFileToTensor(input0_files[i]); + + inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(), + input0.Data().get(), input0.DataSize()); + + gettimeofday(&start, nullptr); + ret = model.Predict(inputs, &outputs); + gettimeofday(&end, nullptr); + if (ret != kSuccess) { + std::cout << "Predict " << input0_files[i] << " failed." << std::endl; + return 1; + } + startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000; + endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000; + costTime_map.insert(std::pair<double, double>(startTimeMs, endTimeMs)); + WriteResult(input0_files[i], outputs); + } + double average = 0.0; + int inferCount = 0; + + for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) { + double diff = 0.0; + diff = iter->second - iter->first; + average += diff; + inferCount++; + } + average = average / inferCount; + std::stringstream timeCost; + timeCost << "NN inference cost average time: " << average << " ms of infer_count " << inferCount << std::endl; + std::cout << "NN inference cost average time: " << average << "ms of infer_count " << inferCount << std::endl; + std::string fileName = "./time_Result" + std::string("/test_perform_static.txt"); + std::ofstream fileStream(fileName.c_str(), std::ios::trunc); + fileStream << timeCost.str(); + fileStream.close(); + costTime_map.clear(); + return 0; +} diff --git a/research/cv/centernet_resnet50_v1/ascend310_infer/src/utils.cc b/research/cv/centernet_resnet50_v1/ascend310_infer/src/utils.cc new file mode 100644 index 0000000000000000000000000000000000000000..3c8b4a52c965bf9d9dfcd2a268dd1c336256ed27 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/ascend310_infer/src/utils.cc @@ -0,0 +1,128 @@ +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include <fstream> +#include <algorithm> +#include <iostream> +#include "inc/utils.h" + +using mindspore::MSTensor; +using mindspore::DataType; + +std::vector<std::string> GetAllFiles(std::string_view dirName) { + struct dirent *filename; + DIR *dir = OpenDir(dirName); + if (dir == nullptr) { + return {}; + } + std::vector<std::string> res; + while ((filename = readdir(dir)) != nullptr) { + std::string dName = std::string(filename->d_name); + if (dName == "." || dName == ".." || filename->d_type != DT_REG) { + continue; + } + res.emplace_back(std::string(dirName) + "/" + filename->d_name); + } + std::sort(res.begin(), res.end()); + for (auto &f : res) { + std::cout << "image file: " << f << std::endl; + } + return res; +} + +int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) { + std::string homePath = "./result_Files"; + for (size_t i = 0; i < outputs.size(); ++i) { + size_t outputSize; + std::shared_ptr<const void> netOutput; + netOutput = outputs[i].Data(); + outputSize = outputs[i].DataSize(); + int pos = imageFile.rfind('/'); + std::string fileName(imageFile, pos + 1); + fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin"); + std::string outFileName = homePath + "/" + fileName; + FILE * outputFile = fopen(outFileName.c_str(), "wb"); + fwrite(netOutput.get(), outputSize, sizeof(char), outputFile); + fclose(outputFile); + outputFile = nullptr; + } + return 0; +} + +mindspore::MSTensor ReadFileToTensor(const std::string &file) { + if (file.empty()) { + std::cout << "Pointer file is nullptr" << std::endl; + return mindspore::MSTensor(); + } + + std::ifstream ifs(file); + if (!ifs.good()) { + std::cout << "File: " << file << " is not exist" << std::endl; + return mindspore::MSTensor(); + } + + if (!ifs.is_open()) { + std::cout << "File: " << file << "open failed" << std::endl; + return mindspore::MSTensor(); + } + + ifs.seekg(0, std::ios::end); + size_t size = ifs.tellg(); + mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size); + + ifs.seekg(0, std::ios::beg); + ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size); + ifs.close(); + + return buffer; +} + +DIR *OpenDir(std::string_view dirName) { + if (dirName.empty()) { + std::cout << " dirName is null ! " << std::endl; + return nullptr; + } + std::string realPath = RealPath(dirName); + struct stat s; + lstat(realPath.c_str(), &s); + if (!S_ISDIR(s.st_mode)) { + std::cout << "dirName is not a valid directory !" << std::endl; + return nullptr; + } + DIR *dir; + dir = opendir(realPath.c_str()); + if (dir == nullptr) { + std::cout << "Can not open dir " << dirName << std::endl; + return nullptr; + } + std::cout << "Successfully opened the dir " << dirName << std::endl; + return dir; +} + +std::string RealPath(std::string_view path) { + char realPathMem[PATH_MAX] = {0}; + char *realPathRet = nullptr; + realPathRet = realpath(path.data(), realPathMem); + + if (realPathRet == nullptr) { + std::cout << "File: " << path << " is not exist."; + return ""; + } + + std::string realPath(realPathMem); + std::cout << path << " realpath is: " << realPath << std::endl; + return realPath; +} diff --git a/research/cv/centernet_resnet50_v1/default_config.yaml b/research/cv/centernet_resnet50_v1/default_config.yaml new file mode 100644 index 0000000000000000000000000000000000000000..3aaea0595ac90d8aaa40a7c62f00f103d290f04d --- /dev/null +++ b/research/cv/centernet_resnet50_v1/default_config.yaml @@ -0,0 +1,274 @@ +# Builtin Configurations(DO NOT CHANGE THESE CONFIGURATIONS unless you know exactly what you are doing) +enable_modelarts: False +# Url for modelarts +data_url: "" +train_url: "" +checkpoint_url: "" +# Path for local +data_path: "/cache/data" +output_path: "/cache/train" +load_path: "/cache/checkpoint_path" +device_target: "Ascend" +enable_profiling: False + +# ============================================================================== +# prepare *.mindrecord* data +coco_data_dir: "" +mindrecord_dir: "" # also used by train.py +mindrecord_prefix: "coco_det.train.mind" + +# train related +save_result_dir: "" +device_id: 0 +device_num: 1 + +distribute: 'false' +need_profiler: "false" +profiler_path: "./profiler" +epoch_size: 1 +train_steps: -1 +enable_save_ckpt: "true" +do_shuffle: "true" +enable_data_sink: "true" +data_sink_steps: -1 +save_checkpoint_path: "" +load_checkpoint_path: "" +save_checkpoint_steps: 458 +save_checkpoint_num: 1 + +# val related +data_dir: "" +run_mode: "test" +enable_eval: "true" +visual_image: "false" + +# export related +export_load_ckpt: '' +export_format: '' +export_name: '' + +# 310 infer +val_data_dir: '' +predict_dir: '' +result_path: '' +label_path: '' +meta_path: '' +save_path: '' + +dataset_config: + num_classes: 80 + max_objs: 128 + input_res_train: [512, 512] + output_res: [128, 128] + input_res_test: [680, 680] + rand_crop: True + shift: 0.1 + scale: 0.4 + down_ratio: 4 + aug_rot: 0.0 + rotate: 0 + flip_prop: 0.5 + color_aug: True + coco_classes: ['background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', + 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', + 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', + 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', + 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', + 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', + 'kite', 'baseball bat', 'baseball glove', 'skateboard', + 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', + 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', + 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', + 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', + 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', + 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', + 'refrigerator', 'book', 'clock', 'vase', 'scissors', + 'teddy bear', 'hair drier', 'toothbrush'] + mean: np.array([0.40789654, 0.44719302, 0.47026115], dtype=np.float32) + std: np.array([0.28863828, 0.27408164, 0.27809835], dtype=np.float32) + eig_val: np.array([0.2141788, 0.01817699, 0.00341571], dtype=np.float32) + eig_vec: np.array([[-0.58752847, -0.69563484, 0.41340352], + [-0.5832747, 0.00994535, -0.81221408], + [-0.56089297, 0.71832671, 0.41158938]], dtype=np.float32) + +net_config: + num_stacks: 1 + down_ratio: 4 + head_conv: 64 + num_classes: 80 + block_class: [3, 4, 6, 3] + dense_wh: False + norm_wh: False + cat_spec_wh: False + reg_offset: True + hm_weight: 1 + off_weight: 1 + wh_weight: 0.1 + mse_loss: False + reg_loss: 'l1' + +train_config: + batch_size: 32 + loss_scale_value: 1024 + optimizer: 'Adam' + lr_schedule: 'MultiDecay' + Adam: + weight_decay: 0.0 + decay_filter: "lambda x: x.name.endswith('.bias') or x.name.endswith('.beta') or x.name.endswith('.gamma')" + PolyDecay: + learning_rate: 0.0005 # 5e-4 + end_learning_rate: 0.0000005 # 5e-7 + power: 5.0 + eps: 0.0000001 # 1e-7 + warmup_steps: 2000 + MultiDecay: + learning_rate: 0.00048 # 4.8e-4 + eps: 0.0000001 # 1e-7 + warmup_steps: 2000 + multi_epochs: [290, 320] + factor: 10 + +eval_config: + SOFT_NMS: True + keep_res: True + multi_scales: [1.0] + K: 100 + score_thresh: 0.3 + valid_ids: [ + 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, + 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, + 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, + 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, + 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, + 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, + 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, + 82, 84, 85, 86, 87, 88, 89, 90] + color_list: [0.000, 0.800, 1.000, + 0.850, 0.325, 0.098, + 0.929, 0.694, 0.125, + 0.494, 0.184, 0.556, + 0.466, 0.674, 0.188, + 0.301, 0.745, 0.933, + 0.635, 0.078, 0.184, + 0.300, 0.300, 0.300, + 0.600, 0.600, 0.600, + 1.000, 0.000, 0.000, + 1.000, 0.500, 0.000, + 0.749, 0.749, 0.000, + 0.000, 1.000, 0.000, + 0.000, 0.000, 1.000, + 0.667, 0.000, 1.000, + 0.333, 0.333, 0.000, + 0.333, 0.667, 0.333, + 0.333, 1.000, 0.000, + 0.667, 0.333, 0.000, + 0.667, 0.667, 0.000, + 0.667, 1.000, 0.000, + 1.000, 0.333, 0.000, + 1.000, 0.667, 0.000, + 1.000, 1.000, 0.000, + 0.000, 0.333, 0.500, + 0.000, 0.667, 0.500, + 0.000, 1.000, 0.500, + 0.333, 0.000, 0.500, + 0.333, 0.333, 0.500, + 0.333, 0.667, 0.500, + 0.333, 1.000, 0.500, + 0.667, 0.000, 0.500, + 0.667, 0.333, 0.500, + 0.667, 0.667, 0.500, + 0.667, 1.000, 0.500, + 1.000, 0.000, 0.500, + 1.000, 0.333, 0.500, + 1.000, 0.667, 0.500, + 1.000, 1.000, 0.500, + 0.000, 0.333, 1.000, + 0.000, 0.667, 1.000, + 0.000, 1.000, 1.000, + 0.333, 0.000, 1.000, + 0.333, 0.333, 1.000, + 0.333, 0.667, 1.000, + 0.333, 1.000, 1.000, + 0.667, 0.000, 1.000, + 0.667, 0.333, 1.000, + 0.667, 0.667, 1.000, + 0.667, 1.000, 1.000, + 1.000, 0.000, 1.000, + 1.000, 0.333, 1.000, + 1.000, 0.667, 1.000, + 0.167, 0.800, 0.000, + 0.333, 0.000, 0.000, + 0.500, 0.000, 0.000, + 0.667, 0.000, 0.000, + 0.833, 0.000, 0.000, + 1.000, 0.000, 0.000, + 0.000, 0.667, 0.400, + 0.000, 0.333, 0.000, + 0.000, 0.500, 0.000, + 0.000, 0.667, 0.000, + 0.000, 0.833, 0.000, + 0.000, 1.000, 0.000, + 0.000, 0.000, 0.167, + 0.000, 0.000, 0.333, + 0.000, 0.000, 0.500, + 0.000, 0.000, 0.667, + 0.000, 0.000, 0.833, + 0.000, 0.000, 1.000, + 0.000, 0.200, 0.800, + 0.143, 0.143, 0.543, + 0.286, 0.286, 0.286, + 0.429, 0.429, 0.429, + 0.571, 0.571, 0.571, + 0.714, 0.714, 0.714, + 0.857, 0.857, 0.857, + 0.000, 0.447, 0.741, + 0.50, 0.5, 0] + +export_config: + input_res: dataset_config.input_res_test + ckpt_file: "./ckpt_file.ckpt" + export_format: "MINDIR" + export_name: "CenterNet_ObjectDetection" + +--- +# Help description for each configuration +enable_modelarts: "Whether training on modelarts, default: False" +data_url: "Url for modelarts" +train_url: "Url for modelarts" +data_path: "The location of the input data." +output_path: "The location of the output file." +device_target: "Running platform, default is Ascend." +enable_profiling: 'Whether enable profiling while training, default: False' + +distribute: "Run distribute, default is false." +need_profiler: "Profiling to parsing runtime info, default is false." +profiler_path: "The path to save profiling data" +epoch_size: "Epoch size, default is 1." +train_steps: "Training Steps, default is -1, i.e. run all steps according to epoch number." +device_id: "Device id, default is 0." +device_num: "Use device nums, default is 1." +enable_save_ckpt: "Enable save checkpoint, default is true." +do_shuffle: "Enable shuffle for dataset, default is true." +enable_data_sink: "Enable data sink, default is true." +data_sink_steps: "Sink steps for each epoch, default is 1." +save_checkpoint_path: "Save checkpoint path" +load_checkpoint_path: "Load checkpoint file path" +save_checkpoint_steps: "Save checkpoint steps, default is 1000." +save_checkpoint_num: "Save checkpoint numbers, default is 1." +mindrecord_dir: "Mindrecord dataset files directory" +mindrecord_prefix: "Prefix of MindRecord dataset filename." +visual_image: "Visulize the ground truth and predicted image" +save_result_dir: "The path to save the predict results" + +data_dir: "Dataset directory, the absolute image path is joined by the data_dir, and the relative path in anno_path" +run_mode: "test or validation, default is test." +enable_eval: "Whether evaluate accuracy after prediction" + +--- +device_target: ['Ascend'] +distribute: ["true", "false"] +need_profiler: ["true", "false"] +enable_save_ckpt: ["true", "false"] +do_shuffle: ["true", "false"] +enable_data_sink: ["true", "false"] +export_format: ["MINDIR"] diff --git a/research/cv/centernet_resnet50_v1/eval.py b/research/cv/centernet_resnet50_v1/eval.py index fa02c63b9e3a4cbafb589c38ceddb93785cf0858..f956456642845e55c36752442c1e69426b625276 100644 --- a/research/cv/centernet_resnet50_v1/eval.py +++ b/research/cv/centernet_resnet50_v1/eval.py @@ -20,7 +20,6 @@ import os import time import copy import json -import argparse import cv2 from pycocotools.coco import COCO from pycocotools.cocoeval import COCOeval @@ -31,54 +30,62 @@ import mindspore.log as logger from src import COCOHP, CenterNetDetEval from src import convert_eval_format, post_process, merge_outputs from src import visual_image -from src.config import dataset_config, net_config, eval_config +from src.model_utils.config import config, dataset_config, net_config, eval_config +from src.model_utils.moxing_adapter import moxing_wrapper +from src.model_utils.device_adapter import get_device_id _current_dir = os.path.dirname(os.path.realpath(__file__)) -parser = argparse.ArgumentParser(description='CenterNet evaluation') -parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'CPU'], - help='device where the code will be implemented. (Default: Ascend)') -parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") -parser.add_argument("--load_checkpoint_path", type=str, default="", help="Load checkpoint file path") -parser.add_argument("--data_dir", type=str, default="", help="Dataset directory, " - "the absolute image path is joined by the data_dir " - "and the relative path in anno_path") -parser.add_argument("--run_mode", type=str, default="val", help="test or validation, default is validation.") -parser.add_argument("--visual_image", type=str, default="false", help="Visulize the ground truth and predicted image") -parser.add_argument("--enable_eval", type=str, default="true", help="Whether evaluate accuracy after prediction") -parser.add_argument("--save_result_dir", type=str, default="", help="The path to save the predict results") - -args_opt = parser.parse_args() - +def modelarts_pre_process(): + """modelarts pre process function.""" + try: + from nms import soft_nms + print('soft_nms_attributes: {}'.format(soft_nms.__dir__())) + except ImportError: + print('NMS not installed! trying installing...\n') + cur_path = os.path.dirname(os.path.abspath(__file__)) + os.system('cd {}/CenterNet/src/lib/external/ && make && python setup.py install && cd - '.format(cur_path)) + try: + from nms import soft_nms + print('soft_nms_attributes: {}'.format(soft_nms.__dir__())) + except ImportError: + print('Installing failed! check if the folder "./CenterNet" exists.') + else: + print('Install nms successfully') + config.data_dir = config.data_path + config.load_checkpoint_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), config.load_checkpoint_path) + + +@moxing_wrapper(pre_process=modelarts_pre_process) def predict(): - """ + ''' Predict function - """ - context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.device_target) - if args_opt.device_target == "Ascend": - context.set_context(device_id=args_opt.device_id) - enable_nms_fp16 = True - else: + ''' + context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target) + if config.device_target == "Ascend": + context.set_context(device_id=get_device_id()) enable_nms_fp16 = False + else: + enable_nms_fp16 = True - logger.info("Begin creating {} dataset".format(args_opt.run_mode)) - coco = COCOHP(dataset_config, run_mode=args_opt.run_mode, net_opt=net_config, - enable_visual_image=(args_opt.visual_image == "true"), save_path=args_opt.save_result_dir,) - coco.init(args_opt.data_dir, keep_res=eval_config.keep_res) + logger.info("Begin creating {} dataset".format(config.run_mode)) + coco = COCOHP(dataset_config, run_mode=config.run_mode, net_opt=net_config, + enable_visual_image=config.visual_image, save_path=config.save_result_dir,) + coco.init(config.data_dir, keep_res=eval_config.keep_res) dataset = coco.create_eval_dataset() net_for_eval = CenterNetDetEval(net_config, eval_config.K, enable_nms_fp16) net_for_eval.set_train(False) - param_dict = load_checkpoint(args_opt.load_checkpoint_path) + param_dict = load_checkpoint(config.load_checkpoint_path) load_param_into_net(net_for_eval, param_dict) # save results - save_path = os.path.join(args_opt.save_result_dir, args_opt.run_mode) + save_path = os.path.join(config.save_result_dir, config.run_mode) if not os.path.exists(save_path): os.makedirs(save_path) - if args_opt.visual_image == "true": + if config.visual_image == "true": save_pred_image_path = os.path.join(save_path, "pred_image") if not os.path.exists(save_pred_image_path): os.makedirs(save_pred_image_path) @@ -120,10 +127,10 @@ def predict(): pred_annos["images"].append(image_info) for image_anno in pred_json["annotations"]: pred_annos["annotations"].append(image_anno) - if args_opt.visual_image == "true": + if config.visual_image == "true": img_file = os.path.join(coco.image_path, gt_image_info[0]['file_name']) gt_image = cv2.imread(img_file) - if args_opt.run_mode != "test": + if config.run_mode != "test": annos = coco.coco.loadAnns(coco.anns[image_id]) visual_image(copy.deepcopy(gt_image), annos, save_gt_image_path, score_threshold=eval_config.score_thresh) @@ -131,15 +138,15 @@ def predict(): visual_image(gt_image, anno, save_pred_image_path, score_threshold=eval_config.score_thresh) # save results - save_path = os.path.join(args_opt.save_result_dir, args_opt.run_mode) + save_path = os.path.join(config.save_result_dir, config.run_mode) if not os.path.exists(save_path): os.makedirs(save_path) - pred_anno_file = os.path.join(save_path, '{}_pred_result.json').format(args_opt.run_mode) + pred_anno_file = os.path.join(save_path, '{}_pred_result.json').format(config.run_mode) json.dump(pred_annos, open(pred_anno_file, 'w')) - pred_res_file = os.path.join(save_path, '{}_pred_eval.json').format(args_opt.run_mode) + pred_res_file = os.path.join(save_path, '{}_pred_eval.json').format(config.run_mode) json.dump(pred_annos["annotations"], open(pred_res_file, 'w')) - if args_opt.run_mode != "test" and args_opt.enable_eval: + if config.run_mode != "test" and config.enable_eval: run_eval(coco.annot_path, pred_res_file) diff --git a/research/cv/centernet_resnet50_v1/export.py b/research/cv/centernet_resnet50_v1/export.py index 27aa8dc025ae1cf63b2e3b8cb940f5637618b956..599c5a1ee393afb9e6a00bef9f121413e374e88b 100644 --- a/research/cv/centernet_resnet50_v1/export.py +++ b/research/cv/centernet_resnet50_v1/export.py @@ -16,21 +16,27 @@ Export CenterNet mindir model. """ -import argparse +import os import numpy as np +import mindspore from mindspore import context, Tensor from mindspore.train.serialization import load_checkpoint, load_param_into_net, export from src import CenterNetDetEval -from src.config import net_config, eval_config, export_config +from src.model_utils.config import config, net_config, eval_config, export_config +from src.model_utils.moxing_adapter import moxing_wrapper -parser = argparse.ArgumentParser(description='centernet export') -parser.add_argument("--device_id", type=int, default=0, help="Device id") -args = parser.parse_args() -context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=args.device_id) +def modelarts_pre_process(): + '''modelarts pre process function.''' + export_config.ckpt_file = os.path.join(os.path.dirname(os.path.abspath(__file__)), export_config.ckpt_file) + export_config.export_name = os.path.join(config.output_path, export_config.export_name) -if __name__ == '__main__': + +@moxing_wrapper(pre_process=modelarts_pre_process) +def run_export(): + '''export function''' + context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=config.device_id) net = CenterNetDetEval(net_config, eval_config.K) net.set_train(False) @@ -38,7 +44,10 @@ if __name__ == '__main__': load_param_into_net(net, param_dict) net.set_train(False) - input_shape = [1, 3, export_config.input_res[0], export_config.input_res[1]] - input_data = Tensor(np.random.uniform(-1.0, 1.0, size=input_shape).astype(np.float32)) + input_data = Tensor(np.zeros([1, 3, export_config.input_res[0], export_config.input_res[1]]), mindspore.float32) export(net, input_data, file_name=export_config.export_name, file_format=export_config.export_format) + + +if __name__ == '__main__': + run_export() diff --git a/research/cv/centernet_resnet50_v1/postprocess.py b/research/cv/centernet_resnet50_v1/postprocess.py new file mode 100644 index 0000000000000000000000000000000000000000..c9fdd0c5ad673eaec091773d6b06c11a12b64a55 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/postprocess.py @@ -0,0 +1,64 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""post process for 310 inference""" +import os +import json +import numpy as np +import pycocotools.coco as coco +from pycocotools.cocoeval import COCOeval +from src.model_utils.config import config, dataset_config, eval_config +from src import convert_eval_format, post_process, merge_outputs + + +def cal_acc(result_path, label_path, meta_path, save_path): + """calculate inference accuracy""" + name_list = np.load(os.path.join(meta_path, "name_list.npy"), allow_pickle=True) + meta_list = np.load(os.path.join(meta_path, "meta_list.npy"), allow_pickle=True) + + label_infor = coco.COCO(label_path) + pred_annos = {"images": [], "annotations": []} + for num, image_id in enumerate(name_list): + meta = meta_list[num] + pre_image = np.fromfile(os.path.join(result_path) + "/eval2017_image_" + str(image_id) + "_0.bin", + dtype=np.float32).reshape((1, 100, 6)) + detections = [] + for scale in eval_config.multi_scales: + dets = post_process(pre_image, meta, scale, dataset_config.num_classes) + detections.append(dets) + detections = merge_outputs(detections, dataset_config.num_classes, eval_config.SOFT_NMS) + pred_json = convert_eval_format(detections, image_id, eval_config.valid_ids) + label_infor.loadImgs([image_id]) + for image_info in pred_json["images"]: + pred_annos["images"].append(image_info) + for image_anno in pred_json["annotations"]: + pred_annos["annotations"].append(image_anno) + + if not os.path.exists(save_path): + os.makedirs(save_path) + pred_anno_file = os.path.join(save_path, '{}_pred_result.json').format(config.run_mode) + json.dump(pred_annos, open(pred_anno_file, 'w')) + pred_res_file = os.path.join(save_path, '{}_pred_eval.json').format(config.run_mode) + json.dump(pred_annos["annotations"], open(pred_res_file, 'w')) + + coco_anno = coco.COCO(label_path) + coco_dets = coco_anno.loadRes(pred_res_file) + coco_eval = COCOeval(coco_anno, coco_dets, "bbox") + coco_eval.evaluate() + coco_eval.accumulate() + coco_eval.summarize() + + +if __name__ == '__main__': + cal_acc(config.result_path, config.label_path, config.meta_path, config.save_path) diff --git a/research/cv/centernet_resnet50_v1/preprocess.py b/research/cv/centernet_resnet50_v1/preprocess.py new file mode 100644 index 0000000000000000000000000000000000000000..9d384b8bab5ddbd0e740383a4b533cd77b13815c --- /dev/null +++ b/research/cv/centernet_resnet50_v1/preprocess.py @@ -0,0 +1,56 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""pre process for 310 inference""" +import os +import numpy as np +from src.model_utils.config import config, dataset_config, eval_config, net_config +from src.dataset import COCOHP + + +def preprocess(dataset_path, preprocess_path): + """preprocess input images""" + meta_path = os.path.join(preprocess_path, "meta/meta") + result_path = os.path.join(preprocess_path, "data") + if not os.path.exists(meta_path): + os.makedirs(os.path.join(preprocess_path, "meta/meta")) + if not os.path.exists(result_path): + os.makedirs(os.path.join(preprocess_path, "data")) + coco = COCOHP(dataset_config, run_mode="val", net_opt=net_config) + coco.init(dataset_path, keep_res=False) + dataset = coco.create_eval_dataset() + name_list = [] + meta_list = [] + i = 0 + for data in dataset.create_dict_iterator(num_epochs=1): + img_id = data['image_id'].asnumpy().reshape((-1))[0] + image = data['image'].asnumpy() + for scale in eval_config.multi_scales: + image_preprocess, meta = coco.pre_process_for_test(image, img_id, scale) + evl_file_name = "eval2017_image" + "_" + str(img_id) + ".bin" + evl_file_path = result_path + "/" + evl_file_name + image_preprocess.tofile(evl_file_path) + meta_file_path = os.path.join(preprocess_path + "/meta/meta", str(img_id) + ".txt") + with open(meta_file_path, 'w+') as f: + f.write(str(meta)) + name_list.append(img_id) + meta_list.append(meta) + i += 1 + print(f"preprocess: no.[{i}], img_name:{img_id}") + np.save(os.path.join(preprocess_path + "/meta", "name_list.npy"), np.array(name_list)) + np.save(os.path.join(preprocess_path + "/meta", "meta_list.npy"), np.array(meta_list)) + + +if __name__ == '__main__': + preprocess(config.val_data_dir, config.predict_dir) diff --git a/research/cv/centernet_resnet50_v1/readme.md b/research/cv/centernet_resnet50_v1/readme.md deleted file mode 100644 index 4c2eb1e167af6f256df9958017b864ceedc9fb63..0000000000000000000000000000000000000000 --- a/research/cv/centernet_resnet50_v1/readme.md +++ /dev/null @@ -1,445 +0,0 @@ -# Contents - -- [CenterNet Description](#CenterNet-description) -- [Model Architecture](#model-architecture) -- [Dataset](#dataset) -- [Environment Requirements](#environment-requirements) -- [Quick Start](#quick-start) -- [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) - - [Script Parameters](#script-parameters) - - [Training Process](#training-process) - - [Training](#training) - - [Distributed Training](#distributed-training) - - [Testing Process](#testing-process) - - [Testing and Evaluation](#testing-and-evaluation) -- [Model Description](#model-description) - - [Performance](#performance) - - [Training Performance](#training-performance) - - [Inference Performance](#inference-performance) -- [ModelZoo Homepage](#modelzoo-homepage) - -# [CenterNet Description](#contents) - -CenterNet is a novel practical anchor-free method for object detection, 3D detection, and pose estimation, which detect identifies objects as axis-aligned boxes in an image. The detector uses keypoint estimation to find center points and regresses to all other object properties, such as size, 3D location, orientation, and even pose. In nature, it's a one-stage method to simultaneously predict center location and bboxes with real-time speed and higher accuracy than corresponding bounding box based detectors. -We support training and evaluation on Ascend910. - -[Paper](https://arxiv.org/pdf/1904.07850.pdf): Objects as Points. 2019. -Xingyi Zhou(UT Austin) and Dequan Wang(UC Berkeley) and Philipp Krahenbuhl(UT Austin) - -# [Model Architecture](#contents) - -In the current model. We first change the channels of the three upsampling layers to 256, 128, 64, respectively, to save computation. We then add one 3 脳 3 deformable convolutional layer before each up-convolution with channel 256, 128, 64, respectively. The up-convolutional kernels are initialized as bilinear interpolation. - -# [Dataset](#contents) - -Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. - -Dataset used: [COCO2017](https://cocodataset.org/) - -- Dataset size锛�26G - - Train锛�19G锛�118000 images - - Val锛�0.8G锛�5000 images - - Test: 6.3G, 40000 images - - Annotations锛�808M锛宨nstances锛宑aptions etc -- Data format锛歩mage and json files - -- Note锛欴ata will be processed in dataset.py - -- The directory structure is as follows, name of directory and file is user defined: - - ```path - . - 鈹溾攢鈹€ dataset - 鈹溾攢鈹€ centernet - 鈹溾攢鈹€ annotations - 鈹� 鈹溾攢 train.json - 鈹� 鈹斺攢 val.json - 鈹斺攢 images - 鈹溾攢 train - 鈹� 鈹斺攢images - 鈹� 鈹溾攢class1_image_folder - 鈹� 鈹溾攢 ... - 鈹� 鈹斺攢classn_image_folder - 鈹斺攢 val - 鈹� 鈹斺攢images - 鈹� 鈹溾攢class1_image_folder - 鈹� 鈹溾攢 ... - 鈹� 鈹斺攢classn_image_folder - 鈹斺攢 test - 鈹斺攢images - 鈹溾攢class1_image_folder - 鈹溾攢 ... - 鈹斺攢classn_image_folder - ``` - -# [Environment Requirements](#contents) - -- Hardware锛圓scend锛� - - Prepare hardware environment with Ascend processor. -- Framework - - [MindSpore](https://www.mindspore.cn/install/en) -- For more information, please check the resources below锛� - - [MindSpore tutorials](https://www.mindspore.cn/tutorials/en/master/index.html) - - [MindSpore Python API](https://www.mindspore.cn/docs/api/en/master/index.html) -- Download the dataset COCO2017. -- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets. - - 1. If coco dataset is used. **Select dataset to coco when run script.** - Install Cython and pycocotool, and you can also install mmcv to process data. - - ```pip - pip install Cython - - pip install pycocotools - - pip install mmcv==0.2.14 - ``` - - And change the COCO_ROOT and other settings you need in `config.py`. The directory structure is as follows: - - ```path - . - 鈹斺攢cocodataset - 鈹溾攢annotations - 鈹溾攢instance_train2017.json - 鈹斺攢instance_val2017.json - 鈹溾攢val2017 - 鈹斺攢train2017 - - ``` - - 2. If your own dataset is used. **Select dataset to other when run script.** - Organize the dataset information the same format as COCO. - -# [Quick Start](#contents) - -After installing MindSpore via the official website, you can start training and evaluation as follows: - -Note: - -1. the first run of training will generate the mindrecord file, which will take a long time. -2. MINDRECORD_DATASET_PATH is the mindrecord dataset directory. -3. LOAD_CHECKPOINT_PATH is the pretrained checkpoint file directory, if no just set -4. RUN_MODE support validation and testing, set to be "val"/"test" - -```shell -# create dataset in mindrecord format -bash scripts/convert_dataset_to_mindrecord.sh [COCO_DATASET_DIR] [MINDRECORD_DATASET_DIR] - -# standalone training on Ascend -bash scripts/run_standalone_train_ascend.sh [DEVICE_ID] [MINDRECORD_DATASET_PATH] [LOAD_CHECKPOINT_PATH](optional) - -# distributed training on Ascend -bash scripts/run_distributed_train_ascend.sh [MINDRECORD_DATASET_PATH] [RANK_TABLE_FILE] [LOAD_CHECKPOINT_PATH](optional) - -# eval on Ascend -bash scripts/run_standalone_eval_ascend.sh [DEVICE_ID] [RUN_MODE] [DATA_DIR] [LOAD_CHECKPOINT_PATH] -``` - -# [Script Description](#contents) - -## [Script and Sample Code](#contents) - -```path -. -鈹溾攢鈹€ cv - 鈹溾攢鈹€ centernet_resnet50_v1 - 鈹溾攢鈹€ train.py // training scripts - 鈹溾攢鈹€ eval.py // testing and evaluation outputs - 鈹溾攢鈹€ export.py // export CenterNet mindir model - 鈹溾攢鈹€ README.md // descriptions about CenterNet - 鈹溾攢鈹€ scripts - 鈹� 鈹溾攢鈹€ ascend_distributed_launcher - 鈹� 鈹� 鈹溾攢鈹€__init__.py - 鈹� 鈹� 鈹溾攢鈹€hyper_parameter_config.ini // hyper parameter for distributed training - 鈹� 鈹� 鈹溾攢鈹€get_distribute_train_cmd.py // script for distributed training - 鈹� 鈹� 鈹溾攢鈹€README.md - 鈹� 鈹溾攢鈹€convert_dataset_to_mindrecord.sh // shell script for converting coco type dataset to mindrecord - 鈹� 鈹溾攢鈹€run_standalone_train_ascend.sh // shell script for standalone training on ascend - 鈹� 鈹溾攢鈹€run_distributed_train_ascend.sh // shell script for distributed training on ascend - 鈹� 鈹溾攢鈹€run_standalone_eval_ascend.sh // shell script for standalone evaluation on ascend - 鈹斺攢鈹€ src - 鈹溾攢鈹€__init__.py - 鈹溾攢鈹€centernet_det.py // centernet networks, training entry - 鈹溾攢鈹€dataset.py // generate dataloader and data processing entry - 鈹溾攢鈹€config.py // centernet unique configs - 鈹溾攢鈹€decode.py // decode the head features - 鈹溾攢鈹€resnet50.py // resnet50 backbone - 鈹溾攢鈹€utils.py // auxiliary functions for train, to log and preload - 鈹溾攢鈹€image.py // image preprocess functions - 鈹溾攢鈹€post_process.py // post-process functions after decode in inference - 鈹斺攢鈹€visual.py // visualization image, bbox, score and keypoints -``` - -## [Script Parameters](#contents) - -### Create MindRecord type dataset - -```text -usage: dataset.py [--coco_data_dir COCO_DATA_DIR] - [--mindrecord_dir MINDRECORD_DIR] - [--mindrecord_prefix MINDRECORD_PREFIX] - -options: - --coco_data_dir path to coco dataset directory: PATH, default is "" - --mindrecord_dir path to mindrecord dataset directory: PATH, default is "" - --mindrecord_prefix prefix of MindRecord dataset filename: STR, default is "coco_det.train.mind" -``` - -### Training - -```text -usage: train.py [--device_target DEVICE_TARGET] [--distribute DISTRIBUTE] - [--need_profiler NEED_PROFILER] [--profiler_path PROFILER_PATH] - [--epoch_size EPOCH_SIZE] [--train_steps TRAIN_STEPS] [device_id DEVICE_ID] - [--device_num DEVICE_NUM] [--do_shuffle DO_SHUFFLE] - [--enable_data_sink ENABLE_DATA_SINK] [--data_sink_steps N] - [--enable_save_ckpt ENABLE_SAVE_CKPT] - [--save_checkpoint_path SAVE_CHECKPOINT_PATH] - [--load_checkpoint_path LOAD_CHECKPOINT_PATH] - [--save_checkpoint_steps N] [--save_checkpoint_num N] - [--mindrecord_dir MINDRECORD_DIR] - [--mindrecord_prefix MINDRECORD_PREFIX] - [--save_result_dir SAVE_RESULT_DIR] - -options: - --device_target device where the code will be implemented: "Ascend" | "CPU", default is "Ascend" - --distribute training by several devices: "true"(training by more than 1 device) | "false", default is "true" - --need profiler whether to use the profiling tools: "true" | "false", default is "false" - --profiler_path path to save the profiling results: PATH, default is "" - --epoch_size epoch size: N, default is 1 - --train_steps training Steps: N, default is -1 - --device_id device id: N, default is 0 - --device_num number of used devices: N, default is 1 - --do_shuffle enable shuffle: "true" | "false", default is "true" - --enable_lossscale enable lossscale: "true" | "false", default is "true" - --enable_data_sink enable data sink: "true" | "false", default is "true" - --data_sink_steps set data sink steps: N, default is 1 - --enable_save_ckpt enable save checkpoint: "true" | "false", default is "true" - --save_checkpoint_path path to save checkpoint files: PATH, default is "" - --load_checkpoint_path path to load checkpoint files: PATH, default is "" - --save_checkpoint_steps steps for saving checkpoint files: N, default is 1000 - --save_checkpoint_num number for saving checkpoint files: N, default is 1 - --mindrecord_dir path to mindrecord dataset directory: PATH, default is "" - --mindrecord_prefix prefix of MindRecord dataset filename: STR, default is "coco_det.train.mind" - --save_result_dir path to save the visualization results: PATH, default is "" -``` - -### Evaluation - -```text -usage: eval.py [--device_target DEVICE_TARGET] [--device_id N] - [--load_checkpoint_path LOAD_CHECKPOINT_PATH] - [--data_dir DATA_DIR] [--run_mode RUN_MODE] - [--visual_image VISUAL_IMAGE] - [--enable_eval ENABLE_EVAL] [--save_result_dir SAVE_RESULT_DIR] -options: - --device_target device where the code will be implemented: "Ascend" | "CPU", default is "Ascend" - --device_id device id to run task, default is 0 - --load_checkpoint_path initial checkpoint (usually from a pre-trained CenterNet model): PATH, default is "" - --data_dir validation or test dataset dir: PATH, default is "" - --run_mode inference mode: "val" | "test", default is "val" - --visual_image whether visualize the image and annotation info: "true" | "false", default is "false" - --save_result_dir path to save the visualization and inference results: PATH, default is "" -``` - -### Options and Parameters - -Parameters for training and evaluation can be set in file `config.py`. - -#### Options - -```text -config for training. - batch_size batch size of input dataset: N, default is 16 - loss_scale_value initial value of loss scale: N, default is 1024 - optimizer optimizer used in the network: Adam, default is Adam - lr_schedule schedules to get the learning rate -``` - -```text -config for evaluation. - SOFT_NMS nms after decode: True | False, default is True - keep_res keep original or fix resolution: True | False, default is True - multi_scales use multi-scales of image: List, default is [1.0] - pad pad size when keep original resolution, default is 31 - K number of bboxes to be computed by TopK, default is 100 - score_thresh threshold of score when visualize image and annotation info,default is 0.4 -``` - -#### Parameters - -```text -Parameters for dataset (Training/Evaluation): - num_classes number of categories: N, default is 80 - max_objs maximum numbers of objects labeled in each image,default is 128 - input_res input resolution, default is [512, 512] - output_res output resolution, default is [128, 128] - rand_crop whether crop image in random during data augmenation: True | False, default is True - shift maximum value of image shift during data augmenation: N, default is 0.1 - scale maximum value of image scale times during data augmenation: N, default is 0.4 - aug_rot properbility of image rotation during data augmenation: N, default is 0.0 - rotate maximum value of rotation angle during data augmentation: N, default is 0.0 - flip_prop properbility of image flip during data augmenation: N, default is 0.5 - color_aug color augmentation of RGB image, default is True - coco_classes name of categories in COCO2017 - coco_class_name2id ID corresponding to the categories in COCO2017 - mean mean value of RGB image - std variance of RGB image - eig_vec eigenvectors of RGB image - eig_val eigenvalues of RGB image - -Parameters for network (Training/Evaluation): - down_ratio the ratio of input and output resolution during training,default is 4 - last_level the last level in final upsampling, default is 6 - heads the number of heatmap,width and height,offset, default is {'hm': 80, 'wh': 2, 'reg': 2} - resnet_block block number of resnet network - resnet_in_channels in channel size for each layer - resnet_out_channels out channel size for each layer - dense_hp whether apply weighted pose regression near center point: True | False, default is True - dense_wh apply weighted regression near center or just apply regression on center point - cat_spec_wh category specific bounding box size - reg_offset regress local offset or not: True | False, default is True - hm_weight loss weight for keypoint heatmaps: N, default is 1.0 - off_weight loss weight for keypoint local offsets: N, default is 1 - wh_weight loss weight for bounding box size: N, default is 0.1 - mse_loss use mse loss or focal loss to train keypoint heatmaps: True | False, default is False - reg_loss l1 or smooth l1 for regression loss: 'l1' | 'sl1', default is 'l1' - -Parameters for optimizer and learning rate: - Adam: - weight_decay weight decay: Q,default is 0.0 - decay_filer lamda expression to specify which param will be decayed - - PolyDecay: - learning_rate initial value of learning rate: Q,default is 2.4e-4 - end_learning_rate final value of learning rate: Q,default is 2.4e-7 - power learning rate decay factor,default is 5.0 - eps normalization parameter,default is 1e-7 - warmup_steps number of warmup_steps,default is 2000 - - MultiDecay: - learning_rate initial value of learning rate: Q,default is 2.4e-4 - eps normalization parameter,default is 1e-7 - warmup_steps number of warmup_steps,default is 2000 - multi_epochs list of epoch numbers after which the lr will be decayed,default is [300, 330] - factor learning rate decay factor,default is 10 -``` - -## [Training Process](#contents) - -Before your first training, convert coco type dataset to mindrecord files is needed to improve performance on host. - -```bash -bash scripts/convert_dataset_to_mindrecord.sh /path/coco_dataset_dir /path/mindrecord_dataset_dir -``` - -The command above will run in the background, after converting mindrecord files will be located in path specified by yourself. - -### Distributed Training - -#### Running on Ascend - -```bash -bash scripts/run_distributed_train_ascend.sh /path/mindrecord_dataset /path/hccl.json /path/load_ckpt(optional) -``` - -The command above will run in the background, you can view training logs in LOG*/training_log.txt and LOG*/ms_log/. After training finished, you will get some checkpoint files under the LOG*/ckpt_0 folder by default. The loss value will be displayed as follows: - -```bash -# grep "epoch" training_log.txt -epoch: 318, current epoch percent: 1.000, step: 292204, outputs are (Tensor(shape=[], dtype=Float32, value= 1.96386), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 1024)) -epoch time: 297763.174 ms, per step time: 325.069 ms -epoch: 319, current epoch percent: 1.000, step: 293120, outputs are (Tensor(shape=[], dtype=Float32, value= 1.86382), Tensor(shape=[], dtype=Bool, value= False), Tensor(shape=[], dtype=Float32, value= 1024)) -... -``` - -## [Testing Process](#contents) - -### Testing and Evaluation - -```bash -# Evaluation base on validation dataset will be done automatically, while for test or test-dev dataset, the accuracy should be upload to the CodaLab official website(https://competitions.codalab.org). -# On Ascend -bash scripts/run_standalone_eval_ascend.sh device_id val(or test) /path/coco_dataset /path/load_ckpt - -# On CPU -bash scripts/run_standalone_eval_cpu.sh val(or test) /path/coco_dataset /path/load_ckpt -``` - -you can see the MAP result below as below: - -```log -overall performance on coco2017 validation dataset - Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.293 - Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.453 - Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.310 - Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.108 - Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.332 - Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.447 - Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.276 - Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.437 - Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.455 - Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.203 - Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.500 - Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.676 -``` - -## [Convert Process](#contents) - -### Convert - -If you want to infer the network on Ascend 310, you should convert the model to MINDIR. What you need to do before is to specify the `ckpt_file` that needs to be converted in the `export_config` section of the `src/config.py` file. - -```python -python export.py [DEVICE_ID] -``` - -# [Model Description](#contents) - -## [Performance](#contents) - -### Training Performance On Ascend - -CenterNet on 11.8K images(The annotation and data format must be the same as coco) - -| Parameters | CenterNet | -| ---------------------- | ------------------------------------------------------------ | -| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 | -| uploaded Date | 7/2/2021 (month/day/year) | -| MindSpore Version | 1.2.0 | -| Dataset | 11.8K images | -| Training Parameters | 8p, epoch=320, steps=293120, batch_size = 16, lr=2.4e-4 | -| Optimizer | Adam | -| Loss Function | Focal Loss, L1 Loss, RegLoss | -| outputs | detections | -| Loss | 1.6~3.6 | -| Speed | 8p: 40 img/s | -| Total time: training | 8p: 25 h | -| Total time: evaluation | keep res: test 1h, val 0.25h; fix res: test 40 min, val 8 min | -| Checkpoint | 375MB (.ckpt file) | -| Scripts | <https://gitee.com/mindspore/models/tree/master/research/cv/centernet> | - -### Inference Performance On Ascend - -CenterNet on validation(5K images) and test-dev(40K images) - -| Parameters | CenterNet | -| -------------------- | ------------------------------------------------------------ | -| Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 | -| uploaded Date | 7/2/2021 (month/day/year) | -| MindSpore Version | 1.1.0 | -| Dataset | 5K images(val), 40K images(test-dev) | -| batch_size | 1 | -| outputs | boxes and keypoints position and scores | -| Accuracy(validation) | MAP: 29.3%, AP50: 45.3%, AP75: 31.0%, Medium: 33.2%, Large: 44.7% | - -# [Description of Random Situation](#contents) - -In run_distributed_train_ascend.sh, we set do_shuffle to True to shuffle the dataset by default. -In train.py, we set a random seed to make sure that each node has the same initial weight in distribute training. - -# [ModelZoo Homepage](#contents) - - Please check the official [homepage](https://gitee.com/mindspore/models). diff --git a/research/cv/centernet_resnet50_v1/requirements.txt b/research/cv/centernet_resnet50_v1/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..be275cf58149e73c458e8a3f5f9207064e278c45 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/requirements.txt @@ -0,0 +1,4 @@ +opencv-python +numpy +pycocotools +Cython \ No newline at end of file diff --git a/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/get_distribute_train_cmd.py b/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/get_distribute_train_cmd.py index 3736cc44b41425eaee2aa09cb462064ce2bd6bce..f906c1fa37c1502bb3a496bff566fed7b45d4803 100644 --- a/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/get_distribute_train_cmd.py +++ b/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/get_distribute_train_cmd.py @@ -57,11 +57,9 @@ def append_cmd(cmd, s): cmd += "\n" return cmd - def append_cmd_env(cmd, key, value): return append_cmd(cmd, "export " + str(key) + "=" + str(value)) - def distribute_train(): """ distribute pretrain scripts. The number of Ascend accelerators can be automatically allocated @@ -154,7 +152,7 @@ def distribute_train(): run_cmd += " --mindrecord_dir=" + mindrecord_dir run_cmd += " --load_checkpoint_path=" + load_checkpoint_path run_cmd += ' --device_id=' + str(device_id) + ' --device_num=' \ - + str(rank_size) + ' >./training_log.txt 2>&1 &' + + str(rank_size) + ' >./training_log.txt 2>&1 &' cmd = append_cmd(cmd, run_cmd) cmd = append_cmd(cmd, "cd -") @@ -163,6 +161,5 @@ def distribute_train(): with open(args.cmd_file, "w") as f: f.write(cmd) - if __name__ == "__main__": distribute_train() diff --git a/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/hyper_parameter_config.ini b/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/hyper_parameter_config.ini index b24733db50e69d2cccec1b22d3e25ff935120945..ac184100cc8956f780fc96c7cadc8bbfe08c7fe2 100644 --- a/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/hyper_parameter_config.ini +++ b/research/cv/centernet_resnet50_v1/scripts/ascend_distributed_launcher/hyper_parameter_config.ini @@ -1,13 +1,13 @@ [config] distribute=true -epoch_size=320 +epoch_size=330 enable_save_ckpt=true do_shuffle=true enable_data_sink=true data_sink_steps=-1 save_checkpoint_path=./ -save_checkpoint_steps=4580 -save_checkpoint_num=30 +save_checkpoint_steps=458 +save_checkpoint_num=1 mindrecord_prefix="coco_det.train.mind" need_profiler=false -profiler_path=./profiler \ No newline at end of file +profiler_path=./profiler diff --git a/research/cv/centernet_resnet50_v1/scripts/run_infer_310.sh b/research/cv/centernet_resnet50_v1/scripts/run_infer_310.sh new file mode 100644 index 0000000000000000000000000000000000000000..3432bdadbd48fcc5433199206c8d9fc11ad094c6 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/scripts/run_infer_310.sh @@ -0,0 +1,145 @@ +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [[ $# -lt 3 || $# -gt 4 ]]; then + echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [NEED_PREPROCESS] [DEVICE_ID] + NEED_PREPROCESS means weather need preprocess or not, it's value is 'y' or 'n'. + DEVICE_ID is optional, it can be set by environment variable device_id, otherwise the value is zero" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} +model=$(get_real_path $1) +dataset_path=$(get_real_path $2) + +if [ "$3" == "y" ] || [ "$3" == "n" ];then + need_preprocess=$3 +else + echo "weather need preprocess or not, it's value must be in [y, n]" + exit 1 +fi + +device_id=0 +if [ $# == 4 ]; then + device_id=$4 +fi + +echo "mindir name: "$model +echo "dataset path: "$dataset_path +echo "need preprocess: "$need_preprocess +echo "device id: "$device_id + +export ASCEND_HOME=/usr/local/Ascend/ +if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then + export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH + export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH + export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe + export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH + export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp +else + export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH + export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH + export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH + export ASCEND_OPP_PATH=$ASCEND_HOME/opp +fi + +function preprocess_data() +{ + if [ -d preprocess ]; then + rm -rf ./preprocess + fi + mkdir preprocess + python3.7 ../preprocess.py --val_data_dir=$dataset_path --predict_dir=./preprocess/ >& preprocess.log +} + +function compile_app() +{ + cd ../ascend310_infer || exit + bash build.sh &> build.log +} + +function infer() +{ + cd - || exit + if [ -d result_Files ]; then + rm -rf ./result_Files + fi + if [ -d time_Result ]; then + rm -rf ./time_Result + fi + mkdir result_Files + mkdir time_Result + + ../ascend310_infer/out/main --mindir_path=$model --input0_path=./preprocess/data --device_id=$device_id &> infer.log + +} + +# install nms module from third party +if python -c "import nms" > /dev/null 2>&1 +then + echo "NMS module already exits, no need reinstall." +else + if [ -f './CenterNet' ] + then + echo "NMS module was not found, but has been downloaded" + else + echo "NMS module was not found, install it now..." + git clone https://github.com/xingyizhou/CenterNet.git + fi + cd CenterNet/src/lib/external/ || exit + make + python setup.py install + cd - || exit + rm -rf CenterNet +fi + +function cal_ap() +{ + if [ -d acc ]; then + rm -rf ./acc + fi + mkdir acc + python3.7 ../postprocess.py --result_path=./result_Files --label_path=$dataset_path/annotations/instances_val2017.json --meta_path=./preprocess/meta --save_path=./acc &> acc.log +} + +if [ $need_preprocess == "y" ]; then + preprocess_data + if [ $? -ne 0 ]; then + echo "preprocess dataset failed" + exit 1 + fi +fi +compile_app +if [ $? -ne 0 ]; then + echo "compile app code failed" + exit 1 +fi +infer +if [ $? -ne 0 ]; then + echo " execute inference failed" + exit 1 +fi +cal_ap +if [ $? -ne 0 ]; then + echo "calculate accuracy failed" + exit 1 +fi diff --git a/research/cv/centernet_resnet50_v1/scripts/run_standalone_eval_ascend.sh b/research/cv/centernet_resnet50_v1/scripts/run_standalone_eval_ascend.sh index 870455c863cc6d98d466344a4519a6f09d06cef9..eee33cbda8c7ee1a03fad045556052989af66df5 100644 --- a/research/cv/centernet_resnet50_v1/scripts/run_standalone_eval_ascend.sh +++ b/research/cv/centernet_resnet50_v1/scripts/run_standalone_eval_ascend.sh @@ -29,14 +29,20 @@ PROJECT_DIR=$(cd "$(dirname "$0")" || exit; pwd) CUR_DIR=`pwd` export GLOG_log_dir=${CUR_DIR}/ms_log export GLOG_logtostderr=0 +export DEVICE_ID=$DEVICE_ID # install nms module from third party if python -c "import nms" > /dev/null 2>&1 then echo "NMS module already exits, no need reinstall." else - echo "NMS module was not found, install it now..." - git clone https://github.com/xingyizhou/CenterNet.git + if [ -f './CenterNet' ] + then + echo "NMS module was not found, but has been downloaded" + else + echo "NMS module was not found, install it now..." + git clone https://github.com/xingyizhou/CenterNet.git + fi cd CenterNet/src/lib/external/ || exit make python setup.py install @@ -50,6 +56,6 @@ python ${PROJECT_DIR}/../eval.py \ --load_checkpoint_path=$LOAD_CHECKPOINT_PATH \ --data_dir=$DATA_DIR \ --run_mode=$RUN_MODE \ - --visual_image=false \ + --visual_image=true \ --enable_eval=true \ --save_result_dir=./ > eval_log.txt 2>&1 & diff --git a/research/cv/centernet_resnet50_v1/scripts/run_standalone_train_ascend.sh b/research/cv/centernet_resnet50_v1/scripts/run_standalone_train_ascend.sh index b08c62daff89ebd7846380cf544ff14fb37dcd24..31f661252bbc6b26c57981b9384e9ae83909b085 100644 --- a/research/cv/centernet_resnet50_v1/scripts/run_standalone_train_ascend.sh +++ b/research/cv/centernet_resnet50_v1/scripts/run_standalone_train_ascend.sh @@ -35,6 +35,7 @@ PROJECT_DIR=$(cd "$(dirname "$0")" || exit; pwd) CUR_DIR=`pwd` export GLOG_log_dir=${CUR_DIR}/ms_log export GLOG_logtostderr=0 +export DEVICE_ID=$DEVICE_ID python ${PROJECT_DIR}/../train.py \ --distribute=false \ @@ -45,10 +46,11 @@ python ${PROJECT_DIR}/../train.py \ --do_shuffle=true \ --enable_data_sink=true \ --data_sink_steps=-1 \ - --epoch_size=130 \ + --epoch_size=330 \ --load_checkpoint_path=$LOAD_CHECKPOINT_PATH \ - --save_checkpoint_steps=4580 \ + --save_checkpoint_steps=3664 \ --save_checkpoint_num=1 \ --mindrecord_dir=$MINDRECORD_DIR \ --mindrecord_prefix="coco_det.train.mind" \ + --visual_image=false \ --save_result_dir="" > training_log.txt 2>&1 & \ No newline at end of file diff --git a/research/cv/centernet_resnet50_v1/src/__init__.py b/research/cv/centernet_resnet50_v1/src/__init__.py index 812250479eff1a12a14ea5ad7324877cd81e5b3c..e5374f358378facc0aede0aeafe47f4026c400b8 100644 --- a/research/cv/centernet_resnet50_v1/src/__init__.py +++ b/research/cv/centernet_resnet50_v1/src/__init__.py @@ -15,7 +15,7 @@ """CenterNet Init.""" from src.dataset import COCOHP -from .centernet_det import GatherDetectionFeatureCell, CenterNetLossCell, \ +from .centernet_det import GatherDetectionFeatureCell, CenterNetLossCell,\ CenterNetWithLossScaleCell, CenterNetWithoutLossScaleCell, CenterNetDetEval from .visual import visual_allimages, visual_image from .decode import DetectionDecode diff --git a/research/cv/centernet_resnet50_v1/src/centernet_det.py b/research/cv/centernet_resnet50_v1/src/centernet_det.py index 8425faeeb748b66e188bb022da5484d6abe1c112..bc23ddb2fbd8b09c2b7f4e7966e5f278806e9492 100644 --- a/research/cv/centernet_resnet50_v1/src/centernet_det.py +++ b/research/cv/centernet_resnet50_v1/src/centernet_det.py @@ -16,6 +16,7 @@ CenterNet for training and evaluation """ + import mindspore.nn as nn import mindspore.ops as ops from mindspore import context @@ -28,32 +29,28 @@ from mindspore.nn.wrap.grad_reducer import DistributedGradReducer from src.utils import Sigmoid, GradScale from src.utils import FocalLoss, RegLoss from src.decode import DetectionDecode -from src.config import dataset_config as data_cfg -from src.resnet50 import ResNetFea, ResidualBlock +from src.resnet50 import Bottleneck, ResNet50, weights_init +from .model_utils.config import dataset_config as data_cfg + +BN_MOMENTUM = 0.9 def _generate_feature(cin, cout, kernel_size, head_name, head_conv=0): """ - Generate feature extraction function of each target head + Generate ResNet feature extraction function of each target head """ fc = None - if head_conv > 0: - if 'hm' in head_name: - conv2d = nn.Conv2d(head_conv, cout, kernel_size=kernel_size, has_bias=True, bias_init=Constant(-2.19)) - else: - conv2d = nn.Conv2d(head_conv, cout, kernel_size=kernel_size, has_bias=True) - fc = nn.SequentialCell([nn.Conv2d(cin, head_conv, kernel_size=3, has_bias=True), nn.ReLU(), conv2d]) + if 'hm' in head_name: + conv2d = nn.Conv2d(head_conv, cout, kernel_size=kernel_size, has_bias=True, bias_init=Constant(-2.19)) else: - if 'hm' in head_name: - fc = nn.Conv2d(cin, cout, kernel_size=kernel_size, has_bias=True, bias_init=Constant(-2.19)) - else: - fc = nn.Conv2d(cin, cout, kernel_size=kernel_size, has_bias=True) + conv2d = nn.Conv2d(head_conv, cout, kernel_size=kernel_size, has_bias=True) + fc = nn.SequentialCell([nn.Conv2d(cin, head_conv, kernel_size=3, has_bias=True), nn.ReLU(), conv2d]) return fc class GatherDetectionFeatureCell(nn.Cell): """ - Gather features of multi-pose estimation. + Gather ResNet features of multi-pose estimation. Args: net_config: The config info of CenterNet network. @@ -61,19 +58,17 @@ class GatherDetectionFeatureCell(nn.Cell): Returns: Tuple of Tensors, the target head of multi-person pose. """ - def __init__(self, net_config): super(GatherDetectionFeatureCell, self).__init__() + self.block_class = Bottleneck + self.layers = net_config.block_class heads = {'hm': data_cfg.num_classes, 'wh': 2} if net_config.reg_offset: heads.update({'reg': 2}) head_conv = net_config.head_conv + self.resnet50 = ResNet50(self.block_class, self.layers, heads, head_conv) - self.resnet50 = ResNetFea(ResidualBlock, - net_config.resnet_block, - net_config.resnet_in_channels, - net_config.resnet_out_channels, - net_config.resnet_strides) + weights_init(self.resnet50) self.hm_fn = _generate_feature(cin=64, cout=heads['hm'], kernel_size=1, head_name='hm', head_conv=head_conv) self.wh_fn = _generate_feature(cin=64, cout=heads['wh'], kernel_size=1, @@ -89,6 +84,7 @@ class GatherDetectionFeatureCell(nn.Cell): output = self.resnet50(image) feature = () out = {} + out['hm'] = self.hm_fn(output) out['wh'] = self.wh_fn(output) @@ -110,43 +106,42 @@ class CenterNetLossCell(nn.Cell): Returns: Tensor, total loss. """ - def __init__(self, net_config): super(CenterNetLossCell, self).__init__() self.network = GatherDetectionFeatureCell(net_config) - self.net_config = net_config + self.num_stacks = net_config.num_stacks self.reduce_sum = ops.ReduceSum() self.Sigmoid = Sigmoid() self.FocalLoss = FocalLoss() self.crit = nn.MSELoss() if net_config.mse_loss else self.FocalLoss self.crit_reg = RegLoss(net_config.reg_loss) self.crit_wh = RegLoss(net_config.reg_loss) + self.num_stacks = net_config.num_stacks self.wh_weight = net_config.wh_weight self.hm_weight = net_config.hm_weight self.off_weight = net_config.off_weight self.reg_offset = net_config.reg_offset self.not_enable_mse_loss = not net_config.mse_loss - self.Print = ops.Print() def construct(self, image, hm, reg_mask, ind, wh, reg): """Defines the computation performed.""" hm_loss, wh_loss, off_loss = 0, 0, 0 feature = self.network(image) - output = feature[0] - if self.not_enable_mse_loss: - output_hm = self.Sigmoid(output['hm']) - else: - output_hm = output['hm'] - hm_loss += self.crit(output_hm, hm) + for s in range(self.num_stacks): + output = feature[s] + if self.not_enable_mse_loss: + output_hm = self.Sigmoid(output['hm']) + else: + output_hm = output['hm'] + hm_loss += self.crit(output_hm, hm) / self.num_stacks - output_wh = output['wh'] - wh_loss += self.crit_reg(output_wh, reg_mask, ind, wh) - - if self.reg_offset and self.off_weight > 0: - output_reg = output['reg'] - off_loss += self.crit_reg(output_reg, reg_mask, ind, reg) + output_wh = output['wh'] + wh_loss += self.crit_reg(output_wh, reg_mask, ind, wh) / self.num_stacks + if self.reg_offset and self.off_weight > 0: + output_reg = output['reg'] + off_loss += self.crit_reg(output_reg, reg_mask, ind, reg) / self.num_stacks total_loss = (self.hm_weight * hm_loss + self.wh_weight * wh_loss + self.off_weight * off_loss) return total_loss @@ -160,7 +155,6 @@ class ImagePreProcess(nn.Cell): Returns: Tensor, normlized images and the format were converted to be NCHW """ - def __init__(self): super(ImagePreProcess, self).__init__() self.transpose = ops.Transpose() @@ -170,7 +164,6 @@ class ImagePreProcess(nn.Cell): self.cast = ops.Cast() def construct(self, image): - """Defines the computation performed.""" image = self.cast(image, mstype.float32) image = (image - self.mean) / self.std image = self.transpose(image, self.perm_list) @@ -191,7 +184,6 @@ class CenterNetWithoutLossScaleCell(nn.Cell): Returns: Tuple of Tensors, the loss, overflow flag and scaling sens of the network. """ - def __init__(self, network, optimizer): super(CenterNetWithoutLossScaleCell, self).__init__(auto_prefix=False) self.image = ImagePreProcess() @@ -208,8 +200,9 @@ class CenterNetWithoutLossScaleCell(nn.Cell): weights = self.weights loss = self.network(image, hm, reg_mask, ind, wh, reg) grads = self.grad(self.network, weights)(image, hm, reg_mask, ind, wh, reg) - self.optimizer(grads) - return loss + succ = self.optimizer(grads) + ret = loss + return ops.depend(ret, succ) class CenterNetWithLossScaleCell(nn.Cell): @@ -227,7 +220,6 @@ class CenterNetWithLossScaleCell(nn.Cell): Returns: Tuple of Tensors, the loss, overflow flag and scaling sens of the network. """ - def __init__(self, network, optimizer, sens=1): super(CenterNetWithLossScaleCell, self).__init__(auto_prefix=False) self.image = ImagePreProcess() @@ -278,9 +270,12 @@ class CenterNetWithLossScaleCell(nn.Cell): else: cond = self.less_equal(self.base, flag_sum) overflow = cond - if not overflow: - self.optimizer(grads) - return (loss, cond, scaling_sens) + if overflow: + succ = False + else: + succ = self.optimizer(grads) + ret = (loss, cond, scaling_sens) + return ops.depend(ret, succ) class CenterNetDetEval(nn.Cell): @@ -290,13 +285,12 @@ class CenterNetDetEval(nn.Cell): Args: net_config: The config info of CenterNet network. K(number): Max number of output objects. Default: 100. - enable_nms_fp16(bool): Use float16 data for max_pool, adaption for CPU. Default: True. + enable_nms_fp16(bool): Use float16 data for max_pool, adaption for CPU. Default: False. Returns: Tensor, detection of images(bboxes, score, keypoints and category id of each objects) """ - - def __init__(self, net_config, K=100, enable_nms_fp16=True): + def __init__(self, net_config, K=100, enable_nms_fp16=False): super(CenterNetDetEval, self).__init__() self.network = GatherDetectionFeatureCell(net_config) self.decode = DetectionDecode(net_config, K, enable_nms_fp16) diff --git a/research/cv/centernet_resnet50_v1/src/config.py b/research/cv/centernet_resnet50_v1/src/config.py deleted file mode 100644 index 4902ef1c4365439c06428aef22c9e8d253011c69..0000000000000000000000000000000000000000 --- a/research/cv/centernet_resnet50_v1/src/config.py +++ /dev/null @@ -1,225 +0,0 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -""" -network config setting, will be used in dataset.py, train.py, eval.py -""" - -import numpy as np -from easydict import EasyDict as edict - - -dataset_config = edict({ - "num_classes": 80, - 'max_objs': 128, - 'input_res': [512, 512], - 'output_res': [128, 128], - 'rand_crop': True, - 'shift': 0.1, - 'scale': 0.4, - 'down_ratio': 4, - 'aug_rot': 0.0, - 'rotate': 0, - 'flip_prop': 0.5, - 'color_aug': True, - 'coco_classes': ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', - 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', - 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', - 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', - 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', - 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', - 'kite', 'baseball bat', 'baseball glove', 'skateboard', - 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', - 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', - 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', - 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', - 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', - 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', - 'refrigerator', 'book', 'clock', 'vase', 'scissors', - 'teddy bear', 'hair drier', 'toothbrush'), - 'coco_class_name2id': { - 'person': 1, 'bicycle': 2, 'car': 3, 'motorcycle': 4, 'airplane': 5, - 'bus': 6, 'train': 7, 'truck': 8, 'boat': 9, 'traffic light': 10, 'fire hydrant': 11, - 'stop sign': 13, 'parking meter': 14, 'bench': 15, 'bird': 16, 'cat': 17, 'dog': 18, 'horse': 19, - 'sheep': 20, 'cow': 21, 'elephant': 22, 'bear': 23, 'zebra': 24, 'giraffe': 25, 'backpack': 27, - 'umbrella': 28, 'handbag': 31, 'tie': 32, 'suitcase': 33, 'frisbee': 34, 'skis': 35, - 'snowboard': 36, 'sports ball': 37, 'kite': 38, 'baseball bat': 39, 'baseball glove': 40, - 'skateboard': 41, 'surfboard': 42, 'tennis racket': 43, 'bottle': 44, 'wine glass': 46, - 'cup': 47, 'fork': 48, 'knife': 49, 'spoon': 50, 'bowl': 51, 'banana': 52, 'apple': 53, 'sandwich': 54, - 'orange': 55, 'broccoli': 56, 'carrot': 57, 'hot dog': 58, 'pizza': 59, 'donut': 60, 'cake': 61, - 'chair': 62, 'couch': 63, 'potted plant': 64, 'bed': 65, 'dining table': 67, 'toilet': 70, 'tv': 72, - 'laptop': 73, 'mouse': 74, 'remote': 75, 'keyboard': 76, 'cell phone': 77, 'microwave': 78, - 'oven': 79, 'toaster': 80, 'sink': 81, 'refrigerator': 82, 'book': 84, 'clock': 85, 'vase': 86, - 'scissors': 87, 'teddy bear': 88, 'hair drier': 89, 'toothbrush': 90}, - 'mean': np.array([0.40789654, 0.44719302, 0.47026115], dtype=np.float32), - 'std': np.array([0.28863828, 0.27408164, 0.27809835], dtype=np.float32), - 'eig_val': np.array([0.2141788, 0.01817699, 0.00341571], dtype=np.float32), - 'eig_vec': np.array([[-0.58752847, -0.69563484, 0.41340352], - [-0.5832747, 0.00994535, -0.81221408], - [-0.56089297, 0.71832671, 0.41158938]], dtype=np.float32), -}) - - -net_config = edict({ - 'down_ratio': 4, - 'last_level': 6, - 'head_conv': 64, - 'heads': {'hm': 80, 'wh': 2, 'reg': 2}, - 'resnet_block': [3, 4, 6, 3], - 'resnet_in_channels': [64, 256, 512, 1024], - 'resnet_out_channels': [256, 512, 1024, 2048], - 'resnet_strides': [1, 2, 2, 2], - 'dense_wh': False, - 'norm_wh': False, - 'cat_spec_wh': False, - 'reg_offset': True, - 'hm_weight': 1, - 'off_weight': 1, - 'wh_weight': 0.1, - 'mse_loss': False, - 'reg_loss': 'l1', -}) - - -train_config = edict({ - 'batch_size': 16, - 'loss_scale_value': 1024, - 'optimizer': 'Adam', - 'lr_schedule': 'MultiDecay', - 'Adam': edict({ - 'weight_decay': 0.0, - 'decay_filter': lambda x: x.name.endswith('.bias') or x.name.endswith('.beta') or x.name.endswith('.gamma'), - }), - 'PolyDecay': edict({ - 'learning_rate': 2.4e-4, - 'end_learning_rate': 2.4e-7, - 'power': 5.0, - 'eps': 1e-7, - 'warmup_steps': 2000, - }), - 'MultiDecay': edict({ - 'learning_rate': 2.4e-4, - 'eps': 1e-7, - 'warmup_steps': 2000, - 'multi_epochs': [300, 320], - 'factor': 10, - }) -}) - - -eval_config = edict({ - 'SOFT_NMS': False, - 'keep_res': False, - 'multi_scales': [1.0], - 'pad': 127, - 'K': 100, - 'score_thresh': 0.3, - 'valid_ids': [ - 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, - 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, - 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, - 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, - 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, - 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, - 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, - 82, 84, 85, 86, 87, 88, 89, 90], - 'color_list': [ - 0.000, 0.800, 1.000, - 0.850, 0.325, 0.098, - 0.929, 0.694, 0.125, - 0.494, 0.184, 0.556, - 0.466, 0.674, 0.188, - 0.301, 0.745, 0.933, - 0.635, 0.078, 0.184, - 0.300, 0.300, 0.300, - 0.600, 0.600, 0.600, - 1.000, 0.000, 0.000, - 1.000, 0.500, 0.000, - 0.749, 0.749, 0.000, - 0.000, 1.000, 0.000, - 0.000, 0.000, 1.000, - 0.667, 0.000, 1.000, - 0.333, 0.333, 0.000, - 0.333, 0.667, 0.333, - 0.333, 1.000, 0.000, - 0.667, 0.333, 0.000, - 0.667, 0.667, 0.000, - 0.667, 1.000, 0.000, - 1.000, 0.333, 0.000, - 1.000, 0.667, 0.000, - 1.000, 1.000, 0.000, - 0.000, 0.333, 0.500, - 0.000, 0.667, 0.500, - 0.000, 1.000, 0.500, - 0.333, 0.000, 0.500, - 0.333, 0.333, 0.500, - 0.333, 0.667, 0.500, - 0.333, 1.000, 0.500, - 0.667, 0.000, 0.500, - 0.667, 0.333, 0.500, - 0.667, 0.667, 0.500, - 0.667, 1.000, 0.500, - 1.000, 0.000, 0.500, - 1.000, 0.333, 0.500, - 1.000, 0.667, 0.500, - 1.000, 1.000, 0.500, - 0.000, 0.333, 1.000, - 0.000, 0.667, 1.000, - 0.000, 1.000, 1.000, - 0.333, 0.000, 1.000, - 0.333, 0.333, 1.000, - 0.333, 0.667, 1.000, - 0.333, 1.000, 1.000, - 0.667, 0.000, 1.000, - 0.667, 0.333, 1.000, - 0.667, 0.667, 1.000, - 0.667, 1.000, 1.000, - 1.000, 0.000, 1.000, - 1.000, 0.333, 1.000, - 1.000, 0.667, 1.000, - 0.167, 0.800, 0.000, - 0.333, 0.000, 0.000, - 0.500, 0.000, 0.000, - 0.667, 0.000, 0.000, - 0.833, 0.000, 0.000, - 1.000, 0.000, 0.000, - 0.000, 0.667, 0.400, - 0.000, 0.333, 0.000, - 0.000, 0.500, 0.000, - 0.000, 0.667, 0.000, - 0.000, 0.833, 0.000, - 0.000, 1.000, 0.000, - 0.000, 0.000, 0.167, - 0.000, 0.000, 0.333, - 0.000, 0.000, 0.500, - 0.000, 0.000, 0.667, - 0.000, 0.000, 0.833, - 0.000, 0.000, 1.000, - 0.000, 0.200, 0.800, - 0.143, 0.143, 0.543, - 0.286, 0.286, 0.286, - 0.429, 0.429, 0.429, - 0.571, 0.571, 0.571, - 0.714, 0.714, 0.714, - 0.857, 0.857, 0.857, - 0.000, 0.447, 0.741, - 0.50, 0.5, 0], -}) - -export_config = edict({ - 'input_res': dataset_config.input_res, - 'ckpt_file': "./ckpt_file.ckpt", - 'export_format': "MINDIR", - 'export_name': "CenterNet_ObjectDetection", -}) diff --git a/research/cv/centernet_resnet50_v1/src/dataset.py b/research/cv/centernet_resnet50_v1/src/dataset.py index 8316913c37290e88a3a06bcdcbbda1d4d622cd6c..b6ab333cac67525e42b607fde76fc3f08d895dad 100644 --- a/research/cv/centernet_resnet50_v1/src/dataset.py +++ b/research/cv/centernet_resnet50_v1/src/dataset.py @@ -17,17 +17,30 @@ Data operations, will be used in train.py """ import os +import sys import math -import argparse import cv2 import numpy as np import pycocotools.coco as coco import mindspore.dataset as ds from mindspore import log as logger from mindspore.mindrecord import FileWriter -from src.image import color_aug, get_affine_transform, affine_transform -from src.image import gaussian_radius, draw_umich_gaussian, draw_msra_gaussian, draw_dense_reg -from src.visual import visual_image + +try: + from src.model_utils.config import config, dataset_config + from src.model_utils.moxing_adapter import moxing_wrapper + from src.image import color_aug, get_affine_transform, affine_transform + from src.image import gaussian_radius, draw_umich_gaussian, draw_msra_gaussian, draw_dense_reg + from src.visual import visual_image +except ImportError as import_error: + print('Import Error: {}, trying append path/centernet_resnet50/src/../'.format(import_error)) + sys.path.append(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))) + from src.model_utils.config import config, dataset_config + from src.model_utils.moxing_adapter import moxing_wrapper + from src.image import color_aug, get_affine_transform, affine_transform + from src.image import gaussian_radius, draw_umich_gaussian, draw_msra_gaussian, draw_dense_reg + from src.visual import visual_image + _current_dir = os.path.dirname(os.path.realpath(__file__)) cv2.setNumThreads(0) @@ -47,7 +60,6 @@ class COCOHP(ds.Dataset): Returns: Prepocessed training or testing dataset for CenterNet network. """ - def __init__(self, data_opt, run_mode="train", net_opt=None, enable_visual_image=False, save_path=None): self._data_rng = np.random.RandomState(123) self.data_opt = data_opt @@ -195,7 +207,7 @@ class COCOHP(ds.Dataset): c = np.array([new_width // 2, new_height // 2], dtype=np.float32) s = np.array([inp_width, inp_height], dtype=np.float32) else: - inp_height, inp_width = self.data_opt.input_res[0], self.data_opt.input_res[1] + inp_height, inp_width = self.data_opt.input_res_test[0], self.data_opt.input_res_test[1] c = np.array([new_width / 2., new_height / 2.], dtype=np.float32) s = max(height, width) * 1.0 @@ -253,7 +265,7 @@ class COCOHP(ds.Dataset): width = img.shape[1] c = np.array([img.shape[1] / 2., img.shape[0] / 2.], dtype=np.float32) s = max(height, width) * 1.0 - input_h, input_w = self.data_opt.input_res[0], self.data_opt.input_res[1] + input_h, input_w = self.data_opt.input_res_train[0], self.data_opt.input_res_train[1] rot = 0 flipped = False @@ -379,7 +391,6 @@ class COCOHP(ds.Dataset): """create testing dataset based on coco format""" def generator(): - """create generator""" for i in range(self.num_samples): yield self.__getitem__(i) @@ -389,16 +400,19 @@ class COCOHP(ds.Dataset): return data_set -if __name__ == '__main__': - # Convert coco2017 dataset to mindrecord to improve performance on host - from src.config import dataset_config - - parser = argparse.ArgumentParser(description='CenterNet MindRecord dataset') - parser.add_argument("--coco_data_dir", type=str, default="", help="Coco dataset directory.") - parser.add_argument("--mindrecord_dir", type=str, default="", help="MindRecord dataset dir.") - parser.add_argument("--mindrecord_prefix", type=str, default="coco_det.train.mind", - help="Prefix of MindRecord dataset filename.") - args_opt = parser.parse_args() +def modelarts_pre_process(): + """modelarts pre process function.""" + config.coco_data_dir = config.data_path + config.mindrecord_dir = config.output_path + + +@moxing_wrapper(pre_process=modelarts_pre_process) +def coco2mindrecord(): + """Convert coco2017 dataset to mindrecord""" dsc = COCOHP(dataset_config, run_mode="train") - dsc.init(args_opt.coco_data_dir) - dsc.transfer_coco_to_mindrecord(args_opt.mindrecord_dir, args_opt.mindrecord_prefix, shard_num=8) + dsc.init(config.coco_data_dir) + dsc.transfer_coco_to_mindrecord(config.mindrecord_dir, config.mindrecord_prefix, shard_num=8) + + +if __name__ == '__main__': + coco2mindrecord() diff --git a/research/cv/centernet_resnet50_v1/src/decode.py b/research/cv/centernet_resnet50_v1/src/decode.py index ffc0cc3b49e1fdd5e14aaac5ac716a3041dbb2ac..2283d9394754d8ac7cc99ed91c68f2a6c8b1022a 100644 --- a/research/cv/centernet_resnet50_v1/src/decode.py +++ b/research/cv/centernet_resnet50_v1/src/decode.py @@ -16,8 +16,10 @@ Decode from heads for evaluation """ -import mindspore.ops as ops +import mindspore as ms import mindspore.nn as nn +import mindspore.ops as ops +from mindspore.ops import operations as P from mindspore.common import dtype as mstype from .utils import GatherFeature, TransposeGatherFeature @@ -28,19 +30,19 @@ class NMS(nn.Cell): Args: kernel(int): Maxpooling kernel size. Default: 3. - enable_nms_fp16(bool): Use float16 data for max_pool, adaption for CPU. Default: True. + enable_nms_fp16(bool): Use float16 data for max_pool, adaption for CPU. Default: False. Returns: Tensor, heatmap after non-maximum suppression. """ - - def __init__(self, kernel=3, enable_nms_fp16=True): + def __init__(self, kernel=3, enable_nms_fp16=False): super(NMS, self).__init__() - self.pad = (kernel - 1) // 2 self.cast = ops.Cast() self.dtype = ops.DType() self.equal = ops.Equal() - self.max_pool = nn.MaxPool2d(kernel, stride=1, pad_mode="same") + self.Abs = P.Abs() + self.max_pool_ = nn.MaxPool2d(kernel, stride=1, pad_mode="same") + self.max_pool = P.MaxPoolWithArgmax(kernel_size=kernel, strides=1, pad_mode='same') self.enable_fp16 = enable_nms_fp16 def construct(self, heat): @@ -48,13 +50,19 @@ class NMS(nn.Cell): dtype = self.dtype(heat) if self.enable_fp16: heat = self.cast(heat, mstype.float16) - heat_max = self.max_pool(heat) + heat_max = self.max_pool_(heat) keep = self.equal(heat, heat_max) keep = self.cast(keep, dtype) heat = self.cast(heat, dtype) else: - heat_max = self.max_pool(heat) - keep = self.equal(heat, heat_max) + heat_max, _ = self.max_pool(heat) + error = self.cast((heat - heat_max), mstype.float32) + abs_error = self.Abs(error) + abs_out = self.Abs(heat) + error = abs_error / (abs_out + 1e-12) + keep = P.Select()(P.LessEqual()(error, 1e-3), + P.Fill()(ms.float32, P.Shape()(error), 1.0), + P.Fill()(ms.float32, P.Shape()(error), 0.0)) heat = heat * keep return heat @@ -68,7 +76,6 @@ class GatherTopK(nn.Cell): Returns: Tuple of Tensors, top_k scores, indexes, category ids, and the indexes in height and width direcction. """ - def __init__(self): super(GatherTopK, self).__init__() self.shape = ops.Shape() @@ -77,7 +84,8 @@ class GatherTopK(nn.Cell): self.cast = ops.Cast() self.dtype = ops.DType() self.gather_feat = GatherFeature() - self.mod = ops.Mod() + # The ops.Mod() operator will produce errors on the Ascend 310 + self.mod = P.FloorMod() self.div = ops.Div() def construct(self, scores, K=40): @@ -97,7 +105,7 @@ class GatherTopK(nn.Cell): topk_ys = self.cast(self.reshape(topk_ys, (b, K)), self.dtype(scores)) topk_xs = self.gather_feat(self.reshape(topk_xs, (b, -1, 1)), topk_ind) topk_xs = self.cast(self.reshape(topk_xs, (b, K)), self.dtype(scores)) - return [topk_score, topk_inds, topk_clses, topk_ys, topk_xs] + return topk_score, topk_inds, topk_clses, topk_ys, topk_xs class DetectionDecode(nn.Cell): @@ -112,8 +120,7 @@ class DetectionDecode(nn.Cell): Returns: Tensor, multi-objects detections. """ - - def __init__(self, net_config, K=100, enable_nms_fp16=True): + def __init__(self, net_config, K=100, enable_nms_fp16=False): super(DetectionDecode, self).__init__() self.K = K self.nms = NMS(enable_nms_fp16=enable_nms_fp16) diff --git a/research/cv/centernet_resnet50_v1/src/image.py b/research/cv/centernet_resnet50_v1/src/image.py index 89c39991720cc21b9305054e07034af7fbac9e14..252c7ddc392b683432e519d06a0f4c6886fc1b0f 100644 --- a/research/cv/centernet_resnet50_v1/src/image.py +++ b/research/cv/centernet_resnet50_v1/src/image.py @@ -123,20 +123,20 @@ def gaussian_radius(det_size, min_overlap=0.7): b2 = 2 * (height + width) c2 = (1 - min_overlap) * width * height sq2 = np.sqrt(b2 ** 2 - 4 * a2 * c2) - r2 = (b2 + sq2) / (2 * a2) + r2 = (b2 + sq2) / 2 a3 = 4 * min_overlap b3 = -2 * min_overlap * (height + width) c3 = (min_overlap - 1) * width * height sq3 = np.sqrt(b3 ** 2 - 4 * a3 * c3) - r3 = (b3 + sq3) / (2 * a3) + r3 = (b3 + sq3) / 2 return math.ceil(min(r1, r2, r3)) def gaussian2D(shape, sigma=1): """2D gaussian function""" m, n = [(ss - 1.) / 2. for ss in shape] - y, x = np.ogrid[-m:m + 1, -n:n + 1] + y, x = np.ogrid[-m:m+1, -n:n+1] h = np.exp(-(x * x + y * y) / (2 * sigma * sigma)) h[h < np.finfo(h.dtype).eps * h.max()] = 0 @@ -156,8 +156,9 @@ def draw_umich_gaussian(heatmap, center, radius, k=1): top, bottom = min(y, radius), min(height - y, radius + 1) masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right] - masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right] - if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: + masked_gaussian = gaussian[radius - top:radius + + bottom, radius - left:radius + right] + if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: # TODO debug np.maximum(masked_heatmap, masked_gaussian * k, out=masked_heatmap) return heatmap @@ -169,9 +170,9 @@ def draw_dense_reg(regmap, heatmap, center, value, radius, is_offset=False): value = np.array(value, dtype=np.float32) value = value.reshape((-1, 1, 1)) dim = value.shape[0] - reg = np.ones((dim, diameter * 2 + 1, diameter * 2 + 1), dtype=np.float32) * value + reg = np.ones((dim, diameter*2+1, diameter*2+1), dtype=np.float32) * value if is_offset and dim == 2: - delta = np.arange(diameter * 2 + 1) - radius + delta = np.arange(diameter*2+1) - radius reg[0] = reg[0] - delta.reshape(1, -1) reg[1] = reg[1] - delta.reshape(-1, 1) @@ -184,11 +185,14 @@ def draw_dense_reg(regmap, heatmap, center, value, radius, is_offset=False): masked_heatmap = heatmap[y - top:y + bottom, x - left:x + right] masked_regmap = regmap[:, y - top:y + bottom, x - left:x + right] - masked_gaussian = gaussian[radius - top:radius + bottom, radius - left:radius + right] - masked_reg = reg[:, radius - top:radius + bottom, radius - left:radius + right] + masked_gaussian = gaussian[radius - top:radius + bottom, + radius - left:radius + right] + masked_reg = reg[:, radius - top:radius + bottom, + radius - left:radius + right] if min(masked_gaussian.shape) > 0 and min(masked_heatmap.shape) > 0: # TODO debug - idx = (masked_gaussian >= masked_heatmap).reshape(1, masked_gaussian.shape[0], masked_gaussian.shape[1]) - masked_regmap = (1 - idx) * masked_regmap + idx * masked_reg + idx = (masked_gaussian >= masked_heatmap).reshape( + 1, masked_gaussian.shape[0], masked_gaussian.shape[1]) + masked_regmap = (1-idx) * masked_regmap + idx * masked_reg regmap[:, y - top:y + bottom, x - left:x + right] = masked_regmap return regmap diff --git a/research/cv/centernet_resnet50_v1/src/model_utils/config.py b/research/cv/centernet_resnet50_v1/src/model_utils/config.py new file mode 100644 index 0000000000000000000000000000000000000000..da416322d08837910d9fb3cc70f50f5eb17ae3f6 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/src/model_utils/config.py @@ -0,0 +1,157 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Parse arguments""" + +import os +import ast +import argparse +from pprint import pprint, pformat +import yaml +import numpy as np + + +class Config: + """ + Configuration namespace. Convert dictionary to members. + """ + def __init__(self, cfg_dict): + for k, v in cfg_dict.items(): + if isinstance(v, str) and (v[:9] == 'np.array(' and v[-17:] == 'dtype=np.float32)'): + v = np.array(ast.literal_eval(v[9:v.rfind(']') + 1]), dtype=np.float32) + if isinstance(v, (list, tuple)): + setattr(self, k, [Config(x) if isinstance(x, dict) else x for x in v]) + else: + setattr(self, k, Config(v) if isinstance(v, dict) else v) + + def __str__(self): + return pformat(self.__dict__) + + def __repr__(self): + return self.__str__() + + +def parse_cli_to_yaml(parser, cfg, helper=None, choices=None, cfg_path="default_config.yaml"): + """ + Parse command line arguments to the configuration according to the default yaml. + + Args: + parser: Parent parser. + cfg: Base configuration. + helper: Helper description. + cfg_path: Path to the default yaml config. + """ + parser = argparse.ArgumentParser(description="[REPLACE THIS at config.py]", + parents=[parser]) + helper = {} if helper is None else helper + choices = {} if choices is None else choices + for item in cfg: + if not isinstance(cfg[item], list) and not isinstance(cfg[item], dict): + help_description = helper[item] if item in helper else "Please reference to {}".format(cfg_path) + choice = choices[item] if item in choices else None + if isinstance(cfg[item], bool): + parser.add_argument("--" + item, type=ast.literal_eval, default=cfg[item], choices=choice, + help=help_description) + else: + parser.add_argument("--" + item, type=type(cfg[item]), default=cfg[item], choices=choice, + help=help_description) + args = parser.parse_args() + return args + + +def parse_yaml(yaml_path): + """ + Parse the yaml config file. + + Args: + yaml_path: Path to the yaml config. + """ + with open(yaml_path, 'r') as fin: + try: + cfgs = yaml.load_all(fin.read(), Loader=yaml.FullLoader) + cfgs = [x for x in cfgs] + if len(cfgs) == 1: + cfg_helper = {} + cfg = cfgs[0] + cfg_choices = {} + elif len(cfgs) == 2: + cfg, cfg_helper = cfgs + cfg_choices = {} + elif len(cfgs) == 3: + cfg, cfg_helper, cfg_choices = cfgs + else: + raise ValueError("At most 3 docs (config, description for help, choices) are supported in config yaml") + print(cfg_helper) + except: + raise ValueError("Failed to parse yaml") + return cfg, cfg_helper, cfg_choices + + +def merge(args, cfg): + """ + Merge the base config from yaml file and command line arguments. + + Args: + args: Command line arguments. + cfg: Base configuration. + """ + args_var = vars(args) + for item in args_var: + cfg[item] = args_var[item] + return cfg + + +def extra_operations(cfg): + """ + Do extra work on Config object. + + Args: + cfg: Object after instantiation of class 'Config'. + """ + cfg.train_config.Adam.decay_filter = lambda x: x.name.endswith('.bias') or x.name.endswith('.beta') or x.name.endswith('.gamma') + cfg.export_config.input_res = cfg.dataset_config.input_res_test + if cfg.export_load_ckpt: + cfg.export_config.ckpt_file = cfg.export_load_ckpt + if cfg.export_name: + cfg.export_config.export_name = cfg.export_name + if cfg.export_format: + cfg.export_config.export_format = cfg.export_format + + + +def get_config(): + """ + Get Config according to the yaml file and cli arguments. + """ + parser = argparse.ArgumentParser(description="default name", add_help=False) + current_dir = os.path.dirname(os.path.abspath(__file__)) + parser.add_argument("--config_path", type=str, default=os.path.join(current_dir, "../../default_config.yaml"), + help="Config file path") + path_args, _ = parser.parse_known_args() + default, helper, choices = parse_yaml(path_args.config_path) + pprint(default) + args = parse_cli_to_yaml(parser=parser, cfg=default, helper=helper, choices=choices, cfg_path=path_args.config_path) + final_config = merge(args, default) + config_obj = Config(final_config) + extra_operations(config_obj) + return config_obj + + +config = get_config() +dataset_config = config.dataset_config +net_config = config.net_config +train_config = config.train_config +eval_config = config.eval_config +export_config = config.export_config diff --git a/research/cv/centernet_resnet50_v1/src/model_utils/device_adapter.py b/research/cv/centernet_resnet50_v1/src/model_utils/device_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..9c3d21d5e47c22617170887df9da97beff668495 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/src/model_utils/device_adapter.py @@ -0,0 +1,27 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Device adapter for ModelArts""" + +from src.model_utils.config import config + +if config.enable_modelarts: + from src.model_utils.moxing_adapter import get_device_id, get_device_num, get_rank_id, get_job_id +else: + from src.model_utils.local_adapter import get_device_id, get_device_num, get_rank_id, get_job_id + +__all__ = [ + "get_device_id", "get_device_num", "get_rank_id", "get_job_id" +] diff --git a/research/cv/centernet_resnet50_v1/src/model_utils/local_adapter.py b/research/cv/centernet_resnet50_v1/src/model_utils/local_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..769fa6dc78e59eb66dbc8e6773accdc1d08b649e --- /dev/null +++ b/research/cv/centernet_resnet50_v1/src/model_utils/local_adapter.py @@ -0,0 +1,36 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Local adapter""" + +import os + +def get_device_id(): + device_id = os.getenv('DEVICE_ID', '0') + return int(device_id) + + +def get_device_num(): + device_num = os.getenv('RANK_SIZE', '1') + return int(device_num) + + +def get_rank_id(): + global_rank_id = os.getenv('RANK_ID', '0') + return int(global_rank_id) + + +def get_job_id(): + return "Local Job" diff --git a/research/cv/centernet_resnet50_v1/src/model_utils/moxing_adapter.py b/research/cv/centernet_resnet50_v1/src/model_utils/moxing_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..09cb0f0cf0fb88ba809d5ba9a40432b644d789b3 --- /dev/null +++ b/research/cv/centernet_resnet50_v1/src/model_utils/moxing_adapter.py @@ -0,0 +1,123 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Moxing adapter for ModelArts""" + +import os +import functools +from mindspore import context +from mindspore.profiler import Profiler +from src.model_utils.config import config + +_global_sync_count = 0 + +def get_device_id(): + device_id = os.getenv('DEVICE_ID', '0') + return int(device_id) + + +def get_device_num(): + device_num = os.getenv('RANK_SIZE', '1') + return int(device_num) + + +def get_rank_id(): + global_rank_id = os.getenv('RANK_ID', '0') + return int(global_rank_id) + + +def get_job_id(): + job_id = os.getenv('JOB_ID') + job_id = job_id if job_id != "" else "default" + return job_id + +def sync_data(from_path, to_path): + """ + Download data from remote obs to local directory if the first url is remote url and the second one is local path + Upload data from local directory to remote obs in contrast. + """ + import moxing as mox + import time + global _global_sync_count + sync_lock = "/tmp/copy_sync.lock" + str(_global_sync_count) + _global_sync_count += 1 + + # Each server contains 8 devices as most. + if get_device_id() % min(get_device_num(), 8) == 0 and not os.path.exists(sync_lock): + print("from path: ", from_path) + print("to path: ", to_path) + mox.file.copy_parallel(from_path, to_path) + print("===finish data synchronization===") + try: + os.mknod(sync_lock) + # print("os.mknod({}) success".format(sync_lock)) + except IOError: + pass + print("===save flag===") + + while True: + if os.path.exists(sync_lock): + break + time.sleep(1) + + print("Finish sync data from {} to {}.".format(from_path, to_path)) + + +def moxing_wrapper(pre_process=None, post_process=None): + """ + Moxing wrapper to download dataset and upload outputs. + """ + def wrapper(run_func): + @functools.wraps(run_func) + def wrapped_func(*args, **kwargs): + # Download data from data_url + if config.enable_modelarts: + if config.data_url: + sync_data(config.data_url, config.data_path) + print("Dataset downloaded: ", os.listdir(config.data_path)) + if config.checkpoint_url: + sync_data(config.checkpoint_url, config.load_path) + print("Preload downloaded: ", os.listdir(config.load_path)) + if config.train_url: + sync_data(config.train_url, config.output_path) + print("Workspace downloaded: ", os.listdir(config.output_path)) + + context.set_context(save_graphs_path=os.path.join(config.output_path, str(get_rank_id()))) + config.device_num = get_device_num() + config.device_id = get_device_id() + if not os.path.exists(config.output_path): + os.makedirs(config.output_path) + + if pre_process: + pre_process() + + if config.enable_profiling: + profiler = Profiler() + + run_func(*args, **kwargs) + + if config.enable_profiling: + profiler.analyse() + + # Upload data to train_url + if config.enable_modelarts: + if post_process: + post_process() + + if config.train_url: + print("Start to copy output directory") + sync_data(config.output_path, config.train_url) + return wrapped_func + return wrapper diff --git a/research/cv/centernet_resnet50_v1/src/post_process.py b/research/cv/centernet_resnet50_v1/src/post_process.py index 2879decb3a0d15700b5108d5c4917d17b7b145c9..22546cb3d38a7afb263c99bb63d41b5c54de331a 100644 --- a/research/cv/centernet_resnet50_v1/src/post_process.py +++ b/research/cv/centernet_resnet50_v1/src/post_process.py @@ -19,12 +19,6 @@ import numpy as np from .image import get_affine_transform, affine_transform, transform_preds from .visual import coco_box_to_bbox -try: - from nms import soft_nms -except ImportError: - print('NMS not installed! Do \n cd $CenterNet_ROOT/scripts/ \n' - 'and see run_standalone_eval.sh for more details to install it\n') - def post_process(dets, meta, scale, num_classes): """rescale detection to original scale""" @@ -58,7 +52,12 @@ def merge_outputs(detections, num_classes, SOFT_NMS=True): results[j] = np.concatenate( [detection[j] for detection in detections], axis=0).astype(np.float32) if SOFT_NMS: - soft_nms(results[j], Nt=0.5, threshold=0.01, method=2) + try: + from nms import soft_nms + except ImportError: + print('NMS not installed! Do \n cd $CenterNet_ROOT/scripts/ \n' + 'and see run_standalone_eval.sh for more details to install it\n') + soft_nms(results[j], Nt=0.5, threshold=0.001, method=2) scores = np.hstack( [results[j][:, 4] for j in range(1, num_classes + 1)]) diff --git a/research/cv/centernet_resnet50_v1/src/resnet50.py b/research/cv/centernet_resnet50_v1/src/resnet50.py index f6a1cbddf9763a2de5660736e7c5f082fdafc176..3db5e945888450967738ca6509ff7dc936257541 100644 --- a/research/cv/centernet_resnet50_v1/src/resnet50.py +++ b/research/cv/centernet_resnet50_v1/src/resnet50.py @@ -12,47 +12,46 @@ # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================ -"""Resnet50 backbone.""" +""" +ResNet50 backbone +""" +import math import mindspore.nn as nn +from mindspore import Parameter +from mindspore.common.initializer import initializer +from mindspore.common.initializer import HeUniform BN_MOMENTUM = 0.9 -class ResidualBlock(nn.Cell): +class Bottleneck(nn.Cell): """ - ResNet V1 residual block definition. + ResNet basic block. Args: - in_channel (int) - Input channel. - out_channel (int) - Output channel. - stride (int) - Stride size for the initial convolutional layer. Default: 1. - downsample (func) - the downsample in block. Default: None. + cin(int): Input channel. + cout(int): Output channel. + stride(int): Stride size for the initial convolutional layer. Default:1. + downsample(Cell): Downsample convolution block. Default:None. Returns: Tensor, output tensor. - Examples: - >>> ResidualBlock(64, 256, stride=2, downsample=None) """ expansion = 4 - def __init__(self, - in_channel, - out_channel, - stride=1, - downsample=None): - super(ResidualBlock, self).__init__() - self.stride = stride - channel = out_channel // self.expansion - self.conv1 = nn.Conv2d(in_channel, channel, kernel_size=1, has_bias=False) - self.bn1 = nn.BatchNorm2d(channel, momentum=BN_MOMENTUM) - self.conv2 = nn.Conv2d(channel, channel, kernel_size=3, stride=stride, + def __init__(self, cin, cout, stride=1, downsample=None): + super(Bottleneck, self).__init__() + self.conv1 = nn.Conv2d(cin, cout, kernel_size=1, has_bias=False) + self.bn1 = nn.BatchNorm2d(cout, momentum=BN_MOMENTUM) + self.conv2 = nn.Conv2d(cout, cout, kernel_size=3, stride=stride, pad_mode='pad', padding=1, has_bias=False) - self.bn2 = nn.BatchNorm2d(channel, momentum=BN_MOMENTUM) - self.conv3 = nn.Conv2d(channel, out_channel, kernel_size=1, has_bias=False) - self.bn3 = nn.BatchNorm2d(out_channel) + self.bn2 = nn.BatchNorm2d(cout, momentum=BN_MOMENTUM) + self.conv3 = nn.Conv2d(cout, cout * self.expansion, kernel_size=1, has_bias=False) + self.bn3 = nn.BatchNorm2d(cout * self.expansion) self.relu = nn.ReLU() self.downsample = downsample + self.stride = stride def construct(self, x): """Defines the computation performed.""" @@ -78,110 +77,83 @@ class ResidualBlock(nn.Cell): return out -class ResNetFea(nn.Cell): +class ResNet50(nn.Cell): """ ResNet architecture. Args: block (Cell): Block for network. - layer_nums (list): Numbers of block in different layers. - in_channels (list): Input channel in each layer. - out_channels (list): Output channel in each layer. - strides (list): Stride size in each layer. + layer (list): Numbers of block in different layers. + heads (dict): The number of heatmap,width and height,offset. + head_conv(int): Input convolution dimension. + Returns: Tensor, output tensor. - Examples: - >>> ResNetFea(ResidualBlock, - >>> [3, 4, 6, 3], - >>> [64, 256, 512, 1024], - >>> [256, 512, 1024, 2048], - >>> [1, 2, 2, 2]) """ - - def __init__(self, - block, - layer_nums, - in_channels, - out_channels, - strides): - self.cin = 64 - super(ResNetFea, self).__init__() + def __init__(self, block, layers, heads, head_conv): + self.cin = head_conv + self.heads = heads + super(ResNet50, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, pad_mode='pad', padding=3, has_bias=False) self.bn1 = nn.BatchNorm2d(64, momentum=BN_MOMENTUM) self.relu = nn.ReLU() self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode='same') - - self.layer1 = self._make_layer(block, - layer_nums[0], - in_channel=in_channels[0], - out_channel=out_channels[0], - stride=strides[0]) - - self.layer2 = self._make_layer(block, - layer_nums[1], - in_channel=in_channels[1], - out_channel=out_channels[1], - stride=strides[1]) - - self.layer3 = self._make_layer(block, - layer_nums[2], - in_channel=in_channels[2], - out_channel=out_channels[2], - stride=strides[2]) - - self.layer4 = self._make_layer(block, - layer_nums[3], - in_channel=in_channels[3], - out_channel=out_channels[3], - stride=strides[3]) - self.cin = out_channels[3] + self.layer1 = self._make_layer(block, 64, layers[0]) + self.layer2 = self._make_layer(block, 128, layers[1], stride=2) + self.layer3 = self._make_layer(block, 256, layers[2], stride=2) + self.layer4 = self._make_layer(block, 512, layers[3], stride=2) self.deconv_layers = self._make_deconv_layer( num_layers=3, num_filters=[256, 128, 64], num_kernels=[4, 4, 4], ) - def _make_layer(self, block, layer_num, in_channel, out_channel, stride=1): + def _make_layer(self, block, cout, blocks, stride=1): """ Make stage network of ResNet. Args: block (Cell): Resnet block. - layer_num (int): Layer number. - in_channel (int): Input channel. - out_channel (int): Output channel. + cout (int): Output channel. + blocks(int): Layer number. stride (int): Stride size for the first convolutional layer. + Returns: SequentialCell, the output layer. - - Examples: - >>> _make_layer(ResidualBlock, 3, 128, 256, 2) """ downsample = None - if stride != 1 or in_channel != out_channel: + if stride != 1 or self.cin != cout * block.expansion: downsample = nn.SequentialCell( - nn.Conv2d(in_channel, out_channel, + nn.Conv2d(self.cin, cout * block.expansion, kernel_size=1, stride=stride, has_bias=False), - nn.BatchNorm2d(out_channel, momentum=BN_MOMENTUM), + nn.BatchNorm2d(cout * block.expansion, momentum=BN_MOMENTUM), ) layers = [] - layers.append(block(in_channel, out_channel, stride, downsample)) - for _ in range(1, layer_num): - layers.append(block(out_channel, out_channel)) + layers.append(block(self.cin, cout, stride, downsample)) + self.cin = cout * block.expansion + for _ in range(1, blocks): + layers.append(block(self.cin, cout)) return nn.SequentialCell(*layers) def _make_deconv_layer(self, num_layers, num_filters, num_kernels): """ - Deconvolution for upsampling + Make deconvolution network of ResNet. + + Args: + num_layer(int): Layer number. + num_filters (list): Convolution dimension. + num_kernels (list): The size of convolution kernel . + + Returns: + SequentialCell, the output layer. """ layers = [] for i in range(num_layers): kernel = num_kernels[i] cout = num_filters[i] - up = nn.Conv2dTranspose(in_channels=self.cin, out_channels=cout, kernel_size=kernel, stride=2, pad_mode='pad', padding=1) @@ -206,3 +178,11 @@ class ResNetFea(nn.Cell): x = self.deconv_layers(x) return x + + +def weights_init(net): + """Initialize the weight.""" + for _, cell in net.cells_and_names(): + if isinstance(cell, nn.Conv2d): + cell.weight = Parameter(initializer(HeUniform(negative_slope=math.sqrt(5)), + cell.weight.shape, cell.weight.dtype), name=cell.weight.name) diff --git a/research/cv/centernet_resnet50_v1/src/utils.py b/research/cv/centernet_resnet50_v1/src/utils.py index 2216953d1b8763aa5b44858c0a371264d79a7db5..e64b96ddaaecb3697fbeb840f19293c55a3c6f28 100644 --- a/research/cv/centernet_resnet50_v1/src/utils.py +++ b/research/cv/centernet_resnet50_v1/src/utils.py @@ -22,53 +22,10 @@ import numpy as np import mindspore.nn as nn import mindspore.ops as ops from mindspore import dtype as mstype -from mindspore.common.initializer import initializer -from mindspore.common.parameter import Parameter from mindspore.common.tensor import Tensor from mindspore.nn.learning_rate_schedule import LearningRateSchedule, PolynomialDecayLR, WarmUpLR from mindspore.train.callback import Callback -clip_grad = ops.MultitypeFuncGraph("clip_grad") - - -@clip_grad.register("Number", "Tensor") -def _clip_grad(clip_value, grad): - """ - Clip gradients. - - Inputs: - clip_value (float): Specifies how much to clip. - grad (tuple[Tensor]): Gradients. - - Outputs: - tuple[Tensor], clipped gradients. - """ - dt = ops.dtype(grad) - new_grad = nn.ClipByNorm()(grad, ops.cast(ops.tuple_to_array((clip_value,)), dt)) - return new_grad - - -class ClipByNorm(nn.Cell): - """ - Clip grads by gradient norm - - Args: - clip_norm(float): The target norm of graident clip. Default: 1.0 - - Returns: - Tuple of Tensors, gradients after clip. - """ - - def __init__(self, clip_norm=1.0): - super(ClipByNorm, self).__init__() - self.hyper_map = ops.HyperMap() - self.clip_norm = clip_norm - - def construct(self, grads): - grads = self.hyper_map(ops.partial(clip_grad, self.clip_norm), grads) - return grads - - reciprocal = ops.Reciprocal() grad_scale = ops.MultitypeFuncGraph("grad_scale") @@ -87,7 +44,6 @@ class GradScale(nn.Cell): Returns: Tuple of Tensors, gradients after rescale. """ - def __init__(self): super(GradScale, self).__init__() self.hyper_map = ops.HyperMap() @@ -97,39 +53,17 @@ class GradScale(nn.Cell): return grads -class ClipByValue(nn.Cell): - """ - Clip tensor by value - - Args: None - - Returns: - Tensor, output after clip. - """ - - def __init__(self): - super(ClipByValue, self).__init__() - self.min = ops.Minimum() - self.max = ops.Maximum() - - def construct(self, x, clip_value_min, clip_value_max): - x_min = self.min(x, clip_value_max) - x_max = self.max(x_min, clip_value_min) - return x_max - - class GatherFeature(nn.Cell): """ Gather feature at specified position Args: - enable_cpu_gather (bool): Use cpu operator GatherD to gather feature or not, adaption for CPU. Default: True. + enable_cpu_gather (bool): Use cpu operator GatherD to gather feature or not, adaption for CPU. Default: False. Returns: Tensor, feature at spectified position """ - - def __init__(self, enable_cpu_gather=True): + def __init__(self, enable_cpu_gather=False): super(GatherFeature, self).__init__() self.tile = ops.Tile() self.shape = ops.Shape() @@ -175,7 +109,6 @@ class TransposeGatherFeature(nn.Cell): Returns: Tensor, feature at spectified position """ - def __init__(self): super(TransposeGatherFeature, self).__init__() self.shape = ops.Shape() @@ -203,7 +136,6 @@ class Sigmoid(nn.Cell): Returns: Tensor, feature after sigmoid and clip. """ - def __init__(self): super(Sigmoid, self).__init__() self.cast = ops.Cast() @@ -211,7 +143,7 @@ class Sigmoid(nn.Cell): self.sigmoid = nn.Sigmoid() self.clip_by_value = ops.clip_by_value - def construct(self, x, min_value=1e-4, max_value=1 - 1e-4): + def construct(self, x, min_value=1e-4, max_value=1-1e-4): x = self.sigmoid(x) dt = self.dtype(x) x = self.clip_by_value(x, self.cast(ops.tuple_to_array((min_value,)), dt), @@ -230,7 +162,6 @@ class FocalLoss(nn.Cell): Returns: Tensor, focal loss. """ - def __init__(self, alpha=2, beta=4): super(FocalLoss, self).__init__() self.alpha = alpha @@ -264,174 +195,7 @@ class FocalLoss(nn.Cell): return loss -class GHMCLoss(nn.Cell): - """ - Warpper for gradient harmonizing loss for classification. - - Args: - bins(int): Number of bins. Default: 10. - momentum(float): Momentum for moving gradient density. Default: 0.0. - - Returns: - Tensor, GHM loss for classification. - """ - - def __init__(self, bins=10, momentum=0.0): - super(GHMCLoss, self).__init__() - self.bins = bins - self.momentum = momentum - edges_left = np.array([float(x) / bins for x in range(bins)], dtype=np.float32) - self.edges_left = Tensor(edges_left.reshape((bins, 1, 1, 1, 1))) - edges_right = np.array([float(x) / bins for x in range(1, bins + 1)], dtype=np.float32) - edges_right[-1] += 1e-4 - self.edges_right = Tensor(edges_right.reshape((bins, 1, 1, 1, 1))) - - if momentum >= 0: - self.acc_sum = Parameter(initializer(0, [bins], mstype.float32)) - - self.abs = ops.Abs() - self.log = ops.Log() - self.cast = ops.Cast() - self.select = ops.Select() - self.reshape = ops.Reshape() - self.reduce_sum = ops.ReduceSum() - self.max = ops.Maximum() - self.less = ops.Less() - self.equal = ops.Equal() - self.greater = ops.Greater() - self.logical_and = ops.LogicalAnd() - self.greater_equal = ops.GreaterEqual() - self.zeros_like = ops.ZerosLike() - self.expand_dims = ops.ExpandDims() - - def construct(self, out, target): - """GHM loss for classification""" - g = self.abs(out - target) - g = self.expand_dims(g, 0) # (1, b, c, h, w) - - pos_inds = self.cast(self.equal(target, 1.0), mstype.float32) - tot = self.max(self.reduce_sum(pos_inds, ()), 1.0) - - # (bin, b, c, h, w) - inds_mask = self.logical_and(self.greater_equal(g, self.edges_left), self.less(g, self.edges_right)) - zero_matrix = self.cast(self.zeros_like(inds_mask), mstype.float32) - inds = self.cast(inds_mask, mstype.float32) - # (bins,) - num_in_bin = self.reduce_sum(inds, (1, 2, 3, 4)) - valid_bins = self.greater(num_in_bin, 0) - num_valid_bin = self.reduce_sum(self.cast(valid_bins, mstype.float32), ()) - - if self.momentum > 0: - self.acc_sum = self.select(valid_bins, - self.momentum * self.acc_sum + (1 - self.momentum) * num_in_bin, - self.acc_sum) - acc_sum = self.acc_sum - acc_sum = self.reshape(acc_sum, (self.bins, 1, 1, 1, 1)) - acc_sum = acc_sum + zero_matrix - weights = self.select(self.equal(inds, 1), tot / acc_sum, zero_matrix) - # (b, c, h, w) - weights = self.reduce_sum(weights, 0) - else: - num_in_bin = self.reshape(num_in_bin, (self.bins, 1, 1, 1, 1)) - num_in_bin = num_in_bin + zero_matrix - weights = self.select(self.equal(inds, 1), tot / num_in_bin, zero_matrix) - # (b, c, h, w) - weights = self.reduce_sum(weights, 0) - - weights = weights / num_valid_bin - - ghmc_loss = (target - 1.0) * self.log(1.0 - out) - target * self.log(out) - ghmc_loss = self.reduce_sum(ghmc_loss * weights, ()) / tot - return ghmc_loss - - -class GHMRLoss(nn.Cell): - """ - Warpper for gradient harmonizing loss for regression. - - Args: - bins(int): Number of bins. Default: 10. - momentum(float): Momentum for moving gradient density. Default: 0.0. - mu(float): Super parameter for smoothed l1 loss. Default: 0.02. - - Returns: - Tensor, GHM loss for regression. - """ - - def __init__(self, bins=10, momentum=0.0, mu=0.02): - super(GHMRLoss, self).__init__() - self.bins = bins - self.momentum = momentum - self.mu = mu - edges_left = np.array([float(x) / bins for x in range(bins)], dtype=np.float32) - self.edges_left = Tensor(edges_left.reshape((bins, 1, 1, 1, 1))) - edges_right = np.array([float(x) / bins for x in range(1, bins + 1)], dtype=np.float32) - edges_right[-1] += 1e-4 - self.edges_right = Tensor(edges_right.reshape((bins, 1, 1, 1, 1))) - - if momentum >= 0: - self.acc_sum = Parameter(initializer(0, [bins], mstype.float32)) - - self.abs = ops.Abs() - self.sqrt = ops.Sqrt() - self.cast = ops.Cast() - self.select = ops.Select() - self.reshape = ops.Reshape() - self.reduce_sum = ops.ReduceSum() - self.max = ops.Maximum() - self.less = ops.Less() - self.equal = ops.Equal() - self.greater = ops.Greater() - self.logical_and = ops.LogicalAnd() - self.greater_equal = ops.GreaterEqual() - self.zeros_like = ops.ZerosLike() - self.expand_dims = ops.ExpandDims() - - def construct(self, out, target): - """GHM loss for regression""" - # ASL1 loss - diff = out - target - # gradient length - g = self.abs(diff / self.sqrt(self.mu * self.mu + diff * diff)) - g = self.expand_dims(g, 0) # (1, b, c, h, w) - - pos_inds = self.cast(self.equal(target, 1.0), mstype.float32) - tot = self.max(self.reduce_sum(pos_inds, ()), 1.0) - - # (bin, b, c, h, w) - inds_mask = self.logical_and(self.greater_equal(g, self.edges_left), self.less(g, self.edges_right)) - zero_matrix = self.cast(self.zeros_like(inds_mask), mstype.float32) - inds = self.cast(inds_mask, mstype.float32) - # (bins,) - num_in_bin = self.reduce_sum(inds, (1, 2, 3, 4)) - valid_bins = self.greater(num_in_bin, 0) - num_valid_bin = self.reduce_sum(self.cast(valid_bins, mstype.float32), ()) - - if self.momentum > 0: - self.acc_sum = self.select(valid_bins, - self.momentum * self.acc_sum + (1 - self.momentum) * num_in_bin, - self.acc_sum) - acc_sum = self.acc_sum - acc_sum = self.reshape(acc_sum, (self.bins, 1, 1, 1, 1)) - acc_sum = acc_sum + zero_matrix - weights = self.select(self.equal(inds, 1), tot / acc_sum, zero_matrix) - # (b, c, h, w) - weights = self.reduce_sum(weights, 0) - else: - num_in_bin = self.reshape(num_in_bin, (self.bins, 1, 1, 1, 1)) - num_in_bin = num_in_bin + zero_matrix - weights = self.select(self.equal(inds, 1), tot / num_in_bin, zero_matrix) - # (b, c, h, w) - weights = self.reduce_sum(weights, 0) - - weights = weights / num_valid_bin - - ghmr_loss = self.sqrt(diff * diff + self.mu * self.mu) - self.mu - ghmr_loss = self.reduce_sum(ghmr_loss * weights, ()) / tot - return ghmr_loss - - -class RegLoss(nn.Cell): # reg_l1_loss +class RegLoss(nn.Cell): #reg_l1_loss """ Warpper for regression loss. @@ -441,7 +205,6 @@ class RegLoss(nn.Cell): # reg_l1_loss Returns: Tensor, regression loss. """ - def __init__(self, mode='l1'): super(RegLoss, self).__init__() self.reduce_sum = ops.ReduceSum() @@ -468,32 +231,6 @@ class RegLoss(nn.Cell): # reg_l1_loss return regr_loss -class RegWeightedL1Loss(nn.Cell): - """ - Warpper for weighted regression loss. - - Args: None - - Returns: - Tensor, regression loss. - """ - - def __init__(self): - super(RegWeightedL1Loss, self).__init__() - self.reduce_sum = ops.ReduceSum() - self.gather_feature = TransposeGatherFeature() - self.cast = ops.Cast() - self.l1_loss = nn.L1Loss(reduction='sum') - - def construct(self, output, mask, ind, target): - pred = self.gather_feature(output, ind) - mask = self.cast(mask, mstype.float32) - num = self.reduce_sum(mask, ()) - loss = self.l1_loss(pred * mask, target * mask) - loss = loss / (num + 1e-4) - return loss - - class LossCallBack(Callback): """ Monitor the loss in training. @@ -554,7 +291,6 @@ class CenterNetPolynomialDecayLR(LearningRateSchedule): Returns: Tensor, learning rate in time. """ - def __init__(self, learning_rate, end_learning_rate, warmup_steps, decay_steps, power): super(CenterNetPolynomialDecayLR, self).__init__() self.warmup_flag = False @@ -593,7 +329,6 @@ class CenterNetMultiEpochsDecayLR(LearningRateSchedule): Returns: Tensor, learning rate in time. """ - def __init__(self, learning_rate, warmup_steps, multi_epochs, steps_per_epoch, factor=10): super(CenterNetMultiEpochsDecayLR, self).__init__() self.warmup_flag = False @@ -632,7 +367,6 @@ class MultiEpochsDecayLR(LearningRateSchedule): Returns: Tensor, learning rate. """ - def __init__(self, learning_rate, multi_epochs, steps_per_epoch, factor=10): super(MultiEpochsDecayLR, self).__init__() if not isinstance(multi_epochs, (list, tuple)): diff --git a/research/cv/centernet_resnet50_v1/src/visual.py b/research/cv/centernet_resnet50_v1/src/visual.py index ab583faf1a1f8da5b47888a44f4b725f58b1c2b4..315fdb62462b3ec46d5a0adc69c2aeb61fe5cae6 100644 --- a/research/cv/centernet_resnet50_v1/src/visual.py +++ b/research/cv/centernet_resnet50_v1/src/visual.py @@ -22,11 +22,28 @@ import random import cv2 import numpy as np import pycocotools.coco as COCO -from .config import dataset_config as data_cfg -from .config import eval_config as eval_cfg +from .model_utils.config import eval_config as eval_cfg from .image import get_affine_transform, affine_transform +coco_class_name2id = {'person': 1, 'bicycle': 2, 'car': 3, 'motorcycle': 4, 'airplane': 5, + 'bus': 6, 'train': 7, 'truck': 8, 'boat': 9, 'traffic light': 10, + 'fire hydrant': 11, 'stop sign': 13, 'parking meter': 14, 'bench': 15, + 'bird': 16, 'cat': 17, 'dog': 18, 'horse': 19, 'sheep': 20, 'cow': 21, + 'elephant': 22, 'bear': 23, 'zebra': 24, 'giraffe': 25, 'backpack': 27, + 'umbrella': 28, 'handbag': 31, 'tie': 32, 'suitcase': 33, 'frisbee': 34, + 'skis': 35, 'snowboard': 36, 'sports ball': 37, 'kite': 38, 'baseball bat': 39, + 'baseball glove': 40, 'skateboard': 41, 'surfboard': 42, 'tennis racket': 43, + 'bottle': 44, 'wine glass': 46, 'cup': 47, 'fork': 48, 'knife': 49, 'spoon': 50, + 'bowl': 51, 'banana': 52, 'apple': 53, 'sandwich': 54, 'orange': 55, 'broccoli': 56, + 'carrot': 57, 'hot dog': 58, 'pizza': 59, 'donut': 60, 'cake': 61, 'chair': 62, + 'couch': 63, 'potted plant': 64, 'bed': 65, 'dining table': 67, 'toilet': 70, + 'tv': 72, 'laptop': 73, 'mouse': 74, 'remote': 75, 'keyboard': 76, 'cell phone': 77, + 'microwave': 78, 'oven': 79, 'toaster': 80, 'sink': 81, 'refrigerator': 82, + 'book': 84, 'clock': 85, 'vase': 86, 'scissors': 87, 'teddy bear': 88, + 'hair drier': 89, 'toothbrush': 90} + + def coco_box_to_bbox(box): """convert height/width to position coordinates""" bbox = np.array([box[0], box[1], box[0] + box[2], box[1] + box[3]], dtype=np.float32) @@ -68,8 +85,8 @@ def merge_pred(ann_path, mode="val", name="merged_annotations"): if "json" in file_name: data_files.append(os.path.join(ann_path, file_name)) pred = {"images": [], "annotations": []} - for f in data_files: - anno = json.load(open(f, 'r')) + for file in data_files: + anno = json.load(open(file, 'r')) if "images" in anno: for img in anno["images"]: pred["images"].append(img) @@ -123,7 +140,7 @@ def visual_image(img, annos, save_path, ratio=None, height=None, width=None, nam num_objects = len(annos) name_list = [] id_list = [] - for class_name, class_id in data_cfg.coco_class_name2id.items(): + for class_name, class_id in coco_class_name2id.items(): name_list.append(class_name) id_list.append(class_id) diff --git a/research/cv/centernet_resnet50_v1/train.py b/research/cv/centernet_resnet50_v1/train.py index b4bff409073009a922c2b31fcd0be14286e085c2..8391d982c8678610bc586701826ee91947020cd1 100644 --- a/research/cv/centernet_resnet50_v1/train.py +++ b/research/cv/centernet_resnet50_v1/train.py @@ -17,7 +17,6 @@ Train CenterNet and get network model files(.ckpt) """ import os -import argparse import mindspore.communication.management as D from mindspore.communication.management import get_rank from mindspore import context @@ -29,55 +28,22 @@ from mindspore.nn.optim import Adam from mindspore import log as logger from mindspore.common import set_seed from mindspore.profiler import Profiler + from src.dataset import COCOHP from src.centernet_det import CenterNetLossCell, CenterNetWithLossScaleCell from src.centernet_det import CenterNetWithoutLossScaleCell from src.utils import LossCallBack, CenterNetPolynomialDecayLR, CenterNetMultiEpochsDecayLR -from src.config import dataset_config, net_config, train_config +from src.model_utils.config import config, dataset_config, net_config, train_config +from src.model_utils.moxing_adapter import moxing_wrapper +from src.model_utils.device_adapter import get_device_id, get_rank_id, get_device_num -_current_dir = os.path.dirname(os.path.realpath(__file__)) -parser = argparse.ArgumentParser(description='CenterNet training') -parser.add_argument('--device_target', type=str, default='Ascend', choices=['Ascend', 'CPU'], - help='device where the code will be implemented. (Default: Ascend)') -parser.add_argument("--distribute", type=str, default="true", choices=["true", "false"], - help="Run distribute, default is true.") -parser.add_argument("--need_profiler", type=str, default="false", choices=["true", "false"], - help="Profiling to parsing runtime info, default is false.") -parser.add_argument("--profiler_path", type=str, default=" ", help="The path to save profiling data") -parser.add_argument("--epoch_size", type=int, default="1", help="Epoch size, default is 1.") -parser.add_argument("--train_steps", type=int, default=-1, help="Training Steps, default is -1," - "i.e. run all steps according to epoch number.") -parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") -parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.") -parser.add_argument("--enable_save_ckpt", type=str, default="true", choices=["true", "false"], - help="Enable save checkpoint, default is true.") -parser.add_argument("--do_shuffle", type=str, default="true", choices=["true", "false"], - help="Enable shuffle for dataset, default is true.") -parser.add_argument("--enable_data_sink", type=str, default="true", choices=["true", "false"], - help="Enable data sink, default is true.") -parser.add_argument("--data_sink_steps", type=int, default="-1", help="Sink steps for each epoch, default is -1.") -parser.add_argument("--save_checkpoint_path", type=str, default="", help="Save checkpoint path") -parser.add_argument("--load_checkpoint_path", type=str, default="", help="Load checkpoint file path") -parser.add_argument("--save_checkpoint_steps", type=int, default=1000, help="Save checkpoint steps, default is 1000.") -parser.add_argument("--save_checkpoint_num", type=int, default=1, help="Save checkpoint numbers, default is 1.") -parser.add_argument("--mindrecord_dir", type=str, default="", help="Mindrecord dataset files directory") -parser.add_argument("--mindrecord_prefix", type=str, default="coco_det.train.mind", - help="Prefix of MindRecord dataset filename.") -parser.add_argument("--save_result_dir", type=str, default="", help="The path to save the predict results") - -args_opt = parser.parse_args() +_current_dir = os.path.dirname(os.path.realpath(__file__)) def _set_parallel_all_reduce_split(): """set centernet all_reduce fusion split""" - if net_config.last_level == 5: - context.set_auto_parallel_context(all_reduce_fusion_config=[16, 56, 96, 136, 175]) - elif net_config.last_level == 6: - context.set_auto_parallel_context(all_reduce_fusion_config=[18, 59, 100, 141, 182]) - else: - raise ValueError("The total num of allreduced grads for last level = {} is unknown," - "please re-split after known the true value".format(net_config.last_level)) + context.set_auto_parallel_context(all_reduce_fusion_config=[18, 59, 100, 141, 182]) def _get_params_groups(network, optimizer): @@ -101,7 +67,7 @@ def _get_optimizer(network, dataset_size): lr_schedule = CenterNetPolynomialDecayLR(learning_rate=train_config.PolyDecay.learning_rate, end_learning_rate=train_config.PolyDecay.end_learning_rate, warmup_steps=train_config.PolyDecay.warmup_steps, - decay_steps=args_opt.train_steps, + decay_steps=config.train_steps, power=train_config.PolyDecay.power) optimizer = Adam(group_params, learning_rate=lr_schedule, eps=train_config.PolyDecay.eps, loss_scale=1.0) elif train_config.lr_schedule == "MultiDecay": @@ -109,7 +75,7 @@ def _get_optimizer(network, dataset_size): if not isinstance(multi_epochs, (list, tuple)): raise TypeError("multi_epochs must be list or tuple.") if not multi_epochs: - multi_epochs = [args_opt.epoch_size] + multi_epochs = [config.epoch_size] lr_schedule = CenterNetMultiEpochsDecayLR(learning_rate=train_config.MultiDecay.learning_rate, warmup_steps=train_config.MultiDecay.warmup_steps, multi_epochs=multi_epochs, @@ -125,77 +91,85 @@ def _get_optimizer(network, dataset_size): return optimizer +def modelarts_pre_process(): + """modelarts pre process function.""" + config.mindrecord_dir = config.data_path + config.save_checkpoint_path = os.path.join(config.output_path, config.save_checkpoint_path) + + +@moxing_wrapper(pre_process=modelarts_pre_process) def train(): """training CenterNet""" - context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.device_target) + context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target) context.set_context(reserve_class_name_in_scope=False) context.set_context(save_graphs=False) - ckpt_save_dir = args_opt.save_checkpoint_path + ckpt_save_dir = config.save_checkpoint_path rank = 0 device_num = 1 num_workers = 8 - if args_opt.device_target == "Ascend": - context.set_context(device_id=args_opt.device_id) - if args_opt.distribute == "true": + if config.device_target == "Ascend": + + context.set_context(device_id=get_device_id()) + if config.distribute == "true": D.init() - device_num = args_opt.device_num - rank = args_opt.device_id % device_num - ckpt_save_dir = args_opt.save_checkpoint_path + 'ckpt_' + str(get_rank()) + '/' + device_num = get_device_num() + rank = get_rank_id() + ckpt_save_dir = config.save_checkpoint_path + 'ckpt_' + str(get_rank()) + '/' context.reset_auto_parallel_context() context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True, device_num=device_num) # _set_parallel_all_reduce_split() else: - args_opt.distribute = "false" - args_opt.need_profiler = "false" - args_opt.enable_data_sink = "false" + config.distribute = "false" + config.need_profiler = "false" + config.enable_data_sink = "false" # Start create dataset! # mindrecord files will be generated at args_opt.mindrecord_dir such as centernet.mindrecord0, 1, ... file_num. logger.info("Begin creating dataset for CenterNet") - coco = COCOHP(dataset_config, run_mode="train", net_opt=net_config, save_path=args_opt.save_result_dir) - dataset = coco.create_train_dataset(args_opt.mindrecord_dir, args_opt.mindrecord_prefix, + coco = COCOHP(dataset_config, run_mode="train", net_opt=net_config, save_path=config.save_result_dir) + dataset = coco.create_train_dataset(config.mindrecord_dir, config.mindrecord_prefix, batch_size=train_config.batch_size, device_num=device_num, rank=rank, - num_parallel_workers=num_workers, do_shuffle=args_opt.do_shuffle == 'true') + num_parallel_workers=num_workers, do_shuffle=config.do_shuffle == 'true') dataset_size = dataset.get_dataset_size() logger.info("Create dataset done!") net_with_loss = CenterNetLossCell(net_config) - args_opt.train_steps = args_opt.epoch_size * dataset_size - logger.info("train steps: {}".format(args_opt.train_steps)) + config.train_steps = config.epoch_size * dataset_size + logger.info("train steps: {}".format(config.train_steps)) optimizer = _get_optimizer(net_with_loss, dataset_size) - enable_static_time = args_opt.device_target == "CPU" - callback = [TimeMonitor(args_opt.data_sink_steps), LossCallBack(dataset_size, enable_static_time)] - if args_opt.enable_save_ckpt == "true" and args_opt.device_id % min(8, device_num) == 0: - config_ck = CheckpointConfig(save_checkpoint_steps=args_opt.save_checkpoint_steps, - keep_checkpoint_max=args_opt.save_checkpoint_num) + enable_static_time = config.device_target == "CPU" + callback = [TimeMonitor(config.data_sink_steps), LossCallBack(dataset_size, enable_static_time)] + if config.enable_save_ckpt == "true" and get_device_id() % min(8, device_num) == 0: + config_ck = CheckpointConfig(save_checkpoint_steps=config.save_checkpoint_steps, + keep_checkpoint_max=config.save_checkpoint_num) ckpoint_cb = ModelCheckpoint(prefix='checkpoint_centernet', directory=None if ckpt_save_dir == "" else ckpt_save_dir, config=config_ck) callback.append(ckpoint_cb) - if args_opt.load_checkpoint_path: - param_dict = load_checkpoint(args_opt.load_checkpoint_path) + if config.load_checkpoint_path: + param_dict = load_checkpoint(config.load_checkpoint_path) load_param_into_net(net_with_loss, param_dict) - if args_opt.device_target == "Ascend": + if config.device_target == "Ascend": net_with_grads = CenterNetWithLossScaleCell(net_with_loss, optimizer=optimizer, sens=train_config.loss_scale_value) else: net_with_grads = CenterNetWithoutLossScaleCell(net_with_loss, optimizer=optimizer) model = Model(net_with_grads) - model.train(args_opt.epoch_size, dataset, callbacks=callback, - dataset_sink_mode=(args_opt.enable_data_sink == "true"), sink_size=args_opt.data_sink_steps) + model.train(config.epoch_size, dataset, callbacks=callback, + dataset_sink_mode=(config.enable_data_sink == "true"), sink_size=config.data_sink_steps) if __name__ == '__main__': - if args_opt.need_profiler == "true": - profiler = Profiler(output_path=args_opt.profiler_path) + if config.need_profiler == "true": + profiler = Profiler(output_path=config.profiler_path) set_seed(317) train() - if args_opt.need_profiler == "true": + if config.need_profiler == "true": profiler.analyse()