!2061 AlphaPose GPU training

Merge pull request !2061 from Alexander Melekhin/alphapose-gpu

!2061 AlphaPose GPU training
Merge pull request !2061 from Alexander Melekhin/alphapose-gpu
efa6920a · i-robot · Gitee · 830a6601 · 628c784b · efa6920a
Unverified Commit efa6920a authored 2 years ago by i-robot Committed by Gitee 2 years ago
--- a/research/cv/AlphaPose/README_EN.md
+++ b/research/cv/AlphaPose/README_EN.md
+# Contents
+
+<!-- TOC -->
+
+- [Alphapose Description](#alphapose-description)
+- [Model Architecture](#model-architecture)
+- [Dataset](#dataset)
+- [Features](#features)
+    - [mixed precision](#mixed-precision)
+- [Environmental requirements](#environmental-requirements)
+- [Quick start](#quick-start)
+- [Script description](#script-description)
+    - [Scripts and sample code](#scripts-and-sample-code)
+    - [Script parameters](#script-parameters)
+    - [Training process](#training-process)
+    - [Evaluation process](#evaluation-process)
+    - [310 Inference Process](#310-inference-process)
+- [Model description](#model-description)
+    - [Performance](#performance)
+        - [Evaluation performance](#evaluation-performance)
+        - [Inference performance](#inference-performance)
+- [Random Seed Description](#random-seed-description)
+- [ModelZoo Homepage](#modelzoo-homepage)
+
+<!-- /TOC -->
+
+# Alphapose Description
+
+## Overview
+
+AlphaPose was proposed by Lu Cewu's team of Shanghai Jiaotong University, and the author proposed the Regional Multi-Person Pose Estimation (RMPE) framework. Mainly include symmetric spatial transformer network (SSTN), Parametric Pose Non-Maximum-Suppression (NMS), and Pose-Guided Proposals Generator (PGPG). And use symmetric spatial transformer network (SSTN), deep proposals generator (DPG), parametric pose nonmaximum suppression (p-NMS) three techniques to solve the problem of multi-person pose estimation in complex scenes.
+
+For details of the AlphaPose model network, please refer to [Paper 1](https://arxiv.org/pdf/1612.00137.pdf)，The Mindspore implementation of the AlphaPose model network is based on the Pytorch version released by the Lu Cewu team of Shanghai Jiaotong University. For details, please refer to (<https://github.com/MVIG-SJTU/AlphaPose>.
+
+## paper
+
+1. [paper](https://arxiv.org/pdf/1804.06208.pdf)：Fang H S , Xie S , Tai Y W , et al. RMPE: Regional Multi-person Pose Estimation
+
+# Model Architecture
+
+The overall network architecture of AlphaPose is as follows:
+[Link](https://arxiv.org/abs/1612.00137)
+
+# Dataset
+
+Datasets used: [COCO2017](https://cocodataset.org/#download)
+
+- Dataset size:
+    - Training set: 19.56G, 118,287 images
+    - Test set: 825MB, 5,000 images
+- Data format: JPG file
+    - Note: Data is processed in src/dataset.py
+
+# Features
+
+## mixed precision
+
+The training method using [mixed precision](https://www.mindspore.cn/docs/programming_guide/en/r1.6/enable_mixed_precision.html) uses support for single-precision and half-precision data to improve the training speed of deep learning neural networks , while maintaining the network accuracy that single-precision training can achieve. Mixed-precision training increases computational speed and reduces memory usage while enabling training of larger models on specific hardware or enabling larger batches of training.
+Taking the FP16 operator as an example, if the input data type is FP32, the MindSpore background will automatically reduce the precision to process the data. You can open the INFO log and search for "reduce precision" to view operators with reduced precision.
+
+# Environmental requirements
+
+- Hardware (Ascend)
+    - Prepare the Ascend processor to build the hardware environment.
+- Framework
+    - [MindSpore](https://www.mindspore.cn/install/en)
+- For details, see the following resources:
+    - [MindSpore Tutorial](https://www.mindspore.cn/tutorials/zh-CN/master/index.html)
+    - [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/master/index.html)
+
+# Quick start
+
+After installing MindSpore through the official website, you can follow the steps below for training and evaluation:
+
+- Pre-trained models
+
+  AlphaPose model uses ResNet-50 network trained on ImageNet as backbone. You can run the Resnet training script in [official model zoo](https://gitee.com/mindspore/models/tree/master/official/cv/resnet) to get the model weight file or download trained checkpoint from [here](https://download.mindspore.cn/model_zoo/r1.3/resnet50_ascend_v130_imagenet2012_official_cv_bs32_acc77.06/). The pre-training file name should be resnet50.ckpt.
+
+- Dataset preparation
+
+  The Alphapose network model uses the COCO2017 dataset for training and inference. The dataset can be downloaded from the [official website](https://cocodataset.org/) official website.
+
+- Configuration
+
+  Set desired configuration in ```default_config.yaml``` file or create new one.
+
+- Ascend processor environment to run
+
+```bash
+# Distributed training
+bash scripts/run_distribute_train.sh --is_model_arts False --run_distribute True
+
+# Stand-alone training
+bash scripts/run_standalone_train.sh --device_id 0
+
+# Run the evaluation example
+bash scripts/run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+
+# run demo
+bash scripts/run_demo.sh
+```
+
+- GPU environment to run
+
+```bash
+# Distributed training
+bash scripts/run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [config_file] [dataset_dir] [pretrained_backbone]
+
+# Stand-alone training
+bash scripts/run_standalone_train_gpu.sh [config_file] [dataset_dir] [pretrained_backbone]
+
+# Run the evaluation example
+bash scripts/run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+```
+
+# Script description
+
+## Scripts and sample code
+
+```text
+
+└──AlphaPose
+  ├── README.md
+  ├── scripts
+    ├── run_distribute_train.sh            # Start Ascend distributed training (8 cards)
+    ├── run_distribute_train_gpu.sh        # Start GPU distributed training (8 cards)
+    ├── run_demo.sh                        # Start the demo (single card)
+    ├── run_eval.sh                        # Start Ascend eval
+    ├── run_standalone_train.sh            # Start Ascend stand-alone training (single card)
+    └── run_standalone_train_gpu.sh        # Start GPU stand-alone training (single card)
+  ├── src
+    ├── utils
+        ├── coco.py                        # COCO dataset evaluation tools
+        ├── fn.py                          # Drawing human poses based on key points
+        ├── inference.py                   # Heatmap keypoint prediction
+        ├── nms.py                         # nms
+        └── transforms.py                  # Image processing transformation
+    ├── config.py                          # parameter configuration
+    ├── dataset.py                         # data preprocessing
+    ├── DUC.py                             # Network part structure DUC
+    ├── FastPose.py                        # Backbone network definition
+    ├── network_with_loss.py               # Loss function definition
+    ├── SE_module.py                       # Network part structure SE
+    └── SE_module.py                       # Part of the network structure ResNet50
+  ├── demo.py                              # demo
+  ├── data_to_bin.py                       # Convert the images in the dataset to binary
+  ├── default_config.yaml                  # Default configuration file
+  ├── requirements.txt                     # pip requirements
+  ├── export.py                            # Convert ckpt model file to mindir
+  ├── postprocess.py                       # Post-processing precision
+  ├── eval.py                              # Evaluate the network
+  └── train.py                             # train the network
+```
+
+## script parameters
+
+Configure relevant parameters in ```default_config.yaml```.
+
+- Configure model related parameters:
+
+```python
+MODEL_INIT_WEIGHTS = True                                 # Initialize model weights
+MODEL_PRETRAINED = 'resnet50.ckpt'                        # pretrained model
+MODEL_NUM_JOINTS = 17                                     # number of key points
+MODEL_IMAGE_SIZE = [192, 256]                             # image size
+```
+
+- Configure network related parameters:
+
+```python
+NETWORK_NUM_LAYERS = 50                                   # Resnet backbone network layers
+NETWORK_DECONV_WITH_BIAS = False                          # network deconvolution bias
+NETWORK_NUM_DECONV_LAYERS = 3                             # The number of network deconvolution layers
+NETWORK_NUM_DECONV_FILTERS = [256, 256, 256]              # Deconvolution layer filter size
+NETWORK_NUM_DECONV_KERNELS = [4, 4, 4]                    # Deconvolution layer kernel size
+NETWORK_FINAL_CONV_KERNEL = 1                             # Final convolutional layer kernel size
+NETWORK_HEATMAP_SIZE = [48, 64]                           # Heatmap size
+```
+
+- Configure training related parameters:
+
+```python
+TRAIN_SHUFFLE = True                                      # training data in random order
+TRAIN_BATCH_SIZE = 64                                     # training batch size
+TRAIN_BEGIN_EPOCH = 0                                     # Test dataset filename
+DATASET_FLIP = True                                       # The dataset is randomly flipped
+DATASET_SCALE_FACTOR = 0.3                                # dataset random scale factor
+DATASET_ROT_FACTOR = 40                                   # Dataset random rotation factor
+TRAIN_BEGIN_EPOCH = 0                                     # number of initial cycles
+TRAIN_END_EPOCH = 270                                     # number of final cycles
+TRAIN_LR = 0.001                                          # initial learning rate
+TRAIN_LR_FACTOR = 0.1                                     # Learning rate reduction factor
+```
+
+- Configure test related parameters:
+
+```python
+TEST_BATCH_SIZE = 32                                      # test batch size
+TEST_FLIP_TEST = True                                     # flip validation
+TEST_USE_GT_BBOX = False                                  # Use gt boxes
+```
+
+- Configure nms related parameters:
+
+```python
+TEST_OKS_THRE = 0.9                                       # OKS threshold
+TEST_IN_VIS_THRE = 0.2                                    # Visualization Threshold
+TEST_BBOX_THRE = 1.0                                      # candidate box threshold
+TEST_IMAGE_THRE = 0.0                                     # image threshold
+TEST_NMS_THRE = 1.0                                       # nms threshold
+```
+
+- Configure demo related parameters:
+
+```python
+detect_image = "images/1.jpg"                             # Detect pictures
+yolo_image_size = [416, 416]                              # yolo network input image size
+yolo_ckpt = "yolo/yolo.ckpt"                              # yolo network weight
+fast_pose_ckpt = "fastpose.ckpt"                          # fastpose network weights
+yolo_threshold = 0.1                                      # bbox threshold
+```
+
+## training process
+
+### usage
+
+#### Ascend processor environment running
+
+```bash
+# Distributed training
+bash scripts/run_distribute_train.sh --is_model_arts False --run_distribute True
+
+# Stand-alone training
+bash scripts/run_standalone_train.sh --device_id 0
+
+# Run the evaluation example
+bash scripts/run_eval.sh checkpoint_path device_id
+```
+
+#### GPU environment
+
+```bash
+# Distributed training
+bash scripts/run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [config_file] [dataset_dir] [pretrained_backbone]
+
+# Stand-alone training
+bash scripts/run_standalone_train_gpu.sh [config_file] [dataset_dir] [pretrained_backbone]
+
+# Run the evaluation example
+bash scripts/run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+```
+
+### result
+
+- Train Alphapose with COCO2017 dataset
+
+```text
+Distributed training results (8P)
+epoch:1 step:292, loss is 0.001391
+epoch:2 step:292, loss is 0.001326
+epoch:3 step:292, loss is 0.001001
+epoch:4 step:292, loss is 0.0007763
+epoch:5 step:292, loss is 0.0006757
+...
+epoch:288 step:292, loss is 0.0002837
+epoch:269 step:292, loss is 0.0002367
+epoch:270 step:292, loss is 0.0002532
+```
+
+## evaluation process
+
+### usage
+
+#### Ascend processor environment running
+
+The corresponding model inference can be performed by changing the "TEST_MODEL_FILE" file in the config file.
+
+```bash
+# evaluate
+bash scripts/run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+```
+
+#### GPU environment
+
+```bash
+# Run the evaluation example
+bash scripts/run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+```
+
+### result
+
+Alphapose is evaluated using val2017 in the COCO2017 dataset folder as follows:
+
+```text
+coco eval results saved to /cache/train_output/multi_train_poseresnet_v5_2-140_2340/keypoints_results.pkl
+AP: 0.723
+```
+
+## 310 Inference Process
+
+### usage
+
+#### export model
+
+```python
+# export model
+python export.py --ckpt_url [ckpt_url] --device_target [device_target] --device_id [device_id] --file_name [file_name] --file_format [file_format]
+```
+
+#### Ascend310 processor environment running
+
+```bash
+# 310 inference
+bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [NEED_PREPROCESS] [DEVICE_ID]
+```
+
+#### Acquire accuracy
+
+```bash
+# Acquire accuracy
+more acc.log
+```
+
+### result
+
+```text
+AP: 0.723
+```
+
+# Model description
+
+## performance
+
+### Evaluation performance
+
+#### Performance parameters on coco2017
+
+| parameter                 | Ascend                                      | GPU                |
+| -------------------------- | ------------------------------------------ | ------------------ |
+| model version              | ResNet50                                   | ResNet50           |
+| resource                   | Ascend 910 ；CPU 2.60GHz，192核；内存：755G  | 8p RTX 3090 24GB   |
+| upload date                | 2020-12-16                                 | 2022-02-16         |
+| MindSpore version          | 1.3                                        | 1.6                |
+| data set                    | coco2017                                  | coco2017
+| training parameters        | epoch=270, steps=2336, batch_size = 64, lr=0.001 | epoch=270, batch_size = 128, lr=0.001 |
+| optimizer                  | Adam                                       | Adam               |
+| loss function              | Mean Squared Error                         | Mean Squared Error |
+| output                    | heatmap                                     | heatmap            |
+| loss                       | 0.00025                                    | 0.00026            |
+| speed                      | 单卡：138.9毫秒/步;  8卡：147.28毫秒/步        | 8p: 441 ms/step    |
+| total duration                 | 单卡：24h22m36s;  8卡：3h13m31s          | 8p: 04h 48m 00s    |
+| parameter(M)             | 13.0                                         | 13.0               |
+| Fine-tune checkpoints | 389.64M (.ckpt文件)                              | 338M (.ckpt)       |
+| inference model        | 57.26M (.om文件),  112.76M(.MINDIR文件)          | -                  |
+
+### Inference performance
+
+#### Performance parameters on coco2017
+
+| parameter          | Ascend                   | GPU           |
+| ------------------- | ----------------------- | ------------ |
+| model version       | ResNet50                | ResNet50     |
+| resource            | Ascend 910              | RTX 3090 24 GB |
+| upload date       | 2020-12-16                | 2022-02-16   |
+| MindSpore Version | 1.3                       | 1.6          |
+| data set             | coco2017               | coco2017     |
+| batch_size          | 32                      | 32           |
+| output             | heatmap                  | heatmap      |
+| accuracy            | 单卡: 72.3%;  8卡：72.5% | 72.2 %       |
+| inference model | 389.64M (.ckpt文件)         | 338M (.ckpt)  |
+
+# Random Seed Description
+
+The seed in the "create_dataset" function is set in dataset.py, and the initial network weights are used in model.py.
+
+# ModelZoo Homepage
+
+Please visit the official website [homepage](https://gitee.com/mindspore/models).
--- a/research/cv/AlphaPose/default_config.yaml
+++ b/research/cv/AlphaPose/default_config.yaml
+# general
+DEVICE_ID: 0
+DEVICE_TARGET: ''
+VERSION: 'commit'
+TRAIN_SEED: 1
+EVAL_SEED: 1
+DATASET_SEED: 1
+RUN_DISTRIBUTE: False
+SUMMARY_DIR: './summary'
+
+# modelarts
+MODELARTS_IS_MODEL_ARTS: False
+MODELARTS_DATA_URL: ''
+MODELARTS_TRAIN_URL: ''
+MODELARTS_CACHE_INPUT: '/cache/data_tzh/'
+MODELARTS_CACHE_OUTPUT: '/cache/train_out/'
+
+# model network parameters
+MODEL_IS_TRAINED: False  # Initially True
+MODEL_INIT_WEIGHTS: True
+MODEL_PRETRAINED: 'resnet50.ckpt'
+MODEL_NUM_JOINTS: 17
+MODEL_IMAGE_SIZE: [192, 256]  # input image size
+
+# network
+NETWORK_NUM_LAYERS: 50
+NETWORK_DECONV_WITH_BIAS: False
+NETWORK_NUM_DECONV_LAYERS: 3
+NETWORK_NUM_DECONV_FILTERS: [256, 256, 256]
+NETWORK_NUM_DECONV_KERNELS: [4, 4, 4]
+NETWORK_FINAL_CONV_KERNEL: 1
+NETWORK_REVERSE: True
+
+NETWORK_TARGET_TYPE: 'gaussian'
+NETWORK_HEATMAP_SIZE: [48, 64]
+NETWORK_SIGMA: 2
+
+# loss
+LOSS_USE_TARGET_WEIGHT: True
+
+# dataset
+DATASET_TYPE: 'COCO'
+DATASET_ROOT: ''
+DATASET_TRAIN_SET: 'train2017'
+DATASET_TRAIN_JSON: 'annotations/person_keypoints_train2017.json'
+DATASET_TEST_SET: 'val2017'
+DATASET_TEST_JSON: 'annotations/person_keypoints_val2017.json'
+
+# training data augmentation
+DATASET_FLIP: True
+DATASET_SCALE_FACTOR: 0.3
+DATASET_ROT_FACTOR: 40
+
+# train
+TRAIN_SHUFFLE: True
+TRAIN_BATCH_SIZE: 128  # 128 in original paper
+TRAIN_BEGIN_EPOCH: 0
+TRAIN_END_EPOCH: 270  # 140 in original paper
+TRAIN_LR: 0.001
+TRAIN_LR_FACTOR: 0.1
+TRAIN_LR_STEP: [170, 200]
+TRAIN_NUM_PARALLEL_WORKERS: 6
+TRAIN_SAVE_CKPT: True
+TRAIN_nClasses: 17
+TRAIN_CKPT_PATH: "./"
+
+# valid
+TEST_device_target: "Ascend"
+TEST_device_id: 0
+TEST_BATCH_SIZE: 32
+TEST_FLIP_TEST: True
+TEST_POST_PROCESS: True
+TEST_SHIFT_HEATMAP: True
+TEST_USE_GT_BBOX: False
+TEST_NUM_PARALLEL_WORKERS: 2
+TEST_MODEL_FILE: "FastPose.ckpt"
+TEST_COCO_BBOX_FILE: '/COCO_BBOX_FILE/COCO_val2017_detections_AP_H_56_person.json'
+TEST_OUTPUT_DIR: 'results/'
+
+# export
+file_name: 'simple_baselines'
+ckpt_url: 'FastPose.ckpt'
+file_format: 'MINDIR'
+device_target: 'CPU'
+device_id: 0
+
+# demo
+detect_image: "demo.jpg"
+yolo_image_size: [416, 416]
+yolo_ckpt: "yolov3.ckpt"
+fast_pose_ckpt: "FastPose.ckpt"
+
+# eval
+checkpoint_path: ''
+
+# confidence under ignore_threshold means no object when training
+yolo_threshold: 0.1
+save_bbox_image: True
+result_path: "demo_result/"
+
+# nms
+TEST_OKS_THRE: 0.9
+TEST_IN_VIS_THRE: 0.2
+TEST_BBOX_THRE: 1.0
+TEST_IMAGE_THRE: 0.0
+TEST_NMS_THRE: 1.0
+
+# 310 infer-related
+INFER_PRE_RESULT_PATH: '_/preprocess_Result'
+INFER_POST_RESULT_PATH: '_/result_Files'
--- a/research/cv/AlphaPose/demo.py
+++ b/research/cv/AlphaPose/demo.py
@@ -66,15 +66,15 @@ def inference(bboxes):
    '''
    inference
    '''
-    image_width = config.MODEL.IMAGE_SIZE[0]
-    image_height = config.MODEL.IMAGE_SIZE[1]
+    image_width = config.MODEL_IMAGE_SIZE[0]
+    image_height = config.MODEL_IMAGE_SIZE[1]
    aspect_ratio = image_width * 1.0 / image_height
    scales, centers = bbox2sc(bboxes, aspect_ratio)
    model = createModel()
    ckpt_name = config.fast_pose_ckpt
    print('loading model fastpose_ckpt from {}'.format(ckpt_name))
    load_param_into_net(model, load_checkpoint(ckpt_name))
-    image_size = np.array(config.MODEL.IMAGE_SIZE, dtype=np.int32)
+    image_size = np.array(config.MODEL_IMAGE_SIZE, dtype=np.int32)
    data_numpy = cv.imread(config.detect_image, cv.IMREAD_COLOR | cv.IMREAD_IGNORE_ORIENTATION)


@@ -82,7 +82,7 @@ def inference(bboxes):

    inputs = []
    bbox_num = bboxes.shape[0]
-    image_size = np.array(config.MODEL.IMAGE_SIZE, dtype=np.int32)
+    image_size = np.array(config.MODEL_IMAGE_SIZE, dtype=np.int32)
    for i in range(bbox_num):
        s, c = scales[i], centers[i]
        trans = get_affine_transform(c, s, 0, image_size, inv=0)
@@ -91,12 +91,12 @@ def inference(bboxes):
        inputs.append(image_data)
    inputs = np.array(inputs, dtype=np.float32)
    output = model(Tensor(inputs, float32)).asnumpy()
-    if config.TEST.FLIP_TEST:
+    if config.TEST_FLIP_TEST:
        inputs_flipped = Tensor(inputs[:, :, :, ::-1], float32)
        output_flipped = model(inputs_flipped)
        output_flipped = flip_back(output_flipped.asnumpy(), flip_pairs)

-        if config.TEST.SHIFT_HEATMAP:
+        if config.TEST_SHIFT_HEATMAP:
            output_flipped[:, :, :, 1:] = \
                output_flipped.copy()[:, :, :, 0:-1]

@@ -127,7 +127,7 @@ def DataWrite(result):

 def main():
    context.set_context(mode=context.GRAPH_MODE,
-                        device_target="Ascend", save_graphs=False, device_id=5)
+                        device_target=config.DEVICE_TARGET, save_graphs=False)
    bboxes = detect_bbox()
    pose_preds, pose_scores = inference(bboxes)


--- a/research/cv/AlphaPose/eval.py
+++ b/research/cv/AlphaPose/eval.py
@@ -17,7 +17,6 @@ This file evaluates the model used.
 '''
 from __future__ import division

-import argparse
 import os
 import time
 import numpy as np
@@ -34,25 +33,20 @@ from src.utils.coco import evaluate
 from src.utils.transforms import flip_back
 from src.utils.inference import get_final_preds

-if config.MODELARTS.IS_MODEL_ARTS:
+if config.MODELARTS_IS_MODEL_ARTS:
    import moxing as mox

-set_seed(config.GENERAL.EVAL_SEED)
-device_id = int(os.getenv('DEVICE_ID'))
+set_seed(config.EVAL_SEED)
+device_id = int(os.getenv('DEVICE_ID', '0'))

-def parse_args():
-    parser = argparse.ArgumentParser(description='Evaluate')
-    parser.add_argument('--checkpoint_path', type=str, default=None, help='Checkpoint file path')
-    args = parser.parse_args()
-    return args

 def validate(cfg, val_dataset, model, output_dir, ann_path):
    '''
    validate
    '''
    model.set_train(False)
-    num_samples = val_dataset.get_dataset_size() * cfg.TEST.BATCH_SIZE
-    all_preds = np.zeros((num_samples, cfg.MODEL.NUM_JOINTS, 3),
+    num_samples = val_dataset.get_dataset_size() * cfg.TEST_BATCH_SIZE
+    all_preds = np.zeros((num_samples, cfg.MODEL_NUM_JOINTS, 3),
                         dtype=np.float32)
    all_boxes = np.zeros((num_samples, 2))
    image_id = []
@@ -62,12 +56,12 @@ def validate(cfg, val_dataset, model, output_dir, ann_path):
    for item in val_dataset.create_dict_iterator():
        inputs = item['image'].asnumpy()
        output = model(Tensor(inputs, float32)).asnumpy()
-        if cfg.TEST.FLIP_TEST:
+        if cfg.TEST_FLIP_TEST:
            inputs_flipped = Tensor(inputs[:, :, :, ::-1], float32)
            output_flipped = model(inputs_flipped)
            output_flipped = flip_back(output_flipped.asnumpy(), flip_pairs)

-            if cfg.TEST.SHIFT_HEATMAP:
+            if cfg.TEST_SHIFT_HEATMAP:
                output_flipped[:, :, :, 1:] = \
                    output_flipped.copy()[:, :, :, 0:-1]

@@ -98,25 +92,25 @@ def validate(cfg, val_dataset, model, output_dir, ann_path):


 def main():
-    args = parse_args()
    context.set_context(mode=context.GRAPH_MODE,
-                        device_target=config.TEST.device_target,
-                        device_id=config.TEST.device_id)
+                        device_target=config.TEST_device_target,
+                        device_id=config.TEST_device_id)

-    if config.MODELARTS.IS_MODEL_ARTS:
-        mox.file.copy_parallel(src_url=args.data_url, dst_url=config.MODELARTS.CACHE_INPUT)
+    if config.MODELARTS_IS_MODEL_ARTS:
+        mox.file.copy_parallel(src_url=config.MODELARTS_DATA_URL,
+                               dst_url=config.MODELARTS_CACHE_INPUT)

    model = createModel()

-    if config.MODELARTS.IS_MODEL_ARTS:
-        ckpt_name = config.MODELARTS.CACHE_INPUT
+    if config.MODELARTS_IS_MODEL_ARTS:
+        ckpt_name = config.MODELARTS_CACHE_INPUT
    else:
-        ckpt_name = config.DATASET.ROOT
-    ckpt_name = ckpt_name + config.TEST.MODEL_FILE
+        ckpt_name = ''
+    ckpt_name = ckpt_name + config.TEST_MODEL_FILE

-    if args.checkpoint_path is not None:
-        param_dict = load_checkpoint(args.checkpoint_path)
-        print("load checkpoint from [{}].".format(args.checkpoint_path))
+    if config.checkpoint_path != '':
+        param_dict = load_checkpoint(config.checkpoint_path)
+        print("load checkpoint from [{}].".format(config.checkpoint_path))
    else:
        param_dict = load_checkpoint(ckpt_name)
        print("load checkpoint from [{}].".format(ckpt_name))
@@ -125,25 +119,26 @@ def main():

    valid_dataset = CreateDatasetCoco(
        train_mode=False,
-        num_parallel_workers=config.TEST.NUM_PARALLEL_WORKERS,
+        num_parallel_workers=config.TEST_NUM_PARALLEL_WORKERS,
    )

    ckpt_name = ckpt_name.split('/')
    ckpt_name = ckpt_name[len(ckpt_name) - 1]
    ckpt_name = ckpt_name.split('.')[0]

-    if config.MODELARTS.IS_MODEL_ARTS:
-        output_dir = config.MODELARTS.CACHE_OUTPUT
-        ann_path = config.MODELARTS.CACHE_INPUT
+    if config.MODELARTS_IS_MODEL_ARTS:
+        output_dir = config.MODELARTS_CACHE_OUTPUT
+        ann_path = config.MODELARTS_CACHE_INPUT
    else:
-        output_dir = config.TEST.OUTPUT_DIR
-        ann_path = config.DATASET.ROOT
+        output_dir = config.TEST_OUTPUT_DIR
+        ann_path = config.DATASET_ROOT
    output_dir = output_dir + ckpt_name
-    ann_path = ann_path + config.DATASET.TEST_JSON
+    ann_path = os.path.join(ann_path, config.DATASET_TEST_JSON)
    validate(config, valid_dataset, model, output_dir, ann_path)

-    if config.MODELARTS.IS_MODEL_ARTS:
-        mox.file.copy_parallel(src_url=config.MODELARTS.CACHE_OUTPUT, dst_url=args.train_url)
+    if config.MODELARTS_IS_MODEL_ARTS:
+        mox.file.copy_parallel(src_url=config.MODELARTS_CACHE_OUTPUT,
+                               dst_url=config.MODELARTS_TRAIN_URL)

 if __name__ == '__main__':
    main()
--- a/research/cv/AlphaPose/export.py
+++ b/research/cv/AlphaPose/export.py
@@ -16,34 +16,23 @@
 ##############export checkpoint file into air, onnx, mindir models#################
 python export.py
 """
-import argparse
 import numpy as np

 import mindspore.common.dtype as ms
 from mindspore import Tensor, load_checkpoint, load_param_into_net, export, context
 from src.FastPose import createModel
-parser = argparse.ArgumentParser(description='simple_baselines')
-parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="CPU",
-                    help="device target")
-parser.add_argument("--device_id", type=int, default=0, help="Device id")
-parser.add_argument("--ckpt_url",
-                    default="FastPose.ckpt",
-                    help="Checkpoint file path.")
-parser.add_argument("--file_name", type=str,
-                    default="simple_baselines", help="output file name.")
-parser.add_argument('--file_format', type=str, choices=["MINDIR", "AIR"],
-                    default='MINDIR', help='file format')
-args = parser.parse_args()
+from src.config import config
+

 context.set_context(mode=context.GRAPH_MODE,
-                    device_target=args.device_target,
-                    device_id=args.device_id)
+                    device_target=config.device_target,
+                    device_id=config.device_id)

 if __name__ == '__main__':
    net = createModel()
    # assert cfg.checkpoint_dir is not None, "cfg.checkpoint_dir is None."
-    param_dict = load_checkpoint(args.ckpt_url)
+    param_dict = load_checkpoint(config.ckpt_url)
    load_param_into_net(net, param_dict)
    input_arr = Tensor(np.ones([1, 3, 256, 192]), ms.float32)
-    export(net, input_arr, file_name=args.file_name,
-           file_format=args.file_format)
+    export(net, input_arr, file_name=config.file_name,
+           file_format=config.file_format)
--- a/research/cv/AlphaPose/postprocess.py
+++ b/research/cv/AlphaPose/postprocess.py
@@ -58,11 +58,11 @@ def get_acc(cfg, result_path, npy_path):
    for i in range(num_samples):
        f1 = os.path.join(result_path, str(i)+"_0.bin")
        output = np.fromfile(f1, np.float32).reshape(out_shape)
-        if cfg.TEST.FLIP_TEST:
+        if cfg.TEST_FLIP_TEST:
            f2 = os.path.join(result_path, "flipped"+str(i)+"_0.bin")
            output_flipped = np.fromfile(f2, np.float32).reshape(out_shape)
            output_flipped = flip_back(output_flipped, flip_pairs)
-            if cfg.TEST.SHIFT_HEATMAP:
+            if cfg.TEST_SHIFT_HEATMAP:
                output_flipped[:, :, :, 1:] = \
                    output_flipped.copy()[:, :, :, 0:-1]

@@ -83,7 +83,7 @@ def get_acc(cfg, result_path, npy_path):
        idx += num_images

    output_dir = "result/"
-    ann_path = config.DATASET.ROOT + config.DATASET.TEST_JSON
+    ann_path = config.DATASET_ROOT + config.DATASET_TEST_JSON
    _, perf_indicator = evaluate(cfg, all_preds[:idx], output_dir, all_boxes[:idx], image_id, ann_path)
    print("AP:", perf_indicator)
    return perf_indicator

--- a/research/cv/AlphaPose/requirements.txt
+++ b/research/cv/AlphaPose/requirements.txt
+opencv-python
+pycocotools
+easydict
+PyYAML
--- a/research/cv/AlphaPose/scripts/run_distribute_train.sh
+++ b/research/cv/AlphaPose/scripts/run_distribute_train.sh
@@ -16,8 +16,8 @@

 echo "========================================================================"
 echo "Please run the script as: "
-echo "bash run.sh RANK_TABLE"
-echo "For example: bash run_distribute.sh RANK_TABLE"
+echo "bash run_distribute_train.sh RANK_TABLE"
+echo "For example: bash run_distribute_train.sh RANK_TABLE"
 echo "It is better to use the absolute path."
 echo "========================================================================"
 set -e
@@ -50,6 +50,7 @@ do
    cd src
    mkdir utils
    cd ../../../
+    cp ./default_config.yaml ./distribute_train/device$i
    cp ./train.py ./distribute_train/device$i
    cp ./src/*.py ./distribute_train/device$i/src
    cp ./src/utils/*.py ./distribute_train/device$i/src/utils
@@ -58,7 +59,7 @@ do
    export RANK_ID=$i
    echo "start training for device $i"
    env > env$i.log
-    python train.py --is_model_arts False --run_distribute True > train$i.log 2>&1 &
+    python train.py --DEVICE_TARGET Ascend --MODELARTS_IS_MODEL_ARTS False --RUN_DISTRIBUTE True > train$i.log 2>&1 &
    echo "$i finish"
    cd ../
 done

--- a/research/cv/AlphaPose/scripts/run_distribute_train_gpu.sh
+++ b/research/cv/AlphaPose/scripts/run_distribute_train_gpu.sh
+#!/bin/bash
+# Copyright 2022 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+if [ $# != 5 ]; then
+  echo "Usage: 
+        bash run_distribute_train_gpu.sh [DEVICE_NUM] [VISIBLE_DEVICES(0,1,2,3,4,5,6,7)] [config_file] [dataset_dir] [pretrained_backbone]
+       " 
+  exit 1
+fi
+
+get_real_path() {
+  if [ "${1:0:1}" == "/" ]; then
+    echo "$1"
+  else
+    echo "$(realpath -m $PWD/$1)"
+  fi
+}
+
+BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd)
+
+CONFIG=$(get_real_path $3)
+echo "CONFIG: "$CONFIG
+
+DATASET=$(get_real_path $4)
+echo "DATASET: "$DATASET
+
+BACKBONE=$(get_real_path $5)
+echo "BACKBONE: "$BACKBONE
+
+if [ ! -f $CONFIG ]
+then
+    echo "error: config=$CONFIG is not a file."
+exit 1
+fi
+
+if [ ! -d $DATASET ]
+then
+    echo "error: dataset_root=$DATASET is not a directory."
+exit 1
+fi
+
+if [ ! -f $BACKBONE ]
+then
+    echo "error: pretrained_backbone=$BACKBONE is not a file."
+exit 1
+fi
+
+if [ -d "$BASE_PATH/../train_parallel" ];
+then
+    rm -rf $BASE_PATH/../train_parallel
+fi
+mkdir $BASE_PATH/../train_parallel
+cd $BASE_PATH/../train_parallel || exit
+
+export CUDA_VISIBLE_DEVICES="$2"
+
+export PYTHONPATH=${BASE_PATH}:$PYTHONPATH
+
+echo "start training on multiple GPUs"
+env > env.log
+echo
+mpirun -n $1 --allow-run-as-root --output-filename log_output --merge-stderr-to-stdout \
+  python -u ${BASE_PATH}/../train.py --DEVICE_TARGET GPU --RUN_DISTRIBUTE True \
+    --config_path $CONFIG --DATASET_ROOT $DATASET --MODEL_PRETRAINED $BACKBONE &> train.log &
--- a/research/cv/AlphaPose/scripts/run_eval.sh
+++ b/research/cv/AlphaPose/scripts/run_eval.sh
@@ -13,6 +13,62 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
-CKPT_PATH=$1
-export DEVICE_ID=$2
-python eval.py --checkpoint_path $CKPT_PATH > eval_log$2.txt 2>&1 &
+if [ $# != 4 ]; then
+  echo "Usage: 
+        bash run_eval.sh [DEVICE_TARGET] [CONFIG] [CKPT_PATH] [DATASET]
+       " 
+  exit 1
+fi
+
+get_real_path() {
+  if [ "${1:0:1}" == "/" ]; then
+    echo "$1"
+  else
+    echo "$(realpath -m $PWD/$1)"
+  fi
+}
+
+BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd)
+
+DEVICE_TARGET=$1
+
+CONFIG=$(get_real_path $2)
+echo "CONFIG: "$CONFIG
+
+CKPT_PATH=$(get_real_path $3)
+echo "CKPT_PATH: "$CKPT_PATH
+
+DATASET=$(get_real_path $4)
+echo "CKPT_PATH: "$DATASET
+
+if [ ! -f $CONFIG ]
+then
+    echo "error: config=$CONFIG is not a file."
+exit 1
+fi
+
+if [ ! -d $DATASET ]
+then
+    echo "error: dataset_root=$DATASET is not a directory."
+exit 1
+fi
+
+if [ ! -f $CKPT_PATH ]
+then
+    echo "error: CKPT_PATH=$CKPT_PATH is not a file."
+exit 1
+fi
+
+if [ -d "$BASE_PATH/../eval" ];
+then
+    rm -rf $BASE_PATH/../eval
+fi
+mkdir $BASE_PATH/../eval
+cd $BASE_PATH/../eval || exit
+
+export PYTHONPATH=${BASE_PATH}:$PYTHONPATH
+
+echo "start eval"
+env > env.log
+echo
+python $BASE_PATH/../eval.py --TEST_device_target $DEVICE_TARGET --config_path $CONFIG --checkpoint_path $CKPT_PATH --MODEL_PRETRAINED $CKPT_PATH --DATASET_ROOT $DATASET  &> eval.log &
--- a/research/cv/AlphaPose/scripts/run_standalone_train.sh
+++ b/research/cv/AlphaPose/scripts/run_standalone_train.sh
@@ -21,5 +21,5 @@ echo "It is better to use the absolute path."
 echo "========================================================================"
 echo "start training for device $DEVICE_ID"
 export DEVICE_ID=$1
-python -u ../train.py --device_id ${DEVICE_ID} > train${DEVICE_ID}.log 2>&1 &
-echo "finish"
\ No newline at end of file
+python -u ../train.py --DEVICE_TARGET Ascend --DEVICE_ID ${DEVICE_ID} > train${DEVICE_ID}.log 2>&1 &
+echo "finish"
--- a/research/cv/AlphaPose/scripts/run_standalone_train_gpu.sh
+++ b/research/cv/AlphaPose/scripts/run_standalone_train_gpu.sh
+#!/bin/bash
+# Copyright 2022 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+if [ $# != 3 ]; then
+  echo "Usage: 
+        bash run_standalone_train_gpu.sh [config_file] [dataset_dir] [pretrained_backbone]
+       " 
+  exit 1
+fi
+
+get_real_path() {
+  if [ "${1:0:1}" == "/" ]; then
+    echo "$1"
+  else
+    echo "$(realpath -m $PWD/$1)"
+  fi
+}
+
+BASE_PATH=$(cd ./"`dirname $0`" || exit; pwd)
+
+CONFIG=$(get_real_path $1)
+echo "CONFIG: "$CONFIG
+
+DATASET=$(get_real_path $2)
+echo "DATASET: "$DATASET
+
+BACKBONE=$(get_real_path $3)
+echo "BACKBONE: "$BACKBONE
+
+if [ ! -f $CONFIG ]
+then
+    echo "error: config=$CONFIG is not a file."
+exit 1
+fi
+
+if [ ! -d $DATASET ]
+then
+    echo "error: dataset_root=$DATASET is not a directory."
+exit 1
+fi
+
+if [ ! -f $BACKBONE ]
+then
+    echo "error: pretrained_backbone=$BACKBONE is not a file."
+exit 1
+fi
+
+if [ -d "$BASE_PATH/../train" ];
+then
+    rm -rf $BASE_PATH/../train
+fi
+mkdir $BASE_PATH/../train
+cd $BASE_PATH/../train || exit
+
+export PYTHONPATH=${BASE_PATH}:$PYTHONPATH
+
+echo "start training on single GPU"
+env > env.log
+echo
+python -u ${BASE_PATH}/../train.py --DEVICE_TARGET GPU --config_path $CONFIG \
+  --DATASET_ROOT $DATASET --MODEL_PRETRAINED $BACKBONE &> train.log &
--- a/research/cv/AlphaPose/src/FastPose.py
+++ b/research/cv/AlphaPose/src/FastPose.py
@@ -15,17 +15,21 @@
 '''
 Alphapose network
 '''
+import os
 import mindspore.nn as nn
 import mindspore.ops as ops
 from mindspore import load_checkpoint, load_param_into_net
+
 from src.DUC import DUC
 from src.SE_Resnet import SEResnet
 from src.config import config

-if config.MODELARTS.IS_MODEL_ARTS:
-    pretrained = config.MODELARTS.CACHE_INPUT + config.MODEL.PRETRAINED
+
+if config.MODELARTS_IS_MODEL_ARTS:
+    pretrained = os.path.join(config.MODELARTS_CACHE_INPUT, config.MODEL_PRETRAINED)
 else:
-    pretrained = config.TRAIN.CKPT_PATH + config.MODEL.PRETRAINED
+    pretrained = os.path.join(config.MODEL_PRETRAINED)
+

 def createModel():
    '''
@@ -49,7 +53,7 @@ class FastPose_SE(nn.Cell):
        self.duc2 = DUC(256, 512, upscale_factor=2)

        self.conv_out = nn.Conv2d(
-            self.conv_dim, config.TRAIN.nClasses, kernel_size=3, stride=1, pad_mode='pad', padding=1, has_bias=True)
+            self.conv_dim, config.TRAIN_nClasses, kernel_size=3, stride=1, pad_mode='pad', padding=1, has_bias=True)
    def construct(self, x):
        '''
        construct

--- a/research/cv/AlphaPose/src/config.py
+++ b/research/cv/AlphaPose/src/config.py
@@ -12,127 +12,116 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # ============================================================================
-'''
-config
-'''
-from easydict import EasyDict as edict
-
-config = edict()
-
-# general
-config.GENERAL = edict()
-config.GENERAL.VERSION = 'commit'
-config.GENERAL.TRAIN_SEED = 1
-config.GENERAL.EVAL_SEED = 1
-config.GENERAL.DATASET_SEED = 1
-config.GENERAL.RUN_DISTRIBUTE = True
-
-# model arts
-config.MODELARTS = edict()
-config.MODELARTS.IS_MODEL_ARTS = False
-config.MODELARTS.CACHE_INPUT = '/cache/data_tzh/'
-config.MODELARTS.CACHE_OUTPUT = '/cache/train_out/'
-
-# model 网络参数
-config.MODEL = edict()
-config.MODEL.IS_TRAINED = False  # 初始是True
-config.MODEL.INIT_WEIGHTS = True
-config.MODEL.PRETRAINED = 'resnet50.ckpt'
-config.MODEL.NUM_JOINTS = 17
-config.MODEL.IMAGE_SIZE = [192, 256]  # 输入图像大小
-#config.MODEL.IMAGE_SIZE = [256,320]
-
-# network
-config.NETWORK = edict()
-config.NETWORK.NUM_LAYERS = 50
-config.NETWORK.DECONV_WITH_BIAS = False
-config.NETWORK.NUM_DECONV_LAYERS = 3
-config.NETWORK.NUM_DECONV_FILTERS = [256, 256, 256]
-config.NETWORK.NUM_DECONV_KERNELS = [4, 4, 4]
-config.NETWORK.FINAL_CONV_KERNEL = 1
-config.NETWORK.REVERSE = True
-
-config.NETWORK.TARGET_TYPE = 'gaussian'
-config.NETWORK.HEATMAP_SIZE = [48, 64]
-#config.NETWORK.HEATMAP_SIZE = [64, 80]
-config.NETWORK.SIGMA = 2
-
-# loss
-config.LOSS = edict()
-config.LOSS.USE_TARGET_WEIGHT = True
-
-# dataset
-config.DATASET = edict()
-config.DATASET.TYPE = 'COCO'
-config.DATASET.ROOT = '/COCO2017/'
-config.DATASET.TRAIN_SET = 'images'
-config.DATASET.TRAIN_JSON = 'annotations/person_keypoints_train2017.json'
-config.DATASET.TEST_SET = 'images'
-config.DATASET.TEST_JSON = 'annotations/person_keypoints_val2017.json'
-
-# training data augmentation
-config.DATASET.FLIP = True
-config.DATASET.SCALE_FACTOR = 0.3
-config.DATASET.ROT_FACTOR = 40
-
-# train
-config.TRAIN = edict()
-config.TRAIN.SHUFFLE = True
-config.TRAIN.BATCH_SIZE = 64
-config.TRAIN.BEGIN_EPOCH = 0
-config.TRAIN.END_EPOCH = 270
-config.TRAIN.LR = 0.001
-config.TRAIN.LR_FACTOR = 0.1
-config.TRAIN.LR_STEP = [90, 120]
-config.TRAIN.NUM_PARALLEL_WORKERS = 8
-config.TRAIN.SAVE_CKPT = True
-config.TRAIN.nClasses = 17
-config.TRAIN.CKPT_PATH = "/CKPT_PATH/"
-# valid
-config.TEST = edict()
-config.TEST.device_target = "Ascend"
-config.TEST.device_id = 7
-config.TEST.BATCH_SIZE = 32
-config.TEST.FLIP_TEST = True
-config.TEST.POST_PROCESS = True
-config.TEST.SHIFT_HEATMAP = True
-config.TEST.USE_GT_BBOX = False
-config.TEST.NUM_PARALLEL_WORKERS = 2
-config.TEST.MODEL_FILE = "FastPose.ckpt"
-config.TEST.COCO_BBOX_FILE = '/COCO_BBOX_FILE/COCO_val2017_detections_AP_H_56_person.json'
-config.TEST.OUTPUT_DIR = 'results/'
-
-# demo
-config.detect_image = "demo.jpg"
-config.yolo_image_size = [416, 416]
-config.yolo_ckpt = "yolov3.ckpt"
-config.fast_pose_ckpt = "FastPose.ckpt"
-# confidence under ignore_threshold means no object when training
-config.yolo_threshold = 0.1
-config.save_bbox_image = True
-config.result_path = "demo_result/"
-
-# nms
-config.TEST.OKS_THRE = 0.9
-config.TEST.IN_VIS_THRE = 0.2
-config.TEST.BBOX_THRE = 1.0
-config.TEST.IMAGE_THRE = 0.0
-config.TEST.NMS_THRE = 1.0
-
-# 310 infer-related
-config.INFER = edict()
-config.INFER.PRE_RESULT_PATH = './preprocess_Result'
-config.INFER.POST_RESULT_PATH = './result_Files'
-
-# Help description for each configuration
-config.enable_modelarts = "Whether training on modelarts, default: False"
-config.data_url = "Url for modelarts"
-config.train_url = "Url for modelarts"
-config.data_path = "The location of the input data."
-config.output_path = "The location of the output file."
-config.device_target = "Running platform, choose from Ascend, GPU or CPU, and default is Ascend."
-config.enable_profiling = 'Whether enable profiling while training, default: False'
-# Parameters that can be modified at the terminal
-config.ckpt_save_dir = "ckpt path to save"
-config.batch_size = "training batch size"
-config.run_distribute = "Run distribute, default is false."
+"""Parse arguments"""
+
+import os
+import ast
+import argparse
+from pprint import pformat
+import yaml
+
+class Config:
+    """
+    Configuration namespace. Convert dictionary to members.
+    """
+    def __init__(self, cfg_dict):
+        for k, v in cfg_dict.items():
+            if isinstance(v, (list, tuple)):
+                setattr(self, k, [Config(x) if isinstance(x, dict) else x for x in v])
+            else:
+                setattr(self, k, Config(v) if isinstance(v, dict) else v)
+
+    def __str__(self):
+        return pformat(self.__dict__)
+
+    def __repr__(self):
+        return self.__str__()
+
+
+def parse_cli_to_yaml(parser, cfg, helper=None, choices=None, cfg_path="default_config.yaml"):
+    """
+    Parse command line arguments to the configuration according to the default yaml.
+
+    Args:
+        parser: Parent parser.
+        cfg: Base configuration.
+        helper: Helper description.
+        cfg_path: Path to the default yaml config.
+    """
+    parser = argparse.ArgumentParser(description="[REPLACE THIS at config.py]",
+                                     parents=[parser])
+    helper = {} if helper is None else helper
+    choices = {} if choices is None else choices
+    for item in cfg:
+        if not isinstance(cfg[item], list) and not isinstance(cfg[item], dict):
+            help_description = helper[item] if item in helper else "Please reference to {}".format(cfg_path)
+            choice = choices[item] if item in choices else None
+            if isinstance(cfg[item], bool):
+                parser.add_argument("--" + item, type=ast.literal_eval, default=cfg[item], choices=choice,
+                                    help=help_description)
+            else:
+                parser.add_argument("--" + item, type=type(cfg[item]), default=cfg[item], choices=choice,
+                                    help=help_description)
+    args = parser.parse_args()
+    return args
+
+
+def parse_yaml(yaml_path):
+    """
+    Parse the yaml config file.
+
+    Args:
+        yaml_path: Path to the yaml config.
+    """
+    with open(yaml_path, 'r') as fin:
+        try:
+            cfgs = yaml.load_all(fin.read(), Loader=yaml.FullLoader)
+            cfgs = [x for x in cfgs]
+            if len(cfgs) == 1:
+                cfg_helper = {}
+                cfg = cfgs[0]
+                cfg_choices = {}
+            elif len(cfgs) == 2:
+                cfg, cfg_helper = cfgs
+                cfg_choices = {}
+            elif len(cfgs) == 3:
+                cfg, cfg_helper, cfg_choices = cfgs
+            else:
+                raise ValueError("At most 3 docs (config, description for help, choices) are supported in config yaml")
+            print(cfg_helper)
+        except:
+            raise ValueError("Failed to parse yaml")
+    return cfg, cfg_helper, cfg_choices
+
+
+def merge(args, cfg):
+    """
+    Merge the base config from yaml file and command line arguments.
+
+    Args:
+        args: Command line arguments.
+        cfg: Base configuration.
+    """
+    args_var = vars(args)
+    for item in args_var:
+        cfg[item] = args_var[item]
+    return cfg
+
+
+def get_config():
+    """
+    Get Config according to the yaml file and cli arguments.
+    """
+    parser = argparse.ArgumentParser(description="default name", add_help=False)
+    current_dir = os.path.dirname(os.path.abspath(__file__))
+    parser.add_argument("--config_path", type=str, default=os.path.join(current_dir, "../default_config.yaml"),
+                        help="Config file path")
+    path_args, _ = parser.parse_known_args()
+    default, helper, choices = parse_yaml(path_args.config_path)
+    args = parse_cli_to_yaml(parser=parser, cfg=default, helper=helper, choices=choices, cfg_path=path_args.config_path)
+    default = Config(merge(args, default))
+
+    return default
+
+
+config = get_config()
--- a/research/cv/AlphaPose/src/dataset.py
+++ b/research/cv/AlphaPose/src/dataset.py
@@ -30,7 +30,7 @@ import mindspore.dataset.vision as C
 from src.utils.transforms import fliplr_joints, get_affine_transform, affine_transform
 from src.config import config

-ds.config.set_seed(config.GENERAL.DATASET_SEED) # Set Random Seed
+ds.config.set_seed(config.DATASET_SEED) # Set Random Seed
 flip_pairs = [[1, 2], [3, 4], [5, 6], [7, 8],
              [9, 10], [11, 12], [13, 14], [15, 16]]

@@ -39,17 +39,17 @@ class CocoDatasetGenerator:
    About the specific operations of coco2017 data set processing
    '''
    def __init__(self, cfg, is_train=False):
-        self.image_thre = cfg.TEST.IMAGE_THRE
-        self.image_size = np.array(cfg.MODEL.IMAGE_SIZE, dtype=np.int32)
-        self.image_width = cfg.MODEL.IMAGE_SIZE[0]
-        self.image_height = cfg.MODEL.IMAGE_SIZE[1]
+        self.image_thre = cfg.TEST_IMAGE_THRE
+        self.image_size = np.array(cfg.MODEL_IMAGE_SIZE, dtype=np.int32)
+        self.image_width = cfg.MODEL_IMAGE_SIZE[0]
+        self.image_height = cfg.MODEL_IMAGE_SIZE[1]
        self.aspect_ratio = self.image_width * 1.0 / self.image_height
-        self.heatmap_size = np.array(cfg.NETWORK.HEATMAP_SIZE, dtype=np.int32)
-        self.sigma = cfg.NETWORK.SIGMA
-        self.target_type = cfg.NETWORK.TARGET_TYPE
-        self.scale_factor = cfg.DATASET.SCALE_FACTOR
-        self.rotation_factor = cfg.DATASET.ROT_FACTOR
-        self.flip = cfg.DATASET.FLIP
+        self.heatmap_size = np.array(cfg.NETWORK_HEATMAP_SIZE, dtype=np.int32)
+        self.sigma = cfg.NETWORK_SIGMA
+        self.target_type = cfg.NETWORK_TARGET_TYPE
+        self.scale_factor = cfg.DATASET_SCALE_FACTOR
+        self.rotation_factor = cfg.DATASET_ROT_FACTOR
+        self.flip = cfg.DATASET_FLIP
        self.db = []
        self.is_train = is_train
        self.flip_pairs = [[1, 2], [3, 4], [5, 6], [7, 8],
@@ -310,34 +310,34 @@ def CreateDatasetCoco(rank=0,
    '''
    CreateDatasetCoco
    '''
-    per_batch_size = config.TRAIN.BATCH_SIZE if train_mode else config.TEST.BATCH_SIZE
+    per_batch_size = config.TRAIN_BATCH_SIZE if train_mode else config.TEST_BATCH_SIZE

    image_path = ''
    ann_file = ''
    bbox_file = ''
-    if config.MODELARTS.IS_MODEL_ARTS:
-        image_path = config.MODELARTS.CACHE_INPUT
-        ann_file = config.MODELARTS.CACHE_INPUT
-        bbox_file = config.MODELARTS.CACHE_INPUT
+    if config.MODELARTS_IS_MODEL_ARTS:
+        image_path = config.MODELARTS_CACHE_INPUT
+        ann_file = config.MODELARTS_CACHE_INPUT
+        bbox_file = config.MODELARTS_CACHE_INPUT
    else:
-        image_path = config.DATASET.ROOT
-        ann_file = config.DATASET.ROOT
-        bbox_file = config.DATASET.ROOT
+        image_path = config.DATASET_ROOT
+        ann_file = config.DATASET_ROOT
+        bbox_file = config.DATASET_ROOT

    if train_mode:
-        image_path = image_path + config.DATASET.TRAIN_SET
-        ann_file = ann_file + config.DATASET.TRAIN_JSON
+        image_path = os.path.join(image_path, config.DATASET_TRAIN_SET)
+        ann_file = os.path.join(ann_file, config.DATASET_TRAIN_JSON)
    else:
-        image_path = image_path + config.DATASET.TEST_SET
-        ann_file = ann_file + config.DATASET.TEST_JSON
-    bbox_file = bbox_file + config.TEST.COCO_BBOX_FILE
+        image_path = os.path.join(image_path, config.DATASET_TEST_SET)
+        ann_file = os.path.join(ann_file, config.DATASET_TEST_JSON)
+    bbox_file = os.path.join(bbox_file, config.TEST_COCO_BBOX_FILE)

    print('loading dataset from {}'.format(image_path))

    shuffle = shuffle if shuffle is not None else train_mode
    dataset_generator = CocoDatasetGenerator(config, is_train=train_mode)

-    if not train_mode and config.TEST.USE_GT_BBOX:
+    if not train_mode and config.TEST_USE_GT_BBOX:
        print('loading bbox file from {}'.format(bbox_file))
        dataset_generator.load_detect_dataset(image_path, ann_file, bbox_file)
    else:

--- a/research/cv/AlphaPose/src/utils/coco.py
+++ b/research/cv/AlphaPose/src/utils/coco.py
@@ -103,9 +103,9 @@ def evaluate(cfg, preds, output_dir, all_boxes, img_id, ann_path):
        })

    # rescoring and oks nms
-    num_joints = cfg.MODEL.NUM_JOINTS
-    in_vis_thre = cfg.TEST.IN_VIS_THRE
-    oks_thre = cfg.TEST.OKS_THRE
+    num_joints = cfg.MODEL_NUM_JOINTS
+    in_vis_thre = cfg.TEST_IN_VIS_THRE
+    oks_thre = cfg.TEST_OKS_THRE
    oks_nmsed_kpts = {}
    for img, items in img_kpts_dict.items():
        for item in items:
@@ -126,10 +126,10 @@ def evaluate(cfg, preds, output_dir, all_boxes, img_id, ann_path):
            oks_nmsed_kpts[img] = [items[kep] for kep in keep]

    # evaluate and save
-    image_set = cfg.DATASET.TEST_SET
+    image_set = cfg.DATASET_TEST_SET
    _write_coco_keypoint_results(oks_nmsed_kpts, num_joints, res_file)
    if 'test' not in image_set and has_coco:
-        ann_path = ann_path if ann_path else os.path.join(cfg.DATASET.ROOT, 'annotations',
+        ann_path = ann_path if ann_path else os.path.join(cfg.DATASET_ROOT, 'annotations',
                                                          'person_keypoints_' + image_set + '.json')
        info_str = _do_python_keypoint_eval(res_file, output_dir, ann_path)
        name_value = OrderedDict(info_str)

--- a/research/cv/AlphaPose/train.py
+++ b/research/cv/AlphaPose/train.py
@@ -18,24 +18,21 @@ train
 from __future__ import division

 import os
-import ast
-import argparse
+
 import numpy as np
 from mindspore import context, Tensor
 from mindspore.context import ParallelMode
-from mindspore.communication.management import init
+from mindspore.communication.management import init, get_rank, get_group_size
 from mindspore.train import Model
-from mindspore.train.callback import TimeMonitor, LossMonitor, ModelCheckpoint, CheckpointConfig
+from mindspore.train.callback import TimeMonitor, LossMonitor, ModelCheckpoint,\
+                                     CheckpointConfig, SummaryCollector
 from mindspore.nn.optim import Adam
 from mindspore.common import set_seed
+
 from src.dataset import CreateDatasetCoco
 from src.config import config
 from src.network_with_loss import JointsMSELoss, PoseResNetWithLoss
 from src.FastPose import createModel
-if config.MODELARTS.IS_MODEL_ARTS:
-    import moxing as mox
-
-set_seed(config.GENERAL.TRAIN_SEED)


 def get_lr(begin_epoch,
@@ -61,104 +58,102 @@ def get_lr(begin_epoch,
    return learning_rate


-def parse_args():
-    '''
-    parse_args
-    '''
-    parser = argparse.ArgumentParser(description="Simpleposenet training")
-    parser.add_argument('--data_url', required=False,
-                        default=None, help='Location of data.')
-    parser.add_argument('--train_url', required=False,
-                        default=None, help='Location of training outputs.')
-    parser.add_argument('--device_id', required=False, default=0,
-                        type=int, help='Location of training outputs.')
-    parser.add_argument('--run_distribute', type=ast.literal_eval,
-                        default=False, help='Location of training outputs.')
-    parser.add_argument('--is_model_arts', type=ast.literal_eval,
-                        default=False, help='Location of training outputs.')
-    args = parser.parse_args()
-    return args
-
-
 def main():
-    print("loading parse...")
-    args = parse_args()
-    device_id = args.device_id
-    config.GENERAL.RUN_DISTRIBUTE = args.run_distribute
-    config.MODELARTS.IS_MODEL_ARTS = args.is_model_arts
-    if config.GENERAL.RUN_DISTRIBUTE or config.MODELARTS.IS_MODEL_ARTS:
-        device_id = int(os.getenv('DEVICE_ID'))
    context.set_context(mode=context.GRAPH_MODE,
-                        device_target="Ascend",
-                        save_graphs=False,
-                        device_id=device_id)
-    if config.GENERAL.RUN_DISTRIBUTE:
-        init()
-        rank = int(os.getenv('DEVICE_ID'))
-        device_num = int(os.getenv('RANK_SIZE'))
-        context.set_auto_parallel_context(device_num=device_num,
-                                          parallel_mode=ParallelMode.DATA_PARALLEL,
-                                          gradients_mean=True)
+                        device_target=config.DEVICE_TARGET,
+                        save_graphs=False)
+
+    if config.DEVICE_TARGET == "Ascend":
+        device_id = int(os.getenv('DEVICE_ID', '0'))
+        context.set_context(device_id=device_id)
+
+    if config.RUN_DISTRIBUTE:
+        if config.DEVICE_TARGET == 'Ascend':
+            init()
+            rank = get_rank()
+            device_num = get_group_size()
+            context.set_auto_parallel_context(device_num=device_num,
+                                              parallel_mode=ParallelMode.DATA_PARALLEL,
+                                              gradients_mean=True,
+                                              parameter_broadcast=True)
+        elif config.DEVICE_TARGET == 'GPU':
+            init("nccl")
+            rank = get_rank()
+            device_num = get_group_size()
+            context.reset_auto_parallel_context()
+            context.set_auto_parallel_context(device_num=device_num,
+                                              parallel_mode=ParallelMode.DATA_PARALLEL,
+                                              gradients_mean=True)
+        else:
+            raise NotImplementedError("Only GPU and Ascend training supported")
    else:
        rank = 0
        device_num = 1

-    if config.MODELARTS.IS_MODEL_ARTS:
-        mox.file.copy_parallel(src_url=args.data_url,
-                               dst_url=config.MODELARTS.CACHE_INPUT)
+    if config.MODELARTS_IS_MODEL_ARTS:
+        mox.file.copy_parallel(src_url=config.MODELARTS_DATA_URL,
+                               dst_url=config.MODELARTS_CACHE_INPUT)

+    print(f"Running on {config.DEVICE_TARGET}, device num: {device_num}, rank: {rank}")
    dataset = CreateDatasetCoco(rank=rank,
                                group_size=device_num,
                                train_mode=True,
-                                num_parallel_workers=config.TRAIN.NUM_PARALLEL_WORKERS,
+                                num_parallel_workers=config.TRAIN_NUM_PARALLEL_WORKERS,
                                )
    m = createModel()
-    loss = JointsMSELoss(config.LOSS.USE_TARGET_WEIGHT)
+    loss = JointsMSELoss(config.LOSS_USE_TARGET_WEIGHT)
    net_with_loss = PoseResNetWithLoss(m, loss)
    dataset_size = dataset.get_dataset_size()
-    lr = Tensor(get_lr(config.TRAIN.BEGIN_EPOCH,
-                       config.TRAIN.END_EPOCH,
+    print(f"Dataset size = {dataset_size}")
+    lr = Tensor(get_lr(config.TRAIN_BEGIN_EPOCH,
+                       config.TRAIN_END_EPOCH,
                       dataset_size,
-                       lr_init=config.TRAIN.LR,
-                       factor=config.TRAIN.LR_FACTOR,
-                       epoch_number_to_drop=config.TRAIN.LR_STEP))
+                       lr_init=config.TRAIN_LR,
+                       factor=config.TRAIN_LR_FACTOR,
+                       epoch_number_to_drop=config.TRAIN_LR_STEP))
    optim = Adam(m.trainable_params(), learning_rate=lr)
    time_cb = TimeMonitor(data_size=dataset_size)
    loss_cb = LossMonitor()
-    cb = [time_cb, loss_cb]
-    if config.TRAIN.SAVE_CKPT:
+    summary_cb = SummaryCollector(os.path.join(config.SUMMARY_DIR, f'rank_{rank}'))
+    cb = [time_cb, loss_cb, summary_cb]
+    if config.TRAIN_SAVE_CKPT:
        config_ck = CheckpointConfig(
            save_checkpoint_steps=dataset_size, keep_checkpoint_max=2)
        prefix = ''
-        if config.GENERAL.RUN_DISTRIBUTE:
+        if config.RUN_DISTRIBUTE:
            prefix = 'multi_' + 'train_fastpose_' + \
-                config.GENERAL.VERSION + '_' + os.getenv('DEVICE_ID')
+                config.VERSION + '_' + str(rank)
        else:
-            prefix = 'single_' + 'train_fastpose_' + config.GENERAL.VERSION
+            prefix = 'single_' + 'train_fastpose_' + config.VERSION

        directory = ''
-        if config.MODELARTS.IS_MODEL_ARTS:
-            directory = config.MODELARTS.CACHE_OUTPUT + \
-                'device_' + os.getenv('DEVICE_ID')
-        elif config.GENERAL.RUN_DISTRIBUTE:
-            directory = config.TRAIN.CKPT_PATH + \
-                'device_' + os.getenv('DEVICE_ID')
+        if config.MODELARTS_IS_MODEL_ARTS:
+            directory = config.MODELARTS_CACHE_OUTPUT + \
+                'device_' + str(rank)
+        elif config.RUN_DISTRIBUTE:
+            directory = config.TRAIN_CKPT_PATH + \
+                'device_' + str(rank)
        else:
-            directory = config.TRAIN.CKPT_PATH + 'device'
+            directory = config.TRAIN_CKPT_PATH + 'device'

        ckpoint_cb = ModelCheckpoint(
            prefix=prefix, directory=directory, config=config_ck)
        cb.append(ckpoint_cb)
    model = Model(net_with_loss, optimizer=optim, amp_level="O2")
-    epoch_size = config.TRAIN.END_EPOCH - config.TRAIN.BEGIN_EPOCH
+    epoch_size = config.TRAIN_END_EPOCH - config.TRAIN_BEGIN_EPOCH
+
    print("************ Start training now ************")
-    print('start training, epoch size = %d' % epoch_size)
-    model.train(epoch_size, dataset, callbacks=cb)
+    print(f'start training, {epoch_size} epochs, {dataset_size} steps per epoch')
+    model.train(epoch_size, dataset, callbacks=cb, dataset_sink_mode=True)

-    if config.MODELARTS.IS_MODEL_ARTS:
+    if config.MODELARTS_IS_MODEL_ARTS:
        mox.file.copy_parallel(
-            src_url=config.MODELARTS.CACHE_OUTPUT, dst_url=args.train_url)
+            src_url=config.MODELARTS_CACHE_OUTPUT, dst_url=config.MODELARTS_TRAIN_URL)


 if __name__ == '__main__':
+    if config.MODELARTS_IS_MODEL_ARTS:
+        import moxing as mox
+    set_seed(config.TRAIN_SEED)
+
    main()