diff --git a/research/cv/RefineDet/README_CN.md b/research/cv/RefineDet/README_CN.md
new file mode 100644
index 0000000000000000000000000000000000000000..07bd5813ee7ac53b6461aced419b384247c3b439
--- /dev/null
+++ b/research/cv/RefineDet/README_CN.md
@@ -0,0 +1,445 @@
+# 目录
+
+<!-- TOC -->
+
+- [目录](#目录)
+- [模型说明](#RefineDet说明)
+- [模型架构](#模型架构)
+- [数据集](#数据集)
+- [环境要求](#环境要求)
+- [快速入门](#快速入门)
+- [脚本说明](#脚本说明)
+    - [脚本及样例代码](#脚本及样例代码)
+    - [脚本参数](#脚本参数)
+    - [训练过程](#训练过程)
+        - [Ascend上训练](#ascend上训练)
+    <!--        - [GPU训练](#gpu训练)-->
+    - [评估过程](#评估过程)
+        - [Ascend处理器环境评估](#ascend处理器环境评估)
+
+<!--
+
+        - [GPU处理器环境评估](#gpu处理器环境评估)
+    - [推理过程](#推理过程)
+        - [导出MindIR](#导出mindir)
+        - [在Ascend310执行推理](#在ascend310执行推理)
+        - [结果](#结果)
+- [模型描述](#模型描述)
+    - [性能](#性能)
+        - [评估性能](#评估性能)
+        - [推理性能](#推理性能)
+- [随机情况说明](#随机情况说明)
+
+-->
+<!-- /TOC -->
+
+# RefineDet说明
+
+RefineDet是CVPR 2018中提出的一种目标检测模型。它融合one-stage方法和two-stage方法的优点(前者更快,后者更准)并克服了它们的缺点。通过使用ARM对随机生成的检测框进行先一步的回归,再使用TCB模块融合多尺度的特征,最后使用类似SSD的回归和分类结构大大提高了目标检测的速度和精度。
+
+[论文](https://arxiv.org/pdf/1711.06897.pdf):   S. Zhang, L. Wen, X. Bian, Z. Lei and S. Z. Li, "Single-shot refinement neural network for object detection", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 4203-4212, Jun. 2018.
+
+# 模型架构
+
+RefineDet的结构由三部分组成————负责预回归检测框的ARM,检测目标的ODM和连接两者的TCB
+![refinedet_structure](https://github.com/sfzhang15/RefineDet/raw/master/refinedet_structure.jpg)
+
+RefineDet的结构,图片来自原论文
+
+特征提取部分使用VGG-16作为backbone
+
+# 数据集
+
+使用的数据集: [COCO2017](<http://images.cocodataset.org/>)
+
+- 数据集大小:19 GB
+    - 训练集:18 GB,118000张图像
+    - 验证集:1 GB,5000张图像
+    - 标注:241 MB,实例,字幕,person_keypoints等
+- 数据格式:图像和json文件
+    - 注意:数据在dataset.py中处理
+
+# 环境要求
+
+- 安装[MindSpore](https://www.mindspore.cn/install)。
+
+- 下载数据集COCO2017。
+
+- 本示例默认使用COCO2017作为训练数据集,您也可以使用自己的数据集。
+
+    1. 如果使用coco数据集。**执行脚本时选择数据集coco。**
+        安装Cython、pycocotool和opencv进行数据处理。
+
+        ```python
+        pip install Cython
+
+        pip install pycocotools
+
+        pip install opencv-python
+        ```
+
+        并在`config.py`中更改COCO_ROOT和其他您需要的设置。目录结构如下:
+
+        ```text
+        .
+        └─cocodataset
+          ├─annotations
+            ├─instance_train2017.json
+            └─instance_val2017.json
+          ├─val2017
+          └─train2017
+
+        ```
+
+    2. 如果使用自己的数据集。**执行脚本时选择数据集为other。**
+        将数据集信息整理成TXT文件,每行如下:
+
+        ```text
+        train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2
+
+        ```
+
+        每行是按空间分割的图像标注,第一列是图像的相对路径,其余为[xmin,ymin,xmax,ymax,class]格式的框和类信息。我们从`IMAGE_DIR`(数据集目录)和`ANNO_PATH`(TXT文件路径)的相对路径连接起来的图像路径中读取图像。在`config.py`中设置`IMAGE_DIR`和`ANNO_PATH`。
+
+# 快速入门
+
+通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估:
+
+- Ascend处理器环境运行
+
+```shell script
+# Ascend单卡直接训练示例
+python train.py --device_id=0 --epoch_size=500 --dataset=coco
+# 或者
+bash run_standardalone_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET]
+# 示例
+bash run_standardalone_train.sh 0 500 0.05 coco
+```
+
+```shell script
+# Ascend分布式训练
+bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE]
+# 示例
+bash run_distribute_train.sh 8 500 0.05 coco ./hccl_rank_tabel_8p.json
+```
+
+在modelarts上训练请运行train_modelarts.py,参数设置除data_url与train_url外与直接运行单卡的参数相同
+
+```shell script
+# Ascend处理器环境运行eval
+bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
+# 示例
+bash run_eval.sh coco  ./ckpt/refinedet.ckpt 0
+# 或直接运行eval.py,示例如下
+python eval.py --dataset=coco --device_id=0 --checkpoint_path=./ckpt/refinedet.ckpt
+```
+
+<!---
+
+- GPU处理器环境运行
+
+```shell script
+# GPU分布式训练
+sh run_distribute_train_gpu.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET]
+```
+
+```shell script
+# GPU处理器环境运行eval
+sh run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
+```
+
+-->
+
+# 脚本说明
+
+## 脚本及样例代码
+
+```text
+.
+└─ cv
+  └─ RefineDet
+    ├─ README.md                      ## SSD相关说明
+    ├─ scripts
+      ├─ run_distribute_train.sh      ## Ascend分布式shell脚本
+      ├─ run_distribute_train_gpu.sh  ## GPU分布式shell脚本
+      ├─ run_eval.sh                  ## Ascend评估shell脚本
+      └─ run_eval_gpu.sh              ## GPU评估shell脚本
+    ├─ src
+      ├─ anchor_generator.py          ## 生成初始的随机检测框的脚本
+      ├─ box_utils.py                 ## bbox处理脚本
+      ├─ config.py                    ## 总的config文件
+      ├─ dataset.py                   ## 处理并生成数据集的脚本
+      ├─ eval_utils.py                ## 评估函数的脚本
+      ├─ init_params.py               ## 初始化网络参数的脚本
+      ├─ __init__.py
+      ├─ l2norm.py                    ## 实现L2 Normalization的脚本
+      ├─ lr_schedule.py               ## 实现动态学习率的脚本
+      ├─ multibox.py                  ## 实现多检测框回归的脚本
+      ├─ refinedet_loss_cell.py       ## 实现loss函数的脚本
+      ├─ refinedet.py                 ## 定义了整个网络框架的脚本
+      ├─ resnet101_for_refinedet.py   ## 实现了resnet101作为backbone
+      └─ vgg16_for_refinedet.py       ## 实现了vgg16作为backbone
+    ├─ eval.py                        ## 评估脚本
+    ├─ train.py                       ## 训练脚本
+    └─ train_modelarts.py             ## 用于在modelarts云环境上训练的脚本
+```
+
+## 脚本参数
+
+  ```text
+  train.py和config.py中主要参数如下:
+
+    "device_num": 1                            # 使用设备数量
+    "lr": 0.05                                 # 学习率初始值
+    "dataset": coco                            # 数据集名称
+    "epoch_size": 500                          # 轮次大小
+    "batch_size": 32                           # 输入张量的批次大小
+    "pre_trained": None                        # 预训练检查点文件路径
+    "pre_trained_epoch_size": 0                # 预训练轮次大小
+    "save_checkpoint_epochs": 10               # 两个检查点之间的轮次间隔。默认情况下,每10个轮次都会保存检查点。
+    "loss_scale": 1024                         # 损失放大
+
+    "class_num": 81                            # 数据集类数
+    "image_shape": [320, 320]                  # 作为模型输入的图像高和宽
+    "mindrecord_dir": "/data/MindRecord"  # MindRecord路径
+    "coco_root": "/data/coco2017"              # COCO2017数据集路径
+    "voc_root": ""                             # VOC原始数据集路径
+    "image_dir": ""                            # 其他数据集图片路径,如果使用coco或voc,此参数无效。
+    "anno_path": ""                            # 其他数据集标注路径,如果使用coco或voc,此参数无效。
+
+  ```
+
+## 训练过程
+
+运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/convert_dataset.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。**
+
+### Ascend上训练
+
+- 分布式
+
+```shell script
+    bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
+```
+
+此脚本需要五或七个参数。
+
+- `DEVICE_NUM`:分布式训练的设备数。
+- `EPOCH_NUM`:分布式训练的轮次数。
+- `LR`:分布式训练的学习率初始值。
+- `DATASET`:分布式训练的数据集模式。
+- `RANK_TABLE_FILE`:[rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools)的路径。最好使用绝对路径。
+- `PRE_TRAINED`:预训练检查点文件的路径。最好使用绝对路径。
+- `PRE_TRAINED_EPOCH_SIZE`:预训练的轮次数。
+
+    训练结果保存在当前路径中,文件夹名称以"LOG"开头。  您可在此文件夹中找到检查点文件以及结果,如下所示。
+
+```text
+epoch: 1 step: 458, loss is 3.1681802
+epoch time: 228752.4654865265, per step time: 499.4595316299705
+epoch: 2 step: 458, loss is 2.8847265
+epoch time: 38912.93382644653, per step time: 84.96273761232868
+epoch: 3 step: 458, loss is 2.8398118
+epoch time: 38769.184827804565, per step time: 84.64887516987896
+...
+
+epoch: 498 step: 458, loss is 0.70908034
+epoch time: 38771.079778671265, per step time: 84.65301261718616
+epoch: 499 step: 458, loss is 0.7974688
+epoch time: 38787.413120269775, per step time: 84.68867493508685
+epoch: 500 step: 458, loss is 0.5548882
+epoch time: 39064.8467540741, per step time: 85.29442522723602
+```
+
+<!---
+### GPU训练
+
+- 分布式
+
+```shell script
+    sh run_distribute_train_gpu.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)
+```
+
+此脚本需要五或七个参数。
+
+- `DEVICE_NUM`:分布式训练的设备数。
+- `EPOCH_NUM`:分布式训练的轮次数。
+- `LR`:分布式训练的学习率初始值。
+- `DATASET`:分布式训练的数据集模式。
+- `PRE_TRAINED`:预训练检查点文件的路径。最好使用绝对路径。
+- `PRE_TRAINED_EPOCH_SIZE`:预训练的轮次数。
+
+    训练结果保存在当前路径中,文件夹名称以"LOG"开头。  您可在此文件夹中找到检查点文件以及结果,如下所示。
+
+```text
+epoch: 1 step: 1, loss is 420.11783
+epoch: 1 step: 2, loss is 434.11032
+epoch: 1 step: 3, loss is 476.802
+...
+epoch: 1 step: 458, loss is 3.1283689
+epoch time: 150753.701, per step time: 329.157
+...
+
+```
+
+-->
+
+## 评估过程
+
+### Ascend处理器环境评估
+
+```shell script
+bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
+```
+
+此脚本需要两个参数。
+
+- `DATASET`:评估数据集的模式。
+- `CHECKPOINT_PATH`:检查点文件的绝对路径。
+- `DEVICE_ID`: 评估的设备ID。
+
+> 在训练过程中可以生成检查点。
+
+推理结果保存在示例路径中,文件夹名称以“eval”开头。您可以在日志中找到类似以下的结果。
+
+```text
+Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.238
+Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.400
+Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.240
+Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.039
+Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.198
+Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.438
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.250
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.389
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.424
+Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.122
+Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.434
+Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.697
+
+========================================
+
+mAP: 0.23808886505483504
+```
+
+<!--
+### GPU处理器环境评估
+
+```shell script
+sh run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
+```
+
+此脚本需要两个参数。
+
+- `DATASET`:评估数据集的模式。
+- `CHECKPOINT_PATH`:检查点文件的绝对路径。
+- `DEVICE_ID`: 评估的设备ID。
+
+> 在训练过程中可以生成检查点。
+
+推理结果保存在示例路径中,文件夹名称以“eval”开头。您可以在日志中找到类似以下的结果。
+
+```text
+Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.224
+Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.375
+Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.228
+Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.034
+Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.189
+Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.407
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.243
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.382
+Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.417
+Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120
+Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.425
+Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.686
+
+========================================
+
+mAP: 0.2244936111705981
+```
+
+## 推理过程
+
+### [导出MindIR](#contents)
+
+```shell
+python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
+```
+
+参数ckpt_file为必填项,
+`EXPORT_FORMAT` 必须在 ["AIR", "MINDIR"]中选择。
+
+### 在Ascend310执行推理
+
+在执行推理前,mindir文件必须通过`export.py`脚本导出。以下展示了使用minir模型执行推理的示例。
+目前仅支持batch_Size为1的推理。精度计算过程需要70G+的内存,否则进程将会因为超出内存被系统终止。
+
+```shell
+# Ascend310 inference
+bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [DEVICE_ID]
+```
+
+- `DVPP` 为必填项,需要在["DVPP", "CPU"]选择,大小写均可。需要注意的是ssd_vgg16执行推理的图片尺寸为[300, 300],由于DVPP硬件限制宽为16整除,高为2整除,因此,这个网络需要通过CPU算子对图像进行前处理。
+- `DEVICE_ID` 可选,默认值为0。
+
+### 结果
+
+推理结果保存在脚本执行的当前路径,你可以在acc.log中看到以下精度计算结果。
+
+```bash
+Average Precision (AP) @[ IoU=0.50:0.95 | area= all   | maxDets=100 ] = 0.339
+Average Precision (AP) @[ IoU=0.50      | area= all   | maxDets=100 ] = 0.521
+Average Precision (AP) @[ IoU=0.75      | area= all   | maxDets=100 ] = 0.370
+Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.168
+Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.386
+Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.461
+Average Recall    (AR) @[ IoU=0.50:0.95 | area= all   | maxDets=  1 ] = 0.310
+Average Recall    (AR) @[ IoU=0.50:0.95 | area= all   | maxDets= 10 ] = 0.481
+Average Recall    (AR) @[ IoU=0.50:0.95 | area= all   | maxDets=100 ] = 0.515
+Average Recall    (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.293
+Average Recall    (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.659
+mAP: 0.33880018942412393
+```
+
+# 模型描述
+
+## 性能
+
+### 评估性能
+
+| 参数                  | Ascend                                                     | GPU                       |
+| -------------------------- | -------------------------------------------------------------| -------------------------------------------------------------|
+| 模型版本 | SSD V1 | SSD V1 |
+| 资源 | Ascend 910;CPU: 2.60GHz,192核;内存:755 GB | NV SMX2 V100-16G |
+| 上传日期 | 2021-06-01  | 2021-09-24 |
+| MindSpore版本          | 0.3.0-alpha                                                  | 1.0.0                                                        |
+| 数据集                   | COCO2017                                                     | COCO2017                                                     |
+| 训练参数    | epoch = 500,  batch_size = 32                                | epoch = 800,  batch_size = 32                                |
+| 优化器               | Momentum                                                     | Momentum                                                     |
+| 损失函数 | Sigmoid交叉熵,SmoothL1Loss | Sigmoid交叉熵,SmoothL1Loss |
+| 速度 | 8卡:90毫秒/步 | 8卡:121毫秒/步 |
+| 总时长 | 8卡:4.81小时 | 8卡:12.31小时 |
+| 参数(M) | 34 | 34 |
+|脚本  | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/ssd | https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/cv/ssd |
+
+### 推理性能
+
+| 参数          | Ascend                      | GPU                         |
+| ------------------- | ----------------------------| ----------------------------|
+| 模型版本       | SSD V1                      | SSD V1                      |
+| 资源           | Ascend 910                  | GPU                         |
+| 上传日期  | 2021-06-01  | 2021-09-24 |
+| MindSpore版本    | 0.3.0-alpha                 | 1.0.0                       |
+| 数据集         | COCO2017                    | COCO2017                    |
+| batch_size          | 1                           | 1                           |
+| 输出 | mAP | mAP |
+| 准确率 | IoU=0.50: 23.8%             | IoU=0.50: 22.4%             |
+| 推理模型   | 34M(.ckpt文件)            | 34M(.ckpt文件)            |
+
+# 随机情况说明
+
+dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。
+
+# ModelZoo主页
+
+ 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。
+
+ -->
\ No newline at end of file
diff --git a/research/cv/RefineDet/eval.py b/research/cv/RefineDet/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..b0af49138b7bef74f202768a8d1065b1fa31b66b
--- /dev/null
+++ b/research/cv/RefineDet/eval.py
@@ -0,0 +1,118 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Evaluation for RefineDet"""
+
+import os
+import argparse
+import time
+import numpy as np
+from mindspore import context, Tensor
+from mindspore.train.serialization import load_checkpoint, load_param_into_net
+from src.eval_utils import coco_metrics
+from src.eval_utils import voc_metrics
+from src.box_utils import box_init
+from src.config import get_config
+from src.dataset import create_refinedet_dataset, create_mindrecord
+from src.refinedet import refinedet_vgg16, refinedet_resnet101, RefineDetInferWithDecoder
+
+def refinedet_eval(net_config, dataset_path, ckpt_path, anno_json, net_metrics):
+    """RefineDet evaluation."""
+    batch_size = 1
+    ds = create_refinedet_dataset(net_config, dataset_path, batch_size=batch_size, repeat_num=1,
+                                  is_training=False, use_multiprocessing=False)
+    if net_config.model == "refinedet_vgg16":
+        net = refinedet_vgg16(net_config, is_training=False)
+    elif net_config.model == "refinedet_resnet101":
+        net = refinedet_resnet101(net_config, is_training=False)
+    else:
+        raise ValueError(f'config.model: {net_config.model} is not supported')
+    default_boxes = box_init(net_config)
+    net = RefineDetInferWithDecoder(net, Tensor(default_boxes), net_config)
+
+    print("Load Checkpoint!")
+    param_dict = load_checkpoint(ckpt_path)
+    net.init_parameters_data()
+    load_param_into_net(net, param_dict)
+
+    net.set_train(False)
+    i = batch_size
+    total = ds.get_dataset_size() * batch_size
+    start = time.time()
+    pred_data = []
+    print("\n========================================\n")
+    print("total images num: ", total)
+    print("Processing, please wait a moment.")
+    for data in ds.create_dict_iterator(output_numpy=True, num_epochs=1):
+        img_id = data['img_id']
+        img_np = data['image']
+        image_shape = data['image_shape']
+
+        output = net(Tensor(img_np))
+        for batch_idx in range(img_np.shape[0]):
+            pred_data.append({"boxes": output[0].asnumpy()[batch_idx],
+                              "box_scores": output[1].asnumpy()[batch_idx],
+                              "img_id": int(np.squeeze(img_id[batch_idx])),
+                              "image_shape": image_shape[batch_idx]})
+        percent = round(i / total * 100., 2)
+
+        print(f'    {str(percent)} [{i}/{total}]', end='\r')
+        i += batch_size
+    cost_time = int((time.time() - start) * 1000)
+    print(f'    100% [{total}/{total}] cost {cost_time} ms')
+    mAP = net_metrics(pred_data, anno_json, net_config)
+    print("\n========================================\n")
+    print(f"mAP: {mAP}")
+
+def get_eval_args():
+    """Get args for eval"""
+    parser = argparse.ArgumentParser(description='RefineDet evaluation')
+    parser.add_argument("--using_mode", type=str, default="refinedet_vgg16_320",
+                        choices=("refinedet_vgg16_320", "refinedet_vgg16_512",
+                                 "refinedet_resnet101_320", "refinedet_resnet101_512"),
+                        help="using mode, same as training.")
+    parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.")
+    parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.")
+    parser.add_argument("--checkpoint_path", type=str, required=True, help="Checkpoint file path.")
+    parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"),
+                        help="run platform, support Ascend ,GPU and CPU.")
+    parser.add_argument('--debug', type=str, default="0", choices=["0", "1"],
+                        help="Active the debug mode. Under debug mode, the network would be run as PyNative mode.")
+    return parser.parse_args()
+
+if __name__ == '__main__':
+    args_opt = get_eval_args()
+    config = get_config(args_opt.using_mode, args_opt.dataset)
+    box_init(config)
+    if args_opt.dataset == "coco":
+        json_path = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type))
+    elif args_opt.dataset[:3] == "voc":
+        json_path = os.path.join(config.voc_root, config.voc_json)
+    else:
+        json_path = config.instances_set
+
+    if args_opt.debug == "1":
+        network_mode = context.PYNATIVE_MODE
+    else:
+        network_mode = context.GRAPH_MODE
+
+    context.set_context(mode=network_mode, device_target=args_opt.run_platform, device_id=args_opt.device_id)
+
+    mindrecord_file = create_mindrecord(config, args_opt.dataset,
+                                        "refinedet_eval.mindrecord", False,
+                                        file_num=1)
+
+    print("Start Eval!")
+    metrics = coco_metrics if args_opt.dataset == 'coco' else voc_metrics
+    refinedet_eval(config, mindrecord_file, args_opt.checkpoint_path, json_path, metrics)
diff --git a/research/cv/RefineDet/export.py b/research/cv/RefineDet/export.py
new file mode 100644
index 0000000000000000000000000000000000000000..1ee443be25ff8f76c7b385911632ce48733067da
--- /dev/null
+++ b/research/cv/RefineDet/export.py
@@ -0,0 +1,63 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Export mindir or air model for refinedet"""
+import argparse
+import numpy as np
+
+import mindspore
+from mindspore import context, Tensor
+from mindspore.train.serialization import load_checkpoint, load_param_into_net, export
+from src.refinedet import refinedet_vgg16, refinedet_resnet101, RefineDetInferWithDecoder
+from src.config import get_config
+from src.box_utils import box_init
+
+parser = argparse.ArgumentParser(description='RefineDet export')
+parser.add_argument("--device_id", type=int, default=0, help="Device id")
+parser.add_argument("--batch_size", type=int, default=1, help="batch size")
+parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
+parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.")
+parser.add_argument("--using_mode", type=str, default="refinedet_vgg16_320",
+                    choices=("refinedet_vgg16_320", "refinedet_vgg16_512",
+                             "refinedet_resnet101_320", "refinedet_resnet101_512"),
+                    help="using mode, same as training.")
+parser.add_argument("--file_name", type=str, default="refinedet", help="output file name.")
+parser.add_argument('--file_format', type=str, choices=["AIR", "MINDIR", "ONNX"], default='MINDIR', help='file format')
+parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="Ascend",
+                    help="device target")
+args = parser.parse_args()
+
+context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
+if args.device_target == "Ascend":
+    context.set_context(device_id=args.device_id)
+
+if __name__ == '__main__':
+    config = get_config(args.using_mode, args.dataset)
+    default_boxes = box_init(config)
+    if config.model == "refinedet_vgg16":
+        net = refinedet_vgg16(config=config)
+    elif config.model == "refinedet_resnet101":
+        net = refinedet_resnet101(config=config)
+    else:
+        raise ValueError(f'config.model: {config.model} is not supported')
+    net = RefineDetInferWithDecoder(net, Tensor(default_boxes), config)
+
+    param_dict = load_checkpoint(args.ckpt_file)
+    net.init_parameters_data()
+    load_param_into_net(net, param_dict)
+    net.set_train(False)
+
+    input_shp = [args.batch_size, 3] + config.img_shape
+    input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), mindspore.float32)
+    export(net, input_array, file_name=args.file_name, file_format=args.file_format)
diff --git a/research/cv/RefineDet/scripts/run_distribute_train.sh b/research/cv/RefineDet/scripts/run_distribute_train.sh
new file mode 100644
index 0000000000000000000000000000000000000000..3da21418c37cd248318979de2aed7355b360f528
--- /dev/null
+++ b/research/cv/RefineDet/scripts/run_distribute_train.sh
@@ -0,0 +1,83 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+echo "=============================================================================================================="
+echo "Please run the script as: "
+echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE"
+echo "for example: sh run_distribute_train.sh 8 500 0.2 coco /data/hccl.json /opt/ssd-300.ckpt(optional) 200(optional)"
+echo "It is better to use absolute path."
+echo "================================================================================================================="
+
+if [ $# != 5 ] && [ $# != 7 ]
+then
+    echo "Usage: sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] \
+[RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)"
+    exit 1
+fi
+
+# Before start distribute train, first create mindrecord files.
+BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
+cd $BASE_PATH/../ || exit
+python train.py --only_create_dataset=True --dataset=$4
+
+echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt"
+
+export RANK_SIZE=$1
+EPOCH_SIZE=$2
+LR=$3
+DATASET=$4
+PRE_TRAINED=$6
+PRE_TRAINED_EPOCH_SIZE=$7
+export RANK_TABLE_FILE=$5
+
+for((i=0;i<RANK_SIZE;i++))
+do
+    export DEVICE_ID=$i
+    rm -rf LOG$i
+    mkdir -p ./LOG$i/data
+    cp ./*.py ./LOG$i
+    cp -r ./src ./LOG$i
+    ln -s $BASE_PATH/../data/MindRecord/ ./LOG$i/data/MindRecord
+    cd ./LOG$i || exit
+    export RANK_ID=$i
+    echo "start training for rank $i, device $DEVICE_ID"
+    env > env.log
+    if [ $# == 5 ]
+    then
+        python train.py  \
+        --distribute=True  \
+        --lr=$LR \
+        --dataset=$DATASET \
+        --device_num=$RANK_SIZE  \
+        --device_id=$DEVICE_ID  \
+        --epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
+    fi
+
+    if [ $# == 7 ]
+    then
+        python train.py  \
+        --distribute=True  \
+        --lr=$LR \
+        --dataset=$DATASET \
+        --device_num=$RANK_SIZE  \
+        --device_id=$DEVICE_ID  \
+        --pre_trained=$PRE_TRAINED \
+        --pre_trained_epoch_size=$PRE_TRAINED_EPOCH_SIZE \
+        --epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
+    fi
+
+    cd ../
+done
diff --git a/research/cv/RefineDet/scripts/run_eval.sh b/research/cv/RefineDet/scripts/run_eval.sh
new file mode 100644
index 0000000000000000000000000000000000000000..77054ad87f642ad9de43b22634450784ee691a32
--- /dev/null
+++ b/research/cv/RefineDet/scripts/run_eval.sh
@@ -0,0 +1,65 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+if [ $# != 3 ]
+then
+    echo "Usage: sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]"
+exit 1
+fi
+
+get_real_path(){
+  if [ "${1:0:1}" == "/" ]; then
+    echo "$1"
+  else
+    echo "$(realpath -m $PWD/$1)"
+  fi
+}
+
+DATASET=$1
+CHECKPOINT_PATH=$(get_real_path $2)
+echo $DATASET
+echo $CHECKPOINT_PATH
+
+if [ ! -f $CHECKPOINT_PATH ]
+then
+    echo "error: CHECKPOINT_PATH=$PATH2 is not a file"
+exit 1
+fi
+
+export DEVICE_NUM=1
+export DEVICE_ID=$3
+export RANK_SIZE=$DEVICE_NUM
+export RANK_ID=0
+
+BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
+cd $BASE_PATH/../ || exit
+
+if [ -d "eval$3" ];
+then
+    rm -rf ./eval$3
+fi
+
+mkdir ./eval$3
+cp ./*.py ./eval$3
+cp -r ./src ./eval$3
+cd ./eval$3 || exit
+env > env.log
+echo "start inferring for device $DEVICE_ID"
+python eval.py \
+    --dataset=$DATASET \
+    --checkpoint_path=$CHECKPOINT_PATH \
+    --device_id=$3 > log.txt 2>&1 &
+cd ..
diff --git a/research/cv/RefineDet/scripts/run_standardalone_train.sh b/research/cv/RefineDet/scripts/run_standardalone_train.sh
new file mode 100644
index 0000000000000000000000000000000000000000..1219066672bf02beec2aeff1eb09c6a43a71768f
--- /dev/null
+++ b/research/cv/RefineDet/scripts/run_standardalone_train.sh
@@ -0,0 +1,73 @@
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+echo "=============================================================================================================="
+echo "Please run the script as: "
+echo "sh run_distribute_train.sh DEVICE_ID EPOCH_SIZE LR DATASET PRE_TRAINED PRE_TRAINED_EPOCH_SIZE"
+echo "for example: sh run_distribute_train.sh 0 500 0.2 coco /opt/ssd-300.ckpt(optional) 200(optional)"
+echo "It is better to use absolute path."
+echo "================================================================================================================="
+
+if [ $# != 4 ] && [ $# != 6 ]
+then
+    echo "Usage: sh run_distribute_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET] \
+    [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)"
+    exit 1
+fi
+
+# Before start distribute train, first create mindrecord files.
+BASE_PATH=$(cd "`dirname $0`" || exit; pwd)
+cd $BASE_PATH/../ || exit
+python train.py --only_create_dataset=True --dataset=$4
+
+echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt"
+DEVICE_ID=$1
+EPOCH_SIZE=$2
+LR=$3
+DATASET=$4
+PRE_TRAINED=$5
+PRE_TRAINED_EPOCH_SIZE=$6
+
+export DEVICE_ID=$DEVICE_ID
+rm -rf LOG$DEVICE_ID
+mkdir ./LOG$DEVICE_ID
+cp ./*.py ./LOG$DEVICE_ID
+cp -r ./src ./LOG$DEVICE_ID
+cd ./LOG$DEVICE_ID || exit
+
+echo "start training with device $DEVICE_ID"
+env > env.log
+if [ $# == 4 ]
+then
+    python train.py  \
+    --lr=$LR \
+    --dataset=$DATASET \
+    --device_id=$DEVICE_ID  \
+    --epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
+fi
+
+if [ $# == 6 ]
+then
+    python train.py  \
+    --lr=$LR \
+    --dataset=$DATASET \
+    --device_id=$DEVICE_ID  \
+    --pre_trained=$PRE_TRAINED \
+    --pre_trained_epoch_size=$PRE_TRAINED_EPOCH_SIZE \
+    --epoch_size=$EPOCH_SIZE > log.txt 2>&1 &
+fi
+
+cd ../
diff --git a/research/cv/RefineDet/src/__init__.py b/research/cv/RefineDet/src/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/research/cv/RefineDet/src/anchor_generator.py b/research/cv/RefineDet/src/anchor_generator.py
new file mode 100644
index 0000000000000000000000000000000000000000..63e0e402f29e711b2a0d3875368b8d0f887c312e
--- /dev/null
+++ b/research/cv/RefineDet/src/anchor_generator.py
@@ -0,0 +1,93 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Anchor Generator"""
+
+import numpy as np
+
+
+class GridAnchorGenerator:
+    """
+    Anchor Generator
+    """
+    def __init__(self, image_shape, scale, scales_per_octave, aspect_ratios):
+        super(GridAnchorGenerator, self).__init__()
+        self.scale = scale
+        self.scales_per_octave = scales_per_octave
+        self.aspect_ratios = aspect_ratios
+        self.image_shape = image_shape
+
+
+    def generate(self, step):
+        """generate default anchors"""
+        scales = np.array([2 ** (float(scale) / self.scales_per_octave)
+                           for scale in range(self.scales_per_octave)]).astype(np.float32)
+        aspects = np.array(list(self.aspect_ratios)).astype(np.float32)
+
+        scales_grid, aspect_ratios_grid = np.meshgrid(scales, aspects)
+        scales_grid = scales_grid.reshape([-1])
+        aspect_ratios_grid = aspect_ratios_grid.reshape([-1])
+
+        feature_size = [self.image_shape[0] / step, self.image_shape[1] / step]
+        grid_height, grid_width = feature_size
+
+        base_size = np.array([self.scale * step, self.scale * step]).astype(np.float32)
+        anchor_offset = step / 2.0
+
+        ratio_sqrt = np.sqrt(aspect_ratios_grid)
+        heights = scales_grid / ratio_sqrt * base_size[0]
+        widths = scales_grid * ratio_sqrt * base_size[1]
+
+        y_centers = np.arange(grid_height).astype(np.float32)
+        y_centers = y_centers * step + anchor_offset
+        x_centers = np.arange(grid_width).astype(np.float32)
+        x_centers = x_centers * step + anchor_offset
+        x_centers, y_centers = np.meshgrid(x_centers, y_centers)
+
+        x_centers_shape = x_centers.shape
+        y_centers_shape = y_centers.shape
+
+        widths_grid, x_centers_grid = np.meshgrid(widths, x_centers.reshape([-1]))
+        heights_grid, y_centers_grid = np.meshgrid(heights, y_centers.reshape([-1]))
+
+        x_centers_grid = x_centers_grid.reshape(*x_centers_shape, -1)
+        y_centers_grid = y_centers_grid.reshape(*y_centers_shape, -1)
+        widths_grid = widths_grid.reshape(-1, *x_centers_shape)
+        heights_grid = heights_grid.reshape(-1, *y_centers_shape)
+
+
+        bbox_centers = np.stack([y_centers_grid, x_centers_grid], axis=3)
+        bbox_sizes = np.stack([heights_grid, widths_grid], axis=3)
+        bbox_centers = bbox_centers.reshape([-1, 2])
+        bbox_sizes = bbox_sizes.reshape([-1, 2])
+        bbox_corners = np.concatenate([bbox_centers - 0.5 * bbox_sizes, bbox_centers + 0.5 * bbox_sizes], axis=1)
+        self.bbox_corners = bbox_corners / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
+        self.bbox_centers = np.concatenate([bbox_centers, bbox_sizes], axis=1)
+        self.bbox_centers = self.bbox_centers / np.array([*self.image_shape, *self.image_shape]).astype(np.float32)
+
+        print(self.bbox_centers.shape)
+        return self.bbox_centers, self.bbox_corners
+
+    def generate_multi_levels(self, steps):
+        """generate multi levels anchors"""
+        bbox_centers_list = []
+        bbox_corners_list = []
+        for step in steps:
+            bbox_centers, bbox_corners = self.generate(step)
+            bbox_centers_list.append(bbox_centers)
+            bbox_corners_list.append(bbox_corners)
+
+        self.bbox_centers = np.concatenate(bbox_centers_list, axis=0)
+        self.bbox_corners = np.concatenate(bbox_corners_list, axis=0)
+        return self.bbox_centers, self.bbox_corners
diff --git a/research/cv/RefineDet/src/box_utils.py b/research/cv/RefineDet/src/box_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..341cc0566783859b1c0c19264b5fd77173032903
--- /dev/null
+++ b/research/cv/RefineDet/src/box_utils.py
@@ -0,0 +1,172 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Bbox utils"""
+
+import math
+import itertools as it
+import numpy as np
+from .anchor_generator import GridAnchorGenerator
+
+is_init = False
+
+class GeneratDefaultBoxes():
+    """
+    Generate Default boxes for SSD, follows the order of (W, H, archor_sizes).
+    `self.default_boxes` has a shape of [archor_sizes, H, W, 4], the last dimension is [y, x, h, w].
+    `self.default_boxes_tlbr` has a shape as `self.default_boxes`, the last dimension is [y1, x1, y2, x2].
+    """
+    def __init__(self, config):
+        fk = config.img_shape[0] / np.array(config.steps)
+        scale_rate = (config.max_scale - config.min_scale) / (len(config.num_default) - 1)
+        scales = [config.min_scale + scale_rate * i for i in range(len(config.num_default))] + [1.0]
+        self.default_boxes = []
+        for idex, feature_size in enumerate(config.feature_size):
+            sk1 = scales[idex]
+            if idex == 0 and not config.aspect_ratios[idex]:
+                w, h = sk1 * math.sqrt(2), sk1 / math.sqrt(2)
+                all_sizes = [(0.1, 0.1), (w, h), (h, w)]
+            else:
+                all_sizes = [(sk1, sk1)]
+                for aspect_ratio in config.aspect_ratios[idex]:
+                    w, h = sk1 * math.sqrt(aspect_ratio), sk1 / math.sqrt(aspect_ratio)
+                    all_sizes.append((w, h))
+                    all_sizes.append((h, w))
+
+            assert len(all_sizes) == config.num_default[idex]
+
+            for i, j in it.product(range(feature_size), repeat=2):
+                for w, h in all_sizes:
+                    cx, cy = (j + 0.5) / fk[idex], (i + 0.5) / fk[idex]
+                    self.default_boxes.append([cy, cx, h, w])
+
+        def to_tlbr(cy, cx, h, w):
+            return cy - h / 2, cx - w / 2, cy + h / 2, cx + w / 2
+
+        # For IoU calculation
+        self.default_boxes_tlbr = np.array(tuple(to_tlbr(*i) for i in self.default_boxes), dtype='float32')
+        self.default_boxes = np.array(self.default_boxes, dtype='float32')
+
+default_boxes = matching_threshold = vol_anchors = y1 = x1 = y2 = x2 = None
+
+def box_init(config):
+    """init default boxes"""
+    global is_init, default_boxes, matching_threshold, vol_anchors, y1, x1, y2, x2
+    if is_init:
+        return default_boxes
+    is_init = True
+    if 'use_anchor_generator' in config and config.use_anchor_generator:
+        generator = GridAnchorGenerator(config.img_shape, 4, 2, [1.0, 2.0, 0.5])
+        default_boxes, default_boxes_tlbr = generator.generate_multi_levels(config.steps)
+    else:
+        default_boxes_tlbr = GeneratDefaultBoxes(config).default_boxes_tlbr
+        default_boxes = GeneratDefaultBoxes(config).default_boxes
+    y1, x1, y2, x2 = np.split(default_boxes_tlbr[:, :4], 4, axis=-1)
+    vol_anchors = (x2 - x1) * (y2 - y1)
+    matching_threshold = config.match_threshold
+    return default_boxes
+
+def refinedet_bboxes_encode(config, boxes):
+    """
+    Labels anchors with ground truth inputs.
+
+    Args:
+        boxes: ground truth with shape [N, 5], for each row, it stores [y, x, h, w, cls].
+
+    Returns:
+        gt_loc: location ground truth with shape [num_anchors, 4].
+        gt_label: class ground truth with shape [num_anchors, 1].
+        num_matched_boxes: number of positives in an image.
+    """
+    box_init(config)
+    def jaccard_with_anchors(bbox):
+        """Compute jaccard score a box and the anchors."""
+        # Intersection bbox and volume.
+        ymin = np.maximum(y1, bbox[0])
+        xmin = np.maximum(x1, bbox[1])
+        ymax = np.minimum(y2, bbox[2])
+        xmax = np.minimum(x2, bbox[3])
+        w = np.maximum(xmax - xmin, 0.)
+        h = np.maximum(ymax - ymin, 0.)
+
+        # Volumes.
+        inter_vol = h * w
+        union_vol = vol_anchors + (bbox[2] - bbox[0]) * (bbox[3] - bbox[1]) - inter_vol
+        jaccard = inter_vol / union_vol
+        return np.squeeze(jaccard)
+
+    pre_scores = np.zeros((config.num_ssd_boxes), dtype=np.float32)
+    t_boxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
+    t_label = np.zeros((config.num_ssd_boxes), dtype=np.int64)
+    for bbox in boxes:
+        label = int(bbox[4])
+        scores = jaccard_with_anchors(bbox)
+        idx = np.argmax(scores)
+        scores[idx] = 2.0
+        mask = (scores > matching_threshold)
+        mask = mask & (scores > pre_scores)
+        pre_scores = np.maximum(pre_scores, scores * mask)
+        t_label = mask * label + (1 - mask) * t_label
+        for i in range(4):
+            t_boxes[:, i] = mask * bbox[i] + (1 - mask) * t_boxes[:, i]
+
+    index = np.nonzero(t_label)
+
+    # Transform to tlbr.
+    bboxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32)
+    bboxes[:, [0, 1]] = (t_boxes[:, [0, 1]] + t_boxes[:, [2, 3]]) / 2
+    bboxes[:, [2, 3]] = t_boxes[:, [2, 3]] - t_boxes[:, [0, 1]]
+
+    # Encode features.
+    bboxes_t = bboxes[index]
+    default_boxes_t = default_boxes[index]
+    bboxes_t[:, :2] = (bboxes_t[:, :2] - default_boxes_t[:, :2]) / (default_boxes_t[:, 2:] * config.prior_scaling[0])
+    tmp = np.maximum(bboxes_t[:, 2:4] / default_boxes_t[:, 2:4], 0.000001)
+    bboxes_t[:, 2:4] = np.log(tmp) / config.prior_scaling[1]
+    bboxes[index] = bboxes_t
+
+    num_match = np.array([len(np.nonzero(t_label)[0])], dtype=np.int32)
+    return bboxes, t_label.astype(np.int32), num_match
+
+
+def refinedet_bboxes_decode(boxes):
+    """Decode predict boxes to [y, x, h, w]"""
+    boxes_t = boxes.copy()
+    default_boxes_t = default_boxes.copy()
+    boxes_t[:, :2] = boxes_t[:, :2] * config.prior_scaling[0] * default_boxes_t[:, 2:] + default_boxes_t[:, :2]
+    boxes_t[:, 2:4] = np.exp(boxes_t[:, 2:4] * config.prior_scaling[1]) * default_boxes_t[:, 2:4]
+
+    bboxes = np.zeros((len(boxes_t), 4), dtype=np.float32)
+
+    bboxes[:, [0, 1]] = boxes_t[:, [0, 1]] - boxes_t[:, [2, 3]] / 2
+    bboxes[:, [2, 3]] = boxes_t[:, [0, 1]] + boxes_t[:, [2, 3]] / 2
+
+    return np.clip(bboxes, 0, 1)
+
+
+def intersect(box_a, box_b):
+    """Compute the intersect of two sets of boxes."""
+    max_yx = np.minimum(box_a[:, 2:4], box_b[2:4])
+    min_yx = np.maximum(box_a[:, :2], box_b[:2])
+    inter = np.clip((max_yx - min_yx), a_min=0, a_max=np.inf)
+    return inter[:, 0] * inter[:, 1]
+
+
+def jaccard_numpy(box_a, box_b):
+    """Compute the jaccard overlap of two sets of boxes."""
+    inter = intersect(box_a, box_b)
+    area_a = ((box_a[:, 2] - box_a[:, 0]) * (box_a[:, 3] - box_a[:, 1]))
+    area_b = ((box_b[2] - box_b[0]) * (box_b[3] - box_b[1]))
+    union = area_a + area_b - inter
+    return inter / union
diff --git a/research/cv/RefineDet/src/config.py b/research/cv/RefineDet/src/config.py
new file mode 100644
index 0000000000000000000000000000000000000000..3f7a3d3452a524a3afe3e475b0b87175b4501a5a
--- /dev/null
+++ b/research/cv/RefineDet/src/config.py
@@ -0,0 +1,59 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Config parameters for RefineDet models."""
+
+from .config_vgg16 import config_320 as config_vgg16_320
+from .config_vgg16 import config_512 as config_vgg16_512
+from .config_resnet101 import config_320 as config_resnet_320
+from .config_resnet101 import config_512 as config_resnet_512
+
+config = None
+
+config_map = {
+    "refinedet_vgg16_320": config_vgg16_320,
+    "refinedet_vgg16_512": config_vgg16_512,
+    "refinedet_resnet101_320": config_resnet_320,
+    "refinedet_resnet101_512": config_resnet_512,
+}
+
+def get_config(using_model="refinedet_vgg16_320", using_dataset="voc_test"):
+    """init config according to args"""
+    global config
+    if config is not None:
+        return config
+    config = config_map[using_model]
+    if using_dataset == "voc0712":
+        config.voc_root = config.voc0712_root
+        config.num_classes = config.voc_num_classes
+        config.classes = config.voc_classes
+    elif using_dataset == "voc0712plus":
+        config.voc_root = config.voc0712plus_root
+        config.num_classes = config.voc_num_classes
+        config.classes = config.voc_classes
+    elif using_dataset == "voc_test":
+        config.voc_root = config.voc_test_root
+        config.num_classes = config.voc_num_classes
+        config.classes = config.voc_classes
+    elif using_dataset == "coco":
+        config.num_classes = config.coco_num_classes
+        config.classes = config.coco_classes
+    # calculate the boxes number
+    if config.num_ssd_boxes == -1:
+        num = 0
+        h, w = config.img_shape
+        for i in range(len(config.steps)):
+            num += (h // config.steps[i]) * (w // config.steps[i]) * config.num_default[i]
+        config.num_ssd_boxes = num
+    return config
diff --git a/research/cv/RefineDet/src/config_resnet101.py b/research/cv/RefineDet/src/config_resnet101.py
new file mode 100644
index 0000000000000000000000000000000000000000..53d15193fcf0959d0f8cb1df68736344b5df2588
--- /dev/null
+++ b/research/cv/RefineDet/src/config_resnet101.py
@@ -0,0 +1,172 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Basic config parameters for RefineDet models."""
+from easydict import EasyDict as ed
+
+config_320 = ed({
+    "model": "refinedet_resnet101",
+    "img_shape": [320, 320],
+    "num_ssd_boxes": -1,
+    "match_threshold": 0.5,
+    "nms_threshold": 0.6,
+    "min_score": 0.1,
+    "max_boxes": 100,
+
+    # learing rate settings
+    "lr_init": 0.001,
+    "lr_end_rate": 0.001,
+    "warmup_epochs": 2,
+    "momentum": 0.9,
+    "weight_decay": 1.5e-4,
+
+    # network
+    # vgg16 config
+    "num_default": [3, 3, 3, 3],
+    "extra_arm_channels": [512, 1024, 2048, 512],
+    "extra_odm_channels": [256, 256, 256, 256],
+    "L2normalizations": [10, 8, -1, -1],
+    "arm_source": ["b4", "b5", "fc7", "b6_2"], # four source layers, last one is the end of backbone
+
+    # box utils config
+    "feature_size": [40, 20, 10, 5],
+    "min_scale": 0.2,
+    "max_scale": 0.95,
+    "aspect_ratios": [(), (2,), (2,), (2,)],
+    "steps": (8, 16, 32, 64),
+    "prior_scaling": (0.1, 0.2),
+    "gamma": 2.0,
+    "alpha": 0.75,
+
+    # `mindrecord_dir` and `coco_root` are better to use absolute path.
+    "feature_extractor_base_param": "",
+    "pretrain_vgg_bn": False,
+    "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'],
+    "mindrecord_dir": "./data/MindRecord",
+    "coco_root": "./data/COCO2017",
+    "train_data_type": "train2017",
+    # The annotation.json position of voc validation dataset.
+    "voc_json": "annotations/voc_instances_val.json",
+    # voc original dataset.
+    "voc_root": "",
+    "voc_test_root": "./data/voc_test",
+    "voc0712_root": "./data/VOC0712",
+    "voc0712plus_root": "./data/VOC0712Plus",
+    # if coco or voc used, `image_dir` and `anno_path` are useless.
+    "image_dir": "",
+    "anno_path": "",
+    "val_data_type": "val2017",
+    "instances_set": "annotations/instances_{}.json",
+    "voc_classes": ('background', 'aeroplane', 'bicycle', 'bird',
+                    'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
+                    'cow', 'diningtable', 'dog', 'horse', 'motorbike',
+                    'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
+    "voc_num_classes": 21,
+    "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
+                     'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
+                     'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
+                     'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
+                     'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
+                     'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
+                     'kite', 'baseball bat', 'baseball glove', 'skateboard',
+                     'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
+                     'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
+                     'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
+                     'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
+                     'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
+                     'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
+                     'refrigerator', 'book', 'clock', 'vase', 'scissors',
+                     'teddy bear', 'hair drier', 'toothbrush'),
+    "coco_num_classes": 81,
+    "classes": (),
+    "num_classes": ()
+})
+
+config_512 = ed({
+    "model": "refinedet_resnet101",
+    "img_shape": [512, 512],
+    "num_ssd_boxes": -1,
+    "match_threshold": 0.5,
+    "nms_threshold": 0.6,
+    "min_score": 0.1,
+    "max_boxes": 100,
+
+    # learing rate settings
+    "lr_init": 0.001,
+    "lr_end_rate": 0.001,
+    "warmup_epochs": 2,
+    "momentum": 0.9,
+    "weight_decay": 1.5e-4,
+
+    # network
+    # vgg16 config
+    "num_default": [3, 3, 3, 3],
+    "extra_arm_channels": [512, 1024, 2048, 512],
+    "extra_odm_channels": [256, 256, 256, 256],
+    "L2normalizations": [10, 8, -1, -1],
+    "arm_source": ["b4", "b5", "fc7", "b6_2"], # four source layers, last one is the end of backbone
+
+    # box utils config
+    "feature_size": [64, 32, 16, 8],
+    "min_scale": 0.2,
+    "max_scale": 0.95,
+    "aspect_ratios": [(), (2,), (2,), (2,)],
+    "steps": (8, 16, 32, 64),
+    "prior_scaling": (0.1, 0.2),
+    "gamma": 2.0,
+    "alpha": 0.75,
+
+    # `mindrecord_dir` and `coco_root` are better to use absolute path.
+    "feature_extractor_base_param": "",
+    "pretrain_vgg_bn": False,
+    "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'],
+    "mindrecord_dir": "./data/MindRecord",
+    "coco_root": "./data/COCO2017",
+    "train_data_type": "train2017",
+    # The annotation.json position of voc validation dataset.
+    "voc_json": "annotations/voc_instances_val.json",
+    # voc original dataset.
+    "voc_root": "",
+    "voc_test_root": "./data/voc_test",
+    "voc0712_root": "./data/VOC0712",
+    "voc0712plus_root": "./data/VOC0712Plus",
+    # if coco or voc used, `image_dir` and `anno_path` are useless.
+    "image_dir": "",
+    "anno_path": "",
+    "val_data_type": "val2017",
+    "instances_set": "annotations/instances_{}.json",
+    "voc_classes": ('background', 'aeroplane', 'bicycle', 'bird',
+                    'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
+                    'cow', 'diningtable', 'dog', 'horse', 'motorbike',
+                    'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
+    "voc_num_classes": 21,
+    "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
+                     'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
+                     'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
+                     'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
+                     'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
+                     'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
+                     'kite', 'baseball bat', 'baseball glove', 'skateboard',
+                     'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
+                     'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
+                     'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
+                     'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
+                     'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
+                     'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
+                     'refrigerator', 'book', 'clock', 'vase', 'scissors',
+                     'teddy bear', 'hair drier', 'toothbrush'),
+    "coco_num_classes": 81,
+    "classes": (),
+    "num_classes": ()
+})
diff --git a/research/cv/RefineDet/src/config_vgg16.py b/research/cv/RefineDet/src/config_vgg16.py
new file mode 100644
index 0000000000000000000000000000000000000000..76bf06556ef256976495ddbd40b014e75bce5cb2
--- /dev/null
+++ b/research/cv/RefineDet/src/config_vgg16.py
@@ -0,0 +1,174 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Basic config parameters for RefineDet models."""
+from easydict import EasyDict as ed
+
+config_320 = ed({
+    "model": "refinedet_vgg16",
+    "img_shape": [320, 320],
+    "num_ssd_boxes": -1,
+    "match_threshold": 0.5,
+    "nms_threshold": 0.6,
+    "min_score": 0.1,
+    "max_boxes": 100,
+    "objectness_thre": 0.01,
+
+    # learing rate settings
+    "lr_init": 0.001,
+    "lr_end_rate": 0.001,
+    "warmup_epochs": 2,
+    "momentum": 0.9,
+    "weight_decay": 1.5e-4,
+
+    # network
+    # vgg16 config
+    "num_default": [3, 3, 3, 3],
+    "extra_arm_channels": [512, 512, 1024, 512],
+    "extra_odm_channels": [256, 256, 256, 256],
+    "L2normalizations": [10, 8, -1, -1],
+    "arm_source": ["b4", "b5", "fc7", "b6_2"], # four source layers
+
+    # box utils config
+    "feature_size": [40, 20, 10, 5],
+    "min_scale": 0.2,
+    "max_scale": 0.95,
+    "aspect_ratios": [(), (2,), (2,), (2,)],
+    "steps": (8, 16, 32, 64),
+    "prior_scaling": (0.1, 0.2),
+    "gamma": 2.0,
+    "alpha": 0.75,
+
+    # `mindrecord_dir` and `coco_root` are better to use absolute path.
+    "feature_extractor_base_param": "",
+    "pretrain_vgg_bn": False,
+    "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'],
+    "mindrecord_dir": "./data/MindRecord",
+    "coco_root": "./data/COCO2017",
+    "train_data_type": "train2017",
+    # The annotation.json position of voc validation dataset.
+    "voc_json": "annotations/voc_instances_val.json",
+    # voc original dataset.
+    "voc_root": "",
+    "voc_test_root": "./data/voc_test",
+    "voc0712_root": "./data/VOC0712",
+    "voc0712plus_root": "./data/VOC0712Plus",
+    # if coco or voc used, `image_dir` and `anno_path` are useless.
+    "image_dir": "",
+    "anno_path": "",
+    "val_data_type": "val2017",
+    "instances_set": "annotations/instances_{}.json",
+    "voc_classes": ('background', 'aeroplane', 'bicycle', 'bird',
+                    'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
+                    'cow', 'diningtable', 'dog', 'horse', 'motorbike',
+                    'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
+    "voc_num_classes": 21,
+    "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
+                     'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
+                     'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
+                     'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
+                     'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
+                     'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
+                     'kite', 'baseball bat', 'baseball glove', 'skateboard',
+                     'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
+                     'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
+                     'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
+                     'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
+                     'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
+                     'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
+                     'refrigerator', 'book', 'clock', 'vase', 'scissors',
+                     'teddy bear', 'hair drier', 'toothbrush'),
+    "coco_num_classes": 81,
+    "classes": (),
+    "num_classes": ()
+})
+
+config_512 = ed({
+    "model": "refinedet_vgg16",
+    "img_shape": [512, 512],
+    "num_ssd_boxes": -1,
+    "match_threshold": 0.5,
+    "nms_threshold": 0.6,
+    "min_score": 0.1,
+    "max_boxes": 100,
+    "objectness_thre": 0.01,
+
+    # learing rate settings
+    "lr_init": 0.001,
+    "lr_end_rate": 0.001,
+    "warmup_epochs": 2,
+    "momentum": 0.9,
+    "weight_decay": 1.5e-4,
+
+    # network
+    # vgg16 config
+    "num_default": [3, 3, 3, 3],
+    "extra_arm_channels": [512, 512, 1024, 512],
+    "extra_odm_channels": [256, 256, 256, 256],
+    "L2normalizations": [10, 8, -1, -1],
+    "arm_source": ["b4", "b5", "fc7", "b6_2"], # four source layers, last one is the end of backbone
+
+    # box utils config
+    "feature_size": [64, 32, 16, 8],
+    "min_scale": 0.2,
+    "max_scale": 0.95,
+    "aspect_ratios": [(), (2,), (2,), (2,)],
+    "steps": (8, 16, 32, 64),
+    "prior_scaling": (0.1, 0.2),
+    "gamma": 2.0,
+    "alpha": 0.75,
+
+    # `mindrecord_dir` and `coco_root` are better to use absolute path.
+    "feature_extractor_base_param": "",
+    "pretrain_vgg_bn": False,
+    "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'],
+    "mindrecord_dir": "./data/MindRecord",
+    "coco_root": "./data/COCO2017",
+    "train_data_type": "train2017",
+    # The annotation.json position of voc validation dataset.
+    "voc_json": "annotations/voc_instances_val.json",
+    # voc original dataset.
+    "voc_root": "",
+    "voc_test_root": "./data/voc_test",
+    "voc0712_root": "./data/VOC0712",
+    "voc0712plus_root": "./data/VOC0712Plus",
+    # if coco or voc used, `image_dir` and `anno_path` are useless.
+    "image_dir": "",
+    "anno_path": "",
+    "val_data_type": "val2017",
+    "instances_set": "annotations/instances_{}.json",
+    "voc_classes": ('background', 'aeroplane', 'bicycle', 'bird',
+                    'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
+                    'cow', 'diningtable', 'dog', 'horse', 'motorbike',
+                    'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
+    "voc_num_classes": 21,
+    "coco_classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
+                     'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
+                     'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',
+                     'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
+                     'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
+                     'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
+                     'kite', 'baseball bat', 'baseball glove', 'skateboard',
+                     'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
+                     'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
+                     'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
+                     'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
+                     'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
+                     'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink',
+                     'refrigerator', 'book', 'clock', 'vase', 'scissors',
+                     'teddy bear', 'hair drier', 'toothbrush'),
+    "coco_num_classes": 81,
+    "classes": (),
+    "num_classes": ()
+})
diff --git a/research/cv/RefineDet/src/dataset.py b/research/cv/RefineDet/src/dataset.py
new file mode 100644
index 0000000000000000000000000000000000000000..9a400aa3f9ac24198c8d6c4ecd2c90950b93791b
--- /dev/null
+++ b/research/cv/RefineDet/src/dataset.py
@@ -0,0 +1,475 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Create RefineDet dataset"""
+
+from __future__ import division
+
+import os
+import json
+import xml.etree.ElementTree as et
+import numpy as np
+import cv2
+
+import mindspore.dataset as de
+import mindspore.dataset.vision.c_transforms as C
+from mindspore.mindrecord import FileWriter
+from .box_utils import jaccard_numpy, refinedet_bboxes_encode, box_init
+
+def _rand(a=0., b=1.):
+    """Generate random."""
+    return np.random.rand() * (b - a) + a
+
+
+def get_imageId_from_fileName(filename, id_iter):
+    """Get imageID from fileName if fileName is int, else return id_iter."""
+    filename = os.path.splitext(filename)[0]
+    if filename.isdigit():
+        return int(filename)
+    return id_iter
+
+
+def random_sample_crop(image, boxes):
+    """Random Crop the image and boxes"""
+    height, width, _ = image.shape
+    min_iou = np.random.choice([None, 0.1, 0.3, 0.5, 0.7, 0.9])
+
+    if min_iou is None:
+        return image, boxes
+
+    # max trails (50)
+    for _ in range(50):
+        image_t = image
+
+        w = _rand(0.3, 1.0) * width
+        h = _rand(0.3, 1.0) * height
+
+        # aspect ratio constraint b/t .5 & 2
+        if h / w < 0.5 or h / w > 2:
+            continue
+
+        left = _rand() * (width - w)
+        top = _rand() * (height - h)
+
+        rect = np.array([int(top), int(left), int(top + h), int(left + w)])
+        overlap = jaccard_numpy(boxes, rect)
+
+        # dropout some boxes
+        drop_mask = overlap > 0
+        if not drop_mask.any():
+            continue
+
+        if overlap[drop_mask].min() < min_iou and overlap[drop_mask].max() > (min_iou + 0.2):
+            continue
+
+        image_t = image_t[rect[0]:rect[2], rect[1]:rect[3], :]
+
+        centers = (boxes[:, :2] + boxes[:, 2:4]) / 2.0
+
+        m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])
+        m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])
+
+        # mask in that both m1 and m2 are true
+        mask = m1 * m2 * drop_mask
+
+        # have any valid boxes?  try again if not
+        if not mask.any():
+            continue
+
+        # take only matching gt boxes
+        boxes_t = boxes[mask, :].copy()
+
+        boxes_t[:, :2] = np.maximum(boxes_t[:, :2], rect[:2])
+        boxes_t[:, :2] -= rect[:2]
+        boxes_t[:, 2:4] = np.minimum(boxes_t[:, 2:4], rect[2:4])
+        boxes_t[:, 2:4] -= rect[:2]
+
+        return image_t, boxes_t
+    return image, boxes
+
+
+def preprocess_fn(config, img_id, image, box, is_training):
+    """Preprocess function for dataset."""
+    cv2.setNumThreads(2)
+
+    def _infer_data(image, input_shape):
+        img_h, img_w, _ = image.shape
+        input_h, input_w = input_shape
+
+        image = cv2.resize(image, (input_w, input_h))
+
+        # When the channels of image is 1
+        if len(image.shape) == 2:
+            image = np.expand_dims(image, axis=-1)
+            image = np.concatenate([image, image, image], axis=-1)
+
+        return img_id, image, np.array((img_h, img_w), np.float32)
+
+    def _data_aug(image, box, is_training, image_size=(300, 300)):
+        """Data augmentation function."""
+        ih, iw, _ = image.shape
+        h, w = image_size
+
+        if not is_training:
+            return _infer_data(image, image_size)
+
+        # Random crop
+        box = box.astype(np.float32)
+        image, box = random_sample_crop(image, box)
+        ih, iw, _ = image.shape
+
+        # Resize image
+        image = cv2.resize(image, (w, h))
+
+        # Flip image or not
+        flip = _rand() < .5
+        if flip:
+            image = cv2.flip(image, 1, dst=None)
+
+        # When the channels of image is 1
+        if len(image.shape) == 2:
+            image = np.expand_dims(image, axis=-1)
+            image = np.concatenate([image, image, image], axis=-1)
+
+        box[:, [0, 2]] = box[:, [0, 2]] / ih
+        box[:, [1, 3]] = box[:, [1, 3]] / iw
+
+        if flip:
+            box[:, [1, 3]] = 1 - box[:, [3, 1]]
+
+        box, label, num_match = refinedet_bboxes_encode(config, box)
+        return image, box, label, num_match
+
+    return _data_aug(image, box, is_training, image_size=config.img_shape)
+
+
+def create_voc_label(config, is_training):
+    """Get image path and annotation from VOC."""
+    print("Create VOC label")
+    voc_root = config.voc_root
+    cls_map = {name: i for i, name in enumerate(config.classes)}
+    sub_dir = 'train' if is_training else 'eval'
+    voc_dir = os.path.join(voc_root, sub_dir)
+    if not os.path.isdir(voc_dir):
+        raise ValueError(f'Cannot find {sub_dir} dataset path.')
+
+    image_dir = anno_dir = voc_dir
+    if os.path.isdir(os.path.join(voc_dir, 'Images')):
+        image_dir = os.path.join(voc_dir, 'Images')
+    if os.path.isdir(os.path.join(voc_dir, 'Annotations')):
+        anno_dir = os.path.join(voc_dir, 'Annotations')
+
+    if not is_training:
+        json_file = os.path.join(config.voc_root, config.voc_json)
+        file_dir = os.path.split(json_file)[0]
+        if not os.path.isdir(file_dir):
+            os.makedirs(file_dir)
+        json_dict = {"images": [], "type": "instances", "annotations": [],
+                     "categories": []}
+        bnd_id = 1
+
+    image_files_dict = {}
+    image_anno_dict = {}
+    images = []
+    id_iter = 0
+    for anno_file in os.listdir(anno_dir):
+        if not anno_file.endswith('xml'):
+            continue
+        tree = et.parse(os.path.join(anno_dir, anno_file))
+        root_node = tree.getroot()
+        folder = root_node.find('folder').text
+        file_name = root_node.find('filename').text
+        img_id = get_imageId_from_fileName(file_name, id_iter)
+        id_iter += 1
+        image_path = os.path.join(image_dir, folder + '_' + file_name)
+        if not os.path.isfile(image_path):
+            print(f'Cannot find image {file_name} according to annotations.')
+            continue
+
+        labels = []
+        for obj in root_node.iter('object'):#diffcut processing
+            cls_name = obj.find('name').text
+            #difficult = int(obj.find('difficult').text)
+            #if difficult > 0:
+            #    continue
+            if cls_name not in cls_map:
+                print(f'Label "{cls_name}" not in "{config.classes}"')
+                continue
+            bnd_box = obj.find('bndbox')
+            x_min = int(float(bnd_box.find('xmin').text)) - 1
+            y_min = int(float(bnd_box.find('ymin').text)) - 1
+            x_max = int(float(bnd_box.find('xmax').text)) - 1
+            y_max = int(float(bnd_box.find('ymax').text)) - 1
+            labels.append([y_min, x_min, y_max, x_max, cls_map[cls_name]])
+
+            if not is_training:
+                o_width = abs(x_max - x_min)
+                o_height = abs(y_max - y_min)
+                ann = {'area': o_width * o_height, 'iscrowd': 0, 'image_id': \
+                    img_id, 'bbox': [x_min, y_min, o_width, o_height], \
+                       'category_id': cls_map[cls_name], 'id': bnd_id, \
+                       'ignore': 0, \
+                       'segmentation': []}
+                json_dict['annotations'].append(ann)
+                bnd_id = bnd_id + 1
+
+        if labels:
+            images.append(img_id)
+            image_files_dict[img_id] = image_path
+            image_anno_dict[img_id] = np.array(labels)
+
+        if not is_training:
+            size = root_node.find("size")
+            width = int(size.find('width').text)
+            height = int(size.find('height').text)
+            image = {'file_name': file_name, 'height': height, 'width': width,
+                     'id': img_id}
+            json_dict['images'].append(image)
+
+    if not is_training:
+        for cls_name, cid in cls_map.items():
+            cat = {'supercategory': 'none', 'id': cid, 'name': cls_name}
+            json_dict['categories'].append(cat)
+        json_fp = open(json_file, 'w')
+        json_str = json.dumps(json_dict)
+        json_fp.write(json_str)
+        json_fp.close()
+
+    return images, image_files_dict, image_anno_dict
+
+
+def create_coco_label(config, is_training):
+    """Get image path and annotation from COCO."""
+    print("Create COCO label")
+    from pycocotools.coco import COCO
+
+    coco_root = config.coco_root
+    data_type = config.val_data_type
+    if is_training:
+        data_type = config.train_data_type
+
+    # Classes need to train or test.
+    train_cls = config.classes
+    train_cls_dict = {}
+    for i, cls in enumerate(train_cls):
+        train_cls_dict[cls] = i
+
+    anno_json = os.path.join(coco_root, config.instances_set.format(data_type))
+
+    coco = COCO(anno_json)
+    classs_dict = {}
+    cat_ids = coco.loadCats(coco.getCatIds())
+    for cat in cat_ids:
+        classs_dict[cat["id"]] = cat["name"]
+
+    image_ids = coco.getImgIds()
+    images = []
+    image_path_dict = {}
+    image_anno_dict = {}
+
+    for img_id in image_ids:
+        image_info = coco.loadImgs(img_id)
+        file_name = image_info[0]["file_name"]
+        anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None)
+        anno = coco.loadAnns(anno_ids)
+        image_path = os.path.join(coco_root, data_type, file_name)
+        annos = []
+        iscrowd = False
+        for label in anno:
+            bbox = label["bbox"]
+            class_name = classs_dict[label["category_id"]]
+            iscrowd = iscrowd or label["iscrowd"]
+            if class_name in train_cls:
+                x_min, x_max = bbox[0], bbox[0] + bbox[2]
+                y_min, y_max = bbox[1], bbox[1] + bbox[3]
+                annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]])
+
+        if not is_training and iscrowd:
+            continue
+        if len(annos) >= 1:
+            images.append(img_id)
+            image_path_dict[img_id] = image_path
+            image_anno_dict[img_id] = np.array(annos)
+
+    return images, image_path_dict, image_anno_dict
+
+
+def anno_parser(annos_str):
+    """Parse annotation from string to list."""
+    annos = []
+    for anno_str in annos_str:
+        anno = list(map(int, anno_str.strip().split(',')))
+        annos.append(anno)
+    return annos
+
+
+def filter_valid_data(image_dir, anno_path, dataset='coco'):
+    """Filter valid image file, which both in image_dir and anno_path."""
+    images = []
+    image_path_dict = {}
+    image_anno_dict = {}
+    if not os.path.isdir(image_dir):
+        raise RuntimeError("Path given is not valid.")
+    if not os.path.isfile(anno_path):
+        raise RuntimeError("Annotation file is not valid.")
+
+    with open(anno_path, "rb") as f:
+        lines = f.readlines()
+    if dataset == 'coco':
+        for img_id, line in enumerate(lines):
+            line_str = line.decode("utf-8").strip()
+            line_split = str(line_str).split(' ')
+            file_name = line_split[0]
+            image_path = os.path.join(image_dir, file_name)
+            if os.path.isfile(image_path):
+                images.append(img_id)
+                image_path_dict[img_id] = image_path
+                image_anno_dict[img_id] = anno_parser(line_split[1:])
+    else:
+        for line in lines:
+            line_str = line.decode("utf-8").strip()
+            line_split = str(line_str).split(' ')
+            file_name = line_split[0]
+            image_path = os.path.join(image_dir, file_name)
+            if os.path.isfile(image_path):
+                img_id = int(os.path.basename(file_name).split('.')[0])
+                images.append(img_id)
+                image_path_dict[img_id] = image_path
+                image_anno_dict[img_id] = anno_parser(line_split[1:])
+
+    return images, image_path_dict, image_anno_dict
+
+
+def voc_data_to_mindrecord(config, mindrecord_dir, is_training, prefix="refinedet.mindrecord", file_num=8):
+    """Create MindRecord file by image_dir and anno_path."""
+    mindrecord_path = os.path.join(mindrecord_dir, prefix)
+    writer = FileWriter(mindrecord_path, file_num)
+    images, image_path_dict, image_anno_dict = create_voc_label(config, is_training)
+    print("create voc label finished")
+    data_json = {
+        "img_id": {"type": "int32", "shape": [1]},
+        "image": {"type": "bytes"},
+        "annotation": {"type": "int32", "shape": [-1, 5]},
+    }
+    writer.add_schema(data_json, "data_json")
+
+    for img_id in images:
+        image_path = image_path_dict[img_id]
+        with open(image_path, 'rb') as f:
+            img = f.read()
+        annos = np.array(image_anno_dict[img_id], dtype=np.int32)
+        img_id = np.array([img_id], dtype=np.int32)
+        row = {"img_id": img_id, "image": img, "annotation": annos}
+        writer.write_raw_data([row])
+    writer.commit()
+
+
+def data_to_mindrecord_byte_image(config, dataset="coco", is_training=True, prefix="refinedet.mindrecord", file_num=8):
+    """Create MindRecord file."""
+    mindrecord_dir = config.mindrecord_dir
+    mindrecord_path = os.path.join(mindrecord_dir, prefix)
+    writer = FileWriter(mindrecord_path, file_num)
+    if dataset == "coco":
+        images, image_path_dict, image_anno_dict = create_coco_label(config, is_training)
+    else:
+        images, image_path_dict, image_anno_dict = filter_valid_data(config.image_dir,
+                                                                     config.anno_path, dataset=dataset)
+
+    data_json = {
+        "img_id": {"type": "int64", "shape": [1]},
+        "image": {"type": "bytes"},
+        "annotation": {"type": "int32", "shape": [-1, 5]},
+    }
+    writer.add_schema(data_json, "data_json")
+
+    for img_id in images:
+        image_path = image_path_dict[img_id]
+        with open(image_path, 'rb') as f:
+            img = f.read()
+        annos = np.array(image_anno_dict[img_id], dtype=np.int32)
+        img_id = np.array([img_id], dtype=np.int64)
+        row = {"img_id": img_id, "image": img, "annotation": annos}
+        writer.write_raw_data([row])
+    writer.commit()
+
+def create_refinedet_dataset(config, mindrecord_file, batch_size=32, repeat_num=10, device_num=1, rank=0,
+                             is_training=True, num_parallel_workers=6, use_multiprocessing=True):
+    """Create RefineDet dataset with MindDataset."""
+    # init box_utils first, this is because the config can't be changed while running
+    box_init(config)
+    print("loading dataset to minddataset...")
+    ds = de.MindDataset(mindrecord_file, columns_list=["img_id", "image", "annotation"], num_shards=device_num,
+                        shard_id=rank, num_parallel_workers=num_parallel_workers, shuffle=is_training)
+    decode = C.Decode()
+    ds = ds.map(operations=decode, input_columns=["image"])
+    change_swap_op = C.HWC2CHW()
+    normalize_op = C.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
+                               std=[0.229 * 255, 0.224 * 255, 0.225 * 255])
+    color_adjust_op = C.RandomColorAdjust(brightness=0.4, contrast=0.4, saturation=0.4)
+    compose_map_func = (lambda img_id, image, annotation: preprocess_fn(config, img_id, image, annotation, is_training))
+    if is_training:
+        output_columns = ["image", "box", "label", "num_match"]
+        trans = [color_adjust_op, normalize_op, change_swap_op]
+    else:
+        output_columns = ["img_id", "image", "image_shape"]
+        trans = [normalize_op, change_swap_op]
+    ds = ds.map(operations=compose_map_func, input_columns=["img_id", "image", "annotation"],
+                output_columns=output_columns, column_order=output_columns,
+                python_multiprocessing=use_multiprocessing,
+                num_parallel_workers=num_parallel_workers)
+    ds = ds.map(operations=trans, input_columns=["image"], python_multiprocessing=use_multiprocessing,
+                num_parallel_workers=num_parallel_workers)
+    ds = ds.batch(batch_size, drop_remainder=True)
+    ds = ds.repeat(repeat_num)
+    return ds
+
+
+def create_mindrecord(config, dataset="coco", prefix="refinedet.mindrecord",
+                      is_training=True, file_num=8):
+    """create mindrecord file"""
+    print("Start create dataset!")
+
+    # It will generate mindrecord file in config.mindrecord_dir,
+    # and the file name is refinedet.mindrecord0, 1, ...  file_num.
+
+    mindrecord_dir = config.mindrecord_dir
+    num_suffix = "0" if file_num > 1 else ""
+    mindrecord_file = os.path.join(mindrecord_dir, prefix + num_suffix)
+    if not os.path.exists(mindrecord_file):
+        if not os.path.isdir(mindrecord_dir):
+            os.makedirs(mindrecord_dir)
+        if dataset == "coco":
+            if os.path.isdir(config.coco_root):
+                print("Create Mindrecord.")
+                data_to_mindrecord_byte_image(config, "coco", is_training, prefix,
+                                              file_num=file_num)
+                print("Create Mindrecord Done, at {}".format(mindrecord_dir))
+            else:
+                print("coco_root not exits.")
+        elif dataset[:3] == "voc":
+            if os.path.isdir(config.voc_root):
+                print("Create Mindrecord.")
+                voc_data_to_mindrecord(config, mindrecord_dir, is_training, prefix, file_num=file_num)
+                print("Create Mindrecord Done, at {}".format(mindrecord_dir))
+            else:
+                print("voc_root not exits.")
+        else:
+            if os.path.isdir(config.image_dir) and os.path.exists(config.anno_path):
+                print("Create Mindrecord.")
+                data_to_mindrecord_byte_image(config, "other", is_training, prefix,
+                                              file_num=file_num)
+                print("Create Mindrecord Done, at {}".format(mindrecord_dir))
+            else:
+                print("image_dir or anno_path not exits.")
+    return mindrecord_file
diff --git a/research/cv/RefineDet/src/eval_utils.py b/research/cv/RefineDet/src/eval_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..69fa941a7a1ecc7cd969d49e155d44185f247e7d
--- /dev/null
+++ b/research/cv/RefineDet/src/eval_utils.py
@@ -0,0 +1,359 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""eval metrics utils"""
+
+import json
+import xml.etree.ElementTree as et
+import os
+import numpy as np
+
+def apply_nms(all_boxes, all_scores, thres, max_boxes):
+    """Apply NMS to bboxes."""
+    y1 = all_boxes[:, 0]
+    x1 = all_boxes[:, 1]
+    y2 = all_boxes[:, 2]
+    x2 = all_boxes[:, 3]
+    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
+
+    order = all_scores.argsort()[::-1]
+    keep = []
+
+    while order.size > 0:
+        i = order[0]
+        keep.append(i)
+
+        if len(keep) >= max_boxes:
+            break
+
+        xx1 = np.maximum(x1[i], x1[order[1:]])
+        yy1 = np.maximum(y1[i], y1[order[1:]])
+        xx2 = np.minimum(x2[i], x2[order[1:]])
+        yy2 = np.minimum(y2[i], y2[order[1:]])
+
+        w = np.maximum(0.0, xx2 - xx1 + 1)
+        h = np.maximum(0.0, yy2 - yy1 + 1)
+        inter = w * h
+
+        ovr = inter / (areas[i] + areas[order[1:]] - inter)
+
+        inds = np.where(ovr <= thres)[0]
+
+        order = order[inds + 1]
+    return keep
+
+
+def coco_metrics(pred_data, anno_json, config):
+    """Calculate mAP of predicted bboxes."""
+    from pycocotools.coco import COCO
+    from pycocotools.cocoeval import COCOeval
+    num_classes = config.num_classes
+
+    #Classes need to train or test.
+    val_cls = config.classes
+    val_cls_dict = {}
+    for i, cls in enumerate(val_cls):
+        val_cls_dict[i] = cls
+    coco_gt = COCO(anno_json)
+    classs_dict = {}
+    cat_ids = coco_gt.loadCats(coco_gt.getCatIds())
+    for cat in cat_ids:
+        classs_dict[cat["name"]] = cat["id"]
+
+    predictions = []
+    img_ids = []
+
+    for sample in pred_data:
+        pred_boxes = sample['boxes']
+        box_scores = sample['box_scores']
+        img_id = sample['img_id']
+        h, w = sample['image_shape']
+
+        final_boxes = []
+        final_label = []
+        final_score = []
+        img_ids.append(img_id)
+
+        for c in range(1, num_classes):
+            class_box_scores = box_scores[:, c]
+            score_mask = class_box_scores > config.min_score
+            class_box_scores = class_box_scores[score_mask]
+            class_boxes = pred_boxes[score_mask] * [h, w, h, w]
+
+            if score_mask.any():
+                nms_index = apply_nms(class_boxes, class_box_scores, config.nms_threshold, config.max_boxes)
+                class_boxes = class_boxes[nms_index]
+                class_box_scores = class_box_scores[nms_index]
+
+                final_boxes += class_boxes.tolist()
+                final_score += class_box_scores.tolist()
+                final_label += [classs_dict[val_cls_dict[c]]] * len(class_box_scores)
+
+        for loc, label, score in zip(final_boxes, final_label, final_score):
+            res = {}
+            res['image_id'] = img_id
+            res['bbox'] = [loc[1], loc[0], loc[3] - loc[1], loc[2] - loc[0]]
+            res['score'] = score
+            res['category_id'] = label
+            predictions.append(res)
+    if not os.path.exists('./eval_out'):
+        os.makedirs('./eval_out')
+    with open('./eval_out/predictions.json', 'w') as f:
+        json.dump(predictions, f)
+
+    coco_dt = coco_gt.loadRes('./eval_out/predictions.json')
+    E = COCOeval(coco_gt, coco_dt, iouType='bbox')
+    E.params.imgIds = img_ids
+    E.evaluate()
+    E.accumulate()
+    E.summarize()
+    return E.stats[0]
+
+
+def parse_rec(filename):
+    """ Parse a PASCAL VOC xml file """
+    tree = et.parse(filename)
+    objects = []
+    for obj in tree.findall('object'):
+        obj_struct = {}
+        obj_struct['name'] = obj.find('name').text
+        obj_struct['pose'] = obj.find('pose').text
+        obj_struct['truncated'] = int(obj.find('truncated').text)
+        obj_struct['difficult'] = int(obj.find('difficult').text)
+        bbox = obj.find('bndbox')
+        obj_struct['bbox'] = [int(bbox.find('xmin').text) - 1,
+                              int(bbox.find('ymin').text) - 1,
+                              int(bbox.find('xmax').text) - 1,
+                              int(bbox.find('ymax').text) - 1]
+        objects.append(obj_struct)
+
+    return objects
+
+def voc_metrics(pred_data, annojson, config, use_07=True):
+    """calc voc ap"""
+    aps = []
+    # The PASCAL VOC metric changed in 2010
+    use_07_metric = use_07
+    print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
+    aps = voc_eval(pred_data, config, ovthresh=0.5, use_07_metric=use_07_metric)
+    print('Mean AP = {:.4f}'.format(np.mean(aps)))
+    print('~~~~~~~~')
+    print('Results:')
+    for ap in aps:
+        print('{:.3f}'.format(ap))
+    print('{:.3f}'.format(np.mean(aps)))
+    print('~~~~~~~~')
+    print('')
+    print('--------------------------------------------------------------')
+    print('Results computed with the **unofficial** Python eval code.')
+    print('Results should be very close to the official MATLAB eval code.')
+    print('--------------------------------------------------------------')
+    return np.mean(aps)
+
+
+def voc_ap(rec, prec, use_07_metric=True):
+    """ ap = voc_ap(rec, prec, [use_07_metric])
+    Compute VOC AP given precision and recall.
+    If use_07_metric is true, uses the
+    VOC 07 11 point method (default:True).
+    """
+    if use_07_metric:
+        # 11 point metric
+        ap = 0.
+        for t in np.arange(0., 1.1, 0.1):
+            if np.sum(rec >= t) == 0:
+                p = 0
+            else:
+                p = np.max(prec[rec >= t])
+            ap = ap + p / 11.
+    else:
+        # correct AP calculation
+        # first append sentinel values at the end
+        mrec = np.concatenate(([0.], rec, [1.]))
+        mpre = np.concatenate(([0.], prec, [0.]))
+
+        # compute the precision envelope
+        for i in range(mpre.size - 1, 0, -1):
+            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
+
+        # to calculate area under PR curve, look for points
+        # where X axis (recall) changes value
+        i = np.where(mrec[1:] != mrec[:-1])[0]
+
+        # and sum (\Delta recall) * prec
+        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
+    return ap
+
+
+def voc_pred_process(pred_data, val_cls, recs):
+    """process pred data for voc"""
+    num_classes = config.num_classes
+    cls_img_ids = {}
+    cls_bboxes = {}
+    cls_scores = {}
+    classes = {}
+    cls_npos = {}
+    for cls in val_cls:
+        if cls == 'background':
+            continue
+        class_recs = {}
+        npos = 0
+        for imagename in imagenames:
+            R = [obj for obj in recs[imagename] if obj['name'] == cls]
+            bbox = np.array([x['bbox'] for x in R])
+            difficult = np.array([x['difficult'] for x in R]).astype(np.bool)
+            det = [False] * len(R)
+            npos = npos + sum(~difficult)
+            class_recs[imagename] = {'bbox': bbox,
+                                     'difficult': difficult,
+                                     'det': det}
+        cls_npos[cls] = npos
+        classes[cls] = class_recs
+        cls_img_ids[cls] = []
+        cls_bboxes[cls] = []
+        cls_scores[cls] = []
+
+    for sample in pred_data:
+        pred_boxes = sample['boxes']
+        box_scores = sample['box_scores']
+        img_id = sample['img_id']
+        h, w = sample['image_shape']
+
+        final_boxes = []
+        final_label = []
+        final_score = []
+
+        for c in range(1, num_classes):
+            class_box_scores = box_scores[:, c]
+            score_mask = class_box_scores > config.min_score
+            class_box_scores = class_box_scores[score_mask]
+            class_boxes = pred_boxes[score_mask] * [h, w, h, w]
+
+            if score_mask.any():
+                nms_index = apply_nms(class_boxes, class_box_scores, config.nms_threshold, config.max_boxes)
+                class_boxes = class_boxes[nms_index]
+                class_box_scores = class_box_scores[nms_index]
+
+                final_boxes += class_boxes.tolist()
+                final_score += class_box_scores.tolist()
+                final_label += [c] * len(class_box_scores)
+
+        for loc, label, score in zip(final_boxes, final_label, final_score):
+            cls_img_ids[val_cls[label]].append(img_id)
+            cls_bboxes[val_cls[label]].append([loc[1], loc[0], loc[3], loc[2]])
+            cls_scores[val_cls[label]].append(score)
+    return classes, cls_img_ids, cls_bboxes, cls_scores, cls_npos
+
+def voc_eval(pred_data, config, ovthresh=0.5, use_07_metric=False):
+    """VOC metric utils"""
+    # first load gt
+    # load annots
+    print("Create VOC label")
+    val_cls = config.classes
+    voc_root = config.voc_root
+    sub_dir = 'eval'
+    voc_dir = os.path.join(voc_root, sub_dir)
+    if not os.path.isdir(voc_dir):
+        raise ValueError(f'Cannot find {sub_dir} dataset path.')
+
+    image_dir = anno_dir = voc_dir
+    if os.path.isdir(os.path.join(voc_dir, 'Images')):
+        image_dir = os.path.join(voc_dir, 'Images')
+    if os.path.isdir(os.path.join(voc_dir, 'Annotations')):
+        anno_dir = os.path.join(voc_dir, 'Annotations')
+    print("finding dir ", image_dir, anno_dir)
+    imagenames = []
+    image_paths = []
+    for anno_file in os.listdir(anno_dir):
+        if not anno_file.endswith('xml'):
+            continue
+        tree = et.parse(os.path.join(anno_dir, anno_file))
+        root_node = tree.getroot()
+        file_name = root_node.find('filename').text
+        imagenames.append(int(file_name[:-4]))
+        image_paths.append(os.path.join(anno_dir, anno_file))
+
+    recs = {}
+    for i, imagename in enumerate(imagenames):
+        recs[imagename] = parse_rec(image_paths[i])
+
+    # extract gt objects for this class
+    classes = {}
+    cls_img_ids = {}
+    cls_bboxes = {}
+    cls_scores = {}
+    cls_npos = {}
+    #pred data
+    classes, cls_img_ids, cls_bboxes, cls_scores, cls_npos = voc_pred_process(pred_data, val_cls, recs)
+    aps = []
+    for cls in val_cls:
+        if cls == 'background':
+            continue
+        npos = cls_npos[cls]
+        class_recs = classes[cls]
+        image_ids = cls_img_ids[cls]
+        confidence = np.array(cls_scores[cls])
+        BB = np.array(cls_bboxes[cls])
+        # sort by confidence
+        sorted_ind = np.argsort(-confidence)
+        #sorted_scores = np.sort(-confidence)
+        BB = BB[sorted_ind, :]
+        image_ids = [image_ids[x] for x in sorted_ind]
+
+        # go down dets and mark TPs and FPs
+        nd = len(image_ids)
+        tp = np.zeros(nd)
+        fp = np.zeros(nd)
+        for d in range(nd):
+            R = class_recs[image_ids[d]]
+            bb = BB[d, :].astype(float)
+            ovmax = -np.inf
+            BBGT = R['bbox'].astype(float)
+            if BBGT.size > 0:
+                # compute overlaps
+                # intersection
+                ixmin = np.maximum(BBGT[:, 0], bb[0])
+                iymin = np.maximum(BBGT[:, 1], bb[1])
+                ixmax = np.minimum(BBGT[:, 2], bb[2])
+                iymax = np.minimum(BBGT[:, 3], bb[3])
+                iw = np.maximum(ixmax - ixmin, 0.)
+                ih = np.maximum(iymax - iymin, 0.)
+                inters = iw * ih
+                uni = ((bb[2] - bb[0]) * (bb[3] - bb[1]) +
+                       (BBGT[:, 2] - BBGT[:, 0]) * (BBGT[:, 3] - BBGT[:, 1]) - inters)
+                overlaps = inters / uni
+                ovmax = np.max(overlaps)
+                jmax = np.argmax(overlaps)
+
+            if ovmax > ovthresh:
+                if not R['difficult'][jmax]:
+                    if not R['det'][jmax]:
+                        tp[d] = 1.
+                        R['det'][jmax] = 1
+                    else:
+                        fp[d] = 1.
+            else:
+                fp[d] = 1.
+
+        # compute precision recall
+        fp = np.cumsum(fp)
+        tp = np.cumsum(tp)
+        #print(npos, nd, fp[-1], tp[-1])
+        rec = tp / float(npos)
+        # avoid divide by zero in case the first detection matches a difficult
+        # ground truth
+        prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
+        ap = voc_ap(rec, prec, use_07_metric)
+        aps.append(ap)
+    return np.array(aps)
diff --git a/research/cv/RefineDet/src/init_params.py b/research/cv/RefineDet/src/init_params.py
new file mode 100644
index 0000000000000000000000000000000000000000..64833e798657d6c11a49f80f62be9f78646174bc
--- /dev/null
+++ b/research/cv/RefineDet/src/init_params.py
@@ -0,0 +1,50 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Parameters utils"""
+
+from mindspore.common.initializer import initializer, TruncatedNormal
+
+def init_net_param(network, initialize_mode='TruncatedNormal'):
+    """Init the parameters in net."""
+    params = network.trainable_params()
+    for p in params:
+        if 'beta' not in p.name and 'gamma' not in p.name and 'bias' not in p.name:
+            if initialize_mode == 'TruncatedNormal':
+                p.set_data(initializer(TruncatedNormal(0.02), p.data.shape, p.data.dtype))
+            else:
+                p.set_data(initialize_mode, p.data.shape, p.data.dtype)
+
+
+def load_backbone_params(network, param_dict):
+    """Init the parameters from pre-train model, default is mobilenetv2."""
+    for _, param in network.parameters_and_names():
+        param_name = param.name.replace('network.backbone.', '')
+        name_split = param_name.split('.')
+        if 'features_1' in param_name:
+            param_name = param_name.replace('features_1', 'features')
+        if 'features_2' in param_name:
+            param_name = '.'.join(['features', str(int(name_split[1]) + 14)] + name_split[2:])
+        if param_name in param_dict:
+            param.set_data(param_dict[param_name].data)
+
+
+def filter_checkpoint_parameter_by_list(param_dict, filter_list):
+    """remove useless parameters according to filter_list"""
+    for key in list(param_dict.keys()):
+        for name in filter_list:
+            if name in key:
+                print("Delete parameter from checkpoint: ", key)
+                del param_dict[key]
+                break
diff --git a/research/cv/RefineDet/src/l2norm.py b/research/cv/RefineDet/src/l2norm.py
new file mode 100644
index 0000000000000000000000000000000000000000..46ab5c5082ff70c1775a02e01e6b386bc7765e82
--- /dev/null
+++ b/research/cv/RefineDet/src/l2norm.py
@@ -0,0 +1,38 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""L2Normalization for RefineDet"""
+
+import mindspore as ms
+import mindspore.nn as nn
+from mindspore import Tensor
+from mindspore.ops import operations as P
+from mindspore.common.initializer import Constant
+
+class L2Norm(nn.Cell):
+    """L2 Normalization for refinedet"""
+    def __init__(self, n_channels, scale):
+        super(L2Norm, self).__init__()
+        self.n_channels = n_channels
+        self.gamma = scale
+        self.eps = 1e-10
+        self.weight = ms.Parameter(Tensor(shape=self.n_channels, dtype=ms.float32, init=Constant(self.gamma)))
+        self.norm = P.L2Normalize(axis=1, epsilon=self.eps)
+        self.expand_dims = P.ExpandDims()
+
+    def construct(self, x):
+        """construct network"""
+        x = self.norm(x)
+        out = self.expand_dims(self.expand_dims(self.expand_dims(self.weight, 0), 2), 3).expand_as(x) * x
+        return out
diff --git a/research/cv/RefineDet/src/lr_schedule.py b/research/cv/RefineDet/src/lr_schedule.py
new file mode 100644
index 0000000000000000000000000000000000000000..893ccbe2300076d03dd78b04896c4bf4b6e66a42
--- /dev/null
+++ b/research/cv/RefineDet/src/lr_schedule.py
@@ -0,0 +1,55 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Learning rate schedule"""
+
+import math
+import numpy as np
+
+
+def get_lr(global_step, lr_init, lr_end, lr_max, warmup_epochs, total_epochs, steps_per_epoch):
+    """
+    generate learning rate array
+
+    Args:
+       global_step(int): total steps of the training
+       lr_init(float): init learning rate
+       lr_end(float): end learning rate
+       lr_max(float): max learning rate
+       warmup_epochs(float): number of warmup epochs
+       total_epochs(int): total epoch of training
+       steps_per_epoch(int): steps of one epoch
+
+    Returns:
+       np.array, learning rate array
+    """
+    lr_each_step = []
+    total_steps = steps_per_epoch * total_epochs
+    warmup_steps = steps_per_epoch * warmup_epochs
+    for i in range(total_steps):
+        if i < warmup_steps:
+            lr = lr_init + (lr_max - lr_init) * i / warmup_steps
+        else:
+            lr = lr_end + \
+                 (lr_max - lr_end) * \
+                 (1. + math.cos(math.pi * (i - warmup_steps) / (total_steps - warmup_steps))) / 2.
+        if lr < 0.0:
+            lr = 0.0
+        lr_each_step.append(lr)
+
+    current_step = global_step
+    lr_each_step = np.array(lr_each_step).astype(np.float32)
+    learning_rate = lr_each_step[current_step:]
+
+    return learning_rate
diff --git a/research/cv/RefineDet/src/multibox.py b/research/cv/RefineDet/src/multibox.py
new file mode 100644
index 0000000000000000000000000000000000000000..fcf82f28e8bc1797ab03d0479065e44effb5a395
--- /dev/null
+++ b/research/cv/RefineDet/src/multibox.py
@@ -0,0 +1,99 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Multibox layers from SSD"""
+
+import mindspore.nn as nn
+from mindspore.ops import operations as P
+from mindspore.ops import functional as F
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """nsures that all layers have a channel number that is divisible by 8."""
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+def _bn(channel):
+    return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.97,
+                          gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1)
+
+
+def _last_conv2d(in_channels, out_channels, kernel_size=3, stride=1, pad_mode='same', pad=0):
+    depthwise_conv = nn.Conv2d(in_channels=in_channels, out_channels=in_channels, kernel_size=kernel_size,
+                               stride=stride, pad_mode=pad_mode, padding=pad, group=in_channels)
+    conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, pad_mode="same", has_bias=True)
+    return nn.SequentialCell([depthwise_conv, _bn(in_channels), nn.ReLU6(), conv])
+
+class FlattenConcat(nn.Cell):
+    """
+    Concatenate predictions into a single tensor.
+
+    Args:
+        num_ssd_boxes (int): The number of boxes.
+
+    Returns:
+        Tensor, flatten predictions.
+    """
+    def __init__(self, config):
+        super(FlattenConcat, self).__init__()
+        self.num_ssd_boxes = config.num_ssd_boxes
+        self.concat = P.Concat(axis=1)
+        self.transpose = P.Transpose()
+    def construct(self, inputs):
+        """construct network"""
+        output = ()
+        batch_size = F.shape(inputs[0])[0]
+        for x in inputs:
+            x = self.transpose(x, (0, 2, 3, 1))
+            output += (F.reshape(x, (batch_size, -1)),)
+        res = self.concat(output)
+        return F.reshape(res, (batch_size, self.num_ssd_boxes, -1))
+
+
+class MultiBox(nn.Cell):
+    """
+    Multibox conv layers. Each multibox layer contains class conf scores and localization predictions.
+    """
+    def __init__(self, config, num_classes, out_channels):
+        super(MultiBox, self).__init__()
+        num_classes = num_classes
+        out_channels = out_channels
+        num_default = config.num_default
+
+        loc_layers = []
+        cls_layers = []
+        for k, out_channel in enumerate(out_channels):
+            loc_layers += [_last_conv2d(out_channel, 4 * num_default[k],
+                                        kernel_size=3, stride=1, pad_mode='same', pad=0)]
+            cls_layers += [_last_conv2d(out_channel, num_classes * num_default[k],
+                                        kernel_size=3, stride=1, pad_mode='same', pad=0)]
+
+        self.multi_loc_layers = nn.layer.CellList(loc_layers)
+        self.multi_cls_layers = nn.layer.CellList(cls_layers)
+        self.flatten_concat = FlattenConcat(config)
+
+    def construct(self, inputs):
+        """construct network"""
+        loc_outputs = ()
+        cls_outputs = ()
+        for i in range(len(self.multi_loc_layers)):
+            loc_outputs += (self.multi_loc_layers[i](inputs[i]),)
+            cls_outputs += (self.multi_cls_layers[i](inputs[i]),)
+        return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs)
diff --git a/research/cv/RefineDet/src/refinedet.py b/research/cv/RefineDet/src/refinedet.py
new file mode 100644
index 0000000000000000000000000000000000000000..e536a367f7045bbe91615d5789cdb65bd0fb02f1
--- /dev/null
+++ b/research/cv/RefineDet/src/refinedet.py
@@ -0,0 +1,224 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""RefineDet network structure"""
+
+import mindspore as ms
+import mindspore.nn as nn
+from mindspore.ops import operations as P
+from mindspore.ops import functional as F
+
+from .vgg16_for_refinedet import vgg16
+from .resnet101_for_refinedet import resnet
+from .multibox import MultiBox
+from .l2norm import L2Norm
+
+def _make_conv_layer(channels, use_bn=False, use_relu=True, kernel_size=3, padding=0):
+    """make convolution layer for refinedet"""
+    in_channels = channels[0]
+    layers = []
+    for out_channels in channels[1:]:
+        layers.append(nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
+                                kernel_size=kernel_size, pad_mode="pad", padding=padding))
+        if use_bn:
+            layers.append(nn.BatchNorm2d(out_channels))
+        if use_relu:
+            layers.append(nn.ReLU())
+        in_channels = out_channels
+    return layers
+
+def _make_deconv_layer(channels, use_bn=False, use_relu=True, kernel_size=3, padding=0, stride=1):
+    """make deconvolution layer for TCB"""
+    in_channels = channels[0]
+    layers = []
+    for out_channels in channels[1:]:
+        layers.append(nn.Conv2dTranspose(in_channels=in_channels, out_channels=out_channels,
+                                         kernel_size=kernel_size, pad_mode="pad", padding=padding, stride=stride))
+        if use_bn:
+            layers.append(nn.BatchNorm2d(out_channels))
+        if use_relu:
+            layers.append(nn.ReLU())
+        in_channels = out_channels
+    return nn.SequentialCell(layers)
+
+class TCB(nn.Cell):
+    """TCB block for transport features from ARM to ODM"""
+    def __init__(self, arm_source_num, in_channels, normalization, use_bn=False):
+        super(TCB, self).__init__()
+        self.layers = []
+        self.t_num = arm_source_num
+        self.add = P.Add()
+        for idx in range(self.t_num):
+            self.layers.append([])
+            if normalization:
+                if normalization[idx] != -1:
+                    self.layers[idx] += [L2Norm(in_channels[idx], normalization[idx])]
+
+            self.layers[idx] += _make_conv_layer([in_channels[idx], 256], use_bn=use_bn, padding=1)
+            if idx + 1 == self.t_num:
+                self.layers[idx] += [nn.SequentialCell(_make_conv_layer([256, 256, 256], use_bn=use_bn, padding=1))]
+            else:
+                self.layers[idx] += _make_conv_layer([256, 256], use_bn=use_bn, use_relu=False, padding=1)
+                self.layers[idx] += [nn.SequentialCell(_make_conv_layer([256, 256, 256], use_bn=use_bn, padding=1))]
+        self.tcb0 = nn.SequentialCell(self.layers[0][:-1])
+        self.deconv0 = _make_deconv_layer([256, 256], use_bn=use_bn, kernel_size=2, stride=2)
+        self.p0 = self.layers[0][-1]
+        self.tcb1 = nn.SequentialCell(self.layers[1][:-1])
+        self.deconv1 = _make_deconv_layer([256, 256], use_bn=use_bn, kernel_size=2, stride=2)
+        self.p1 = self.layers[1][-1]
+        self.tcb2 = nn.SequentialCell(self.layers[2][:-1])
+        self.deconv2 = _make_deconv_layer([256, 256], use_bn=use_bn, kernel_size=2, stride=2)
+        self.p2 = self.layers[2][-1]
+        self.tcb3 = nn.SequentialCell(self.layers[3][:-1])
+        self.p3 = self.layers[3][-1]
+
+    def construct(self, x):
+        """construct network"""
+        outputs = ()
+        tmp = x[3]
+        tmp = self.tcb3(tmp)
+        tmp = self.p3(tmp)
+        outputs += (tmp,)
+        tmp = x[2]
+        tmp = self.tcb2(tmp)
+        tmp = self.add(tmp, self.deconv2(outputs[0]))
+        tmp = self.p2(tmp)
+        outputs = (tmp,) + outputs
+        tmp = x[1]
+        tmp = self.tcb1(tmp)
+        tmp = self.add(tmp, self.deconv1(outputs[0]))
+        tmp = self.p1(tmp)
+        outputs = (tmp,) + outputs
+        tmp = x[0]
+        tmp = self.tcb0(tmp)
+        tmp = self.add(tmp, self.deconv0(outputs[0]))
+        tmp = self.p0(tmp)
+        outputs = (tmp,) + outputs
+        return outputs
+
+class ARM(nn.Cell):
+    """anchor refined module"""
+    def __init__(self, backbone, config, is_training=True):
+        super(ARM, self).__init__()
+        self.layer = []
+        self.layers = {}
+        self.backbone = backbone
+        self.multi_box = MultiBox(config, 2, config.extra_arm_channels)
+        self.is_training = is_training
+        if not is_training:
+            self.activation = P.Sigmoid()
+
+    def construct(self, x):
+        """construct network"""
+        outputs = self.backbone(x)
+        multi_feature = outputs
+        pred_loc, pred_label = self.multi_box(multi_feature)
+        if not self.is_training:
+            pred_label = self.activation(pred_label)
+        pred_loc = F.cast(pred_loc, ms.float32)
+        pred_label = F.cast(pred_label, ms.float32)
+        return outputs, pred_loc, pred_label
+
+class ODM(nn.Cell):
+    """object detecion module"""
+    def __init__(self, config, is_training=True):
+        super(ODM, self).__init__()
+        self.layer = []
+        self.layers = {}
+        self.multi_box = MultiBox(config, config.num_classes, config.extra_odm_channels)
+        self.is_training = is_training
+        if not is_training:
+            self.activation = P.Sigmoid()
+
+    def construct(self, x):
+        """construct network"""
+        outputs = x
+        multi_feature = outputs
+        pred_loc, pred_label = self.multi_box(multi_feature)
+        if not self.is_training:
+            pred_label = self.activation(pred_label)
+        pred_loc = F.cast(pred_loc, ms.float32)
+        pred_label = F.cast(pred_label, ms.float32)
+        return pred_loc, pred_label
+
+class RefineDet(nn.Cell):
+    """refinedet network"""
+    def __init__(self, backbone, config, is_training=True):
+        super(RefineDet, self).__init__()
+        self.backbone = backbone
+        self.is_training = is_training
+        self.arm = ARM(backbone, config, is_training)
+        self.odm = ODM(config, is_training)
+        self.tcb = TCB(len(config.arm_source), config.extra_arm_channels, config.L2normalizations)
+
+    def construct(self, x):
+        """construct network"""
+        arm_out, arm_pre_loc, arm_pre_label = self.arm(x)
+        tcb_out = self.tcb(arm_out)
+        odm_pre_loc, odm_pre_label = self.odm(tcb_out)
+        return arm_pre_loc, arm_pre_label, odm_pre_loc, odm_pre_label, arm_out
+
+def refinedet_vgg16(config, is_training=True):
+    """return refinedet with vgg16"""
+    return RefineDet(backbone=vgg16(), config=config, is_training=is_training)
+
+
+def refinedet_resnet101(config, is_training=True):
+    """return refinedet with resnet101"""
+    return RefineDet(backbone=resnet(), config=config, is_training=is_training)
+
+class RefineDetInferWithDecoder(nn.Cell):
+    """
+    RefineDet Infer wrapper to decode the bbox locations. (As detection layers in other forms)
+    Args:
+        network (Cell): the origin ssd infer network without bbox decoder.
+        default_boxes (Tensor): the default_boxes from anchor generator
+        config (dict): network config
+    Returns:
+        Tensor, the locations for bbox after decoder representing (y0,x0,y1,x1)
+        Tensor, the prediction labels.
+    """
+    def __init__(self, network, default_boxes, config):
+        super(RefineDetInferWithDecoder, self).__init__()
+        self.network = network
+        self.default_boxes = default_boxes
+        self.prior_scaling_xy = config.prior_scaling[0]
+        self.prior_scaling_wh = config.prior_scaling[1]
+        self.objectness_thre = config.objectness_thre
+        self.softmax1 = nn.Softmax()
+        self.softmax2 = nn.Softmax()
+
+    def construct(self, x):
+        """construct network"""
+        _, arm_label, odm_loc, odm_label, _ = self.network(x)
+
+        arm_label = self.softmax1(arm_label)
+        pred_loc = odm_loc
+        pred_label = self.softmax2(odm_label)
+        pred_label = odm_label
+        arm_object_conf = arm_label[:, :, 1:]
+        no_object_index = F.cast(arm_object_conf > self.objectness_thre, ms.float32)
+        pred_label = pred_label * no_object_index.expand_as(pred_label)
+
+        default_bbox_xy = self.default_boxes[..., :2]
+        default_bbox_wh = self.default_boxes[..., 2:]
+        pred_xy = pred_loc[..., :2] * self.prior_scaling_xy * default_bbox_wh + default_bbox_xy
+        pred_wh = P.Exp()(pred_loc[..., 2:] * self.prior_scaling_wh) * default_bbox_wh
+
+        pred_xy_0 = pred_xy - pred_wh / 2.0
+        pred_xy_1 = pred_xy + pred_wh / 2.0
+        pred_xy = P.Concat(-1)((pred_xy_0, pred_xy_1))
+        pred_xy = P.Maximum()(pred_xy, 0)
+        pred_xy = P.Minimum()(pred_xy, 1)
+        return pred_xy, pred_label
diff --git a/research/cv/RefineDet/src/refinedet_loss_cell.py b/research/cv/RefineDet/src/refinedet_loss_cell.py
new file mode 100644
index 0000000000000000000000000000000000000000..dfc0e7d59f553c07423c184bdfac836f5ace0309
--- /dev/null
+++ b/research/cv/RefineDet/src/refinedet_loss_cell.py
@@ -0,0 +1,185 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""RefineDet loss cell and training wrapper"""
+
+import mindspore as ms
+import mindspore.nn as nn
+from mindspore import context, Tensor
+from mindspore.context import ParallelMode
+from mindspore.parallel._auto_parallel_context import auto_parallel_context
+from mindspore.communication.management import get_group_size
+from mindspore.ops import operations as P
+from mindspore.ops import functional as F
+from mindspore.ops import composite as C
+
+
+
+class SigmoidFocalClassificationLoss(nn.Cell):
+    """"
+    Sigmoid focal-loss for classification.
+
+    Args:
+        gamma (float): Hyper-parameter to balance the easy and hard examples. Default: 2.0
+        alpha (float): Hyper-parameter to balance the positive and negative example. Default: 0.25
+
+    Returns:
+        Tensor, the focal loss.
+    """
+    def __init__(self, gamma=2.0, alpha=0.25):
+        super(SigmoidFocalClassificationLoss, self).__init__()
+        self.sigmiod_cross_entropy = P.SigmoidCrossEntropyWithLogits()
+        self.sigmoid = P.Sigmoid()
+        self.pow = P.Pow()
+        self.onehot = P.OneHot()
+        self.on_value = Tensor(1.0, ms.float32)
+        self.off_value = Tensor(0.0, ms.float32)
+        self.gamma = gamma
+        self.alpha = alpha
+
+    def construct(self, logits, label):
+        """construct network"""
+        label = self.onehot(label, F.shape(logits)[-1], self.on_value, self.off_value)
+        sigmiod_cross_entropy = self.sigmiod_cross_entropy(logits, label)
+        sigmoid = self.sigmoid(logits)
+        label = F.cast(label, ms.float32)
+        p_t = label * sigmoid + (1 - label) * (1 - sigmoid)
+        modulating_factor = self.pow(1 - p_t, self.gamma)
+        alpha_weight_factor = label * self.alpha + (1 - label) * (1 - self.alpha)
+        focal_loss = modulating_factor * alpha_weight_factor * sigmiod_cross_entropy
+        return focal_loss
+
+
+class MultiBoxLoss(nn.Cell):
+    """"
+    Provide multibox loss through network.
+
+    Args:
+        network (Cell): The training network.
+        config (dict): RefineDet config.
+
+    Returns:
+        Tensor, the loss of the network.
+    """
+    def __init__(self, config):
+        super(MultiBoxLoss, self).__init__()
+        self.less = P.Less()
+        self.tile = P.Tile()
+        self.reduce_sum = P.ReduceSum()
+        self.expand_dims = P.ExpandDims()
+        self.class_loss = SigmoidFocalClassificationLoss(config.gamma, config.alpha)
+        self.loc_loss = nn.SmoothL1Loss()
+        self.softmax = nn.Softmax(axis=2)
+
+    def construct(self, x, gt_loc, gt_label, num_matched_boxes, arm_label=None, theta=0.01, use_hard=0):
+        """construct network"""
+        pred_loc, pred_label = x
+        mask = F.cast(self.less(0, gt_label), ms.float32)
+        if arm_label is not None:
+            p = self.softmax(arm_label)
+            hard_negative = F.cast(p[:, :, 1] > theta, ms.float32)
+            mask = (1 - use_hard) * mask + use_hard * mask * hard_negative
+        num_matched_boxes = self.reduce_sum(F.cast(num_matched_boxes, ms.float32))
+
+        # Localization Loss
+        mask_loc = self.tile(self.expand_dims(mask, -1), (1, 1, 4))
+        smooth_l1 = self.loc_loss(pred_loc, gt_loc) * mask_loc
+        loss_loc = self.reduce_sum(self.reduce_sum(smooth_l1, -1), -1)
+
+        # Classification Loss
+        loss_cls = self.class_loss(pred_label, gt_label)
+        loss_cls = self.reduce_sum(loss_cls, (1, 2))
+
+        return self.reduce_sum((loss_cls + loss_loc) / num_matched_boxes)
+
+
+class RefineDetLossCell(nn.Cell):
+    """"
+    Provide RefineDet training loss through network.
+
+    Args:
+        network (Cell): The training network.
+        config (dict): RefineDet config.
+
+    Returns:
+        Tensor, the loss of the network.
+    """
+    def __init__(self, network, config):
+        super(RefineDetLossCell, self).__init__()
+        self.multiboxloss = MultiBoxLoss(config)
+        self.network = network
+
+    def construct(self, x, gt_loc, gt_label, num_matched_boxes):
+        """construct network"""
+        arm_pre_loc, arm_pre_label, odm_pre_loc, odm_pre_label, _ = self.network(x)
+        arm_loss = self.multiboxloss((arm_pre_loc, arm_pre_label), gt_loc, gt_label, num_matched_boxes)
+        odm_loss = self.multiboxloss((odm_pre_loc, odm_pre_label), gt_loc, gt_label, num_matched_boxes, arm_pre_label)
+        return arm_loss + odm_loss
+
+
+grad_scale = C.MultitypeFuncGraph("grad_scale")
+@grad_scale.register("Tensor", "Tensor")
+def tensor_grad_scale(scale, grad):
+    return grad * P.Reciprocal()(scale)
+
+
+class TrainingWrapper(nn.Cell):
+    """
+    Encapsulation class of SSD network training.
+
+    Append an optimizer to the training network after that the construct
+    function can be called to create the backward graph.
+
+    Args:
+        network (Cell): The training network. Note that loss function should have been added.
+        optimizer (Optimizer): Optimizer for updating the weights.
+        sens (Number): The adjust parameter. Default: 1.0.
+        use_global_nrom(bool): Whether apply global norm before optimizer. Default: False
+    """
+    def __init__(self, network, optimizer, sens=1.0, use_global_norm=False):
+        super(TrainingWrapper, self).__init__(auto_prefix=False)
+        self.network = network
+        self.network.set_grad()
+        self.weights = ms.ParameterTuple(network.trainable_params())
+        self.optimizer = optimizer
+        self.grad = C.GradOperation(get_by_list=True, sens_param=True)
+        self.sens = sens
+        self.reducer_flag = False
+        self.grad_reducer = None
+        self.use_global_norm = use_global_norm
+        self.parallel_mode = context.get_auto_parallel_context("parallel_mode")
+        if self.parallel_mode in [ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL]:
+            self.reducer_flag = True
+        if self.reducer_flag:
+            mean = context.get_auto_parallel_context("gradients_mean")
+            if auto_parallel_context().get_device_num_is_set():
+                degree = context.get_auto_parallel_context("device_num")
+            else:
+                degree = get_group_size()
+            self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree)
+        self.hyper_map = C.HyperMap()
+
+    def construct(self, *args):
+        """construct network"""
+        weights = self.weights
+        loss = self.network(*args)
+        sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
+        grads = self.grad(self.network, weights)(*args, sens)
+        if self.reducer_flag:
+            # apply grad reducer on grads
+            grads = self.grad_reducer(grads)
+        if self.use_global_norm:
+            grads = self.hyper_map(F.partial(grad_scale, F.scalar_to_array(self.sens)), grads)
+            grads = C.clip_by_global_norm(grads)
+        return F.depend(loss, self.optimizer(grads))
diff --git a/research/cv/RefineDet/src/resnet101_for_refinedet.py b/research/cv/RefineDet/src/resnet101_for_refinedet.py
new file mode 100644
index 0000000000000000000000000000000000000000..0ee3b6497efa7d372297f520b08e8f476910201a
--- /dev/null
+++ b/research/cv/RefineDet/src/resnet101_for_refinedet.py
@@ -0,0 +1,241 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""ResNet101 for RefineDet"""
+import mindspore.nn as nn
+from mindspore.ops import operations as P
+
+
+def _conv3x3(in_channel, out_channel, stride=1):
+    return nn.Conv2d(in_channel, out_channel,
+                     kernel_size=3, stride=stride, padding=0, pad_mode='same')
+
+
+def _conv1x1(in_channel, out_channel, stride=1):
+    return nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=stride, padding=0, pad_mode='same')
+
+
+def _conv7x7(in_channel, out_channel, stride=1):
+    return nn.Conv2d(in_channel, out_channel, kernel_size=7, stride=stride, padding=0, pad_mode='same')
+
+
+def _bn(channel):
+    return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997,
+                          gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1)
+
+
+def _bn_last(channel):
+    return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997,
+                          gamma_init=0, beta_init=0, moving_mean_init=0, moving_var_init=1)
+
+class ResidualBlock(nn.Cell):
+    """
+    ResNet V1 residual block definition.
+
+    Args:
+        in_channel (int): Input channel.
+        out_channel (int): Output channel.
+        stride (int): Stride size for the first convolutional layer. Default: 1.
+
+    Returns:
+        Tensor, output tensor.
+
+    Examples:
+        >>> ResidualBlock(3, 256, stride=2)
+    """
+
+    def __init__(self,
+                 in_channel,
+                 out_channel,
+                 stride=1, expansion=4):
+        super(ResidualBlock, self).__init__()
+        self.expansion = expansion
+        self.stride = stride
+        channel = out_channel // self.expansion
+        self.conv1 = _conv1x1(in_channel, channel, stride=1)
+        self.bn1 = _bn(channel)
+        self.conv2 = _conv3x3(channel, channel, stride=stride)
+        self.bn2 = _bn(channel)
+
+        self.conv3 = _conv1x1(channel, out_channel, stride=1)
+        self.bn3 = _bn_last(out_channel)
+        self.relu = nn.ReLU()
+
+        self.down_sample = False
+
+        if stride != 1 or in_channel != out_channel:
+            self.down_sample = True
+        self.down_sample_layer = None
+
+        if self.down_sample:
+            self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride), _bn(out_channel)])
+        self.add = P.Add()
+
+    def construct(self, x):
+        """construct network"""
+        identity = x
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.relu(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        out = self.relu(out)
+        out = self.conv3(out)
+        out = self.bn3(out)
+
+        if self.down_sample:
+            identity = self.down_sample_layer(identity)
+
+        out = self.add(out, identity)
+        out = self.relu(out)
+
+        return out
+
+
+class ResNet(nn.Cell):
+    """
+    ResNet architecture.
+
+    Args:
+        block (Cell): Block for network.
+        layer_nums (list): Numbers of block in different layers.
+        in_channels (list): Input channel in each layer.
+        out_channels (list): Output channel in each layer.
+        strides (list):  Stride size in each layer.
+    Returns:
+        Tensor, output tensor.
+
+    Examples:
+        >>> ResNet(ResidualBlock,
+        >>>        [3, 4, 6, 3],
+        >>>        [64, 256, 512, 1024],
+        >>>        [256, 512, 1024, 2048],
+        >>>        [1, 2, 2, 2]
+    """
+
+    def __init__(self,
+                 block,
+                 layer_nums,
+                 in_channels,
+                 out_channels,
+                 strides):
+        super(ResNet, self).__init__()
+
+        if not len(layer_nums) == len(in_channels) == len(out_channels) == 4:
+            raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!")
+        self.conv1 = _conv7x7(3, 64, stride=2)
+        self.bn1 = _bn(64)
+        self.relu = P.ReLU()
+        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same")
+        self.layer1 = self._make_layer(block,
+                                       layer_nums[0],
+                                       in_channel=in_channels[0],
+                                       out_channel=out_channels[0],
+                                       stride=strides[0])
+        self.layer2 = self._make_layer(block,
+                                       layer_nums[1],
+                                       in_channel=in_channels[1],
+                                       out_channel=out_channels[1],
+                                       stride=strides[1])
+        self.layer3 = self._make_layer(block,
+                                       layer_nums[2],
+                                       in_channel=in_channels[2],
+                                       out_channel=out_channels[2],
+                                       stride=strides[2])
+        self.layer4 = self._make_layer(block,
+                                       layer_nums[3],
+                                       in_channel=in_channels[3],
+                                       out_channel=out_channels[3],
+                                       stride=strides[3])
+
+    def _make_layer(self, block, layer_num, in_channel, out_channel, stride):
+        """
+        Make stage network of ResNet.
+
+        Args:
+            block (Cell): Resnet block.
+            layer_num (int): Layer number.
+            in_channel (int): Input channel.
+            out_channel (int): Output channel.
+            stride (int): Stride size for the first convolutional layer.
+        Returns:
+            SequentialCell, the output layer.
+
+        Examples:
+            >>> _make_layer(ResidualBlock, 3, 128, 256, 2)
+        """
+        layers = []
+
+        resnet_block = block(in_channel, out_channel, stride=stride)
+        layers.append(resnet_block)
+        for _ in range(1, layer_num):
+            resnet_block = block(out_channel, out_channel, stride=1)
+            layers.append(resnet_block)
+        return nn.SequentialCell(layers)
+
+    def construct(self, x):
+        """construct network"""
+        x = self.conv1(x)
+        x = self.bn1(x)
+        x = self.relu(x)
+        c1 = self.maxpool(x)
+
+        c2 = self.layer1(c1)
+        c3 = self.layer2(c2)
+        c4 = self.layer3(c3)
+        c5 = self.layer4(c4)
+        return c1, c2, c3, c4, c5
+
+
+def resnet50():
+    """
+    Get ResNet50 neural network.
+
+    Returns:
+        Cell, cell instance of ResNet50 neural network.
+
+    Examples:
+        >>> net = resnet50()
+    """
+    return ResNet(ResidualBlock,
+                  [3, 4, 6, 3],
+                  [64, 256, 512, 1024],
+                  [256, 512, 1024, 2048],
+                  [1, 2, 2, 2])
+
+def resnet101():
+    """
+    Get ResNet101 neural network.
+    """
+    return ResNet(ResidualBlock,
+                  [3, 4, 23, 3],
+                  [64, 256, 512, 1024],
+                  [256, 512, 1024, 2048],
+                  [1, 2, 2, 2])
+
+class ResNet101_for_RefineDet(nn.Cell):
+    """build up resnet101"""
+    def __init__(self):
+        super(ResNet101_for_RefineDet, self).__init__()
+        self.base = resnet101()
+        self.extra = ResidualBlock(2048, 512, 2, 16)
+
+    def construct(self, x):
+        """construct network"""
+        _, _, c3, c4, c5 = self.base(x)
+        c6 = self.extra(c5)
+        return c3, c4, c5, c6
+
+def resnet():
+    return ResNet101_for_RefineDet()
diff --git a/research/cv/RefineDet/src/vgg16_for_refinedet.py b/research/cv/RefineDet/src/vgg16_for_refinedet.py
new file mode 100644
index 0000000000000000000000000000000000000000..cb3b505b935f9f8fb8e53abc59153669818e7b1f
--- /dev/null
+++ b/research/cv/RefineDet/src/vgg16_for_refinedet.py
@@ -0,0 +1,80 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""VGG16 backbone for RefineDet"""
+
+import mindspore.nn  as nn
+
+def _make_conv_layer(channels, use_bn=False, kernel_size=3, stride=1, padding=0):
+    """make convolution layers for vgg16"""
+    in_channels = channels[0]
+    layers = []
+    for out_channels in channels[1:]:
+        layers.append(nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
+                                kernel_size=kernel_size, stride=stride, pad_mode="pad", padding=padding))
+        if use_bn:
+            layers.append(nn.BatchNorm2d(out_channels))
+        layers.append(nn.ReLU())
+        in_channels = out_channels
+    return nn.SequentialCell(layers)
+
+class VGG16_for_RefineDet(nn.Cell):
+    """
+    VGG-16 network body, reference to caffe model_libs
+    """
+    def __init__(self):
+        super(VGG16_for_RefineDet, self).__init__()
+        self.b1 = _make_conv_layer([3, 64, 64], padding=1)
+        self.b2 = _make_conv_layer([64, 128, 128], padding=1)
+        self.b3 = _make_conv_layer([128, 256, 256, 256], padding=1)
+        self.b4 = _make_conv_layer([256, 512, 512, 512], padding=1)
+        self.b5 = _make_conv_layer([512, 512, 512, 512], padding=1)
+        self.m1 = nn.MaxPool2d(kernel_size=2, stride=2, pad_mode="same")
+        self.m2 = nn.MaxPool2d(kernel_size=2, stride=2, pad_mode="same")
+        self.m3 = nn.MaxPool2d(kernel_size=2, stride=2, pad_mode="same")
+        self.m4 = nn.MaxPool2d(kernel_size=2, stride=2, pad_mode="same")
+        self.m5 = nn.MaxPool2d(kernel_size=2, stride=2, pad_mode="same")
+        self.fc6 = nn.Conv2d(in_channels=512, out_channels=1024, pad_mode="pad", padding=3, kernel_size=3, dilation=3)
+        self.relu6 = nn.ReLU()
+        self.fc7 = nn.Conv2d(in_channels=1024, out_channels=1024, kernel_size=1)
+        self.relu7 = nn.ReLU()
+        self.b6_1 = _make_conv_layer([1024, 256], kernel_size=1)
+        self.b6_2 = _make_conv_layer([256, 512], stride=2, padding=1)
+
+    def construct(self, x):
+        """construct network"""
+        outputs = ()
+        x = self.b1(x)
+        x = self.m1(x)
+        x = self.b2(x)
+        x = self.m2(x)
+        x = self.b3(x)
+        x = self.m3(x)
+        x = self.b4(x)
+        outputs += (x,)
+        x = self.m4(x)
+        x = self.b5(x)
+        outputs += (x,)
+        x = self.m5(x)
+        x = self.fc6(x)
+        x = self.relu6(x)
+        x = self.fc7(x)
+        outputs += (x,)
+        x = self.relu7(x)
+        x = self.b6_1(x)
+        x = self.b6_2(x)
+        return outputs + (x,)
+
+def vgg16():
+    return VGG16_for_RefineDet()
diff --git a/research/cv/RefineDet/train.py b/research/cv/RefineDet/train.py
new file mode 100644
index 0000000000000000000000000000000000000000..56bb8b60f97eb59b841c699a3d1b942f199b8532
--- /dev/null
+++ b/research/cv/RefineDet/train.py
@@ -0,0 +1,205 @@
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""Train RefineDet and get checkpoint files."""
+
+import argparse
+import ast
+import os
+import mindspore.nn as nn
+from mindspore import context, Tensor
+from mindspore.communication.management import init, get_rank
+from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, LossMonitor, TimeMonitor
+from mindspore.train import Model
+from mindspore.context import ParallelMode
+from mindspore.train.serialization import load_checkpoint, load_param_into_net
+from mindspore.common import set_seed, dtype
+from src.config import get_config
+from src.dataset import create_refinedet_dataset, create_mindrecord
+from src.lr_schedule import get_lr
+from src.init_params import init_net_param
+from src.refinedet import refinedet_vgg16, refinedet_resnet101
+from src.refinedet_loss_cell import RefineDetLossCell, TrainingWrapper
+
+set_seed(1)
+
+def get_args():
+    """get args for train"""
+    parser = argparse.ArgumentParser(description="RefineDet training script")
+    parser.add_argument("--using_mode", type=str, default="refinedet_vgg16_320",
+                        choices=("refinedet_vgg16_320", "refinedet_vgg16_512",
+                                 "refinedet_resnet101_320", "refinedet_resnet101_512"),
+                        help="which network you want to train, we present four networks: "
+                             "using vgg16 as backbone with 320x320 image size"
+                             "using vgg16 as backbone with 512x512 image size"
+                             "using resnet101 as backbone with 320x320 image size"
+                             "using resnet101 as backbone with 512x512 image size")
+    parser.add_argument("--run_online", type=ast.literal_eval, default=False,
+                        help="Run on Modelarts platform, need data_url, train_url if true, default is False.")
+    parser.add_argument("--data_url", type=str,
+                        help="using for OBS file system")
+    parser.add_argument("--train_url", type=str,
+                        help="using for OBS file system")
+    parser.add_argument("--pre_trained_url", type=str, default=None, help="Pretrained Checkpoint file url for OBS.")
+    parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"),
+                        help="run platform, support Ascend, GPU and CPU.")
+    parser.add_argument("--only_create_dataset", type=ast.literal_eval, default=False,
+                        help="If set it true, only create Mindrecord, default is False.")
+    parser.add_argument("--distribute", type=ast.literal_eval, default=False,
+                        help="Run distribute, default is False.")
+    parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.")
+    parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.")
+    parser.add_argument("--lr", type=float, default=0.05, help="Learning rate, default is 0.05.")
+    parser.add_argument("--mode", type=str, default="sink", help="Run sink mode or not, default is sink.")
+    parser.add_argument("--dataset", type=str, default="coco",
+                        help="Dataset, default is coco."
+                             "Now we have coco, voc0712, voc0712plus")
+    parser.add_argument("--epoch_size", type=int, default=500, help="Epoch size, default is 500.")
+    parser.add_argument("--batch_size", type=int, default=32, help="Batch size, default is 32.")
+    parser.add_argument("--pre_trained", type=str, default=None, help="Pretrained Checkpoint file path.")
+    parser.add_argument("--pre_trained_epoch_size", type=int, default=0, help="Pretrained epoch size.")
+    parser.add_argument("--save_checkpoint_epochs", type=int, default=10, help="Save checkpoint epochs, default is 10.")
+    parser.add_argument("--loss_scale", type=int, default=1024, help="Loss scale, default is 1024.")
+    parser.add_argument("--filter_weight", type=ast.literal_eval, default=False,
+                        help="Filter head weight parameters, default is False.")
+    parser.add_argument('--debug', type=str, default="0", choices=["0", "1", "2", "3"],
+                        help="Active the debug mode. 0 for no debug mode,"
+                             "Under debug mode 1, the network would be run in PyNative mode,"
+                             "Under debug mode 2, all ascend log would be print on stdout,"
+                             "Under debug mode 3, all ascend log would be print on stdout."
+                        "And network will run in PyNative mode.")
+    parser.add_argument("--check_point", type=str, default="./ckpt",
+                        help="The directory path to save check point files")
+    args_opt = parser.parse_args()
+    return args_opt
+
+def refinedet_model_build(config, args_opt):
+    """build refinedet network"""
+    if config.model == "refinedet_vgg16":
+        refinedet = refinedet_vgg16(config=config)
+        init_net_param(refinedet)
+    elif config.model == "refinedet_resnet101":
+        refinedet = refinedet_resnet101(config=config)
+        init_net_param(refinedet)
+    else:
+        raise ValueError(f'config.model: {config.model} is not supported')
+    return refinedet
+
+def train_main(args_opt):
+    """main code for train refinedet"""
+    rank = 0
+    device_num = 1
+    # config with args
+    config = get_config(args_opt.using_mode, args_opt.dataset)
+
+    # run mode config
+    if args_opt.debug == "1" or args_opt.debug == "3":
+        network_mode = context.PYNATIVE_MODE
+    else:
+        network_mode = context.GRAPH_MODE
+
+    # set run platform
+    if args_opt.run_platform == "CPU":
+        context.set_context(mode=network_mode, device_target="CPU")
+    else:
+        context.set_context(mode=network_mode, device_target=args_opt.run_platform, device_id=args_opt.device_id)
+        if args_opt.distribute:
+            device_num = args_opt.device_num
+            context.reset_auto_parallel_context()
+            context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True,
+                                              device_num=device_num)
+            init()
+            rank = get_rank()
+
+    mindrecord_file = create_mindrecord(config, args_opt.dataset, "refinedet.mindrecord", True)
+
+    if args_opt.only_create_dataset:
+        return
+
+    loss_scale = float(args_opt.loss_scale)
+    if args_opt.run_platform == "CPU":
+        loss_scale = 1.0
+
+    # When create MindDataset, using the fitst mindrecord file, such as
+    # refinedet.mindrecord0.
+    use_multiprocessing = (args_opt.run_platform != "CPU")
+    dataset = create_refinedet_dataset(config, mindrecord_file, repeat_num=1, batch_size=args_opt.batch_size,
+                                       device_num=device_num, rank=rank, use_multiprocessing=use_multiprocessing)
+
+    dataset_size = dataset.get_dataset_size()
+    print(f"Create dataset done! dataset size is {dataset_size}")
+    refinedet = refinedet_model_build(config, args_opt)
+    if ("use_float16" in config and config.use_float16) or args_opt.run_platform == "GPU":
+        refinedet.to_float(dtype.float16)
+    net = RefineDetLossCell(refinedet, config)
+
+    # checkpoint
+    ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs)
+    ckpt_prefix = args_opt.check_point + '/ckpt_'
+    save_ckpt_path = ckpt_prefix + str(rank) + '/'
+    ckpoint_cb = ModelCheckpoint(prefix="refinedet", directory=save_ckpt_path, config=ckpt_config)
+
+    if args_opt.pre_trained:
+        param_dict = load_checkpoint(args_opt.pre_trained)
+        load_param_into_net(net, param_dict, True)
+
+    lr = Tensor(get_lr(global_step=args_opt.pre_trained_epoch_size * dataset_size,
+                       lr_init=config.lr_init, lr_end=config.lr_end_rate * args_opt.lr, lr_max=args_opt.lr,
+                       warmup_epochs=config.warmup_epochs,
+                       total_epochs=args_opt.epoch_size,
+                       steps_per_epoch=dataset_size))
+
+    if "use_global_norm" in config and config.use_global_norm:
+        opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr,
+                          config.momentum, config.weight_decay, 1.0)
+        net = TrainingWrapper(net, opt, loss_scale, True)
+    else:
+        opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr,
+                          config.momentum, config.weight_decay, loss_scale)
+        net = TrainingWrapper(net, opt, loss_scale)
+
+
+    callback = [TimeMonitor(data_size=dataset_size), LossMonitor(), ckpoint_cb]
+    model = Model(net)
+    dataset_sink_mode = False
+    if args_opt.mode == "sink" and args_opt.run_platform != "CPU":
+        print("In sink mode, one epoch return a loss.")
+        dataset_sink_mode = True
+    print("Start train RefineDet, the first epoch will be slower because of the graph compilation.")
+    model.train(args_opt.epoch_size, dataset, callbacks=callback, dataset_sink_mode=dataset_sink_mode)
+
+def main():
+    args_opt = get_args()
+    # copy files if online
+    if args_opt.run_online:
+        import moxing as mox
+        args_opt.device_id = int(os.getenv('DEVICE_ID'))
+        args_opt.device_num = int(os.getenv('RANK_SIZE'))
+        dir_root = os.getcwd()
+        data_root = os.path.join(dir_root, "data")
+        ckpt_root = os.path.join(dir_root, args_opt.check_point)
+        mox.file.copy_parallel(args_opt.data_url, data_root)
+        if args_opt.pre_trained:
+            mox.file.copy_parallel(args_opt.pre_trained_url, args_opt.pre_trained)
+    # print log to stdout
+    if args_opt.debug == "2" or args_opt.debug == "3":
+        os.environ["SLOG_PRINT_TO_STDOUT"] = "1"
+        os.environ["ASCEND_SLOG_PRINT_TO_STDOUT"] = "1"
+        os.environ["ASCEND_GLOBAL_LOG_LEVEL"] = "1"
+    train_main(args_opt)
+    if args_opt.run_online:
+        mox.file.copy_parallel(ckpt_root, args_opt.train_url)
+
+if __name__ == '__main__':
+    main()