!311 [昇腾众智] vit_base网络

Merge pull request !311 from yexijoe/vit

!311 [昇腾众智] vit_base网络
Merge pull request !311 from yexijoe/vit
1d8fe960 · i-robot · Gitee · 87e899b8 · f8585c86 · 1d8fe960
Commit 1d8fe960 authored 3 years ago by i-robot Committed by Gitee 3 years ago
--- a/research/cv/vit_base/README_CN.md
+++ b/research/cv/vit_base/README_CN.md
+# 目录
+
+<!-- TOC -->
+
+- [目录](#目录)
+- [vit_base描述](#vit_base描述)
+- [模型架构](#模型架构)
+- [数据集](#数据集)
+- [特性](#特性)
+    - [混合精度](#混合精度)
+- [环境要求](#环境要求)
+- [快速入门](#快速入门)
+- [脚本说明](#脚本说明)
+    - [脚本及样例代码](#脚本及样例代码)
+    - [脚本参数](#脚本参数)
+    - [训练过程](#训练过程)
+        - [训练](#训练)
+        - [分布式训练](#分布式训练)
+    - [评估过程](#评估过程)
+        - [评估](#评估)
+    - [导出过程](#导出过程)
+        - [导出](#导出)
+    - [推理过程](#推理过程)
+        - [推理](#推理)
+- [模型描述](#模型描述)
+    - [性能](#性能)
+        - [评估性能](#评估性能)
+            - [CIFAR-10上的vit_base](#cifar-10上的vit_base)
+        - [推理性能](#推理性能)
+            - [CIFAR-10上的vit_base](#cifar-10上的vit_base)
+- [ModelZoo主页](#modelzoo主页)
+
+<!-- /TOC -->
+
+# vit_base描述
+
+Transformer架构已广泛应用于自然语言处理领域。本模型的作者发现，Vision Transformer（ViT）模型在计算机视觉领域中对CNN的依赖不是必需的，直接将其应用于图像块序列来进行图像分类时，也能得到和目前卷积网络相媲美的准确率。
+
+[论文](https://arxiv.org/abs/2010.11929) ：Dosovitskiy, A. , Beyer, L. , Kolesnikov, A. , Weissenborn, D. , & Houlsby, N.. (2020). An image is worth 16x16 words: transformers for image recognition at scale.
+
+# 模型架构
+
+vit_base的总体网络架构如下： [链接](https://arxiv.org/abs/2010.11929)
+
+# 数据集
+
+使用的数据集：[CIFAR-10](<http://www.cs.toronto.edu/~kriz/cifar.html>)
+
+- 数据集大小：175M，共10个类、6万张彩色图像
+    - 训练集：146M，共5万张图像
+    - 测试集：29M，共1万张图像
+- 数据格式：二进制文件
+    - 注：数据将在src/dataset.py中处理。
+
+# 特性
+
+## 混合精度
+
+采用[混合精度](https://www.mindspore.cn/docs/programming_guide/zh-CN/r1.3/enable_mixed_precision.html) 的训练方法，使用支持单精度和半精度数据来提高深度学习神经网络的训练速度，同时保持单精度训练所能达到的网络精度。混合精度训练提高计算速度、减少内存使用的同时，支持在特定硬件上训练更大的模型或实现更大批次的训练。
+
+# 环境要求
+
+- 硬件（Ascend）
+    - 使用Ascend来搭建硬件环境。
+- 框架
+    - [MindSpore](https://www.mindspore.cn/install/en)
+- 如需查看详情，请参见如下资源：
+    - [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/r1.3/index.html)
+    - [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/r1.3/index.html)
+
+# 快速入门
+
+通过官方网站安装MindSpore后，您可以按照如下步骤进行训练和评估，特别地，进行训练前需要先下载官方基于ImageNet21k的预训练模型[ViT-B_16](https://console.cloud.google.com/storage/vit_models/) ，并将其转换为MindSpore支持的ckpt格式模型，命名为"cifar10_pre_checkpoint_based_imagenet21k.ckpt"，和训练集测试集数据放于同一级目录下：
+
+- Ascend处理器环境运行
+
+  ```python
+  # 运行训练示例
+  python train.py --device_id=0 --dataset_name='cifar10' > train.log 2>&1 &
+  OR
+  bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]
+
+  # 运行分布式训练示例
+  bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]
+
+  # 运行评估示例
+  python eval.py --checkpoint_path [CKPT_PATH] ./eval.log 2>&1 &
+  OR
+  bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]
+
+  # 运行推理示例
+  bash run_infer_310.sh ../vit_base.mindir Cifar10 /home/dataset/cifar-10-verify-bin/ 0
+  ```
+
+  对于分布式训练，需要提前创建JSON格式的hccl配置文件。
+
+  请遵循以下链接中的说明：
+
+ <https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.>
+
+- 在 ModelArts 进行训练 (如果你想在modelarts上运行，可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
+
+    - 在 ModelArts 上使用多卡训练 cifar10 数据集
+
+      ```python
+      # (1) 在网页上设置AI引擎为MindSpore
+      # (2) 在网页上设置 "ckpt_url=obs://path/pre_ckpt/"（预训练模型命名为"cifar10_pre_checkpoint_based_imagenet21k.ckpt"）
+      #     在网页上设置 "modelarts=True"
+      #     在网页上设置 其他参数
+      # (3) 上传你的数据集到 S3 桶上
+      # (4) 在网页上设置你的代码路径为 "/path/vit_base"
+      # (5) 在网页上设置启动文件为 "train.py"
+      # (6) 在网页上设置"训练数据集（如/dataset/cifar10/cifar-10-batches-bin/）"、"训练输出文件路径"、"作业日志路径"等
+      # (7) 创建训练作业
+      ```
+
+# 脚本说明
+
+## 脚本及样例代码
+
+```bash
+├── models
+    ├── README.md                                  // 所有模型相关说明
+    ├── vit_base
+        ├── README_CN.md                           // vit_base相关说明
+        ├── ascend310_infer                        // 实现310推理源代码
+        ├── scripts
+        │   ├──run_distribution_train_ascend.sh    // 分布式到Ascend的shell脚本
+        │   ├──run_infer_310.sh                    // Ascend推理的shell脚本
+        │   ├──run_standalone_eval_ascend.sh       // Ascend评估的shell脚本
+        │   ├──run_standalone_train_ascend.sh      // Ascend单卡训练的shell脚本
+        ├── src
+        │   ├──config.py                           // 参数配置
+        │   ├──dataset.py                          // 创建数据集
+        │   ├──modeling_ms.py                      // vit_base架构
+        │   ├──net_config.py                       // 结构参数配置
+        ├── eval.py                                // 评估脚本
+        ├── export.py                              // 将checkpoint文件导出到air/mindir
+        ├── postprocess.py                         // 310推理后处理脚本
+        ├── preprocess.py                          // 310推理前处理脚本
+        ├── train.py                               // 训练脚本
+```
+
+## 脚本参数
+
+在config.py中可以同时配置训练参数和评估参数。
+
+- 配置vit_base和CIFAR-10数据集。
+
+  ```python
+  'name':'cifar10'         # 数据集
+  'pre_trained':True       # 是否基于预训练模型训练
+  'num_classes':10         # 数据集类数
+  'lr_init':0.013          # 初始学习率，双卡并行训练
+  'batch_size':32          # 训练批次大小
+  'epoch_size':60          # 总计训练epoch数
+  'momentum':0.9           # 动量
+  'weight_decay':1e-4      # 权重衰减值
+  'image_height':224       # 输入到模型的图像高度
+  'image_width':224        # 输入到模型的图像宽度
+  'data_path':'/dataset/cifar10/cifar-10-batches-bin/'     # 训练数据集的绝对全路径
+  'val_data_path':'/dataset/cifar10/cifar-10-verify-bin/'  # 评估数据集的绝对全路径
+  'device_target':'Ascend' # 运行设备
+  'device_id':0            # 用于训练或评估数据集的设备ID，进行分布式训练时可以忽略
+  'keep_checkpoint_max':2  # 最多保存2个ckpt模型文件
+  'checkpoint_path':'/dataset/cifar10_pre_checkpoint_based_imagenet21k.ckpt'  # 保存预训练模型的绝对全路径
+  # optimizer and lr related
+  'lr_scheduler':'cosine_annealing'
+  'T_max':50
+  ```
+
+更多配置细节请参考脚本`config.py`。
+
+## 训练过程
+
+### 训练
+
+- Ascend处理器环境运行
+
+  ```bash
+  python train.py --device_id=0 --dataset_name='cifar10' > train.log 2>&1 &
+  OR
+  bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]
+  ```
+
+  上述python命令将在后台运行，可以通过生成的train.log文件查看结果。
+
+  训练结束后，可以在默认脚本文件夹下得到损失值：
+
+  ```bash
+  Load pre_trained ckpt: ./cifar10_pre_checkpoint_based_imagenet21k.ckpt
+  epoch: 1 step: 1562, loss is 0.12886986
+  epoch time: 289458.121 ms, per step time: 185.312 ms
+  epoch: 2 step: 1562, loss is 0.15596801
+  epoch time: 245404.168 ms, per step time: 157.109 ms
+  {'acc': 0.9240785256410257}
+  epoch: 3 step: 1562, loss is 0.06133139
+  epoch time: 244538.410 ms, per step time: 156.555 ms
+  epoch: 4 step: 1562, loss is 0.28615832
+  epoch time: 245382.652 ms, per step time: 157.095 ms
+  {'acc': 0.9597355769230769}
+  ```
+
+### 分布式训练
+
+- Ascend处理器环境运行
+
+  ```bash
+  bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]
+  ```
+
+  上述shell脚本将在后台运行分布训练。
+
+  训练结束后，可以得到损失值：
+
+  ```bash
+  Load pre_trained ckpt: ./cifar10_pre_checkpoint_based_imagenet21k.ckpt
+  epoch: 1 step: 781, loss is 0.015172593
+  epoch time: 195952.289 ms, per step time: 250.899 ms
+  epoch: 2 step: 781, loss is 0.06709316
+  epoch time: 135894.327 ms, per step time: 174.000 ms
+  {'acc': 0.9853766025641025}
+  epoch: 3 step: 781, loss is 0.050968178
+  epoch time: 135056.020 ms, per step time: 172.927 ms
+  epoch: 4 step: 781, loss is 0.01949552
+  epoch time: 136084.816 ms, per step time: 174.244 ms
+  {'acc': 0.9854767628205128}
+  ```
+
+## 评估过程
+
+### 评估
+
+- 在Ascend环境运行时评估CIFAR-10数据集
+
+  ```bash
+  python eval.py --checkpoint_path [CKPT_PATH] ./eval.log 2>&1 &
+  OR
+  bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]
+  ```
+
+## 导出过程
+
+### 导出
+
+将checkpoint文件导出成mindir格式模型。
+
+  ```shell
+  python export.py --ckpt_file [CKPT_FILE]
+  ```
+
+## 推理过程
+
+### 推理
+
+在进行推理之前我们需要先导出模型。mindir可以在任意环境上导出，air模型只能在昇腾910环境上导出。以下展示了使用mindir模型执行推理的示例。
+
+- 在昇腾310上使用CIFAR-10数据集进行推理
+
+  执行推理的命令如下所示，其中'MINDIR_PATH'是mindir文件路径；'DATASET'是使用的推理数据集名称，为'Cifar10'；'DATA_PATH'是推理数据集路径；'DEVICE_ID'可选，默认值为0。
+
+  ```shell
+  # Ascend310 inference
+  bash run_infer_310.sh [MINDIR_PATH] [DATASET] [DATA_PATH] [DEVICE_ID]
+  ```
+
+  推理的精度结果保存在scripts目录下，在acc.log日志文件中可以找到类似以下的分类准确率结果。推理的性能结果保存在scripts/time_Result目录下，在test_perform_static.txt文件中可以找到类似以下的性能结果。
+
+  ```shell
+  after allreduce eval: top1_correct=9854, tot=10000, acc=98.54%
+  NN inference cost average time: 52.2274 ms of infer_count 10000
+  ```
+
+# 模型描述
+
+## 性能
+
+### 评估性能
+
+#### CIFAR-10上的vit_base
+
+| 参数                 | Ascend                                                      |
+| -------------------------- | ----------------------------------------------------------- |
+| 模型版本              | vit_base                                                |
+| 资源                   | Ascend 910；CPU 2.60GHz，192核；内存 755G；系统 Red Hat 8.3.1-5         |
+| 上传日期              | 2021-10-26                                 |
+| MindSpore版本          | 1.3.0                                                 |
+| 数据集                    | CIFAR-10                                                |
+| 训练参数        | epoch=60, batch_size=32, lr_init=0.013（双卡并行训练时）             |
+| 优化器                  | Momentum                                                    |
+| 损失函数              | Softmax交叉熵                                       |
+| 输出                    | 概率                                                 |
+| 分类准确率             | 双卡：98.99%               |
+| 速度                      | 单卡：157毫秒/步；八卡：174毫秒/步                        |
+| 总时长                 | 双卡：2.48小时/60轮                                             |
+
+### 推理性能
+
+#### CIFAR-10上的vit_base
+
+| 参数                 | Ascend                                                       |
+| -------------------------- | ----------------------------------------------------------- |
+| 模型版本              | vit_base                                                |
+| 资源                   | Ascend 310               |
+| 上传日期              | 2021-10-26                                 |
+| MindSpore版本          | 1.3.0                                                 |
+| 数据集                    | CIFAR-10                                                |
+| 分类准确率             | 98.54%                       |
+| 速度                      | NN inference cost average time: 52.2274 ms of infer_count 10000           |
+
+# ModelZoo主页  
+
+ 请浏览官网[主页](https://gitee.com/mindspore/models) 。
\ No newline at end of file
--- a/research/cv/vit_base/ascend310_infer/CMakeLists.txt
+++ b/research/cv/vit_base/ascend310_infer/CMakeLists.txt
+cmake_minimum_required(VERSION 3.14.1)
+project(Ascend310Infer)
+add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
+set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
+option(MINDSPORE_PATH "mindspore install path" "")
+include_directories(${MINDSPORE_PATH})
+include_directories(${MINDSPORE_PATH}/include)
+include_directories(${PROJECT_SRC_ROOT})
+find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
+file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
+find_package(gflags REQUIRED)
+
+add_executable(main src/main.cc src/utils.cc)
+target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags)
--- a/research/cv/vit_base/ascend310_infer/build.sh
+++ b/research/cv/vit_base/ascend310_infer/build.sh
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+if [ ! -d out ]; then
+  mkdir out
+fi
+cd out || exit
+cmake .. \
+    -DMINDSPORE_PATH="`pip show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
+make
--- a/research/cv/vit_base/ascend310_infer/inc/utils.h
+++ b/research/cv/vit_base/ascend310_infer/inc/utils.h
+/**
+ * Copyright 2021 Huawei Technologies Co., Ltd
+ * 
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef MINDSPORE_INFERENCE_UTILS_H_
+#define MINDSPORE_INFERENCE_UTILS_H_
+
+#include <sys/stat.h>
+#include <dirent.h>
+#include <vector>
+#include <string>
+#include <memory>
+#include "include/api/types.h"
+
+std::vector<std::string> GetAllFiles(std::string_view dirName);
+DIR *OpenDir(std::string_view dirName);
+std::string RealPath(std::string_view path);
+mindspore::MSTensor ReadFileToTensor(const std::string &file);
+int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs);
+#endif
--- a/research/cv/vit_base/ascend310_infer/src/main.cc
+++ b/research/cv/vit_base/ascend310_infer/src/main.cc
+/**
+ * Copyright 2021 Huawei Technologies Co., Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <sys/time.h>
+#include <gflags/gflags.h>
+#include <dirent.h>
+#include <iostream>
+#include <string>
+#include <algorithm>
+#include <iosfwd>
+#include <vector>
+#include <fstream>
+#include <sstream>
+
+#include "../inc/utils.h"
+#include "include/dataset/execute.h"
+#include "include/dataset/transforms.h"
+#include "include/dataset/vision.h"
+#include "include/dataset/vision_ascend.h"
+#include "include/api/types.h"
+#include "include/api/model.h"
+#include "include/api/serialization.h"
+#include "include/api/context.h"
+
+using mindspore::Serialization;
+using mindspore::Model;
+using mindspore::Context;
+using mindspore::Status;
+using mindspore::ModelType;
+using mindspore::Graph;
+using mindspore::GraphCell;
+using mindspore::kSuccess;
+using mindspore::MSTensor;
+using mindspore::DataType;
+using mindspore::dataset::Execute;
+using mindspore::dataset::TensorTransform;
+using mindspore::dataset::vision::Decode;
+using mindspore::dataset::vision::Resize;
+using mindspore::dataset::vision::CenterCrop;
+using mindspore::dataset::vision::Normalize;
+using mindspore::dataset::vision::HWC2CHW;
+
+using mindspore::dataset::transforms::TypeCast;
+
+
+DEFINE_string(model_path, "", "model path");
+DEFINE_string(dataset, "Cifar10", "dataset: ImageNet or Cifar10");
+DEFINE_string(dataset_path, ".", "dataset path");
+DEFINE_int32(device_id, 0, "device id");
+
+int main(int argc, char **argv) {
+    gflags::ParseCommandLineFlags(&argc, &argv, true);
+    if (RealPath(FLAGS_model_path).empty()) {
+        std::cout << "Invalid model" << std::endl;
+        return 1;
+    }
+
+    std::transform(FLAGS_dataset.begin(), FLAGS_dataset.end(), FLAGS_dataset.begin(), ::tolower);
+
+    auto context = std::make_shared<Context>();
+    auto ascend310_info = std::make_shared<mindspore::Ascend310DeviceInfo>();
+    ascend310_info->SetDeviceID(FLAGS_device_id);
+    context->MutableDeviceInfo().push_back(ascend310_info);
+
+    Graph graph;
+    Status ret = Serialization::Load(FLAGS_model_path, ModelType::kMindIR, &graph);
+    if (ret != kSuccess) {
+        std::cout << "Load model failed." << std::endl;
+        return 1;
+    }
+
+    Model model;
+    ret = model.Build(GraphCell(graph), context);
+    if (ret != kSuccess) {
+        std::cout << "ERROR: Build failed." << std::endl;
+        return 1;
+    }
+
+    std::vector<MSTensor> modelInputs = model.GetInputs();
+
+    auto all_files = GetAllFiles(FLAGS_dataset_path);
+    if (all_files.empty()) {
+        std::cout << "ERROR: no input data." << std::endl;
+        return 1;
+    }
+
+    auto decode = Decode();
+    auto resizeImageNet = Resize({256});
+    auto centerCrop = CenterCrop({224});
+    auto normalizeImageNet = Normalize({123.675, 116.28, 103.53}, {58.395, 57.12, 57.375});
+    auto hwc2chw = HWC2CHW();
+
+    mindspore::dataset::Execute transformImageNet({decode, resizeImageNet, centerCrop, normalizeImageNet, hwc2chw});
+
+    std::map<double, double> costTime_map;
+
+    size_t size = all_files.size();
+    for (size_t i = 0; i < size; ++i) {
+        struct timeval start;
+        struct timeval end;
+        double startTime_ms;
+        double endTime_ms;
+        std::vector<MSTensor> inputs;
+        std::vector<MSTensor> outputs;
+
+        std::cout << "Start predict input files:" << all_files[i] << std::endl;
+        mindspore::MSTensor image =  ReadFileToTensor(all_files[i]);
+
+        if (FLAGS_dataset.compare("imagenet") == 0) {
+            transformImageNet(image, &image);
+        }
+
+        inputs.emplace_back(modelInputs[0].Name(), modelInputs[0].DataType(), modelInputs[0].Shape(),
+                            image.Data().get(), image.DataSize());
+
+        gettimeofday(&start, NULL);
+        model.Predict(inputs, &outputs);
+        gettimeofday(&end, NULL);
+
+        startTime_ms = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
+        endTime_ms = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
+        costTime_map.insert(std::pair<double, double>(startTime_ms, endTime_ms));
+        int ret_ = WriteResult(all_files[i], outputs);
+        if (ret_ != kSuccess) {
+          std::cout << "write result failed." << std::endl;
+          return 1;
+        }
+    }
+    double average = 0.0;
+    int infer_cnt = 0;
+    for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
+        double diff = 0.0;
+        diff = iter->second - iter->first;
+        average += diff;
+        infer_cnt++;
+    }
+
+    average = average / infer_cnt;
+    std::stringstream timeCost;
+    timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << infer_cnt << std::endl;
+    std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << infer_cnt << std::endl;
+
+    std::string file_name = "./time_Result" + std::string("/test_perform_static.txt");
+    std::ofstream file_stream(file_name.c_str(), std::ios::trunc);
+    file_stream << timeCost.str();
+    file_stream.close();
+    costTime_map.clear();
+    return 0;
+}
--- a/research/cv/vit_base/ascend310_infer/src/utils.cc
+++ b/research/cv/vit_base/ascend310_infer/src/utils.cc
+/**
+ * Copyright 2021 Huawei Technologies Co., Ltd
+ * 
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "inc/utils.h"
+
+#include <fstream>
+#include <algorithm>
+#include <iostream>
+
+using mindspore::MSTensor;
+using mindspore::DataType;
+
+std::vector<std::string> GetAllFiles(std::string_view dirName) {
+    struct dirent *filename;
+    DIR *dir = OpenDir(dirName);
+    if (dir == nullptr) {
+        return {};
+    }
+    std::vector<std::string> dirs;
+    std::vector<std::string> files;
+    while ((filename = readdir(dir)) != nullptr) {
+        std::string dName = std::string(filename->d_name);
+        if (dName == "." || dName == "..") {
+            continue;
+        } else if (filename->d_type == DT_DIR) {
+            dirs.emplace_back(std::string(dirName) + "/" + filename->d_name);
+        } else if (filename->d_type == DT_REG) {
+            files.emplace_back(std::string(dirName) + "/" + filename->d_name);
+        } else {
+            continue;
+        }
+    }
+
+    for (auto d : dirs) {
+        dir = OpenDir(d);
+        while ((filename = readdir(dir)) != nullptr) {
+            std::string dName = std::string(filename->d_name);
+            if (dName == "." || dName == ".." || filename->d_type != DT_REG) {
+                continue;
+            }
+            files.emplace_back(std::string(d) + "/" + filename->d_name);
+        }
+    }
+    std::sort(files.begin(), files.end());
+    for (auto &f : files) {
+        std::cout << "image file: " << f << std::endl;
+    }
+    return files;
+}
+
+int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) {
+    std::string homePath = "./result_Files";
+    for (size_t i = 0; i < outputs.size(); ++i) {
+        size_t outputSize;
+        std::shared_ptr<const void> netOutput;
+        netOutput = outputs[i].Data();
+        outputSize = outputs[i].DataSize();
+        int pos = imageFile.rfind('/');
+        std::string fileName(imageFile, pos + 1);
+        fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
+        std::string outFileName = homePath + "/" + fileName;
+        FILE * outputFile = fopen(outFileName.c_str(), "wb");
+        if (outputFile == nullptr) {
+          std::cout << "open result file" << outFileName << "failed" << std::endl;
+          return -1;
+        }
+        size_t size = fwrite(netOutput.get(), sizeof(char), outputSize, outputFile);
+        if (size != outputSize) {
+          fclose(outputFile);
+          outputFile = nullptr;
+          std::cout << "writer result file" << outFileName << "failed write size[" << size <<
+              "] is smaller than output size[" << outputSize << "], maybe the disk is full" << std::endl;
+          return -1;
+        }
+
+        fclose(outputFile);
+        outputFile = nullptr;
+    }
+    return 0;
+}
+
+mindspore::MSTensor ReadFileToTensor(const std::string &file) {
+  if (file.empty()) {
+    std::cout << "Pointer file is nullptr" << std::endl;
+    return mindspore::MSTensor();
+  }
+
+  std::ifstream ifs(file);
+  if (!ifs.good()) {
+    std::cout << "File: " << file << " is not exist" << std::endl;
+    return mindspore::MSTensor();
+  }
+
+  if (!ifs.is_open()) {
+    std::cout << "File: " << file << "open failed" << std::endl;
+    return mindspore::MSTensor();
+  }
+
+  ifs.seekg(0, std::ios::end);
+  size_t size = ifs.tellg();
+  mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
+
+  ifs.seekg(0, std::ios::beg);
+  ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
+  ifs.close();
+
+  return buffer;
+}
+
+
+DIR *OpenDir(std::string_view dirName) {
+    if (dirName.empty()) {
+        std::cout << " dirName is null ! " << std::endl;
+        return nullptr;
+    }
+    std::string realPath = RealPath(dirName);
+    struct stat s;
+    lstat(realPath.c_str(), &s);
+    if (!S_ISDIR(s.st_mode)) {
+        std::cout << "dirName is not a valid directory !" << std::endl;
+        return nullptr;
+    }
+    DIR *dir;
+    dir = opendir(realPath.c_str());
+    if (dir == nullptr) {
+        std::cout << "Can not open dir " << dirName << std::endl;
+        return nullptr;
+    }
+    std::cout << "Successfully opened the dir " << dirName << std::endl;
+    return dir;
+}
+
+std::string RealPath(std::string_view path) {
+    char realPathMem[PATH_MAX] = {0};
+    char *realPathRet = nullptr;
+    realPathRet = realpath(path.data(), realPathMem);
+
+    if (realPathRet == nullptr) {
+        std::cout << "File: " << path << " is not exist.";
+        return "";
+    }
+
+    std::string realPath(realPathMem);
+    std::cout << path << " realpath is: " << realPath << std::endl;
+    return realPath;
+}
--- a/research/cv/vit_base/eval.py
+++ b/research/cv/vit_base/eval.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+Process the test set with the .ckpt model in turn.
+"""
+import argparse
+import mindspore.nn as nn
+from mindspore import context
+from mindspore.train.model import Model
+from mindspore.train.serialization import load_checkpoint, load_param_into_net
+from mindspore.common import set_seed
+from mindspore import Tensor
+from mindspore.common import dtype as mstype
+from mindspore.nn.loss.loss import LossBase
+from mindspore.ops import functional as F
+from mindspore.ops import operations as P
+
+from src.config import cifar10_cfg
+from src.dataset import create_dataset_cifar10
+
+from src.modeling_ms import VisionTransformer
+import src.net_config as configs
+
+set_seed(1)
+
+parser = argparse.ArgumentParser(description='vit_base')
+parser.add_argument('--dataset_name', type=str, default='cifar10', choices=['cifar10'],
+                    help='dataset name.')
+parser.add_argument('--sub_type', type=str, default='ViT-B_16',
+                    choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
+parser.add_argument('--checkpoint_path', type=str, default='./ckpt_0', help='Checkpoint file path')
+parser.add_argument('--id', type=int, default=0, help='Device id')
+args_opt = parser.parse_args()
+
+
+class CrossEntropySmooth(LossBase):
+    """CrossEntropy"""
+    def __init__(self, sparse=True, reduction='mean', smooth_factor=0., num_classes=1000):
+        super(CrossEntropySmooth, self).__init__()
+        self.onehot = P.OneHot()
+        self.sparse = sparse
+        self.on_value = Tensor(1.0 - smooth_factor, mstype.float32)
+        self.off_value = Tensor(1.0 * smooth_factor / (num_classes - 1), mstype.float32)
+        self.ce = nn.SoftmaxCrossEntropyWithLogits(reduction=reduction)
+
+    def construct(self, logit, label):
+        if self.sparse:
+            label = self.onehot(label, F.shape(logit)[1], self.on_value, self.off_value)
+        loss_ = self.ce(logit, label)
+        return loss_
+
+
+if __name__ == '__main__':
+    CONFIGS = {'ViT-B_16': configs.get_b16_config,
+               'ViT-B_32': configs.get_b32_config,
+               'ViT-L_16': configs.get_l16_config,
+               'ViT-L_32': configs.get_l32_config,
+               'ViT-H_14': configs.get_h14_config,
+               'R50-ViT-B_16': configs.get_r50_b16_config,
+               'testing': configs.get_testing}
+    context.set_context(mode=context.GRAPH_MODE, device_target='Ascend', device_id=args_opt.id)
+    if args_opt.dataset_name == "cifar10":
+        cfg = cifar10_cfg
+        net = VisionTransformer(CONFIGS[args_opt.sub_type], num_classes=cfg.num_classes)
+        loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
+        opt = nn.Momentum(net.trainable_params(), 0.01, cfg.momentum, weight_decay=cfg.weight_decay)
+        dataset = create_dataset_cifar10(cfg.val_data_path, 1, False)
+        param_dict = load_checkpoint(args_opt.checkpoint_path)
+        print("load checkpoint from [{}].".format(args_opt.checkpoint_path))
+        load_param_into_net(net, param_dict)
+        net.set_train(False)
+        model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'})
+    else:
+        raise ValueError("dataset is not support.")
+
+    acc = model.eval(dataset)
+    print(f"model's accuracy is {acc}")
--- a/research/cv/vit_base/export.py
+++ b/research/cv/vit_base/export.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+##############export checkpoint file into air, onnx or mindir model#################
+python export.py
+"""
+import argparse
+import numpy as np
+
+from mindspore import Tensor, load_checkpoint, load_param_into_net, export, context
+
+from src.modeling_ms import VisionTransformer
+import src.net_config as configs
+
+parser = argparse.ArgumentParser(description='vit_base export')
+parser.add_argument("--device_id", type=int, default=0, help="Device id")
+parser.add_argument('--sub_type', type=str, default='ViT-B_16',
+                    choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
+parser.add_argument("--batch_size", type=int, default=1, help="batch size")
+parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
+parser.add_argument("--file_name", type=str, default="vit_base", help="output file name.")
+parser.add_argument('--width', type=int, default=224, help='input width')
+parser.add_argument('--height', type=int, default=224, help='input height')
+parser.add_argument("--file_format", type=str, choices=["AIR", "ONNX", "MINDIR"], default="MINDIR", help="file format")
+parser.add_argument("--device_target", type=str, default="Ascend",
+                    choices=["Ascend", "GPU", "CPU"], help="device target(default: Ascend)")
+args = parser.parse_args()
+
+context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
+if args.device_target == "Ascend":
+    context.set_context(device_id=args.device_id)
+
+if __name__ == '__main__':
+
+    CONFIGS = {'ViT-B_16': configs.get_b16_config,
+               'ViT-B_32': configs.get_b32_config,
+               'ViT-L_16': configs.get_l16_config,
+               'ViT-L_32': configs.get_l32_config,
+               'ViT-H_14': configs.get_h14_config,
+               'R50-ViT-B_16': configs.get_r50_b16_config,
+               'testing': configs.get_testing}
+    net = VisionTransformer(CONFIGS[args.sub_type], num_classes=10)
+
+    assert args.ckpt_file is not None, "checkpoint_path is None."
+
+    param_dict = load_checkpoint(args.ckpt_file)
+    load_param_into_net(net, param_dict)
+
+    input_arr = Tensor(np.zeros([args.batch_size, 3, args.height, args.width], np.float32))
+    export(net, input_arr, file_name=args.file_name, file_format=args.file_format)
--- a/research/cv/vit_base/postprocess.py
+++ b/research/cv/vit_base/postprocess.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""postprocess for 310 inference"""
+import os
+import argparse
+import numpy as np
+
+parser = argparse.ArgumentParser(description="postprocess")
+parser.add_argument("--result_path", type=str, required=True, help="result files path.")
+parser.add_argument("--label_file", type=str, required=True, help="label file path.")
+args = parser.parse_args()
+
+
+if __name__ == '__main__':
+    img_tot = 0
+    top1_correct = 0
+    result_shape = (1, 10)
+    files = os.listdir(args.result_path)
+    for file in files:
+        full_file_path = os.path.join(args.result_path, file)
+        if os.path.isfile(full_file_path):
+            result = np.fromfile(full_file_path, dtype=np.float32).reshape(result_shape)
+            label_path = os.path.join(args.label_file, file.split(".bin")[0][:-2] + ".bin")
+            gt_classes = np.fromfile(label_path, dtype=np.int32)
+
+            top1_output = np.argmax(result, (-1))
+
+            t1_correct = np.equal(top1_output, gt_classes).sum()
+            top1_correct += t1_correct
+            img_tot += 1
+
+    acc1 = 100.0 * top1_correct / img_tot
+    print('after allreduce eval: top1_correct={}, tot={}, acc={:.2f}%'.format(top1_correct, img_tot, acc1))
--- a/research/cv/vit_base/preprocess.py
+++ b/research/cv/vit_base/preprocess.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""preprocess"""
+import os
+import argparse
+from src.dataset import create_dataset_cifar10
+parser = argparse.ArgumentParser('preprocess')
+parser.add_argument('--data_path', type=str, default='', help='eval data dir')
+
+args = parser.parse_args()
+if __name__ == "__main__":
+    dataset = create_dataset_cifar10(args.data_path, 1, 1, False)
+    img_path = os.path.join('./preprocess_Result/', "img_data")
+    label_path = os.path.join('./preprocess_Result/', "label")
+    os.makedirs(img_path)
+    os.makedirs(label_path)
+    batch_size = 1
+    for idx, data in enumerate(dataset.create_dict_iterator(output_numpy=True, num_epochs=1)):
+        img_data = data["image"]
+        img_label = data["label"]
+        file_name = "vit_base_cifar10_" + str(batch_size) + "_" + str(idx) + ".bin"
+        img_file_path = os.path.join(img_path, file_name)
+        img_data.tofile(img_file_path)
+        label_file_path = os.path.join(label_path, file_name)
+        img_label.tofile(label_file_path)
+    print("=" * 20, "export bin files finished", "=" * 20)
--- a/research/cv/vit_base/requirements.txt
+++ b/research/cv/vit_base/requirements.txt
+easydict
--- a/research/cv/vit_base/scripts/run_distribution_train_ascend.sh
+++ b/research/cv/vit_base/scripts/run_distribution_train_ascend.sh
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+if [[ $# -gt 4 ]]; then 
+    echo "Usage: bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]"
+exit 1
+fi
+
+ulimit -u unlimited
+export DEVICE_NUM=$2
+export RANK_SIZE=$3
+RANK_TABLE_FILE=$(realpath $1)
+export RANK_TABLE_FILE
+echo "RANK_TABLE_FILE=${RANK_TABLE_FILE}"
+
+device_start=$3
+for((i=0; i<${DEVICE_NUM}; i++))
+do
+    export DEVICE_ID=$((device_start + i))
+    export RANK_ID=$i
+    rm -rf ./train_parallel$i
+    mkdir ./train_parallel$i
+    cp -r ./src ./train_parallel$i
+    cp ./train.py ./train_parallel$i
+    echo "start training for rank $RANK_ID, device $DEVICE_ID"
+    cd ./train_parallel$i ||exit
+    env > env.log
+    python train.py --device_id=$DEVICE_ID --dataset_name=$4 > log 2>&1 &
+    cd ..
+done
--- a/research/cv/vit_base/scripts/run_infer_310.sh
+++ b/research/cv/vit_base/scripts/run_infer_310.sh
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+if [[ $# -lt 3 || $# -gt 4 ]]; then 
+    echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATASET] [DATA_PATH] [DEVICE_ID]
+    DEVICE_ID is optional, default value is zero"
+exit 1
+fi
+
+get_real_path(){
+  if [ "${1:0:1}" == "/" ]; then
+    echo "$1"
+  else
+    echo "$(realpath -m $PWD/$1)"
+  fi
+}
+
+typeset -l dataset
+model=$(get_real_path $1)
+dataset=$2
+data_path=$(get_real_path $3)
+
+device_id=0
+
+if [ $# == 4 ]; then
+    device_id=$4
+fi
+
+echo $model
+echo $dataset
+echo $data_path
+echo $device_id
+
+export ASCEND_HOME=/usr/local/Ascend/
+if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then
+    export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH
+    export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
+    export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe
+    export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH
+    export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp
+else
+    export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH
+    export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
+    export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH
+    export ASCEND_OPP_PATH=$ASCEND_HOME/opp
+fi
+
+function compile_app()
+{
+    cd ../ascend310_infer || exit
+    if [ -f "Makefile" ]; then
+        make clean
+    fi
+    sh build.sh &> build.log
+
+    if [ $? -ne 0 ]; then
+        echo "compile app code failed"
+        exit 1
+    fi
+    cd - || exit
+}
+
+function preprocess_data()
+{
+    if [ -d preprocess_Result ]; then
+        rm -rf ./preprocess_Result
+    fi
+    mkdir preprocess_Result
+
+    python3.7 ../preprocess.py --data_path=$data_path #--output_path=./preprocess_Result
+}
+
+function infer()
+{
+    if [ -d result_Files ]; then
+        rm -rf ./result_Files
+    fi
+     if [ -d time_Result ]; then
+        rm -rf ./time_Result
+    fi
+    mkdir result_Files
+    mkdir time_Result
+    ../ascend310_infer/out/main --model_path=$model --dataset=$dataset --dataset_path=$data_path --device_id=$device_id &> infer.log
+
+    if [ $? -ne 0 ]; then
+        echo "execute inference failed"
+        exit 1
+    fi
+}
+
+function cal_acc()
+{
+    if [ "x${dataset}" == "xcifar10" ] || [ "x${dataset}" == "xCifar10" ]; then
+        python ../postprocess.py --label_file=./preprocess_Result/label --result_path=result_Files &> acc.log
+    fi
+    if [ $? -ne 0 ]; then
+        echo "calculate accuracy failed"
+        exit 1
+    fi
+}
+
+if [ "x${dataset}" == "xcifar10" ] || [ "x${dataset}" == "xCifar10" ]; then
+    preprocess_data
+    data_path=./preprocess_Result/img_data
+fi
+compile_app
+infer
+cal_acc
--- a/research/cv/vit_base/scripts/run_standalone_eval_ascend.sh
+++ b/research/cv/vit_base/scripts/run_standalone_eval_ascend.sh
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+echo "Usage: bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]"
+
+export CKPT=$1
+
+python eval.py --checkpoint_path $CKPT > ./eval.log 2>&1 &
--- a/research/cv/vit_base/scripts/run_standalone_train_ascend.sh
+++ b/research/cv/vit_base/scripts/run_standalone_train_ascend.sh
+#!/bin/bash
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+
+echo "Usage: bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]"
+
+export DEVICE_ID=$1
+
+python train.py --device_id=$DEVICE_ID --dataset_name=$2 > train.log 2>&1 &
--- a/research/cv/vit_base/src/config.py
+++ b/research/cv/vit_base/src/config.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+network config setting, will be used in main.py
+"""
+from easydict import EasyDict as edict
+
+cifar10_cfg = edict({
+    'name': 'cifar10',
+    'pre_trained': True,  # False
+    'num_classes': 10,
+    'lr_init': 0.013,  # 2P
+    'batch_size': 32,
+    'epoch_size': 60,
+    'momentum': 0.9,
+    'weight_decay': 1e-4,
+    'image_height': 224,
+    'image_width': 224,
+    'data_path': '/dataset/cifar10/cifar-10-batches-bin/',
+    'val_data_path': '/dataset/cifar10/cifar-10-verify-bin/',
+    'device_target': 'Ascend',
+    'device_id': 0,
+    'keep_checkpoint_max': 2,
+    'checkpoint_path': '/dataset/cifar10_pre_checkpoint_based_imagenet21k.ckpt',
+    'onnx_filename': 'vit_base',
+    'air_filename': 'vit_base',
+
+    # optimizer and lr related
+    'lr_scheduler': 'cosine_annealing',
+    'lr_epochs': [30, 60, 90, 120],
+    'lr_gamma': 0.3,
+    'eta_min': 0.0,
+    'T_max': 50,
+    'warmup_epochs': 0,
+
+    # loss related
+    'is_dynamic_loss_scale': 0,
+    'loss_scale': 1024,
+    'label_smooth_factor': 0.1,
+    'use_label_smooth': True,
+})
--- a/research/cv/vit_base/src/dataset.py
+++ b/research/cv/vit_base/src/dataset.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+Data operations, will be used in train.py and eval.py
+"""
+import os
+
+import mindspore.common.dtype as mstype
+import mindspore.dataset as ds
+import mindspore.dataset.transforms.c_transforms as C
+import mindspore.dataset.vision.c_transforms as vision
+from src.config import cifar10_cfg
+
+def create_dataset_cifar10(data_home, repeat_num=1, device_num=1, training=True):
+    """Data operations."""
+    if device_num > 1:
+        rank_size, rank_id = _get_rank_info()
+        data_set = ds.Cifar10Dataset(data_home, num_shards=rank_size, shard_id=rank_id, shuffle=True)
+    else:
+        data_set = ds.Cifar10Dataset(data_home, shuffle=False)
+
+    resize_height = cifar10_cfg.image_height
+    resize_width = cifar10_cfg.image_width
+
+    # define map operations
+    random_crop_op = vision.RandomCrop((32, 32), (4, 4, 4, 4))  # padding_mode default CONSTANT
+    random_horizontal_op = vision.RandomHorizontalFlip()
+    resize_op = vision.Resize((resize_height, resize_width))  # interpolation default BILINEAR
+    rescale_op = vision.Rescale(1.0 / 255.0, 0.0)
+    normalize_op = vision.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
+    changeswap_op = vision.HWC2CHW()
+    type_cast_op = C.TypeCast(mstype.int32)
+
+    c_trans = []
+    if training:
+        c_trans = [random_crop_op, random_horizontal_op]
+    c_trans += [resize_op, rescale_op, normalize_op, changeswap_op]
+
+    # apply map operations on images
+    data_set = data_set.map(operations=type_cast_op, input_columns="label")
+    data_set = data_set.map(operations=c_trans, input_columns="image")
+
+    # apply batch operations
+    if training:
+        data_set = data_set.batch(batch_size=cifar10_cfg.batch_size, drop_remainder=True)
+    else:
+        data_set = data_set.batch(batch_size=1, drop_remainder=True)
+
+    # apply repeat operations
+    data_set = data_set.repeat(repeat_num)
+
+    return data_set
+
+
+def _get_rank_info():
+    """
+    get rank size and rank id
+    """
+    rank_size = int(os.environ.get("RANK_SIZE", 1))
+
+    if rank_size > 1:
+        from mindspore.communication.management import get_rank, get_group_size
+        rank_size = get_group_size()
+        rank_id = get_rank()
+    else:
+        rank_size = rank_id = None
+
+    return rank_size, rank_id
--- a/research/cv/vit_base/src/modeling_ms.py
+++ b/research/cv/vit_base/src/modeling_ms.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+model.
+"""
+import copy
+
+import mindspore
+from mindspore import Parameter, Tensor
+import mindspore.nn as nn
+import mindspore.ops.operations as P
+
+
+def swish(x):
+    return x * P.Sigmoid()(x)
+
+
+ACT2FN = {"gelu": nn.GELU(), "relu": P.ReLU(), "swish": swish}
+
+
+class Attention(nn.Cell):
+    """Attention"""
+    def __init__(self, config):
+        super(Attention, self).__init__()
+        self.num_attention_heads = config.transformer_num_heads
+        self.attention_head_size = int(config.hidden_size / self.num_attention_heads)
+        self.attention_head_size2 = Tensor(config.hidden_size / self.num_attention_heads, mindspore.float32)
+        self.all_head_size = self.num_attention_heads * self.attention_head_size
+
+        self.query = nn.Dense(config.hidden_size, self.all_head_size)
+        self.key = nn.Dense(config.hidden_size, self.all_head_size)
+        self.value = nn.Dense(config.hidden_size, self.all_head_size)
+
+        self.out = nn.Dense(config.hidden_size, config.hidden_size)
+        self.attn_dropout = nn.Dropout(config.transformer_attention_dropout_rate)
+        self.proj_dropout = nn.Dropout(config.transformer_attention_dropout_rate)
+
+        self.softmax = nn.Softmax(axis=-1)
+
+    def transpose_for_scores(self, x):
+        """transpose_for_scores"""
+        new_x_shape = P.Shape()(x)[:-1] + (self.num_attention_heads, self.attention_head_size)
+        x = P.Reshape()(x, new_x_shape)
+        return P.Transpose()(x, (0, 2, 1, 3,))
+
+    def construct(self, hidden_states):
+        """construct"""
+        mixed_query_layer = self.query(hidden_states)
+        mixed_key_layer = self.key(hidden_states)
+        mixed_value_layer = self.value(hidden_states)
+
+        query_layer = self.transpose_for_scores(mixed_query_layer)
+        key_layer = self.transpose_for_scores(mixed_key_layer)
+        value_layer = self.transpose_for_scores(mixed_value_layer)
+
+        attention_scores = mindspore.ops.matmul(query_layer, P.Transpose()(key_layer, (0, 1, 3, 2)))
+        attention_scores = attention_scores / P.Sqrt()(self.attention_head_size2)
+        attention_probs = self.softmax(attention_scores)
+        attention_probs = self.attn_dropout(attention_probs)
+
+        context_layer = mindspore.ops.matmul(attention_probs, value_layer)
+        context_layer = P.Transpose()(context_layer, (0, 2, 1, 3))
+        new_context_layer_shape = P.Shape()(context_layer)[:-2] + (self.all_head_size,)
+        context_layer = P.Reshape()(context_layer, new_context_layer_shape)
+        attention_output = self.out(context_layer)
+        attention_output = self.proj_dropout(attention_output)
+        return attention_output
+
+
+class Mlp(nn.Cell):
+    """Mlp"""
+    def __init__(self, config):
+        super(Mlp, self).__init__()
+        self.fc1 = nn.Dense(config.hidden_size, config.transformer_mlp_dim,
+                            weight_init='XavierUniform', bias_init='Normal')
+        self.fc2 = nn.Dense(config.transformer_mlp_dim, config.hidden_size,
+                            weight_init='XavierUniform', bias_init='Normal')
+        self.act_fn = ACT2FN["gelu"]
+        self.dropout = nn.Dropout(config.transformer_dropout_rate)
+
+    def construct(self, x):
+        """construct"""
+        x = self.fc1(x)
+        x = self.act_fn(x)
+        x = self.dropout(x)
+        x = self.fc2(x)
+        x = self.dropout(x)
+        return x
+
+
+class Embeddings(nn.Cell):
+    """Construct the embeddings from patch, position embeddings."""
+    def __init__(self, config, img_size, in_channels=3):
+        super(Embeddings, self).__init__()
+        self.hybrid = None
+
+        if config.patches_grid is not None:
+            grid_size = config.patches_grid
+            patch_size = (img_size[0] // 16 // grid_size[0], img_size[1] // 16 // grid_size[1])
+            n_patches = (img_size[0] // 16) * (img_size[1] // 16)
+            self.hybrid = True
+        else:
+            patch_size = config.patches_size
+            n_patches = (img_size[0] // patch_size) * (img_size[1] // patch_size)
+            self.hybrid = False
+
+        if self.hybrid:
+            self.hybrid_model = ResNetV2(block_units=config.resnet.num_layers,
+                                         width_factor=config.resnet.width_factor)
+            in_channels = self.hybrid_model.width * 16
+        self.patch_embeddings = nn.Conv2d(in_channels=in_channels,
+                                          out_channels=config.hidden_size,
+                                          kernel_size=patch_size,
+                                          stride=patch_size, has_bias=True)
+        self.position_embeddings = Parameter(P.Zeros()((1, n_patches+1, config.hidden_size), mindspore.float32),
+                                             name="q1", requires_grad=True)
+        self.cls_token = Parameter(P.Zeros()((1, 1, config.hidden_size), mindspore.float32), name="q2",
+                                   requires_grad=True)
+
+        self.dropout = nn.Dropout(config.transformer_dropout_rate)
+
+    def construct(self, x):
+        """construct"""
+        B = x.shape[0]
+        cls_tokens = P.BroadcastTo((B, self.cls_token.shape[1], self.cls_token.shape[2]))(self.cls_token)
+
+        if self.hybrid:
+            x = self.hybrid_model(x)
+        x = self.patch_embeddings(x)
+        x = P.Reshape()(x, (x.shape[0], x.shape[1], x.shape[2] * x.shape[3]))
+        x = P.Transpose()(x, (0, 2, 1))
+        x = P.Concat(1)((cls_tokens, x))
+
+        embeddings = x + self.position_embeddings
+        embeddings = self.dropout(embeddings)
+        return embeddings
+
+
+class Block(nn.Cell):
+    """Block"""
+    def __init__(self, config):
+        super(Block, self).__init__()
+        self.hidden_size = config.hidden_size
+        self.attention_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
+        self.ffn_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
+        self.ffn = Mlp(config)
+        self.attn = Attention(config)
+
+    def construct(self, x):
+        """construct"""
+        h = x
+        x = self.attention_norm(x)
+        x = self.attn(x)
+        x = x + h
+
+        h = x
+        x = self.ffn_norm(x)
+        x = self.ffn(x)
+        x = x + h
+        return x
+
+
+class Encoder(nn.Cell):
+    """Encoder"""
+    def __init__(self, config):
+        super(Encoder, self).__init__()
+        self.layer = nn.CellList([])
+        self.encoder_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
+        for _ in range(config.transformer_num_layers):
+            layer = Block(config)
+            self.layer.append(copy.deepcopy(layer))
+
+    def construct(self, hidden_states):
+        """construct"""
+        for layer_block in self.layer:
+            hidden_states = layer_block(hidden_states)
+        encoded = self.encoder_norm(hidden_states)
+        return encoded
+
+
+class Transformer(nn.Cell):
+    """Transformer"""
+    def __init__(self, config, img_size):
+        super(Transformer, self).__init__()
+        self.embeddings = Embeddings(config, img_size=img_size)
+        self.encoder = Encoder(config)
+
+    def construct(self, input_ids):
+        """construct"""
+        embedding_output = self.embeddings(input_ids)
+        encoded = self.encoder(embedding_output)
+        return encoded
+
+
+class VisionTransformer(nn.Cell):
+    """VisionTransformer"""
+    def __init__(self, config, img_size=(224, 224), num_classes=21843):
+        super(VisionTransformer, self).__init__()
+        self.num_classes = num_classes
+        self.classifier = config.classifier
+
+        self.transformer = Transformer(config, img_size)
+        self.head = nn.Dense(config.hidden_size, num_classes)
+
+    def construct(self, x, labels=None):
+        """construct"""
+        x = self.transformer(x)
+        logits = self.head(x[:, 0])
+        return logits
--- a/research/cv/vit_base/src/net_config.py
+++ b/research/cv/vit_base/src/net_config.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+Configurations.
+"""
+
+from easydict import EasyDict as edict
+
+
+# Returns a minimal configuration for testing.
+get_testing = edict({
+    'patches_grid': None,
+    'patches_size': 16,
+    'hidden_size': 1,
+    'transformer_mlp_dim': 1,
+    'transformer_num_heads': 1,
+    'transformer_num_layers': 1,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 0.9,
+    'classifier': 'token',
+    'representation_size': None,
+})
+
+
+# Returns the ViT-B/16 configuration.
+get_b16_config = edict({
+    'patches_grid': None,
+    'patches_size': 16,
+    'hidden_size': 768,
+    'transformer_mlp_dim': 3072,
+    'transformer_num_heads': 12,
+    'transformer_num_layers': 12,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 1.0,  # 0.9
+    'classifier': 'token',
+    'representation_size': None,
+})
+
+
+# Returns the Resnet50 + ViT-B/16 configuration.
+get_r50_b16_config = edict({
+    'patches_grid': 14,
+    'resnet_num_layers': (3, 4, 9),
+    'resnet_width_factor': 1,
+})
+
+
+# Returns the ViT-B/32 configuration.
+get_b32_config = edict({
+    'patches_grid': None,
+    'patches_size': 32,
+    'hidden_size': 768,
+    'transformer_mlp_dim': 3072,
+    'transformer_num_heads': 12,
+    'transformer_num_layers': 12,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 0.9,
+    'classifier': 'token',
+    'representation_size': None,
+})
+
+
+# Returns the ViT-L/16 configuration.
+get_l16_config = edict({
+    'patches_grid': None,
+    'patches_size': 16,
+    'hidden_size': 1024,
+    'transformer_mlp_dim': 4096,
+    'transformer_num_heads': 16,
+    'transformer_num_layers': 24,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 0.9,
+    'classifier': 'token',
+    'representation_size': None,
+})
+
+
+# Returns the ViT-L/32 configuration.
+get_l32_config = edict({
+    'patches_grid': None,
+    'patches_size': 32,
+    'hidden_size': 1024,
+    'transformer_mlp_dim': 4096,
+    'transformer_num_heads': 16,
+    'transformer_num_layers': 24,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 0.9,
+    'classifier': 'token',
+    'representation_size': None,
+})
+
+
+# Returns the ViT-L/16 configuration.
+get_h14_config = edict({
+    'patches_grid': None,
+    'patches_size': 14,
+    'hidden_size': 1280,
+    'transformer_mlp_dim': 5120,
+    'transformer_num_heads': 16,
+    'transformer_num_layers': 32,
+    'transformer_attention_dropout_rate': 1.0,
+    'transformer_dropout_rate': 0.9,
+    'classifier': 'token',
+    'representation_size': None,
+})
--- a/research/cv/vit_base/train.py
+++ b/research/cv/vit_base/train.py
+# Copyright 2021 Huawei Technologies Co., Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ============================================================================
+"""
+Train.
+"""
+import argparse
+import os
+
+import math
+import numpy as np
+
+import mindspore.nn as nn
+from mindspore import Tensor
+from mindspore import context
+from mindspore.communication.management import init
+from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor
+from mindspore.train.model import Model
+from mindspore.context import ParallelMode
+from mindspore.train.serialization import load_checkpoint, load_param_into_net
+from mindspore.common import set_seed
+from mindspore.train.callback import Callback
+
+from src.config import cifar10_cfg
+from src.dataset import create_dataset_cifar10
+from src.modeling_ms import VisionTransformer
+import src.net_config as configs
+
+set_seed(2)
+
+def lr_steps_imagenet(_cfg, steps_per_epoch):
+    """lr step for imagenet"""
+    if _cfg.lr_scheduler == 'cosine_annealing':
+        _lr = warmup_cosine_annealing_lr(_cfg.lr_init,
+                                         steps_per_epoch,
+                                         _cfg.warmup_epochs,
+                                         _cfg.epoch_size,
+                                         _cfg.T_max,
+                                         _cfg.eta_min)
+    else:
+        raise NotImplementedError(_cfg.lr_scheduler)
+
+    return _lr
+
+
+def linear_warmup_lr(current_step, warmup_steps, base_lr, init_lr):
+    lr_inc = (float(base_lr) - float(init_lr)) / float(warmup_steps)
+    lr1 = float(init_lr) + lr_inc * current_step
+    return lr1
+
+
+def warmup_cosine_annealing_lr(lr5, steps_per_epoch, warmup_epochs, max_epoch, T_max, eta_min=0):
+    """ warmup cosine annealing lr"""
+    base_lr = lr5
+    warmup_init_lr = 0
+    total_steps = int(max_epoch * steps_per_epoch)
+    warmup_steps = int(warmup_epochs * steps_per_epoch)
+
+    lr_each_step = []
+    for i in range(total_steps):
+        last_epoch = i // steps_per_epoch
+        if i < warmup_steps:
+            lr5 = linear_warmup_lr(i + 1, warmup_steps, base_lr, warmup_init_lr)
+        else:
+            lr5 = eta_min + (base_lr - eta_min) * (1. + math.cos(math.pi * last_epoch / T_max)) / 2
+        lr_each_step.append(lr5)
+
+    return np.array(lr_each_step).astype(np.float32)
+
+
+class EvalCallBack(Callback):
+    """EvalCallBack"""
+    def __init__(self, model0, eval_dataset, eval_per_epoch, epoch_per_eval0):
+        self.model = model0
+        self.eval_dataset = eval_dataset
+        self.eval_per_epoch = eval_per_epoch
+        self.epoch_per_eval = epoch_per_eval0
+
+    def epoch_end(self, run_context):
+        """epoch_end"""
+        cb_param = run_context.original_args()
+        cur_epoch = cb_param.cur_epoch_num
+        if cur_epoch % self.eval_per_epoch == 0:
+            acc = self.model.eval(self.eval_dataset)
+            self.epoch_per_eval["epoch"].append(cur_epoch)
+            self.epoch_per_eval["acc"].append(acc)
+            print(acc)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Classification')
+    parser.add_argument('--dataset_name', type=str, default='cifar10', choices=['cifar10'],
+                        help='dataset name.')
+    parser.add_argument('--sub_type', type=str, default='ViT-B_16',
+                        choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
+    parser.add_argument('--device_id', type=int, default=None, help='device id of GPU or Ascend. (Default: None)')
+    parser.add_argument('--device_start', type=int, default=0, help='start device id. (Default: 0)')
+    parser.add_argument('--data_url', default=None, help='Location of data.')
+    parser.add_argument('--train_url', default=None, help='Location of training outputs.')
+    parser.add_argument('--ckpt_url', default=None, help='Location of ckpt.')
+    parser.add_argument('--modelarts', default=False, help='Use ModelArts or not.')
+    args_opt = parser.parse_args()
+
+    if args_opt.modelarts:
+        import moxing as mox
+        local_data_path = '/cache/data'
+        local_ckpt_path = '/cache/data/pre_ckpt'
+
+    if args_opt.dataset_name == "cifar10":
+        cfg = cifar10_cfg
+    else:
+        raise ValueError("Unsupported dataset.")
+
+    # set context
+    device_target = cfg.device_target
+
+    context.set_context(mode=context.GRAPH_MODE, device_target=cfg.device_target)
+    device_num = int(os.getenv('RANK_SIZE', '1'))
+
+    if device_target == "Ascend":
+        device_id = int(os.getenv('DEVICE_ID', '0'))
+        if args_opt.device_id is not None:
+            context.set_context(device_id=args_opt.device_id)
+        else:
+            context.set_context(device_id=cfg.device_id)
+
+        if device_num > 1:
+            if args_opt.modelarts:
+                context.set_context(device_id=int(os.getenv('DEVICE_ID')))
+            init(backend_name='hccl')
+            context.reset_auto_parallel_context()
+            context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL,
+                                              gradients_mean=True)
+            if args_opt.modelarts:
+                local_data_path = os.path.join(local_data_path, str(device_id))
+    else:
+        raise ValueError("Unsupported platform.")
+
+    if args_opt.modelarts:
+        mox.file.copy_parallel(src_url=args_opt.data_url, dst_url=local_data_path)
+
+    if args_opt.dataset_name == "cifar10":
+        if args_opt.modelarts:
+            dataset = create_dataset_cifar10(local_data_path, 1, device_num)
+        else:
+            dataset = create_dataset_cifar10(cfg.data_path, 1, device_num)
+    else:
+        raise ValueError("Unsupported dataset.")
+
+    batch_num = dataset.get_dataset_size()
+
+    CONFIGS = {'ViT-B_16': configs.get_b16_config,
+               'ViT-B_32': configs.get_b32_config,
+               'ViT-L_16': configs.get_l16_config,
+               'ViT-L_32': configs.get_l32_config,
+               'ViT-H_14': configs.get_h14_config,
+               'R50-ViT-B_16': configs.get_r50_b16_config,
+               'testing': configs.get_testing}
+
+    net = VisionTransformer(CONFIGS[args_opt.sub_type], num_classes=cfg.num_classes)
+
+    if args_opt.modelarts:
+        mox.file.copy_parallel(src_url=args_opt.ckpt_url, dst_url=local_ckpt_path)
+
+    if cfg.pre_trained:
+        if args_opt.modelarts:
+            param_dict = load_checkpoint(os.path.join(local_ckpt_path, "cifar10_pre_checkpoint_based_imagenet21k.ckpt"))
+        else:
+            param_dict = load_checkpoint(cfg.checkpoint_path)
+        load_param_into_net(net, param_dict)
+        print("Load pre_trained ckpt: {}".format(cfg.checkpoint_path))
+
+    loss_scale_manager = None
+    if args_opt.dataset_name == 'cifar10':
+        lr = lr_steps_imagenet(cfg, batch_num)
+        opt = nn.Momentum(params=net.trainable_params(),
+                          learning_rate=Tensor(lr),
+                          momentum=cfg.momentum,
+                          weight_decay=cfg.weight_decay)
+        loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
+
+    model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'},
+                  amp_level="O3", keep_batchnorm_fp32=False, loss_scale_manager=loss_scale_manager)
+
+    config_ck = CheckpointConfig(save_checkpoint_steps=batch_num * 2, keep_checkpoint_max=cfg.keep_checkpoint_max)
+    time_cb = TimeMonitor(data_size=batch_num)
+    ckpt_save_dir = "./ckpt/"
+    ckpoint_cb = ModelCheckpoint(prefix="train_vit_" + args_opt.dataset_name, directory=ckpt_save_dir,
+                                 config=config_ck)
+    loss_cb = LossMonitor()
+    if args_opt.modelarts:
+        cbs = [time_cb, ModelCheckpoint(prefix="train_vit_" + args_opt.dataset_name, config=config_ck), loss_cb]
+    else:
+        epoch_per_eval = {"epoch": [], "acc": []}
+        eval_cb = EvalCallBack(model, create_dataset_cifar10(cfg.val_data_path, 1, False), 2, epoch_per_eval)
+        cbs = [time_cb, ckpoint_cb, loss_cb, eval_cb]
+        if device_num > 1 and device_id != args_opt.device_start:
+            cbs = [time_cb, loss_cb]
+    model.train(cfg.epoch_size, dataset, callbacks=cbs)
+    print("train success")