Skip to content
Snippets Groups Projects
Commit 1d8fe960 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!311 [昇腾众智] vit_base网络

Merge pull request !311 from yexijoe/vit
parents 87e899b8 f8585c86
No related branches found
No related tags found
No related merge requests found
Showing
with 1827 additions and 0 deletions
# 目录
<!-- TOC -->
- [目录](#目录)
- [vit_base描述](#vit_base描述)
- [模型架构](#模型架构)
- [数据集](#数据集)
- [特性](#特性)
- [混合精度](#混合精度)
- [环境要求](#环境要求)
- [快速入门](#快速入门)
- [脚本说明](#脚本说明)
- [脚本及样例代码](#脚本及样例代码)
- [脚本参数](#脚本参数)
- [训练过程](#训练过程)
- [训练](#训练)
- [分布式训练](#分布式训练)
- [评估过程](#评估过程)
- [评估](#评估)
- [导出过程](#导出过程)
- [导出](#导出)
- [推理过程](#推理过程)
- [推理](#推理)
- [模型描述](#模型描述)
- [性能](#性能)
- [评估性能](#评估性能)
- [CIFAR-10上的vit_base](#cifar-10上的vit_base)
- [推理性能](#推理性能)
- [CIFAR-10上的vit_base](#cifar-10上的vit_base)
- [ModelZoo主页](#modelzoo主页)
<!-- /TOC -->
# vit_base描述
Transformer架构已广泛应用于自然语言处理领域。本模型的作者发现,Vision Transformer(ViT)模型在计算机视觉领域中对CNN的依赖不是必需的,直接将其应用于图像块序列来进行图像分类时,也能得到和目前卷积网络相媲美的准确率。
[论文](https://arxiv.org/abs/2010.11929) :Dosovitskiy, A. , Beyer, L. , Kolesnikov, A. , Weissenborn, D. , & Houlsby, N.. (2020). An image is worth 16x16 words: transformers for image recognition at scale.
# 模型架构
vit_base的总体网络架构如下: [链接](https://arxiv.org/abs/2010.11929)
# 数据集
使用的数据集:[CIFAR-10](<http://www.cs.toronto.edu/~kriz/cifar.html>)
- 数据集大小:175M,共10个类、6万张彩色图像
- 训练集:146M,共5万张图像
- 测试集:29M,共1万张图像
- 数据格式:二进制文件
- 注:数据将在src/dataset.py中处理。
# 特性
## 混合精度
采用[混合精度](https://www.mindspore.cn/docs/programming_guide/zh-CN/r1.3/enable_mixed_precision.html) 的训练方法,使用支持单精度和半精度数据来提高深度学习神经网络的训练速度,同时保持单精度训练所能达到的网络精度。混合精度训练提高计算速度、减少内存使用的同时,支持在特定硬件上训练更大的模型或实现更大批次的训练。
# 环境要求
- 硬件(Ascend)
- 使用Ascend来搭建硬件环境。
- 框架
- [MindSpore](https://www.mindspore.cn/install/en)
- 如需查看详情,请参见如下资源:
- [MindSpore教程](https://www.mindspore.cn/tutorials/zh-CN/r1.3/index.html)
- [MindSpore Python API](https://www.mindspore.cn/docs/api/zh-CN/r1.3/index.html)
# 快速入门
通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估,特别地,进行训练前需要先下载官方基于ImageNet21k的预训练模型[ViT-B_16](https://console.cloud.google.com/storage/vit_models/) ,并将其转换为MindSpore支持的ckpt格式模型,命名为"cifar10_pre_checkpoint_based_imagenet21k.ckpt",和训练集测试集数据放于同一级目录下:
- Ascend处理器环境运行
```python
# 运行训练示例
python train.py --device_id=0 --dataset_name='cifar10' > train.log 2>&1 &
OR
bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]
# 运行分布式训练示例
bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]
# 运行评估示例
python eval.py --checkpoint_path [CKPT_PATH] ./eval.log 2>&1 &
OR
bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]
# 运行推理示例
bash run_infer_310.sh ../vit_base.mindir Cifar10 /home/dataset/cifar-10-verify-bin/ 0
```
对于分布式训练,需要提前创建JSON格式的hccl配置文件。
请遵循以下链接中的说明:
<https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools.>
- 在 ModelArts 进行训练 (如果你想在modelarts上运行,可以参考以下文档 [modelarts](https://support.huaweicloud.com/modelarts/))
- 在 ModelArts 上使用多卡训练 cifar10 数据集
```python
# (1) 在网页上设置AI引擎为MindSpore
# (2) 在网页上设置 "ckpt_url=obs://path/pre_ckpt/"(预训练模型命名为"cifar10_pre_checkpoint_based_imagenet21k.ckpt")
# 在网页上设置 "modelarts=True"
# 在网页上设置 其他参数
# (3) 上传你的数据集到 S3 桶上
# (4) 在网页上设置你的代码路径为 "/path/vit_base"
# (5) 在网页上设置启动文件为 "train.py"
# (6) 在网页上设置"训练数据集(如/dataset/cifar10/cifar-10-batches-bin/)"、"训练输出文件路径"、"作业日志路径"等
# (7) 创建训练作业
```
# 脚本说明
## 脚本及样例代码
```bash
├── models
├── README.md // 所有模型相关说明
├── vit_base
├── README_CN.md // vit_base相关说明
├── ascend310_infer // 实现310推理源代码
├── scripts
│ ├──run_distribution_train_ascend.sh // 分布式到Ascend的shell脚本
│ ├──run_infer_310.sh // Ascend推理的shell脚本
│ ├──run_standalone_eval_ascend.sh // Ascend评估的shell脚本
│ ├──run_standalone_train_ascend.sh // Ascend单卡训练的shell脚本
├── src
│ ├──config.py // 参数配置
│ ├──dataset.py // 创建数据集
│ ├──modeling_ms.py // vit_base架构
│ ├──net_config.py // 结构参数配置
├── eval.py // 评估脚本
├── export.py // 将checkpoint文件导出到air/mindir
├── postprocess.py // 310推理后处理脚本
├── preprocess.py // 310推理前处理脚本
├── train.py // 训练脚本
```
## 脚本参数
在config.py中可以同时配置训练参数和评估参数。
- 配置vit_base和CIFAR-10数据集。
```python
'name':'cifar10' # 数据集
'pre_trained':True # 是否基于预训练模型训练
'num_classes':10 # 数据集类数
'lr_init':0.013 # 初始学习率,双卡并行训练
'batch_size':32 # 训练批次大小
'epoch_size':60 # 总计训练epoch数
'momentum':0.9 # 动量
'weight_decay':1e-4 # 权重衰减值
'image_height':224 # 输入到模型的图像高度
'image_width':224 # 输入到模型的图像宽度
'data_path':'/dataset/cifar10/cifar-10-batches-bin/' # 训练数据集的绝对全路径
'val_data_path':'/dataset/cifar10/cifar-10-verify-bin/' # 评估数据集的绝对全路径
'device_target':'Ascend' # 运行设备
'device_id':0 # 用于训练或评估数据集的设备ID,进行分布式训练时可以忽略
'keep_checkpoint_max':2 # 最多保存2个ckpt模型文件
'checkpoint_path':'/dataset/cifar10_pre_checkpoint_based_imagenet21k.ckpt' # 保存预训练模型的绝对全路径
# optimizer and lr related
'lr_scheduler':'cosine_annealing'
'T_max':50
```
更多配置细节请参考脚本`config.py`
## 训练过程
### 训练
- Ascend处理器环境运行
```bash
python train.py --device_id=0 --dataset_name='cifar10' > train.log 2>&1 &
OR
bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]
```
上述python命令将在后台运行,可以通过生成的train.log文件查看结果。
训练结束后,可以在默认脚本文件夹下得到损失值:
```bash
Load pre_trained ckpt: ./cifar10_pre_checkpoint_based_imagenet21k.ckpt
epoch: 1 step: 1562, loss is 0.12886986
epoch time: 289458.121 ms, per step time: 185.312 ms
epoch: 2 step: 1562, loss is 0.15596801
epoch time: 245404.168 ms, per step time: 157.109 ms
{'acc': 0.9240785256410257}
epoch: 3 step: 1562, loss is 0.06133139
epoch time: 244538.410 ms, per step time: 156.555 ms
epoch: 4 step: 1562, loss is 0.28615832
epoch time: 245382.652 ms, per step time: 157.095 ms
{'acc': 0.9597355769230769}
```
### 分布式训练
- Ascend处理器环境运行
```bash
bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]
```
上述shell脚本将在后台运行分布训练。
训练结束后,可以得到损失值:
```bash
Load pre_trained ckpt: ./cifar10_pre_checkpoint_based_imagenet21k.ckpt
epoch: 1 step: 781, loss is 0.015172593
epoch time: 195952.289 ms, per step time: 250.899 ms
epoch: 2 step: 781, loss is 0.06709316
epoch time: 135894.327 ms, per step time: 174.000 ms
{'acc': 0.9853766025641025}
epoch: 3 step: 781, loss is 0.050968178
epoch time: 135056.020 ms, per step time: 172.927 ms
epoch: 4 step: 781, loss is 0.01949552
epoch time: 136084.816 ms, per step time: 174.244 ms
{'acc': 0.9854767628205128}
```
## 评估过程
### 评估
- 在Ascend环境运行时评估CIFAR-10数据集
```bash
python eval.py --checkpoint_path [CKPT_PATH] ./eval.log 2>&1 &
OR
bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]
```
## 导出过程
### 导出
将checkpoint文件导出成mindir格式模型。
```shell
python export.py --ckpt_file [CKPT_FILE]
```
## 推理过程
### 推理
在进行推理之前我们需要先导出模型。mindir可以在任意环境上导出,air模型只能在昇腾910环境上导出。以下展示了使用mindir模型执行推理的示例。
- 在昇腾310上使用CIFAR-10数据集进行推理
执行推理的命令如下所示,其中'MINDIR_PATH'是mindir文件路径;'DATASET'是使用的推理数据集名称,为'Cifar10';'DATA_PATH'是推理数据集路径;'DEVICE_ID'可选,默认值为0。
```shell
# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATASET] [DATA_PATH] [DEVICE_ID]
```
推理的精度结果保存在scripts目录下,在acc.log日志文件中可以找到类似以下的分类准确率结果。推理的性能结果保存在scripts/time_Result目录下,在test_perform_static.txt文件中可以找到类似以下的性能结果。
```shell
after allreduce eval: top1_correct=9854, tot=10000, acc=98.54%
NN inference cost average time: 52.2274 ms of infer_count 10000
```
# 模型描述
## 性能
### 评估性能
#### CIFAR-10上的vit_base
| 参数 | Ascend |
| -------------------------- | ----------------------------------------------------------- |
| 模型版本 | vit_base |
| 资源 | Ascend 910;CPU 2.60GHz,192核;内存 755G;系统 Red Hat 8.3.1-5 |
| 上传日期 | 2021-10-26 |
| MindSpore版本 | 1.3.0 |
| 数据集 | CIFAR-10 |
| 训练参数 | epoch=60, batch_size=32, lr_init=0.013(双卡并行训练时) |
| 优化器 | Momentum |
| 损失函数 | Softmax交叉熵 |
| 输出 | 概率 |
| 分类准确率 | 双卡:98.99% |
| 速度 | 单卡:157毫秒/步;八卡:174毫秒/步 |
| 总时长 | 双卡:2.48小时/60轮 |
### 推理性能
#### CIFAR-10上的vit_base
| 参数 | Ascend |
| -------------------------- | ----------------------------------------------------------- |
| 模型版本 | vit_base |
| 资源 | Ascend 310 |
| 上传日期 | 2021-10-26 |
| MindSpore版本 | 1.3.0 |
| 数据集 | CIFAR-10 |
| 分类准确率 | 98.54% |
| 速度 | NN inference cost average time: 52.2274 ms of infer_count 10000 |
# ModelZoo主页
请浏览官网[主页](https://gitee.com/mindspore/models)
\ No newline at end of file
cmake_minimum_required(VERSION 3.14.1)
project(Ascend310Infer)
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
option(MINDSPORE_PATH "mindspore install path" "")
include_directories(${MINDSPORE_PATH})
include_directories(${MINDSPORE_PATH}/include)
include_directories(${PROJECT_SRC_ROOT})
find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
find_package(gflags REQUIRED)
add_executable(main src/main.cc src/utils.cc)
target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags)
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ ! -d out ]; then
mkdir out
fi
cd out || exit
cmake .. \
-DMINDSPORE_PATH="`pip show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
make
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef MINDSPORE_INFERENCE_UTILS_H_
#define MINDSPORE_INFERENCE_UTILS_H_
#include <sys/stat.h>
#include <dirent.h>
#include <vector>
#include <string>
#include <memory>
#include "include/api/types.h"
std::vector<std::string> GetAllFiles(std::string_view dirName);
DIR *OpenDir(std::string_view dirName);
std::string RealPath(std::string_view path);
mindspore::MSTensor ReadFileToTensor(const std::string &file);
int WriteResult(const std::string& imageFile, const std::vector<mindspore::MSTensor> &outputs);
#endif
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <sys/time.h>
#include <gflags/gflags.h>
#include <dirent.h>
#include <iostream>
#include <string>
#include <algorithm>
#include <iosfwd>
#include <vector>
#include <fstream>
#include <sstream>
#include "../inc/utils.h"
#include "include/dataset/execute.h"
#include "include/dataset/transforms.h"
#include "include/dataset/vision.h"
#include "include/dataset/vision_ascend.h"
#include "include/api/types.h"
#include "include/api/model.h"
#include "include/api/serialization.h"
#include "include/api/context.h"
using mindspore::Serialization;
using mindspore::Model;
using mindspore::Context;
using mindspore::Status;
using mindspore::ModelType;
using mindspore::Graph;
using mindspore::GraphCell;
using mindspore::kSuccess;
using mindspore::MSTensor;
using mindspore::DataType;
using mindspore::dataset::Execute;
using mindspore::dataset::TensorTransform;
using mindspore::dataset::vision::Decode;
using mindspore::dataset::vision::Resize;
using mindspore::dataset::vision::CenterCrop;
using mindspore::dataset::vision::Normalize;
using mindspore::dataset::vision::HWC2CHW;
using mindspore::dataset::transforms::TypeCast;
DEFINE_string(model_path, "", "model path");
DEFINE_string(dataset, "Cifar10", "dataset: ImageNet or Cifar10");
DEFINE_string(dataset_path, ".", "dataset path");
DEFINE_int32(device_id, 0, "device id");
int main(int argc, char **argv) {
gflags::ParseCommandLineFlags(&argc, &argv, true);
if (RealPath(FLAGS_model_path).empty()) {
std::cout << "Invalid model" << std::endl;
return 1;
}
std::transform(FLAGS_dataset.begin(), FLAGS_dataset.end(), FLAGS_dataset.begin(), ::tolower);
auto context = std::make_shared<Context>();
auto ascend310_info = std::make_shared<mindspore::Ascend310DeviceInfo>();
ascend310_info->SetDeviceID(FLAGS_device_id);
context->MutableDeviceInfo().push_back(ascend310_info);
Graph graph;
Status ret = Serialization::Load(FLAGS_model_path, ModelType::kMindIR, &graph);
if (ret != kSuccess) {
std::cout << "Load model failed." << std::endl;
return 1;
}
Model model;
ret = model.Build(GraphCell(graph), context);
if (ret != kSuccess) {
std::cout << "ERROR: Build failed." << std::endl;
return 1;
}
std::vector<MSTensor> modelInputs = model.GetInputs();
auto all_files = GetAllFiles(FLAGS_dataset_path);
if (all_files.empty()) {
std::cout << "ERROR: no input data." << std::endl;
return 1;
}
auto decode = Decode();
auto resizeImageNet = Resize({256});
auto centerCrop = CenterCrop({224});
auto normalizeImageNet = Normalize({123.675, 116.28, 103.53}, {58.395, 57.12, 57.375});
auto hwc2chw = HWC2CHW();
mindspore::dataset::Execute transformImageNet({decode, resizeImageNet, centerCrop, normalizeImageNet, hwc2chw});
std::map<double, double> costTime_map;
size_t size = all_files.size();
for (size_t i = 0; i < size; ++i) {
struct timeval start;
struct timeval end;
double startTime_ms;
double endTime_ms;
std::vector<MSTensor> inputs;
std::vector<MSTensor> outputs;
std::cout << "Start predict input files:" << all_files[i] << std::endl;
mindspore::MSTensor image = ReadFileToTensor(all_files[i]);
if (FLAGS_dataset.compare("imagenet") == 0) {
transformImageNet(image, &image);
}
inputs.emplace_back(modelInputs[0].Name(), modelInputs[0].DataType(), modelInputs[0].Shape(),
image.Data().get(), image.DataSize());
gettimeofday(&start, NULL);
model.Predict(inputs, &outputs);
gettimeofday(&end, NULL);
startTime_ms = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
endTime_ms = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
costTime_map.insert(std::pair<double, double>(startTime_ms, endTime_ms));
int ret_ = WriteResult(all_files[i], outputs);
if (ret_ != kSuccess) {
std::cout << "write result failed." << std::endl;
return 1;
}
}
double average = 0.0;
int infer_cnt = 0;
for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
double diff = 0.0;
diff = iter->second - iter->first;
average += diff;
infer_cnt++;
}
average = average / infer_cnt;
std::stringstream timeCost;
timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << infer_cnt << std::endl;
std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << infer_cnt << std::endl;
std::string file_name = "./time_Result" + std::string("/test_perform_static.txt");
std::ofstream file_stream(file_name.c_str(), std::ios::trunc);
file_stream << timeCost.str();
file_stream.close();
costTime_map.clear();
return 0;
}
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "inc/utils.h"
#include <fstream>
#include <algorithm>
#include <iostream>
using mindspore::MSTensor;
using mindspore::DataType;
std::vector<std::string> GetAllFiles(std::string_view dirName) {
struct dirent *filename;
DIR *dir = OpenDir(dirName);
if (dir == nullptr) {
return {};
}
std::vector<std::string> dirs;
std::vector<std::string> files;
while ((filename = readdir(dir)) != nullptr) {
std::string dName = std::string(filename->d_name);
if (dName == "." || dName == "..") {
continue;
} else if (filename->d_type == DT_DIR) {
dirs.emplace_back(std::string(dirName) + "/" + filename->d_name);
} else if (filename->d_type == DT_REG) {
files.emplace_back(std::string(dirName) + "/" + filename->d_name);
} else {
continue;
}
}
for (auto d : dirs) {
dir = OpenDir(d);
while ((filename = readdir(dir)) != nullptr) {
std::string dName = std::string(filename->d_name);
if (dName == "." || dName == ".." || filename->d_type != DT_REG) {
continue;
}
files.emplace_back(std::string(d) + "/" + filename->d_name);
}
}
std::sort(files.begin(), files.end());
for (auto &f : files) {
std::cout << "image file: " << f << std::endl;
}
return files;
}
int WriteResult(const std::string& imageFile, const std::vector<MSTensor> &outputs) {
std::string homePath = "./result_Files";
for (size_t i = 0; i < outputs.size(); ++i) {
size_t outputSize;
std::shared_ptr<const void> netOutput;
netOutput = outputs[i].Data();
outputSize = outputs[i].DataSize();
int pos = imageFile.rfind('/');
std::string fileName(imageFile, pos + 1);
fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
std::string outFileName = homePath + "/" + fileName;
FILE * outputFile = fopen(outFileName.c_str(), "wb");
if (outputFile == nullptr) {
std::cout << "open result file" << outFileName << "failed" << std::endl;
return -1;
}
size_t size = fwrite(netOutput.get(), sizeof(char), outputSize, outputFile);
if (size != outputSize) {
fclose(outputFile);
outputFile = nullptr;
std::cout << "writer result file" << outFileName << "failed write size[" << size <<
"] is smaller than output size[" << outputSize << "], maybe the disk is full" << std::endl;
return -1;
}
fclose(outputFile);
outputFile = nullptr;
}
return 0;
}
mindspore::MSTensor ReadFileToTensor(const std::string &file) {
if (file.empty()) {
std::cout << "Pointer file is nullptr" << std::endl;
return mindspore::MSTensor();
}
std::ifstream ifs(file);
if (!ifs.good()) {
std::cout << "File: " << file << " is not exist" << std::endl;
return mindspore::MSTensor();
}
if (!ifs.is_open()) {
std::cout << "File: " << file << "open failed" << std::endl;
return mindspore::MSTensor();
}
ifs.seekg(0, std::ios::end);
size_t size = ifs.tellg();
mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
ifs.seekg(0, std::ios::beg);
ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
ifs.close();
return buffer;
}
DIR *OpenDir(std::string_view dirName) {
if (dirName.empty()) {
std::cout << " dirName is null ! " << std::endl;
return nullptr;
}
std::string realPath = RealPath(dirName);
struct stat s;
lstat(realPath.c_str(), &s);
if (!S_ISDIR(s.st_mode)) {
std::cout << "dirName is not a valid directory !" << std::endl;
return nullptr;
}
DIR *dir;
dir = opendir(realPath.c_str());
if (dir == nullptr) {
std::cout << "Can not open dir " << dirName << std::endl;
return nullptr;
}
std::cout << "Successfully opened the dir " << dirName << std::endl;
return dir;
}
std::string RealPath(std::string_view path) {
char realPathMem[PATH_MAX] = {0};
char *realPathRet = nullptr;
realPathRet = realpath(path.data(), realPathMem);
if (realPathRet == nullptr) {
std::cout << "File: " << path << " is not exist.";
return "";
}
std::string realPath(realPathMem);
std::cout << path << " realpath is: " << realPath << std::endl;
return realPath;
}
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Process the test set with the .ckpt model in turn.
"""
import argparse
import mindspore.nn as nn
from mindspore import context
from mindspore.train.model import Model
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from mindspore.common import set_seed
from mindspore import Tensor
from mindspore.common import dtype as mstype
from mindspore.nn.loss.loss import LossBase
from mindspore.ops import functional as F
from mindspore.ops import operations as P
from src.config import cifar10_cfg
from src.dataset import create_dataset_cifar10
from src.modeling_ms import VisionTransformer
import src.net_config as configs
set_seed(1)
parser = argparse.ArgumentParser(description='vit_base')
parser.add_argument('--dataset_name', type=str, default='cifar10', choices=['cifar10'],
help='dataset name.')
parser.add_argument('--sub_type', type=str, default='ViT-B_16',
choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
parser.add_argument('--checkpoint_path', type=str, default='./ckpt_0', help='Checkpoint file path')
parser.add_argument('--id', type=int, default=0, help='Device id')
args_opt = parser.parse_args()
class CrossEntropySmooth(LossBase):
"""CrossEntropy"""
def __init__(self, sparse=True, reduction='mean', smooth_factor=0., num_classes=1000):
super(CrossEntropySmooth, self).__init__()
self.onehot = P.OneHot()
self.sparse = sparse
self.on_value = Tensor(1.0 - smooth_factor, mstype.float32)
self.off_value = Tensor(1.0 * smooth_factor / (num_classes - 1), mstype.float32)
self.ce = nn.SoftmaxCrossEntropyWithLogits(reduction=reduction)
def construct(self, logit, label):
if self.sparse:
label = self.onehot(label, F.shape(logit)[1], self.on_value, self.off_value)
loss_ = self.ce(logit, label)
return loss_
if __name__ == '__main__':
CONFIGS = {'ViT-B_16': configs.get_b16_config,
'ViT-B_32': configs.get_b32_config,
'ViT-L_16': configs.get_l16_config,
'ViT-L_32': configs.get_l32_config,
'ViT-H_14': configs.get_h14_config,
'R50-ViT-B_16': configs.get_r50_b16_config,
'testing': configs.get_testing}
context.set_context(mode=context.GRAPH_MODE, device_target='Ascend', device_id=args_opt.id)
if args_opt.dataset_name == "cifar10":
cfg = cifar10_cfg
net = VisionTransformer(CONFIGS[args_opt.sub_type], num_classes=cfg.num_classes)
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
opt = nn.Momentum(net.trainable_params(), 0.01, cfg.momentum, weight_decay=cfg.weight_decay)
dataset = create_dataset_cifar10(cfg.val_data_path, 1, False)
param_dict = load_checkpoint(args_opt.checkpoint_path)
print("load checkpoint from [{}].".format(args_opt.checkpoint_path))
load_param_into_net(net, param_dict)
net.set_train(False)
model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'})
else:
raise ValueError("dataset is not support.")
acc = model.eval(dataset)
print(f"model's accuracy is {acc}")
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
##############export checkpoint file into air, onnx or mindir model#################
python export.py
"""
import argparse
import numpy as np
from mindspore import Tensor, load_checkpoint, load_param_into_net, export, context
from src.modeling_ms import VisionTransformer
import src.net_config as configs
parser = argparse.ArgumentParser(description='vit_base export')
parser.add_argument("--device_id", type=int, default=0, help="Device id")
parser.add_argument('--sub_type', type=str, default='ViT-B_16',
choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
parser.add_argument("--batch_size", type=int, default=1, help="batch size")
parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.")
parser.add_argument("--file_name", type=str, default="vit_base", help="output file name.")
parser.add_argument('--width', type=int, default=224, help='input width')
parser.add_argument('--height', type=int, default=224, help='input height')
parser.add_argument("--file_format", type=str, choices=["AIR", "ONNX", "MINDIR"], default="MINDIR", help="file format")
parser.add_argument("--device_target", type=str, default="Ascend",
choices=["Ascend", "GPU", "CPU"], help="device target(default: Ascend)")
args = parser.parse_args()
context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target)
if args.device_target == "Ascend":
context.set_context(device_id=args.device_id)
if __name__ == '__main__':
CONFIGS = {'ViT-B_16': configs.get_b16_config,
'ViT-B_32': configs.get_b32_config,
'ViT-L_16': configs.get_l16_config,
'ViT-L_32': configs.get_l32_config,
'ViT-H_14': configs.get_h14_config,
'R50-ViT-B_16': configs.get_r50_b16_config,
'testing': configs.get_testing}
net = VisionTransformer(CONFIGS[args.sub_type], num_classes=10)
assert args.ckpt_file is not None, "checkpoint_path is None."
param_dict = load_checkpoint(args.ckpt_file)
load_param_into_net(net, param_dict)
input_arr = Tensor(np.zeros([args.batch_size, 3, args.height, args.width], np.float32))
export(net, input_arr, file_name=args.file_name, file_format=args.file_format)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""postprocess for 310 inference"""
import os
import argparse
import numpy as np
parser = argparse.ArgumentParser(description="postprocess")
parser.add_argument("--result_path", type=str, required=True, help="result files path.")
parser.add_argument("--label_file", type=str, required=True, help="label file path.")
args = parser.parse_args()
if __name__ == '__main__':
img_tot = 0
top1_correct = 0
result_shape = (1, 10)
files = os.listdir(args.result_path)
for file in files:
full_file_path = os.path.join(args.result_path, file)
if os.path.isfile(full_file_path):
result = np.fromfile(full_file_path, dtype=np.float32).reshape(result_shape)
label_path = os.path.join(args.label_file, file.split(".bin")[0][:-2] + ".bin")
gt_classes = np.fromfile(label_path, dtype=np.int32)
top1_output = np.argmax(result, (-1))
t1_correct = np.equal(top1_output, gt_classes).sum()
top1_correct += t1_correct
img_tot += 1
acc1 = 100.0 * top1_correct / img_tot
print('after allreduce eval: top1_correct={}, tot={}, acc={:.2f}%'.format(top1_correct, img_tot, acc1))
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""preprocess"""
import os
import argparse
from src.dataset import create_dataset_cifar10
parser = argparse.ArgumentParser('preprocess')
parser.add_argument('--data_path', type=str, default='', help='eval data dir')
args = parser.parse_args()
if __name__ == "__main__":
dataset = create_dataset_cifar10(args.data_path, 1, 1, False)
img_path = os.path.join('./preprocess_Result/', "img_data")
label_path = os.path.join('./preprocess_Result/', "label")
os.makedirs(img_path)
os.makedirs(label_path)
batch_size = 1
for idx, data in enumerate(dataset.create_dict_iterator(output_numpy=True, num_epochs=1)):
img_data = data["image"]
img_label = data["label"]
file_name = "vit_base_cifar10_" + str(batch_size) + "_" + str(idx) + ".bin"
img_file_path = os.path.join(img_path, file_name)
img_data.tofile(img_file_path)
label_file_path = os.path.join(label_path, file_name)
img_label.tofile(label_file_path)
print("=" * 20, "export bin files finished", "=" * 20)
easydict
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [[ $# -gt 4 ]]; then
echo "Usage: bash ./scripts/run_distribution_train_ascend.sh [RANK_TABLE] [DEVICE_NUM] [DEVICE_START] [DATASET_NAME]"
exit 1
fi
ulimit -u unlimited
export DEVICE_NUM=$2
export RANK_SIZE=$3
RANK_TABLE_FILE=$(realpath $1)
export RANK_TABLE_FILE
echo "RANK_TABLE_FILE=${RANK_TABLE_FILE}"
device_start=$3
for((i=0; i<${DEVICE_NUM}; i++))
do
export DEVICE_ID=$((device_start + i))
export RANK_ID=$i
rm -rf ./train_parallel$i
mkdir ./train_parallel$i
cp -r ./src ./train_parallel$i
cp ./train.py ./train_parallel$i
echo "start training for rank $RANK_ID, device $DEVICE_ID"
cd ./train_parallel$i ||exit
env > env.log
python train.py --device_id=$DEVICE_ID --dataset_name=$4 > log 2>&1 &
cd ..
done
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [[ $# -lt 3 || $# -gt 4 ]]; then
echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATASET] [DATA_PATH] [DEVICE_ID]
DEVICE_ID is optional, default value is zero"
exit 1
fi
get_real_path(){
if [ "${1:0:1}" == "/" ]; then
echo "$1"
else
echo "$(realpath -m $PWD/$1)"
fi
}
typeset -l dataset
model=$(get_real_path $1)
dataset=$2
data_path=$(get_real_path $3)
device_id=0
if [ $# == 4 ]; then
device_id=$4
fi
echo $model
echo $dataset
echo $data_path
echo $device_id
export ASCEND_HOME=/usr/local/Ascend/
if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then
export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH
export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe
export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH
export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp
else
export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH
export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH
export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH
export ASCEND_OPP_PATH=$ASCEND_HOME/opp
fi
function compile_app()
{
cd ../ascend310_infer || exit
if [ -f "Makefile" ]; then
make clean
fi
sh build.sh &> build.log
if [ $? -ne 0 ]; then
echo "compile app code failed"
exit 1
fi
cd - || exit
}
function preprocess_data()
{
if [ -d preprocess_Result ]; then
rm -rf ./preprocess_Result
fi
mkdir preprocess_Result
python3.7 ../preprocess.py --data_path=$data_path #--output_path=./preprocess_Result
}
function infer()
{
if [ -d result_Files ]; then
rm -rf ./result_Files
fi
if [ -d time_Result ]; then
rm -rf ./time_Result
fi
mkdir result_Files
mkdir time_Result
../ascend310_infer/out/main --model_path=$model --dataset=$dataset --dataset_path=$data_path --device_id=$device_id &> infer.log
if [ $? -ne 0 ]; then
echo "execute inference failed"
exit 1
fi
}
function cal_acc()
{
if [ "x${dataset}" == "xcifar10" ] || [ "x${dataset}" == "xCifar10" ]; then
python ../postprocess.py --label_file=./preprocess_Result/label --result_path=result_Files &> acc.log
fi
if [ $? -ne 0 ]; then
echo "calculate accuracy failed"
exit 1
fi
}
if [ "x${dataset}" == "xcifar10" ] || [ "x${dataset}" == "xCifar10" ]; then
preprocess_data
data_path=./preprocess_Result/img_data
fi
compile_app
infer
cal_acc
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "Usage: bash ./scripts/run_standalone_eval_ascend.sh [CKPT_PATH]"
export CKPT=$1
python eval.py --checkpoint_path $CKPT > ./eval.log 2>&1 &
#!/bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
echo "Usage: bash ./scripts/run_standalone_train_ascend.sh [DEVICE_ID] [DATASET_NAME]"
export DEVICE_ID=$1
python train.py --device_id=$DEVICE_ID --dataset_name=$2 > train.log 2>&1 &
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
network config setting, will be used in main.py
"""
from easydict import EasyDict as edict
cifar10_cfg = edict({
'name': 'cifar10',
'pre_trained': True, # False
'num_classes': 10,
'lr_init': 0.013, # 2P
'batch_size': 32,
'epoch_size': 60,
'momentum': 0.9,
'weight_decay': 1e-4,
'image_height': 224,
'image_width': 224,
'data_path': '/dataset/cifar10/cifar-10-batches-bin/',
'val_data_path': '/dataset/cifar10/cifar-10-verify-bin/',
'device_target': 'Ascend',
'device_id': 0,
'keep_checkpoint_max': 2,
'checkpoint_path': '/dataset/cifar10_pre_checkpoint_based_imagenet21k.ckpt',
'onnx_filename': 'vit_base',
'air_filename': 'vit_base',
# optimizer and lr related
'lr_scheduler': 'cosine_annealing',
'lr_epochs': [30, 60, 90, 120],
'lr_gamma': 0.3,
'eta_min': 0.0,
'T_max': 50,
'warmup_epochs': 0,
# loss related
'is_dynamic_loss_scale': 0,
'loss_scale': 1024,
'label_smooth_factor': 0.1,
'use_label_smooth': True,
})
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Data operations, will be used in train.py and eval.py
"""
import os
import mindspore.common.dtype as mstype
import mindspore.dataset as ds
import mindspore.dataset.transforms.c_transforms as C
import mindspore.dataset.vision.c_transforms as vision
from src.config import cifar10_cfg
def create_dataset_cifar10(data_home, repeat_num=1, device_num=1, training=True):
"""Data operations."""
if device_num > 1:
rank_size, rank_id = _get_rank_info()
data_set = ds.Cifar10Dataset(data_home, num_shards=rank_size, shard_id=rank_id, shuffle=True)
else:
data_set = ds.Cifar10Dataset(data_home, shuffle=False)
resize_height = cifar10_cfg.image_height
resize_width = cifar10_cfg.image_width
# define map operations
random_crop_op = vision.RandomCrop((32, 32), (4, 4, 4, 4)) # padding_mode default CONSTANT
random_horizontal_op = vision.RandomHorizontalFlip()
resize_op = vision.Resize((resize_height, resize_width)) # interpolation default BILINEAR
rescale_op = vision.Rescale(1.0 / 255.0, 0.0)
normalize_op = vision.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
changeswap_op = vision.HWC2CHW()
type_cast_op = C.TypeCast(mstype.int32)
c_trans = []
if training:
c_trans = [random_crop_op, random_horizontal_op]
c_trans += [resize_op, rescale_op, normalize_op, changeswap_op]
# apply map operations on images
data_set = data_set.map(operations=type_cast_op, input_columns="label")
data_set = data_set.map(operations=c_trans, input_columns="image")
# apply batch operations
if training:
data_set = data_set.batch(batch_size=cifar10_cfg.batch_size, drop_remainder=True)
else:
data_set = data_set.batch(batch_size=1, drop_remainder=True)
# apply repeat operations
data_set = data_set.repeat(repeat_num)
return data_set
def _get_rank_info():
"""
get rank size and rank id
"""
rank_size = int(os.environ.get("RANK_SIZE", 1))
if rank_size > 1:
from mindspore.communication.management import get_rank, get_group_size
rank_size = get_group_size()
rank_id = get_rank()
else:
rank_size = rank_id = None
return rank_size, rank_id
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
model.
"""
import copy
import mindspore
from mindspore import Parameter, Tensor
import mindspore.nn as nn
import mindspore.ops.operations as P
def swish(x):
return x * P.Sigmoid()(x)
ACT2FN = {"gelu": nn.GELU(), "relu": P.ReLU(), "swish": swish}
class Attention(nn.Cell):
"""Attention"""
def __init__(self, config):
super(Attention, self).__init__()
self.num_attention_heads = config.transformer_num_heads
self.attention_head_size = int(config.hidden_size / self.num_attention_heads)
self.attention_head_size2 = Tensor(config.hidden_size / self.num_attention_heads, mindspore.float32)
self.all_head_size = self.num_attention_heads * self.attention_head_size
self.query = nn.Dense(config.hidden_size, self.all_head_size)
self.key = nn.Dense(config.hidden_size, self.all_head_size)
self.value = nn.Dense(config.hidden_size, self.all_head_size)
self.out = nn.Dense(config.hidden_size, config.hidden_size)
self.attn_dropout = nn.Dropout(config.transformer_attention_dropout_rate)
self.proj_dropout = nn.Dropout(config.transformer_attention_dropout_rate)
self.softmax = nn.Softmax(axis=-1)
def transpose_for_scores(self, x):
"""transpose_for_scores"""
new_x_shape = P.Shape()(x)[:-1] + (self.num_attention_heads, self.attention_head_size)
x = P.Reshape()(x, new_x_shape)
return P.Transpose()(x, (0, 2, 1, 3,))
def construct(self, hidden_states):
"""construct"""
mixed_query_layer = self.query(hidden_states)
mixed_key_layer = self.key(hidden_states)
mixed_value_layer = self.value(hidden_states)
query_layer = self.transpose_for_scores(mixed_query_layer)
key_layer = self.transpose_for_scores(mixed_key_layer)
value_layer = self.transpose_for_scores(mixed_value_layer)
attention_scores = mindspore.ops.matmul(query_layer, P.Transpose()(key_layer, (0, 1, 3, 2)))
attention_scores = attention_scores / P.Sqrt()(self.attention_head_size2)
attention_probs = self.softmax(attention_scores)
attention_probs = self.attn_dropout(attention_probs)
context_layer = mindspore.ops.matmul(attention_probs, value_layer)
context_layer = P.Transpose()(context_layer, (0, 2, 1, 3))
new_context_layer_shape = P.Shape()(context_layer)[:-2] + (self.all_head_size,)
context_layer = P.Reshape()(context_layer, new_context_layer_shape)
attention_output = self.out(context_layer)
attention_output = self.proj_dropout(attention_output)
return attention_output
class Mlp(nn.Cell):
"""Mlp"""
def __init__(self, config):
super(Mlp, self).__init__()
self.fc1 = nn.Dense(config.hidden_size, config.transformer_mlp_dim,
weight_init='XavierUniform', bias_init='Normal')
self.fc2 = nn.Dense(config.transformer_mlp_dim, config.hidden_size,
weight_init='XavierUniform', bias_init='Normal')
self.act_fn = ACT2FN["gelu"]
self.dropout = nn.Dropout(config.transformer_dropout_rate)
def construct(self, x):
"""construct"""
x = self.fc1(x)
x = self.act_fn(x)
x = self.dropout(x)
x = self.fc2(x)
x = self.dropout(x)
return x
class Embeddings(nn.Cell):
"""Construct the embeddings from patch, position embeddings."""
def __init__(self, config, img_size, in_channels=3):
super(Embeddings, self).__init__()
self.hybrid = None
if config.patches_grid is not None:
grid_size = config.patches_grid
patch_size = (img_size[0] // 16 // grid_size[0], img_size[1] // 16 // grid_size[1])
n_patches = (img_size[0] // 16) * (img_size[1] // 16)
self.hybrid = True
else:
patch_size = config.patches_size
n_patches = (img_size[0] // patch_size) * (img_size[1] // patch_size)
self.hybrid = False
if self.hybrid:
self.hybrid_model = ResNetV2(block_units=config.resnet.num_layers,
width_factor=config.resnet.width_factor)
in_channels = self.hybrid_model.width * 16
self.patch_embeddings = nn.Conv2d(in_channels=in_channels,
out_channels=config.hidden_size,
kernel_size=patch_size,
stride=patch_size, has_bias=True)
self.position_embeddings = Parameter(P.Zeros()((1, n_patches+1, config.hidden_size), mindspore.float32),
name="q1", requires_grad=True)
self.cls_token = Parameter(P.Zeros()((1, 1, config.hidden_size), mindspore.float32), name="q2",
requires_grad=True)
self.dropout = nn.Dropout(config.transformer_dropout_rate)
def construct(self, x):
"""construct"""
B = x.shape[0]
cls_tokens = P.BroadcastTo((B, self.cls_token.shape[1], self.cls_token.shape[2]))(self.cls_token)
if self.hybrid:
x = self.hybrid_model(x)
x = self.patch_embeddings(x)
x = P.Reshape()(x, (x.shape[0], x.shape[1], x.shape[2] * x.shape[3]))
x = P.Transpose()(x, (0, 2, 1))
x = P.Concat(1)((cls_tokens, x))
embeddings = x + self.position_embeddings
embeddings = self.dropout(embeddings)
return embeddings
class Block(nn.Cell):
"""Block"""
def __init__(self, config):
super(Block, self).__init__()
self.hidden_size = config.hidden_size
self.attention_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
self.ffn_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
self.ffn = Mlp(config)
self.attn = Attention(config)
def construct(self, x):
"""construct"""
h = x
x = self.attention_norm(x)
x = self.attn(x)
x = x + h
h = x
x = self.ffn_norm(x)
x = self.ffn(x)
x = x + h
return x
class Encoder(nn.Cell):
"""Encoder"""
def __init__(self, config):
super(Encoder, self).__init__()
self.layer = nn.CellList([])
self.encoder_norm = nn.LayerNorm([config.hidden_size], epsilon=1e-6)
for _ in range(config.transformer_num_layers):
layer = Block(config)
self.layer.append(copy.deepcopy(layer))
def construct(self, hidden_states):
"""construct"""
for layer_block in self.layer:
hidden_states = layer_block(hidden_states)
encoded = self.encoder_norm(hidden_states)
return encoded
class Transformer(nn.Cell):
"""Transformer"""
def __init__(self, config, img_size):
super(Transformer, self).__init__()
self.embeddings = Embeddings(config, img_size=img_size)
self.encoder = Encoder(config)
def construct(self, input_ids):
"""construct"""
embedding_output = self.embeddings(input_ids)
encoded = self.encoder(embedding_output)
return encoded
class VisionTransformer(nn.Cell):
"""VisionTransformer"""
def __init__(self, config, img_size=(224, 224), num_classes=21843):
super(VisionTransformer, self).__init__()
self.num_classes = num_classes
self.classifier = config.classifier
self.transformer = Transformer(config, img_size)
self.head = nn.Dense(config.hidden_size, num_classes)
def construct(self, x, labels=None):
"""construct"""
x = self.transformer(x)
logits = self.head(x[:, 0])
return logits
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Configurations.
"""
from easydict import EasyDict as edict
# Returns a minimal configuration for testing.
get_testing = edict({
'patches_grid': None,
'patches_size': 16,
'hidden_size': 1,
'transformer_mlp_dim': 1,
'transformer_num_heads': 1,
'transformer_num_layers': 1,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 0.9,
'classifier': 'token',
'representation_size': None,
})
# Returns the ViT-B/16 configuration.
get_b16_config = edict({
'patches_grid': None,
'patches_size': 16,
'hidden_size': 768,
'transformer_mlp_dim': 3072,
'transformer_num_heads': 12,
'transformer_num_layers': 12,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 1.0, # 0.9
'classifier': 'token',
'representation_size': None,
})
# Returns the Resnet50 + ViT-B/16 configuration.
get_r50_b16_config = edict({
'patches_grid': 14,
'resnet_num_layers': (3, 4, 9),
'resnet_width_factor': 1,
})
# Returns the ViT-B/32 configuration.
get_b32_config = edict({
'patches_grid': None,
'patches_size': 32,
'hidden_size': 768,
'transformer_mlp_dim': 3072,
'transformer_num_heads': 12,
'transformer_num_layers': 12,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 0.9,
'classifier': 'token',
'representation_size': None,
})
# Returns the ViT-L/16 configuration.
get_l16_config = edict({
'patches_grid': None,
'patches_size': 16,
'hidden_size': 1024,
'transformer_mlp_dim': 4096,
'transformer_num_heads': 16,
'transformer_num_layers': 24,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 0.9,
'classifier': 'token',
'representation_size': None,
})
# Returns the ViT-L/32 configuration.
get_l32_config = edict({
'patches_grid': None,
'patches_size': 32,
'hidden_size': 1024,
'transformer_mlp_dim': 4096,
'transformer_num_heads': 16,
'transformer_num_layers': 24,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 0.9,
'classifier': 'token',
'representation_size': None,
})
# Returns the ViT-L/16 configuration.
get_h14_config = edict({
'patches_grid': None,
'patches_size': 14,
'hidden_size': 1280,
'transformer_mlp_dim': 5120,
'transformer_num_heads': 16,
'transformer_num_layers': 32,
'transformer_attention_dropout_rate': 1.0,
'transformer_dropout_rate': 0.9,
'classifier': 'token',
'representation_size': None,
})
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
Train.
"""
import argparse
import os
import math
import numpy as np
import mindspore.nn as nn
from mindspore import Tensor
from mindspore import context
from mindspore.communication.management import init
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor
from mindspore.train.model import Model
from mindspore.context import ParallelMode
from mindspore.train.serialization import load_checkpoint, load_param_into_net
from mindspore.common import set_seed
from mindspore.train.callback import Callback
from src.config import cifar10_cfg
from src.dataset import create_dataset_cifar10
from src.modeling_ms import VisionTransformer
import src.net_config as configs
set_seed(2)
def lr_steps_imagenet(_cfg, steps_per_epoch):
"""lr step for imagenet"""
if _cfg.lr_scheduler == 'cosine_annealing':
_lr = warmup_cosine_annealing_lr(_cfg.lr_init,
steps_per_epoch,
_cfg.warmup_epochs,
_cfg.epoch_size,
_cfg.T_max,
_cfg.eta_min)
else:
raise NotImplementedError(_cfg.lr_scheduler)
return _lr
def linear_warmup_lr(current_step, warmup_steps, base_lr, init_lr):
lr_inc = (float(base_lr) - float(init_lr)) / float(warmup_steps)
lr1 = float(init_lr) + lr_inc * current_step
return lr1
def warmup_cosine_annealing_lr(lr5, steps_per_epoch, warmup_epochs, max_epoch, T_max, eta_min=0):
""" warmup cosine annealing lr"""
base_lr = lr5
warmup_init_lr = 0
total_steps = int(max_epoch * steps_per_epoch)
warmup_steps = int(warmup_epochs * steps_per_epoch)
lr_each_step = []
for i in range(total_steps):
last_epoch = i // steps_per_epoch
if i < warmup_steps:
lr5 = linear_warmup_lr(i + 1, warmup_steps, base_lr, warmup_init_lr)
else:
lr5 = eta_min + (base_lr - eta_min) * (1. + math.cos(math.pi * last_epoch / T_max)) / 2
lr_each_step.append(lr5)
return np.array(lr_each_step).astype(np.float32)
class EvalCallBack(Callback):
"""EvalCallBack"""
def __init__(self, model0, eval_dataset, eval_per_epoch, epoch_per_eval0):
self.model = model0
self.eval_dataset = eval_dataset
self.eval_per_epoch = eval_per_epoch
self.epoch_per_eval = epoch_per_eval0
def epoch_end(self, run_context):
"""epoch_end"""
cb_param = run_context.original_args()
cur_epoch = cb_param.cur_epoch_num
if cur_epoch % self.eval_per_epoch == 0:
acc = self.model.eval(self.eval_dataset)
self.epoch_per_eval["epoch"].append(cur_epoch)
self.epoch_per_eval["acc"].append(acc)
print(acc)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Classification')
parser.add_argument('--dataset_name', type=str, default='cifar10', choices=['cifar10'],
help='dataset name.')
parser.add_argument('--sub_type', type=str, default='ViT-B_16',
choices=['ViT-B_16', 'ViT-B_32', 'ViT-L_16', 'ViT-L_32', 'ViT-H_14', 'testing'])
parser.add_argument('--device_id', type=int, default=None, help='device id of GPU or Ascend. (Default: None)')
parser.add_argument('--device_start', type=int, default=0, help='start device id. (Default: 0)')
parser.add_argument('--data_url', default=None, help='Location of data.')
parser.add_argument('--train_url', default=None, help='Location of training outputs.')
parser.add_argument('--ckpt_url', default=None, help='Location of ckpt.')
parser.add_argument('--modelarts', default=False, help='Use ModelArts or not.')
args_opt = parser.parse_args()
if args_opt.modelarts:
import moxing as mox
local_data_path = '/cache/data'
local_ckpt_path = '/cache/data/pre_ckpt'
if args_opt.dataset_name == "cifar10":
cfg = cifar10_cfg
else:
raise ValueError("Unsupported dataset.")
# set context
device_target = cfg.device_target
context.set_context(mode=context.GRAPH_MODE, device_target=cfg.device_target)
device_num = int(os.getenv('RANK_SIZE', '1'))
if device_target == "Ascend":
device_id = int(os.getenv('DEVICE_ID', '0'))
if args_opt.device_id is not None:
context.set_context(device_id=args_opt.device_id)
else:
context.set_context(device_id=cfg.device_id)
if device_num > 1:
if args_opt.modelarts:
context.set_context(device_id=int(os.getenv('DEVICE_ID')))
init(backend_name='hccl')
context.reset_auto_parallel_context()
context.set_auto_parallel_context(device_num=device_num, parallel_mode=ParallelMode.DATA_PARALLEL,
gradients_mean=True)
if args_opt.modelarts:
local_data_path = os.path.join(local_data_path, str(device_id))
else:
raise ValueError("Unsupported platform.")
if args_opt.modelarts:
mox.file.copy_parallel(src_url=args_opt.data_url, dst_url=local_data_path)
if args_opt.dataset_name == "cifar10":
if args_opt.modelarts:
dataset = create_dataset_cifar10(local_data_path, 1, device_num)
else:
dataset = create_dataset_cifar10(cfg.data_path, 1, device_num)
else:
raise ValueError("Unsupported dataset.")
batch_num = dataset.get_dataset_size()
CONFIGS = {'ViT-B_16': configs.get_b16_config,
'ViT-B_32': configs.get_b32_config,
'ViT-L_16': configs.get_l16_config,
'ViT-L_32': configs.get_l32_config,
'ViT-H_14': configs.get_h14_config,
'R50-ViT-B_16': configs.get_r50_b16_config,
'testing': configs.get_testing}
net = VisionTransformer(CONFIGS[args_opt.sub_type], num_classes=cfg.num_classes)
if args_opt.modelarts:
mox.file.copy_parallel(src_url=args_opt.ckpt_url, dst_url=local_ckpt_path)
if cfg.pre_trained:
if args_opt.modelarts:
param_dict = load_checkpoint(os.path.join(local_ckpt_path, "cifar10_pre_checkpoint_based_imagenet21k.ckpt"))
else:
param_dict = load_checkpoint(cfg.checkpoint_path)
load_param_into_net(net, param_dict)
print("Load pre_trained ckpt: {}".format(cfg.checkpoint_path))
loss_scale_manager = None
if args_opt.dataset_name == 'cifar10':
lr = lr_steps_imagenet(cfg, batch_num)
opt = nn.Momentum(params=net.trainable_params(),
learning_rate=Tensor(lr),
momentum=cfg.momentum,
weight_decay=cfg.weight_decay)
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
model = Model(net, loss_fn=loss, optimizer=opt, metrics={'acc'},
amp_level="O3", keep_batchnorm_fp32=False, loss_scale_manager=loss_scale_manager)
config_ck = CheckpointConfig(save_checkpoint_steps=batch_num * 2, keep_checkpoint_max=cfg.keep_checkpoint_max)
time_cb = TimeMonitor(data_size=batch_num)
ckpt_save_dir = "./ckpt/"
ckpoint_cb = ModelCheckpoint(prefix="train_vit_" + args_opt.dataset_name, directory=ckpt_save_dir,
config=config_ck)
loss_cb = LossMonitor()
if args_opt.modelarts:
cbs = [time_cb, ModelCheckpoint(prefix="train_vit_" + args_opt.dataset_name, config=config_ck), loss_cb]
else:
epoch_per_eval = {"epoch": [], "acc": []}
eval_cb = EvalCallBack(model, create_dataset_cifar10(cfg.val_data_path, 1, False), 2, epoch_per_eval)
cbs = [time_cb, ckpoint_cb, loss_cb, eval_cb]
if device_num > 1 and device_id != args_opt.device_start:
cbs = [time_cb, loss_cb]
model.train(cfg.epoch_size, dataset, callbacks=cbs)
print("train success")
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment