Skip to content
Snippets Groups Projects
Commit e32ce331 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!113 add new model erfnet to model_zoo

Merge pull request !113 from gpf/master
parents 2119aed8 8ec08fe9
No related branches found
No related tags found
No related merge requests found
Showing
with 2112 additions and 0 deletions
# 目录
<!-- TOC -->
- [目录](#目录)
- [ERFNet描述](#erfnet描述)
- [概述](#概述)
- [论文](#论文)
- [关于精度](#关于精度)
- [环境](#环境)
- [数据集](#数据集)
- [脚本说明](#脚本说明)
- [训练](#训练)
- [单卡训练](#单卡训练)
- [多卡训练](#多卡训练)
- [验证](#验证)
- [验证单个ckpt](#验证单个ckpt)
- [推理](#推理)
- [使用ckpt文件推理](#使用ckpt文件推理)
- [310推理](#310推理)
<!-- /TOC -->
# ERFNet描述
## 概述
ERFNet可以看作是对ResNet结构的又一改变,ERFNet提出了Factorized Residual Layers,内部全部使用1D的cov(非对称卷积),以此来降低参数量,提高速度。同时ERFNet也是对ENet的改进,在模型结构上删除了encode中的层和decode层之间的long-range链接,同时所有的downsampling模块都是一组并行的max pooling和conv。
使用mindpsore复现ERFNet[[论文]](http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17iv.pdf)
这个项目迁移于原作者对ERFNet的Pytorch实现[[HERE](https://github.com/Eromera/erfnet_pytorch)]。
## 论文
1. [论文](http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17tits.pdf):E. Romera, J. M. Alvarez, L. M. Bergasa and R. Arroyo."ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation"
2. [论文](https://arxiv.org/abs/1606.02147):A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello."ENet: A deep neural network architecture for real-time semantic segmentation."
## 关于精度
| (Val IOU/Test IOU) | [erfnet_pytorch](https://github.com/Eromera/erfnet_pytorch) | [论文](http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17iv.pdf) |
|-|-|-|
| **512 x 1024** | **72.1/69.8** | * |
| **1024 x 2048** | * | **70.0/68.0** |
[erfnet_pytorch](https://github.com/Eromera/erfnet_pytorch)是作者对erfnet的pytroch实现,
上表显示了其readme中称能达到的结果和论文中声称的结果。
测试和训练时图片的输入大小尺寸会影响精度,cityscapes数据集中的图片尺寸全部是2048x1024。论文和pytorch的具体实现, 对于图片尺寸的处理也有所不同。
论文中声称对图片和标签进行2倍下采样(1024x512)再进行训练,测试时在1024x512下进行推断,然后对prediction进行插值到2048x1024再和label计算IOU。在pytorch的实现中,训练和测试均在下采样后的1024x512下进行。实测Pytorch实现在val上能达到70.7%的IOU。
# 环境
Ascend
# 数据集
[**The Cityscapes dataset**](https://www.cityscapes-dataset.com/):
在官网直接下载的标签文件, 像素被分为30多类, 在训练时我们需要将其归纳到20类, 所以对其需要进行处理. 为了方便可以直接下载已经处理好的数据.
链接:https://pan.baidu.com/s/1jH9GUDX4grcEoDNLsWPKGw. 提取码:aChQ.
下载后可以得到以下目录:
```sh
└── cityscapes
├── gtFine .................................. ground truth
└── leftImg8bit ............................. 训练集&测试集&验证集
```
键入
```sh
python build_mrdata.py \
--dataset_path /path/to/cityscapes/ \
--subset train \
--output_name train.mindrecord
```
脚本会在/path/to/cityscapes/数据集根目录下,找到训练集,在output_name指出的路径下生成mindrecord文件,
然后再把mindrecord文件移动到项目根目录下的data文件夹下,来让脚本中的相对路径能够寻找到
# 脚本说明
```sh
├── ascend310_infer
│ ├── inc
│ │ └── utils.h // utils头文件
│ └── src
│ ├── CMakeLists.txt // cmakelist
│ ├── main.cc // 推理代码
│ ├── build.sh // 运行脚本
│ └── utils.cc // utils实现
├── eval.py // 测试脚本
├── export.py // 生成模型文件脚本
├── README_CN.md // 描述文件
├── requirements.txt // python环境依赖
├── scripts
│ ├── run_infer_310.sh // 310推理脚本
│ ├── run_distribute_train.sh // 多卡训练脚本
│ └── run_standalone_train.sh // 单卡训练脚本
├── src
│ ├── build_mrdata.py // 生成mindrecord数据集
│ ├── config.py // 配置参数脚本
│ ├── dataset.py // 数据集脚本
│ ├── infer.py // 推断脚本
│ ├── iouEval.py // metric计算脚本
│ ├── model.py // 模型脚本
│ ├── eval310.py // 310推理脚本
│ ├── show.py // 结果可视化脚本
│ └── util.py // 工具函数脚本
└── train.py // 训练脚本
```
# 训练
训练之前需要生成mindrecord数据文件并放到项目根目录的data文件夹下,然后启动脚本。
## 单卡训练
如果你要使用单卡进行训练,进入项目根目录,键入
```py
nohup bash scripts/run_standalone_train.sh /home/name/cityscapes 0 &
```
其中/home/name/cityscapes指数据集的位置,其后的0指定device_id.
在项目根目录下会生成log_single_device文件夹,./log_single_device/log_stage*.txt即为程序log文件,键入
```sh
tail -f log_single_device/log_stage*.txt
```
显示训练状态。
## 多卡训练
例如,你要使用4卡进行训练,进入项目根目录,键入
```py
nohup bash scripts/run_distribute_train.sh /home/name/cityscapes 4 0,1,2,3 /home/name/rank_table_4pcs.json &
```
其中/home/name/cityscapes指数据集的位置,其后的4指rank_size, 再后的0,1,2,3制定了设备的编号, /home/name/rank_table_4pcs.json指并行训练配置文件的位置。其他数目的设备并行训练也类似。
在项目根目录下会生成log文件夹,./log/log0/log.txt即为程序log文件,键入
```sh
tail -f log/log0/log.txt
```
显示训练状态。
# 验证
训练之后,脚本会调用验证代码,对不同的ckpt文件,会加上后缀.metrics.txt,其中包含测试精度。
## 验证单个ckpt
键入
```sh
python eval.py \
--data_path /path/cityscapes \
--run_distribute false \
--encode false \
--model_root_path /path/ERFNet/ERFNet.ckpt \
--device_id 1
```
data_path为数据集根目录,model_root_path为ckpt文件路径。
验证完毕后,会在ckpt文件同目录下生成后缀metrics.txt文件,其中包含测试点数。
```txt
mean_iou 0.7090318296884867
mean_loss 0.296806449357143
iou_class tensor([0.9742, 0.8046, 0.9048, 0.4574, 0.5067, 0.6105, 0.6239, 0.7221, 0.9134,
0.5903, 0.9352, 0.7633, 0.5624, 0.9231, 0.6211, 0.7897, 0.6471, 0.4148,
0.7069], dtype=torch.float64)
```
# 推理
## 使用ckpt文件推理
键入
```sh
python src/infer.py \
--data_path /path/to/imgs \
--model_path /path/to/ERFNet.ckpt /
--output_path /output/path \
--device_id 3
```
脚本会读取/path/to/imgs下的图片,使用/path/to/ERFNet.ckpt模型进行推理,得到的可视化结果输出到/output/path下。
## 310推理
需要处理训练好的ckpt文件, 得到能在310上直接推理的mindir模型文件:
```sh
python export.py --model_path /path/to/net.ckpt
```
会在当前目录下得到ERFNet.mindir文件, 之后进入ascend310_infer文件夹,
```sh
cd ascend310_infer
bash scripts/run_infer_310.sh /path/to/net.mindir /path/to/images /path/to/result /path/to/label 0
```
其中/path/to/images指验证集的图片, 由于原始数据集的路径cityscapes/leftImg8bit/val/的图片根据拍摄的城市进行了分类, 需要先将其归到一个文件夹下才能供推理.
例如
```sh
cp /path/to/cityscapes/leftImg8bit/val/frankfurt/* /path/to/images/
cp /path/to/cityscapes/leftImg8bit/val/lindau/* /path/to/images/
cp /path/to/cityscapes/leftImg8bit/val/munster/* /path/to/images/
```
验证集的ground truth, 同理也要归到/path/to/labels/下. 其余的参数/path/to/net.mindir指mindir文件的路径, /path/to/result推理结果的输出路径(文件夹需要提前创建好), 0指的是device_id
最终推理结果会输出在/res/result/文件夹下, 当前目录下会生成metric.txt, 其中包含精度.
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <sys/stat.h>
#include <sys/time.h>
#include <dirent.h>
#include <algorithm>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include "include/api/context.h"
#include "include/api/model.h"
#include "include/api/serialization.h"
#include "include/dataset/execute.h"
#include "include/dataset/vision.h"
namespace ms = mindspore;
namespace ds = mindspore::dataset;
int WriteResult(const std::string& imageFile, const std::vector<ms::MSTensor> &outputs);
void print_shape(const ms::MSTensor &buffer);
std::vector<std::string> GetAllFiles(std::string_view dir_name);
DIR *OpenDir(std::string_view dir_name);
std::string RealPath(std::string_view path);
ms::MSTensor ReadFile(const std::string &file);
cmake_minimum_required(VERSION 3.14.1)
project(ERFNet)
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined")
set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/)
option(MINDSPORE_PATH "mindspore install path" "")
include_directories(${MINDSPORE_PATH})
include_directories(${MINDSPORE_PATH}/include)
include_directories(${PROJECT_SRC_ROOT}/../)
find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib)
file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*)
add_executable(erfnet main.cc utils.cc)
target_link_libraries(erfnet ${MS_LIB} ${MD_LIB})
#! /bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
mkdir build
cd build
cmake . -DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`"
make
cd ..
\ No newline at end of file
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <sys/stat.h>
#include <sys/time.h>
#include <dirent.h>
#include <algorithm>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
#include "include/api/context.h"
#include "include/api/model.h"
#include "include/api/serialization.h"
#include "include/dataset/execute.h"
#include "include/dataset/vision.h"
#include "inc/utils.h"
namespace ms = mindspore;
namespace ds = mindspore::dataset;
int main(int argc, char **argv) {
if (argc != 5) {
std::cout << "example: ./erfnet /path/to/model /path/to/image device_id " << std::endl;
return -1;
}
std::cout << "model_patt:" << argv[1] << std::endl;
std::cout << "image_path:" << argv[2] << std::endl;
std::cout << "result_path" << argv[3] << std::endl;
std::cout << "device_id:" << argv[4] << std::endl;
int device_id = argv[4][0] - '0';
std::string res_path = argv[3];
// set context
auto context = std::make_shared<ms::Context>();
auto ascend310_info = std::make_shared<ms::Ascend310DeviceInfo>();
ascend310_info->SetDeviceID(device_id);
context->MutableDeviceInfo().push_back(ascend310_info);
// define model
ms::Graph graph;
ms::Status ret = ms::Serialization::Load(argv[1], ms::ModelType::kMindIR, &graph);
if (ret != ms::kSuccess) {
std::cout << "Load model failed." << std::endl;
return 1;
}
ms::Model erfnet;
// build model
ret = erfnet.Build(ms::GraphCell(graph), context);
if (ret != ms::kSuccess) {
std::cout << "Build model failed." << std::endl;
return 1;
}
// get model info
std::vector<ms::MSTensor> model_inputs = erfnet.GetInputs();
if (model_inputs.empty()) {
std::cout << "Invalid model, inputs is empty." << std::endl;
return 1;
}
// define transforms
std::shared_ptr<ds::TensorTransform> decode(new ds::vision::Decode());
std::shared_ptr<ds::TensorTransform> resize(new ds::vision::Resize({512, 1024}));
std::shared_ptr<ds::TensorTransform> normalize(new ds::vision::Normalize({0, 0, 0},
{255, 255, 255}));
std::shared_ptr<ds::TensorTransform> hwc2chw(new ds::vision::HWC2CHW());
// define preprocessor
ds::Execute preprocessor({decode, resize, normalize, hwc2chw});
std::map<double, double> costTime_map;
std::vector<std::string> images = GetAllFiles(argv[2]);
for (const auto &image_file : images) {
struct timeval start = {0};
struct timeval end = {0};
double startTime_ms;
double endTime_ms;
// prepare input
std::vector<ms::MSTensor> outputs;
std::vector<ms::MSTensor> inputs;
// read image file and preprocess
auto image = ReadFile(image_file);
ret = preprocessor(image, &image);
if (ret != ms::kSuccess) {
std::cout << "Image preprocess failed." << std::endl;
return 1;
}
inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(),
image.Data().get(), image.DataSize());
// infer
gettimeofday(&start, NULL);
ret = erfnet.Predict(inputs, &outputs);
gettimeofday(&end, NULL);
if (ret != ms::kSuccess) {
std::cout << "Predict model failed." << std::endl;
return 1;
}
// print infer result
std::cout << "Image: " << image_file << std::endl;
WriteResult(image_file, outputs, res_path);
startTime_ms = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000;
endTime_ms = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000;
costTime_map.insert(std::pair<double, double>(startTime_ms, endTime_ms));
}
double average = 0.0;
int infer_cnt = 0;
for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) {
double diff = 0.0;
diff = iter->second - iter->first;
average += diff;
infer_cnt++;
}
average = average / infer_cnt;
std::stringstream timeCost;
timeCost << "NN inference cost average time: " << average << " ms of infer_count " << infer_cnt << std::endl;
std::cout << "NN inference cost average time: " << average << "ms of infer_count " << infer_cnt << std::endl;
std::string file_name = "./time_Result" + std::string("/test_perform_static.txt");
std::ofstream file_stream(file_name.c_str(), std::ios::trunc);
file_stream << timeCost.str();
file_stream.close();
costTime_map.clear();
return 0;
}
/**
* Copyright 2021 Huawei Technologies Co., Ltd
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "inc/utils.h"
void print_shape(const ms::MSTensor &buffer) {
for (const auto &i : buffer.Shape())
std::cout << i;
std::cout << std::endl;
}
int WriteResult(const std::string& imageFile, const std::vector<ms::MSTensor> &outputs, const std::string &res_path) {
for (size_t i = 0; i < outputs.size(); ++i) {
size_t outputSize;
std::shared_ptr<const void> netOutput;
netOutput = outputs[i].Data();
outputSize = outputs[i].DataSize();
int pos = imageFile.rfind('/');
std::string fileName(imageFile, pos + 1);
fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin");
std::string outFileName = res_path + "/" + fileName;
FILE * outputFile = fopen(outFileName.c_str(), "wb");
fwrite(netOutput.get(), outputSize, sizeof(char), outputFile);
fclose(outputFile);
outputFile = nullptr;
}
return 0;
}
std::vector<std::string> GetAllFiles(std::string_view dir_name) {
struct dirent *filename;
DIR *dir = OpenDir(dir_name);
if (dir == nullptr) {
return {};
}
/* read all the files in the dir ~ */
std::vector<std::string> res;
while ((filename = readdir(dir)) != nullptr) {
std::string d_name = std::string(filename->d_name);
// get rid of "." and ".."
if (d_name == "." || d_name == ".." || filename->d_type != DT_REG)
continue;
res.emplace_back(std::string(dir_name) + "/" + filename->d_name);
}
std::sort(res.begin(), res.end());
return res;
}
DIR *OpenDir(std::string_view dir_name) {
// check the parameter !
if (dir_name.empty()) {
std::cout << " dir_name is null ! " << std::endl;
return nullptr;
}
std::string real_path = RealPath(dir_name);
// check if dir_name is a valid dir
struct stat s;
lstat(real_path.c_str(), &s);
if (!S_ISDIR(s.st_mode)) {
std::cout << "dir_name is not a valid directory !" << std::endl;
return nullptr;
}
DIR *dir;
dir = opendir(real_path.c_str());
if (dir == nullptr) {
std::cout << "Can not open dir " << dir_name << std::endl;
return nullptr;
}
return dir;
}
std::string RealPath(std::string_view path) {
char real_path_mem[PATH_MAX] = {0};
char *real_path_ret = realpath(path.data(), real_path_mem);
if (real_path_ret == nullptr) {
std::cout << "File: " << path << " is not exist.";
return "";
}
return std::string(real_path_mem);
}
ms::MSTensor ReadFile(const std::string &file) {
if (file.empty()) {
std::cout << "Pointer file is nullptr" << std::endl;
return ms::MSTensor();
}
std::ifstream ifs(file);
if (!ifs.good()) {
std::cout << "File: " << file << " is not exist" << std::endl;
return ms::MSTensor();
}
if (!ifs.is_open()) {
std::cout << "File: " << file << "open failed" << std::endl;
return ms::MSTensor();
}
ifs.seekg(0, std::ios::end);
size_t size = ifs.tellg();
ms::MSTensor buffer(file, ms::DataType::kNumberTypeUInt8, {static_cast<int64_t>(size)}, nullptr, size);
ifs.seekg(0, std::ios::beg);
ifs.read(reinterpret_cast<char *>(buffer.MutableData()), size);
ifs.close();
return buffer;
}
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
import math
from argparse import ArgumentParser
import numpy as np
import torch
from mindspore import Tensor
import mindspore.common.dtype as mstype
import mindspore.nn as nn
import mindspore.numpy as mnp
import mindspore.ops as ops
from mindspore.ops import operations as P
from mindspore.train.serialization import load_param_into_net, load_checkpoint
from mindspore import context
from src.iouEval import iouEval_1
from src.util import getCityLossWeight, getBool, seed_seed
from src.model import ERFNet, Encoder_pred
from src.dataset import getCityScapesDataLoader_GeneratorDataset
# Pytorch NLLLoss + log_softmax
class SoftmaxCrossEntropyLoss(nn.Cell):
def __init__(self, num_cls, weight):
super(SoftmaxCrossEntropyLoss, self).__init__()
self.one_hot = P.OneHot(axis=-1)
self.on_value = Tensor(1.0, mstype.float32)
self.off_value = Tensor(0.0, mstype.float32)
self.cast = P.Cast()
self.ce = nn.SoftmaxCrossEntropyWithLogits()
self.not_equal = P.NotEqual()
self.num_cls = num_cls
self.mul = P.Mul()
self.sum = P.ReduceSum(False)
self.div = P.RealDiv()
self.transpose = P.Transpose()
self.reshape = P.Reshape()
self.unsqueeze = ops.ExpandDims()
self.get_size = ops.Size()
self.exp = ops.Exp()
self.pow = ops.Pow()
self.weight = weight
if isinstance(self.weight, tuple):
self.use_focal = True
self.gamma = self.weight[0]
self.alpha = self.weight[1]
else:
self.use_focal = False
def construct(self, pred, labels):
labels = self.cast(labels, mstype.int32)
labels = self.reshape(labels, (-1,))
pred = self.transpose(pred, (0, 2, 3, 1))
pred = self.reshape(pred, (-1, self.num_cls))
one_hot_labels = self.one_hot(labels, self.num_cls, self.on_value, self.off_value)
pred = self.cast(pred, mstype.float32)
num = self.get_size(labels)
if self.use_focal:
loss = self.ce(pred, one_hot_labels)
factor = self.pow(1 - self.exp(-loss), self.gamma) * self.alpha
loss = self.div(self.sum(factor * loss), num)
return loss
if self.weight is not None:
weight = mnp.copy(self.weight)
weight = self.cast(weight, mstype.float32)
weight = self.unsqueeze(weight, 0)
expand = ops.BroadcastTo(pred.shape)
weight = expand(weight)
weight_masked = weight[mnp.arange(num), labels]
loss = self.ce(pred, one_hot_labels)
loss = self.div(self.sum(loss * weight_masked), self.sum(weight_masked))
else:
loss = self.ce(pred, one_hot_labels)
loss = self.div(self.sum(loss), num)
return loss
def IOU_1(network_trainefd, dataloader, num_class, enc):
ioueval = iouEval_1(num_class)
loss = SoftmaxCrossEntropyLoss(num_class, getCityLossWeight(enc))
loss_list = []
network_new = network_trainefd
network_new.set_train(False)
for index, (images, labels) in enumerate(dataloader):
preds = network_new(images)
l = loss(preds, labels)
loss_list.append(float(str(l)))
print("step {}/{}: loss: ".format(index+1, dataloader.get_dataset_size()), l)
preds = torch.Tensor(preds.asnumpy().argmax(axis=1).astype(np.int32)).unsqueeze(1).long()
labels = torch.Tensor(labels.asnumpy().astype(np.int32)).unsqueeze(1).long()
ioueval.addBatch(preds, labels)
mean_iou, iou_class = ioueval.getIoU()
mean_iou = mean_iou.item()
mean_loss = sum(loss_list) / len(loss_list)
return mean_iou, mean_loss, iou_class
def evalNetwork(network, eval_dataloader, ckptPath, encode_1, num_class=20, weight_init="XavierUniform"):
# load model checkpoint
if ckptPath is None:
print("no model checkpoint!")
elif not os.path.exists(ckptPath):
print("not exist {}".format(ckptPath))
else:
print("load model checkpoint {}!".format(ckptPath))
param_dict = load_checkpoint(ckptPath)
load_param_into_net(network, param_dict)
mean_iou, mean_loss, iou_class = IOU_1(network, eval_dataloader, num_class, encode_1)
with open(ckptPath + ".metric.txt", "w") as file:
print("model path", ckptPath, file=file)
print("mean_iou", mean_iou, file=file)
print("mean_loss", mean_loss, file=file)
print("iou_class", iou_class, file=file)
def listCKPTPath(model_root_path_1, enc):
paths_1 = []
names = os.listdir(model_root_path_1)
for name in names:
if name.endswith(".ckpt") and name+".metric.txt" not in names:
if enc and name.startswith("Encoder"):
ckpt_path = os.path.join(model_root_path_1, name)
paths_1.append(ckpt_path)
elif not enc and name.startswith("ERFNet"):
ckpt_path = os.path.join(model_root_path_1, name)
paths_1.append(ckpt_path)
return paths_1
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('--data_path', type=str)
parser.add_argument('--run_distribute', type=str)
parser.add_argument('--encode', type=str)
parser.add_argument('--model_root_path', type=str)
parser.add_argument('--device_id', type=int)
config = parser.parse_args()
model_root_path = config.model_root_path
encode_ = getBool(config.encode)
device_id = config.device_id
CityScapesRoot = config.data_path
run_distribute = getBool(config.run_distribute)
seed_seed()
context.set_context(mode=context.GRAPH_MODE)
context.set_context(device_target="Ascend")
context.set_context(device_id=device_id)
context.set_context(save_graphs=False)
eval_dataloader_1 = getCityScapesDataLoader_GeneratorDataset(CityScapesRoot, "val", 6, \
encode_, 512, False, False)
weight_init_1 = "XavierUniform"
if encode_:
network_1 = Encoder_pred(stage=1, num_class=20, weight_init=weight_init_1, \
run_distribute=False, train=False)
else:
network_1 = ERFNet(stage=1, num_class=20, init_conv=weight_init_1, run_distribute=False, \
train=False)
if not run_distribute:
if os.path.isdir(model_root_path):
paths = listCKPTPath(model_root_path, encode_)
for path in paths:
evalNetwork(network_1, eval_dataloader_1, path, encode_)
else:
evalNetwork(network_1, eval_dataloader_1, model_root_path, encode_)
else:
rank_id = int(os.environ["RANK_ID"])
rank_size = int(os.environ["RANK_SIZE"])
ckpt_files_path = listCKPTPath(model_root_path, encode_)
n = math.ceil(len(ckpt_files_path) / rank_size)
ckpt_files_path = ckpt_files_path[rank_id*n : rank_id*n + n]
for path in ckpt_files_path:
evalNetwork(network_1, eval_dataloader_1, path, encode_)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
from argparse import ArgumentParser
import numpy as np
from mindspore import Tensor, context, load_checkpoint, export
from src.model import ERFNet
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('--model_path', type=str)
config = parser.parse_args()
net = ERFNet(1, 20, "XavierUniform", run_distribute=False, train=False)
context.set_context(mode=context.GRAPH_MODE)
context.set_context(device_target="Ascend")
context.set_context(device_id=0)
load_checkpoint(config.model_path, net=net)
net.set_train(False)
input_data = Tensor(np.zeros([1, 3, 512, 1024]).astype(np.float32))
export(net, input_data, file_name="ERFNet.mindir", file_format="MINDIR")
torch==1.8.1+cpu
torchvision==0.9.1+cpu
numpy==1.17.5
mindspore-ascend @ file:///disk0/gengdongjie/packages/mindspore_ascend-1.2.0-cp37-cp37m-linux_x86_64.whl
Pillow==8.1.0
tqdm==4.61.2
\ No newline at end of file
#! /bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 4 ]
then
echo "Usage: bash scripts/run_distribute_train.sh /path/to/cityscapes DEVICE_ID RANK_TABLE_FILE"
echo "Example: bash scripts/run_distribute_train.sh /home/name/cityscapes 4 0,1,2,3 /home/name/rank_table_4pcs.json"
exit 1
fi
if [ ! -d $1 ]
then
echo "error: DATASET_PATH=$1 is not a directory"
exit 1
fi
if [ ! -f $4 ]
then
echo "error: RANK_TABLE_FILE=$4 is not a file"
exit 1
fi
echo "CityScapes dataset path: $1"
echo "RANK_SIZE: $2"
echo "DEVICE_ID: $3"
echo "RANK_TABLE_FILE: $4"
# ps -aux | grep "python -u ../../train.py" | awk '{print $2}' | xargs kill -9
export HCCL_CONNECT_TIMEOUT=600
export RANK_SIZE=$2
cityscapes_path=$1
IFS="," read -r -a devices <<< "$3";
export RANK_TABLE_FILE=$4
mkdir ./log
cd ./log
# 1.train
for((i=0;i<RANK_SIZE;i++))
do
{
mkdir ./log$i
cd ./log$i
export RANK_ID=$i
export DEVICE_ID=${devices[i]}
echo "start training for rank $i, device $DEVICE_ID"
python -u ../../train.py \
--lr 1e-3 \
--repeat 2 \
--run_distribute true \
--save_path './' \
--mindrecord_train_data "../../data/train.mindrecord" \
--stage 1 \
--ckpt_path "" \
> log.txt 2>&1
cd ../
} &
done
wait
# 2.train
for((i=0;i<RANK_SIZE;i++))
do
{
mkdir ./log$i
cd ./log$i
export RANK_ID=$i
export DEVICE_ID=${devices[i]}
echo "start training for rank $i, device $DEVICE_ID"
python -u ../../train.py \
--lr 1e-3 \
--repeat 2 \
--run_distribute true \
--save_path './' \
--mindrecord_train_data "../../data/train.mindrecord" \
--stage 2 \
--ckpt_path "../log0/Encoder-65_496.ckpt" \
> log.txt 2>&1
cd ../
} &
done
wait
# 3.train
for((i=0;i<RANK_SIZE;i++))
do
{
mkdir ./log$i
cd ./log$i
export RANK_ID=$i
export DEVICE_ID=${devices[i]}
echo "start training for rank $i, device $DEVICE_ID"
python -u ../../train.py \
--lr 1e-3 \
--repeat 2 \
--run_distribute true \
--save_path './' \
--mindrecord_train_data "../../data/train.mindrecord" \
--stage 3 \
--ckpt_path "../log0/Encoder_1-85_496.ckpt" \
> log.txt 2>&1
cd ../
} &
done
wait
# 4.train
for((i=0;i<RANK_SIZE;i++))
do
{
mkdir ./log$i
cd ./log$i
export RANK_ID=$i
export DEVICE_ID=${devices[i]}
echo "start training for rank $i, device $DEVICE_ID"
python -u ../../train.py \
--lr 1e-3 \
--repeat 2 \
--run_distribute true \
--save_path './' \
--mindrecord_train_data "../../data/train.mindrecord" \
--stage 4 \
--ckpt_path "../log0/ERFNet-65_496.ckpt" \
> log.txt 2>&1
cd ../
} &
done
wait
# eval
cd ./log0
for((i=0;i<RANK_SIZE;i++))
do
{
export RANK_ID=$i
export DEVICE_ID=${devices[i]}
echo "start eval for rank $i, device $DEVICE_ID"
python -u ../../eval.py \
--data_path ${cityscapes_path} \
--run_distribute true \
--encode false \
--model_root_path './' \
--device_id ${devices[i]} \
> log${i}_eval.txt 2>&1 &
}
done
#! /bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 5 ]
then
echo "Usage: bash scripts/run_infer_310.sh MINDIR IMGS RES LABEL DEVICE_ID"
echo "Example: bash scripts/run_infer_310.sh /path/to/net.mindir /path/to/images /path/to/result /path/to/label 0"
exit 1
fi
if [ ! -f $1 ]
then
echo "error: mindir_path=$1 is not a file"
exit 1
fi
if [ ! -d $2 ]
then
echo "error: images_path=$2 is not a directory"
exit 1
fi
if [ ! -d $3 ]
then
echo "error: result_path=$3 is not a directory"
exit 1
fi
if [ ! -d $4 ]
then
echo "error: label_path=$4 is not a directory"
exit 1
fi
echo "model mindir: $1"
echo "images path: $2"
echo "result path: $3"
echo "laebl path: $4"
echo "device id: $5"
cd ascend310_infer/src
bash build.sh
./build/erfnet $1 $2 $3 $5
cd ../..
python src/eval310.py --res_path $3 --label_path $4
#! /bin/bash
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
if [ $# != 2 ]
then
echo "Usage: bash scripts/run_standalone_train.sh /path/to/cityscapes DEVICE_ID"
echo "Example: bash scripts/run_standalone_train.sh /home/name/cityscapes 0"
exit 1
fi
if [ ! -d $1 ]
then
echo "error: DATASET_PATH=$1 is not a directory"
exit 1
fi
echo "CityScapes dataset path: $1"
echo "DEVICE_ID: $2"
ps -aux | grep "python -u ../train.py" | awk '{print $2}' | xargs kill -9
mkdir ./log_single_device
cd ./log_single_device
cityscapes_path=$1
export RANK_SIZE=1
export DEVICE_ID=$2
python -u ../train.py \
--lr 5e-4 \
--repeat 1 \
--run_distribute false \
--save_path './' \
--mindrecord_train_data "../data/train.mindrecord" \
--stage 1 \
--ckpt_path "" \
> log_stage1.txt 2>&1
python -u ../train.py \
--lr 5e-4 \
--repeat 1 \
--run_distribute false \
--save_path './' \
--mindrecord_train_data "../data/train.mindrecord" \
--stage 2 \
--ckpt_path "./Encoder-65_496.ckpt" \
> log_stage2.txt 2>&1
python -u ../train.py \
--lr 5e-4 \
--repeat 1 \
--run_distribute false \
--save_path './' \
--mindrecord_train_data "../data/train.mindrecord" \
--stage 3 \
--ckpt_path "./Encoder_1-85_496.ckpt" \
> log_stage3.txt 2>&1
python -u ../train.py \
--lr 5e-4 \
--repeat 1 \
--run_distribute false \
--save_path './' \
--mindrecord_train_data "../data/train.mindrecord" \
--stage 4 \
--ckpt_path "./ERFNet-65_496.ckpt" \
> log_stage4.txt 2>&1
python -u ../eval.py \
--data_path ${cityscapes_path} \
--run_distribute false \
--encode false \
--model_root_path './' \
--device_id ${DEVICE_ID} \
> log_eval.txt 2>&1 &
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
from argparse import ArgumentParser
from tqdm import tqdm
from mindspore.mindrecord import FileWriter
from dataset import cityscapes_datapath
# example:
# python build_mrdata.py \
# --dataset_path /path/to/cityscapes/ \
# --subset train \
# --output_name train.mindrecord
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('--dataset_path', type=str)
parser.add_argument('--subset', type=str)
parser.add_argument('--output_name', type=str)
config = parser.parse_args()
output_name = config.output_name
subset = config.subset
dataset_path = config.dataset_path
if not subset in ("train", "val"):
raise RuntimeError('subset should be "train" or "val"')
dataPathLoader = cityscapes_datapath(dataset_path, subset)
writer = FileWriter(file_name=output_name)
seg_schema = {"file_name": {"type": "string"}, "label": {"type": "bytes"}, "data": {"type": "bytes"}}
writer.add_schema(seg_schema, "seg_schema")
data_list = []
cnt = 0
for img_path, label_path in tqdm(dataPathLoader):
sample_ = {"file_name": img_path.split('/')[-1]}
with open(img_path, 'rb') as f:
sample_['data'] = f.read()
with open(label_path, 'rb') as f:
sample_['label'] = f.read()
data_list.append(sample_)
cnt += 1
if cnt % 100 == 0:
writer.write_raw_data(data_list)
print('number of samples written:', cnt)
data_list = []
if data_list:
writer.write_raw_data(data_list)
writer.commit()
print('number of samples written:', cnt)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
from argparse import ArgumentParser
from mindspore import context
from mindspore.common.initializer import XavierUniform
from src.util import getBool, seed_seed, getLR
parser = ArgumentParser()
parser.add_argument('--lr', type=float)
parser.add_argument('--run_distribute', type=str)
parser.add_argument('--save_path', type=str)
parser.add_argument('--repeat', type=int)
parser.add_argument('--mindrecord_train_data', type=str)
parser.add_argument('--stage', type=int)
parser.add_argument('--ckpt_path', type=str)
config = parser.parse_args()
max_lr = config.lr
run_distribute = getBool(config.run_distribute)
global_size = int(os.environ["RANK_SIZE"])
repeat = config.repeat
stage = config.stage
ckpt_path = config.ckpt_path
save_path = config.save_path
context.set_context(mode=context.GRAPH_MODE)
context.set_context(device_target="Ascend")
context.set_context(device_id=int(os.environ["DEVICE_ID"]))
context.set_context(save_graphs=False)
seed_seed() # init random seed
weight_init = XavierUniform() # weight init
ms_train_data = config.mindrecord_train_data
num_class = 20
# train config
class TrainConfig_1:
def __init__(self):
self.subset = "train"
self.num_class = 20
self.train_img_size = 512
self.epoch_num_save = 1
self.epoch = 65
self.encode = True
self.attach_decoder = False
self.lr = getLR(max_lr, 0, 150, 496, \
run_distribute=run_distribute, global_size=global_size, repeat=repeat)
class TrainConfig_2:
def __init__(self):
self.subset = "train"
self.num_class = 20
self.train_img_size = 512
self.epoch_num_save = 1
self.epoch = 85
self.encode = True
self.attach_decoder = False
self.lr = getLR(max_lr, 65, 150, 496, \
run_distribute=run_distribute, global_size=global_size, repeat=repeat)
class TrainConfig_3:
def __init__(self):
self.subset = "train"
self.num_class = 20
self.train_img_size = 512
self.epoch_num_save = 1
self.epoch = 65
self.encode = False
self.attach_decoder = True
self.lr = getLR(max_lr, 0, 150, 496, \
run_distribute=run_distribute, global_size=global_size, repeat=repeat)
class TrainConfig_4:
def __init__(self):
self.subset = "train"
self.num_class = 20
self.train_img_size = 512
self.epoch_num_save = 1
self.epoch = 85
self.encode = False
self.attach_decoder = False
self.lr = getLR(max_lr, 65, 150, 496, \
run_distribute=run_distribute, global_size=global_size, repeat=repeat)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
import random
from io import BytesIO
import numpy as np
from PIL import Image, ImageFilter
from torchvision.transforms import Resize
import mindspore.dataset as ds
EXTENSIONS = ['.jpg', '.png']
class MyGaussianBlur(ImageFilter.Filter):
def __init__(self, radius=2, bounds=None):
self.radius = radius
self.bounds = bounds
def filter(self, image):
if self.bounds:
clips = image.crop(self.bounds).gaussian_blur(self.radius)
image.paste(clips, self.bounds)
return image
return image.gaussian_blur(self.radius)
def load_image(file):
return Image.open(file)
def is_image(filename):
return any(filename.endswith(ext) for ext in EXTENSIONS)
def is_label(filename):
return filename.endswith("_labelTrainIds.png")
def image_path(root, basename, extension):
return os.path.join(root, f'{basename}{extension}')
def image_path_city(root, name):
return os.path.join(root, f'{name}')
def image_basename(filename):
return os.path.basename(os.path.splitext(filename)[0])
class MyCoTransform:
def __init__(self, stage, enc, augment, height, if_from_mindrecord=False):
self.enc = enc
self.augment = augment
self.height = height
self.if_from_mindrecord = if_from_mindrecord
if not stage in (1, 2, 3, 4):
raise RuntimeError("stage should be 1, 2, 3 or 4")
self.stage = stage
if self.stage == 1:
self.ratio = 1.2
else:
self.ratio = 1.3
def process_one(self, image, target, height):
if self.augment:
# GaussianBlur
image = image.filter(MyGaussianBlur(radius=random.random()))
if random.random() > 0.5: # random crop
if self.stage == 1:
ratio = self.ratio # 1阶段使用
else:
ratio = random.random() * (self.ratio - 1) + 1
w = int(2048 / ratio)
h = int(1024 / ratio)
x = int(random.random()*(2048-w))
y = int(random.random()*(1024-h))
box = (x, y, x+w, y+h)
image = image.crop(box)
target = target.crop(box)
image = Resize(height, Image.BILINEAR)(image)
target = Resize(height, Image.NEAREST)(target)
# Random hflip
if random.random() < 0.5:
image = image.transpose(Image.FLIP_LEFT_RIGHT)
target = target.transpose(Image.FLIP_LEFT_RIGHT)
else:
image = Resize(height, Image.BILINEAR)(image)
target = Resize(height, Image.NEAREST)(target)
image = np.array(image).astype(np.float32) / 255
image = image.transpose(2, 0, 1)
target = Resize(int(height/8), Image.NEAREST)(target) if self.enc else target
target = np.array(target).astype(np.uint32)
target[target == 255] = 19
return image, target
def process_one_infer(self, image, height):
image = Resize(height, Image.BILINEAR)(image)
image = np.array(image).astype(np.float32) / 255
image = image.transpose(2, 0, 1)
return image
def __call__(self, image, target=None):
if self.if_from_mindrecord:
image = Image.open(BytesIO(image))
target = Image.open(BytesIO(target))
if target is None:
image = self.process_one_infer(image, self.height)
return image
image, target = self.process_one(image, target, self.height)
return image, target
class cityscapes:
def __init__(self, root, subset, enc, aug, height):
self.images_root = os.path.join(root, 'leftImg8bit/')
self.labels_root = os.path.join(root, 'gtFine/')
self.images_root += subset
self.labels_root += subset
self.filenames = [os.path.join(dp, f) for dp, dn, fn in \
os.walk(os.path.expanduser(self.images_root)) for f in fn if is_image(f)]
self.filenames.sort()
self.filenamesGt = [os.path.join(dp, f) for dp, dn, fn in \
os.walk(os.path.expanduser(self.labels_root)) for f in fn if is_label(f)]
self.filenamesGt.sort()
self.transform = MyCoTransform(1, enc, aug, height)
def __getitem__(self, index):
filename = self.filenames[index]
filenameGt = self.filenamesGt[index]
with open(image_path_city(self.images_root, filename), 'rb') as f:
image = load_image(f).convert('RGB')
with open(image_path_city(self.labels_root, filenameGt), 'rb') as f:
label = load_image(f).convert('P')
image, label = self.transform(image, label)
return image, label
def __len__(self):
return len(self.filenames)
class cityscapes_datapath:
def __init__(self, root, subset):
self.images_root = os.path.join(root, 'leftImg8bit/')
self.labels_root = os.path.join(root, 'gtFine/')
self.images_root += subset
self.labels_root += subset
self.filenames = [os.path.join(dp, f) for dp, dn, fn in \
os.walk(os.path.expanduser(self.images_root)) for f in fn if is_image(f)]
self.filenames.sort()
self.filenamesGt = [os.path.join(dp, f) for dp, dn, fn in \
os.walk(os.path.expanduser(self.labels_root)) for f in fn if is_label(f)]
self.filenamesGt.sort()
def __getitem__(self, index):
filename = self.filenames[index]
filenameGt = self.filenamesGt[index]
return filename, filenameGt
def __len__(self):
return len(self.filenames)
def getCityScapesDataLoader_GeneratorDataset(CityScapesRoot, subset, batch_size, \
enc, height, shuffle, aug, rank_id=0, global_size=1, repeat=1):
dataset = cityscapes(CityScapesRoot, subset, enc, aug, height)
dataloader = ds.GeneratorDataset(dataset, column_names=["images", "labels"], \
num_parallel_workers=8, shuffle=shuffle, shard_id=rank_id, \
num_shards=global_size, python_multiprocessing=True)
if shuffle:
dataloader = dataloader.shuffle(batch_size*10)
dataloader = dataloader.batch(batch_size, drop_remainder=False)
if repeat > 1:
dataloader = dataloader.repeat(repeat)
return dataloader
def getCityScapesDataLoader_mindrecordDataset(stage, data_path, batch_size, enc, height, \
shuffle, aug, rank_id=0, global_size=1, repeat=1):
dataloader = ds.MindDataset(data_path, columns_list=["data", "label"], \
num_parallel_workers=8, shuffle=shuffle, shard_id=rank_id, num_shards=global_size)
transform = MyCoTransform(stage, enc, aug, height, if_from_mindrecord=True)
dataloader = dataloader.map(operations=transform, \
input_columns=["data", "label"], output_columns=["data", "label"], \
num_parallel_workers=8, python_multiprocessing=True)
if shuffle:
dataloader = dataloader.shuffle(batch_size*10)
dataloader = dataloader.batch(batch_size, drop_remainder=False)
if repeat > 1:
dataloader = dataloader.repeat(repeat)
return dataloader
class InferDataSet:
def __init__(self, img_path, height):
self.imgs_path = [os.path.join(img_path, img) for img in os.listdir(img_path)]
self.transform = MyCoTransform(1, False, False, 512)
def __getitem__(self, index):
with open(self.imgs_path[index], 'rb') as f:
image = load_image(f).convert('RGB')
image = self.transform(image)
return (image,)
def __len__(self):
return len(self.imgs_path)
def getInferDataLoader_fromfile(img_path, batch_size, height):
shuffle = False
global_size = 1
rank_id = 0
dataset = InferDataSet(img_path, height)
dataloader = ds.GeneratorDataset(dataset, column_names=["images"], \
num_parallel_workers=8, shuffle=shuffle, shard_id=rank_id, \
num_shards=global_size, python_multiprocessing=True)
dataloader = dataloader.batch(batch_size, drop_remainder=False)
return dataloader
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
from argparse import ArgumentParser
import numpy as np
import torch
from torchvision.transforms import Resize
from PIL import Image
def load_image(fileName):
return Image.open(fileName)
def is_image(filename):
return any(filename.endswith(ext) for ext in EXTENSIONS)
def is_label(filename):
return filename.endswith("_labelTrainIds.png")
def image_path(root, basename, extension):
return os.path.join(root, f'{basename}{extension}')
def image_path_city(root, name):
return os.path.join(root, f'{name}')
def image_basename(filename):
return os.path.basename(os.path.splitext(filename)[0])
class iouEval_1:
def __init__(self, nClasses, ignoreIndex=19):
self.nClasses = nClasses
self.ignoreIndex = ignoreIndex if nClasses > ignoreIndex else -1
self.reset()
def reset(self):
classes = self.nClasses if self.ignoreIndex == -1 else self.nClasses-1
self.tp = torch.zeros(classes).double()
self.fp = torch.zeros(classes).double()
self.fn = torch.zeros(classes).double()
def addBatch(self, x, y): #x=preds, y=targets
#sizes should be "batch_size x nClasses x H x W"
if (x.is_cuda or y.is_cuda):
x = x.cuda()
y = y.cuda()
if x.size(1) == 1:
x_onehot = torch.zeros(x.size(0), self.nClasses, x.size(2), x.size(3))
if x.is_cuda:
x_onehot = x_onehot.cuda()
x_onehot.scatter_(1, x, 1).float()
else:
x_onehot = x.float()
if y.size(1) == 1:
y_onehot = torch.zeros(y.size(0), self.nClasses, y.size(2), y.size(3))
if y.is_cuda:
y_onehot = y_onehot.cuda()
y_onehot.scatter_(1, y, 1).float()
else:
y_onehot = y.float()
if self.ignoreIndex != -1:
ignores = y_onehot[:, self.ignoreIndex].unsqueeze(1)
x_onehot = x_onehot[:, :self.ignoreIndex]
y_onehot = y_onehot[:, :self.ignoreIndex]
else:
ignores = 0
tpmult = x_onehot * y_onehot
tp = torch.sum(
torch.sum(torch.sum(tpmult, dim=0, keepdim=True), dim=2, keepdim=True),
dim=3, keepdim=True
).squeeze()
fpmult = x_onehot * (1-y_onehot-ignores)
fp = torch.sum(torch.sum(torch.sum(fpmult, dim=0, \
keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()
fnmult = (1-x_onehot) * (y_onehot)
fn = torch.sum(torch.sum(torch.sum(fnmult, dim=0, \
keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()
self.tp += tp.double().cpu()
self.fp += fp.double().cpu()
self.fn += fn.double().cpu()
def getIoU(self):
num = self.tp
den = self.tp + self.fp + self.fn + 1e-15
iou = num / den
return torch.mean(iou), iou
class cityscapes_datapath:
def __init__(self, labels_path):
self.labels_path = labels_path
self.filenamesGt = [os.path.join(dp, f) for dp, dn, fn in \
os.walk(os.path.expanduser(self.labels_path)) for f in fn if is_label(f)]
self.filenamesGt.sort()
def __getitem__(self, index):
filenameGt = self.filenamesGt[index]
return filenameGt
def __len__(self):
return len(self.filenamesGt)
# example:
# python eval.py --res_path \
# --label_path
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('--res_path', type=str)
parser.add_argument('--label_path', type=str)
config = parser.parse_args()
res_path = config.res_path
label_path = config.label_path
gt = {}
for i in list(cityscapes_datapath(label_path)):
gt[i.split("/")[-1].rstrip("_gtFine_labelTrainIds.png")] = i
metrics = iouEval_1(nClasses=20)
for i, bin_name in enumerate(os.listdir(res_path)):
print(i)
file_name_sof = os.path.join(res_path, bin_name)
key = bin_name.split("_leftImg8bit_0.bin")[0]
with open(gt[key], 'rb') as f:
target = load_image(f).convert('P')
target = Resize(512, Image.NEAREST)(target)
target = np.array(target).astype(np.uint32)
target[target == 255] = 19
target = target.reshape(512, 1024)
target = target[np.newaxis, :, :]
softmax_out = np.fromfile(file_name_sof, np.float32)
softmax_out = softmax_out.reshape(1, 20, 512, 1024)
preds = torch.Tensor(softmax_out.argmax(axis=1).astype(np.int32)).unsqueeze(1).long()
labels = torch.Tensor(target.astype(np.int32)).unsqueeze(1).long()
metrics.addBatch(preds, labels)
mean_iou, iou_class = metrics.getIoU()
mean_iou = mean_iou.item()
with open("metric.txt", "w") as file:
print("mean_iou: ", mean_iou, file=file)
print("iou_class: ", iou_class, file=file)
print("mean_iou: ", mean_iou)
print("iou_class: ", iou_class)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import os
from argparse import ArgumentParser
import cv2
import numpy as np
from mindspore import context
from mindspore.train.serialization import load_param_into_net, load_checkpoint
from dataset import getInferDataLoader_fromfile
from show import Colorize_cityscapes
from util import seed_seed
from model import ERFNet
def infer(network_1, eval_dataloader, ckptPath, output_path_1):
colorize = Colorize_cityscapes()
# load model checkpoint
if ckptPath is None:
print("no model checkpoint!")
elif not os.path.exists(ckptPath):
print("not exist {}".format(ckptPath))
else:
print("load model checkpoint {}!".format(ckptPath))
param_dict = load_checkpoint(ckptPath)
load_param_into_net(network_1, param_dict)
for index, images in enumerate(eval_dataloader):
images = images[0]
preds = network_1(images)
preds = np.argmax(preds.asnumpy(), axis=1).astype(np.uint8)
for i, pred in enumerate(preds):
colorized_pred = colorize(pred)
cv2.imwrite(os.path.join(output_path_1, str(index)+"_"+str(i)+".jpg"), \
colorized_pred)
# example:
# python infer.py \
# --data_path /path/to/cityscapes \
# --model_path /path/to/ERFNet.ckpt \
# --output_path /path/to/output \
# --device_id 0 > log_infer.txt
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument('--data_path', type=str)
parser.add_argument('--model_path', type=str)
parser.add_argument('--output_path', type=str)
parser.add_argument('--device_id', type=int)
config = parser.parse_args()
model_path = config.model_path
device_id = config.device_id
data_path = config.data_path
output_path = config.output_path
if not os.path.exists(output_path):
os.mkdir(output_path)
seed_seed()
context.set_context(mode=context.GRAPH_MODE)
context.set_context(device_target="Ascend")
context.set_context(device_id=device_id)
context.set_context(save_graphs=False)
dataloader = getInferDataLoader_fromfile(data_path, 32, 512)
weight_init = "XavierUniform"
network = ERFNet(1, 20, weight_init, run_distribute=False, train=False)
network.set_train(False)
infer(network, dataloader, model_path, output_path)
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import torch
class iouEval_1:
def __init__(self, nClasses, ignoreIndex=19):
self.nClasses = nClasses
self.ignoreIndex = ignoreIndex if nClasses > ignoreIndex else -1
self.reset()
def reset(self):
classes = self.nClasses if self.ignoreIndex == -1 else self.nClasses-1
self.tp = torch.zeros(classes).double()
self.fp = torch.zeros(classes).double()
self.fn = torch.zeros(classes).double()
def addBatch(self, x, y): #x=preds, y=targets
#sizes should be "batch_size x nClasses x H x W"
if (x.is_cuda or y.is_cuda):
x = x.cuda()
y = y.cuda()
if x.size(1) == 1:
x_onehot = torch.zeros(x.size(0), self.nClasses, x.size(2), x.size(3))
if x.is_cuda:
x_onehot = x_onehot.cuda()
x_onehot.scatter_(1, x, 1).float()
else:
x_onehot = x.float()
if y.size(1) == 1:
y_onehot = torch.zeros(y.size(0), self.nClasses, y.size(2), y.size(3))
if y.is_cuda:
y_onehot = y_onehot.cuda()
y_onehot.scatter_(1, y, 1).float()
else:
y_onehot = y.float()
if self.ignoreIndex != -1:
ignores = y_onehot[:, self.ignoreIndex].unsqueeze(1)
x_onehot = x_onehot[:, :self.ignoreIndex]
y_onehot = y_onehot[:, :self.ignoreIndex]
else:
ignores = 0
tpmult = x_onehot * y_onehot
tp = torch.sum(
torch.sum(torch.sum(tpmult, dim=0, keepdim=True), dim=2, keepdim=True),
dim=3, keepdim=True
).squeeze()
fpmult = x_onehot * (1-y_onehot-ignores)
fp = torch.sum(torch.sum(torch.sum(fpmult, dim=0, \
keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()
fnmult = (1-x_onehot) * (y_onehot)
fn = torch.sum(torch.sum(torch.sum(fnmult, dim=0, \
keepdim=True), dim=2, keepdim=True), dim=3, keepdim=True).squeeze()
self.tp += tp.double().cpu()
self.fp += fp.double().cpu()
self.fn += fn.double().cpu()
def getIoU(self):
num = self.tp
den = self.tp + self.fp + self.fn + 1e-15
iou = num / den
return torch.mean(iou), iou
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
from mindspore import ops, nn
class DownsamplerBlock(nn.Cell):
def __init__(self, in_feature_num, out_feature_num, weight_init):
super(DownsamplerBlock, self).__init__()
self.conv = nn.Conv2d(in_feature_num, out_feature_num-in_feature_num, \
3, stride=2, padding=(1, 1, 1, 1), has_bias=True, \
weight_init=weight_init, pad_mode="pad")
self.pool = nn.MaxPool2d(2, 2)
self.bn = nn.BatchNorm2d(out_feature_num, eps=1e-3)
self.cat = ops.Concat(axis=1)
self.relu = nn.ReLU()
def construct(self, x):
a = self.conv(x)
b = self.pool(x)
output = self.cat((a, b))
output = self.bn(output)
output = self.relu(output)
return output
class non_bottleneck_1d(nn.Cell):
def __init__(self, chann, dropprob, dilated, weight_init):
super(non_bottleneck_1d, self).__init__()
self.dropprob = dropprob
self.conv3x1_1 = nn.Conv2d(chann, chann, (3, 1), stride=1, \
padding=(1, 1, 0, 0), pad_mode='pad', has_bias=True, \
weight_init=weight_init)
self.conv1x3_1 = nn.Conv2d(chann, chann, (1, 3), stride=1, \
padding=(0, 0, 1, 1), pad_mode='pad', has_bias=True, \
weight_init=weight_init)
self.conv3x1_2 = nn.Conv2d(chann, chann, (3, 1), stride=1, \
padding=(dilated, dilated, 0, 0), pad_mode='pad', \
has_bias=True, dilation=(dilated, 1), weight_init=weight_init)
self.conv1x3_2 = nn.Conv2d(chann, chann, (1, 3), stride=1, \
padding=(0, 0, dilated, dilated), pad_mode='pad', \
has_bias=True, dilation=(1, dilated), weight_init=weight_init)
self.bn1 = nn.BatchNorm2d(chann, eps=1e-03)
self.bn2 = nn.BatchNorm2d(chann, eps=1e-03)
if dropprob > 0:
self.dropout = ops.Dropout(keep_prob=1-dropprob)
self.relu = nn.ReLU()
self.dilated = dilated
def construct(self, x):
output = self.conv3x1_1(x)
output = self.relu(output)
output = self.conv1x3_1(output)
output = self.bn1(output)
output = self.relu(output)
output = self.conv3x1_2(output)
output = self.relu(output)
output = self.conv1x3_2(output)
output = self.bn2(output)
if self.dropprob > 0:
output, _ = self.dropout(output)
return self.relu(output + x)
class Encoder(nn.Cell):
def __init__(self, stage, weight_init, run_distribute, train=True):
super(Encoder, self).__init__()
if train:
if run_distribute:
if stage == 3:
drop_prob = [0.03, 0.2]
elif stage in (1, 2, 4):
drop_prob = [0.03, 0.3]
else:
raise RuntimeError("stage should be 1, 2, 3, or 4.")
else:
drop_prob = [0.03, 0.2]
else:
drop_prob = [.0, .0]
self.layers = nn.CellList()
self.down1 = DownsamplerBlock(3, 16, weight_init)
self.down2 = DownsamplerBlock(16, 64, weight_init)
self.bottleneck1 = non_bottleneck_1d(64, drop_prob[0], 1, weight_init)
self.bottleneck2 = non_bottleneck_1d(64, drop_prob[0], 1, weight_init)
self.bottleneck3 = non_bottleneck_1d(64, drop_prob[0], 1, weight_init)
self.bottleneck4 = non_bottleneck_1d(64, drop_prob[0], 1, weight_init)
self.bottleneck5 = non_bottleneck_1d(64, drop_prob[0], 1, weight_init)
self.down3 = DownsamplerBlock(64, 128, weight_init)
self.bottleneck6 = non_bottleneck_1d(128, drop_prob[1], 2, weight_init)
self.bottleneck7 = non_bottleneck_1d(128, drop_prob[1], 4, weight_init)
self.bottleneck8 = non_bottleneck_1d(128, drop_prob[1], 8, weight_init)
self.bottleneck9 = non_bottleneck_1d(128, drop_prob[1], 16, weight_init)
self.bottleneck10 = non_bottleneck_1d(128, drop_prob[1], 2, weight_init)
self.bottleneck11 = non_bottleneck_1d(128, drop_prob[1], 4, weight_init)
self.bottleneck12 = non_bottleneck_1d(128, drop_prob[1], 8, weight_init)
self.bottleneck13 = non_bottleneck_1d(128, drop_prob[1], 16, weight_init)
def construct(self, x):
x = self.down1(x)
x = self.down2(x)
x = self.bottleneck1(x)
x = self.bottleneck2(x)
x = self.bottleneck3(x)
x = self.bottleneck4(x)
x = self.bottleneck5(x)
x = self.down3(x)
x = self.bottleneck6(x)
x = self.bottleneck7(x)
x = self.bottleneck8(x)
x = self.bottleneck9(x)
x = self.bottleneck10(x)
x = self.bottleneck11(x)
x = self.bottleneck12(x)
x = self.bottleneck13(x)
return x
class UpsamplerBlock(nn.Cell):
def __init__(self, in_feature_num, out_feature_num, weight_init):
super(UpsamplerBlock, self).__init__()
self.conv = nn.Conv2dTranspose(in_feature_num, out_feature_num, 3, \
stride=2, has_bias=True, weight_init=weight_init)
self.bn = nn.BatchNorm2d(out_feature_num, eps=1e-03)
self.relu = nn.ReLU()
def construct(self, x):
x = self.conv(x)
x = self.bn(x)
x = self.relu(x)
return x
class Decoder(nn.Cell):
def __init__(self, num_classes, weight_init):
super(Decoder, self).__init__()
self.up1 = UpsamplerBlock(128, 64, weight_init)
self.bottleneck1 = non_bottleneck_1d(64, 0, 1, weight_init)
self.bottleneck2 = non_bottleneck_1d(64, 0, 1, weight_init)
self.up2 = UpsamplerBlock(64, 16, weight_init)
self.bottleneck3 = non_bottleneck_1d(16, 0, 1, weight_init)
self.bottleneck4 = non_bottleneck_1d(16, 0, 1, weight_init)
self.pred = nn.Conv2dTranspose(16, num_classes, 2, stride=2, has_bias=True, \
weight_init=weight_init)
def construct(self, x):
x = self.up1(x)
x = self.bottleneck1(x)
x = self.bottleneck2(x)
x = self.up2(x)
x = self.bottleneck3(x)
x = self.bottleneck4(x)
x = self.pred(x)
return x
class Encoder_pred(nn.Cell):
def __init__(self, stage, num_class, weight_init, run_distribute, train=True):
super(Encoder_pred, self).__init__()
self.encoder = Encoder(stage, weight_init, run_distribute, train)
self.pred = nn.Conv2d(128, num_class, 1, stride=1, pad_mode='valid', \
has_bias=True, weight_init=weight_init)
def construct(self, x):
x = self.encoder(x)
x = self.pred(x)
return x
class ERFNet(nn.Cell):
def __init__(self, stage, num_class, init_conv, run_distribute, train=True):
super(ERFNet, self).__init__()
self.encoder = Encoder(stage, init_conv, run_distribute, train)
self.decoder = Decoder(num_class, init_conv)
def construct(self, x):
x1 = self.encoder(x)
x2 = self.decoder(x1)
return x2
# Copyright 2021 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
import numpy as np
def colormap_cityscapes():
cmap = np.zeros([20, 3]).astype(np.uint8)
cmap[0, :] = np.array([128, 64, 128])
cmap[1, :] = np.array([244, 35, 232])
cmap[2, :] = np.array([70, 70, 70])
cmap[3, :] = np.array([102, 102, 156])
cmap[4, :] = np.array([190, 153, 153])
cmap[5, :] = np.array([153, 153, 153])
cmap[6, :] = np.array([250, 170, 30])
cmap[7, :] = np.array([220, 220, 0])
cmap[8, :] = np.array([107, 142, 35])
cmap[9, :] = np.array([152, 251, 152])
cmap[10, :] = np.array([70, 130, 180])
cmap[11, :] = np.array([220, 20, 60])
cmap[12, :] = np.array([255, 0, 0])
cmap[13, :] = np.array([0, 0, 142])
cmap[14, :] = np.array([0, 0, 70])
cmap[15, :] = np.array([0, 60, 100])
cmap[16, :] = np.array([0, 80, 100])
cmap[17, :] = np.array([0, 0, 230])
cmap[18, :] = np.array([119, 11, 32])
cmap[19, :] = np.array([0, 0, 0])
return cmap
class Colorize_cityscapes:
def __init__(self):
self.cmap = colormap_cityscapes()
def __call__(self, gray_image):
size = gray_image.shape
color_image = np.zeros((size[0], size[1], 3)).astype(np.uint8)
for label in range(20):
mask = label == gray_image
color_image[mask] = self.cmap[label]
return color_image
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment