diff --git a/research/cv/flownet2/README.md b/research/cv/flownet2/README.md new file mode 100644 index 0000000000000000000000000000000000000000..34ebe2b864b4e1d9019e30248da860190cfd0178 --- /dev/null +++ b/research/cv/flownet2/README.md @@ -0,0 +1,426 @@ +# Contents + +- [Contents](#contents) +- [Flownet2 Description](#flownet2-description) +- [Model Architecture](#model-architecture) +- [Dataset](#dataset) +- [Environment Requirements](#environment-requirements) +- [Quick Start](#quick-start) +- [Script Description](#script-description) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Process](#training-process) + - [Training](#training) + - [Distributed Training](#distributed-training) + - [Evaluation Process](#evaluation-process) + - [Evaluation](#evaluation) +- [Model Description](#model-description) + - [Performance](#performance) + - [Training Performance](#training-performance) + - [FlowNet2 train on FlyingChairs](#flownet2-train-on-flyingchairs) + - [Inference Performance](#inference-performance) + - [FLowNet2 infer on MpiSintelClean](#flownet2-infer-on-mpisintelclean) +- [ModelZoo Homepage](#modelzoo-homepage) + +# [Flownet2 Description](#contents) + +FlowNet2.0, a deep network proposed in 2017, which performs end-to-end learning on optical flow data. +It is optimized based on the FlowNet network , The large improvements in quality and +speed are caused by three major contributions: first, it +focus on the training data and show that the schedule of +presenting data during training is very important. Second, +it develop a stacked architecture that includes warping +of the second image with intermediate optical flow. Third, +it elaborate on small displacements by introducing a subnetwork specializing on small motions. + +Compared with the FLownet network, FlowNet 2.0 is only +marginally slower than the original FlowNet but decreases +the estimation error by more than 50%. + +[FlowNet2 paper](https://arxiv.org/abs/1612.01925 ):Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox + +[FlowNet paper](https://arxiv.org/abs/1504.06852 ):Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox + +# [Model Architecture](#contents) + +The FlowNet2 network is stacked by multiple flownet sub-modules. After the output of the previous network is processed by warp, it is used as the input of the second network. + +The model structure is flowNet2CSS and FlowNet2SD two sub-networks fuse the output through the FlownetFusion network, and the entire large network structure formed is FLowNet2 +The FlowNet2CSS network is a stack of FLowNet2C and two FLowNet2S. The specific structure can be further understood according to the paper + +This source code provides the following model structure, which can be configured and used in the yaml file: + +- FlowNet2S +- FlowNet2C +- FlowNet2CS +- FlowNet2CSS +- FlowNet2SD +- FlowNet2 + +# [Dataset](#contents) + +Dataset used: [FlyingChairs](https://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs.en.html) + +- Dataset size:31GB,22,872 pairs 512*384 colorful images +- Data format:PPM + - Note:Data will be processed in src/dataset.py +- you can download here [dataset package](https://lmb.informatik.uni-freiburg.de/data/FlyingChairs/FlyingChairs.zip) + +Dataset used: [ChairsSDHom](https://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs.en.html) + +- Dataset size: 51GB, 21,668 pairs 512*384 colorful images + - Train:20,965 pairs image + - Test: 703 pairs image +- Data format:PNG + - Note: Data will be processed in src/dataset.py +- you can download here [dataset package](https://lmb.informatik.uni-freiburg.de/data/FlowNet2/ChairsSDHom/ChairsSDHom.tar.gz) + +Dataset used: [MpiSintel](http://sintel.cs.washington.edu) + +- Dataset size: 536M, 1024 x 436 colorful images in 23 classes + - MpiSintelClean:1150 images + - MpiSintelFinal: 1150 images +- Data format:PNG + - Note: Data will be processed in src/dataset.py +- you can download here [dataset package](http://files.is.tue.mpg.de/sintel/MPI-Sintel-complete.zip) + +# [Environment Requirements](#contents) + +- Hardware(Ascend/GPU/CPU) + - Prepare hardware environment with Ascend/GPU/CPU processor. +- Framework + - [MindSpore](https://www.mindspore.cn/install/en) +- For more information, please check the resources below: + - [MindSpore Tutorials](https://www.mindspore.cn/tutorials/en/master/index.html) + - [MindSpore Python API](https://www.mindspore.cn/docs/api/en/master/index.html) + +# [Quick Start](#contents) + +After installing MindSpore via the official website, you can start training and evaluation as follows: + +- download pretrained parameter + + FlowNet2 [620MB](https://drive.google.com/file/d/1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da/view?usp=sharing) + + FlowNet2-C [149MB](https://drive.google.com/file/d/1BFT6b7KgKJC8rA59RmOVAXRM_S7aSfKE/view?usp=sharing) + + FlowNet2-CS [297MB](https://drive.google.com/file/d/1iBJ1_o7PloaINpa8m7u_7TsLCX0Dt_jS/view?usp=sharing) + + FlowNet2-CSS [445MB](https://drive.google.com/file/d/157zuzVf4YMN6ABAQgZc8rRmR5cgWzSu8/view?usp=sharing) + + FlowNet2-CSS-ft-sd [445MB](https://drive.google.com/file/d/1R5xafCIzJCXc8ia4TGfC65irmTNiMg6u/view?usp=sharing) + + FlowNet2-S [148MB](https://drive.google.com/file/d/1V61dZjFomwlynwlYklJHC-TLfdFom3Lg/view?usp=sharing) + + FlowNet2-SD [173MB](https://drive.google.com/file/d/1QW03eyYG_vD-dT-Mx4wopYvtPu_msTKn/view?usp=sharing) + +- convert pretrained parameter (from pytorch pretrained parameter to mindspore pretained parameter,so the env should both installed torch and mindspore) + convert pytorch pretrained parameter to mindspore pretrained parameter + the pytorch pretrained parameter are supposed to be downloaded by above link + + ```text + bash scripts/run_ckpt_convert.sh [PYTORCH_FILE_PATH] [MINDSPORE_FILE_PATH] + # example: + bash scripts/run_ckpt_convert.sh /path/to/FlowNet2_checkpoint.pth.tar /path/to/flownet2.ckpt + ``` + +- compile custom operation Correlation and Resample2d + after execution,you can check the whether generate correlation.so and resample2d.so under path src/submodels/custom_ops/ + + ```text + bash scripts/run_compile_custom_ops.sh + ``` + +- config pretrained parameter path in yaml file + + ```text + pre_trained: # whether use pretrained parameter file 1 or 0 + pre_trained_ckpt_path: # pretrained checkpoint file path + # 实例: + pre_trained: 1 + pre_trained_ckpt_path: /path/checkpoint/flownet2.ckpt + ``` + +- config dataset name and path in yaml file + + ```text + train_data: [DATASET_NAME] # Name of dataset, 'FlyingChairs' or 'MpiSintelFinal' or 'MpiSintelClean' + train_data_path:[DATASET_PATH] # path of dataset + # example: + train_data: FlyingChairs + train_data_path: /path/to/FlyingChairs_release/data + ``` + +- running on GPU + + For running on GPU, please change `device_target` from `Ascend` to `GPU` in configuration file default_config.yaml + + ```python + # run training example + export CUDA_VISIBLE_DEVICES=0 + python train.py > train.log 2>&1 & + + # run distributed training example + bash scripts/run_train_gpu.sh 8 0,1,2,3,4,5,6,7 + + # run evaluation example + python eval.py --eval_checkpoint_path=[EVAL_CHECKPOINT_PATH] > eval.log 2>&1 & + OR + bash scripts/run_eval_gpu.sh [MpiSintelClean/MpiSintelFinal] [DATA_PATH] [MODEL_NAME] [CKPT_PATH] [DEVICE_ID] + ``` + +We use FlyingChairs dataset by default. Your can also pass `$dataset_type` to the scripts so that select different datasets. For more details, please refer the specify script. + +# [Script Description](#contents) + +## [Script and Sample Code](#contents) + +```text +├── model_zoo + ├── README.md // descriptions about all the models + ├── flownet2 + ├── README.md // descriptions about flownet2 + ├── scripts + │ ├── run_ckpt_convert.sh // shell script for converting pytorch ckpt file to pickle file on GPU + │ ├── run_compile_custom_ops.sh // shell script for compile ops + │ ├── run_eval_gpu.sh // shell script for eval on GPU + │ └── run_train_gpu.sh // shell script for training on GPU + ├── src + │ ├── dataset.py // creating dataset + │ ├── eval_callback.py // eval callback when training + │ ├── metric.py // metric to calculate mean error + │ ├── model_utils + │ │ ├── ckpt_convert.py // convert pytorch ckpt file to pickle file + │ │ ├── config.py // parameter configuration + │ │ ├── device_adapter.py // device adapter + │ │ ├── local_adapter.py // local adapter + │ │ ├── moxing_adapter.py // moxing adapter + │ │ ├── frame_utils.py // utils to read files of dataset + │ │ └── tools.py // tools to match class with paratmeter from config + │ ├── models.py // FlowNet2/FlowNet2CSS/FlowNet2CS/FlowNet2C/FlowNet2S/FlowNet2SD model + │ └── submodels + │ ├── custom_ops + │ │ ├── correlation.cu // cuda file for operation correlation + │ │ ├── resample2d.cu // cuda file for operation resample2d + │ │ └── custom_ops.py // definition of correlation and resample2d + │ ├── FlowNetC.py // FlowNetC model + │ ├── FlowNetFusion.py // FlowNetFusion model + │ ├── FlowNetS.py // FlowNetS model + │ ├── FlowNetSD.py // FlowNetSD model + │ └── submodules.py // submodules used in flownet model + ├── default_config.yaml // parameter configuration + ├── requirements.txt // requirements configuration + ├── eval.py // evaluation script + └── train.py // training script +``` + +## [Script Parameters](#contents) + +Parameters for both training and evaluation can be set in config.py + +- config for FLowNet2 + + ```text + # ============================================================================== + # Device + device_target: "GPU" + device_id: 0 + + # Dataset Setup + crop_type: Random # Type of cropping operation (Random and Center) + crop_size: [384, 512] # (Height, Width) of image when training + eval_size: [256, 256] # (Height, Width) of image when eval + + # Experiment Setup + model: "FlowNet2" # Name of model to be loaded + rgb_max: 255 # rgb channel used + batchNorm: False # boolean switch to whether add batchnorm before conv + lr: 1e-6 # Learning rate + num_parallel_workers: 2 # Number of CPU worker used to load data + max_rowsize: 2 # Number of max rowsize used to load data + batch_size: 2 # Numbers of image pairs in a mini-batch + epoch_size: 20 # Total number of epochs + pre_trained: 1 # Load pretrained network + pre_trained_ckpt_path: "/path/flownet2.ckpt" # Pretrained ckpt path + seed: 1 # Seed for reproducibility + is_dynamicLoss_scale: 0 # Using dynamicLoss scale or fix scale + scale: 1024 # Fix scale value + weight_decay: 0.00001 # Weight decay + train_data: "FlyingChairs" # Train Dataset name + train_data_path: "/path/ds/FlyingChairs_release/data" # Train Dataset path + + # Train Setup + run_distribute: 1 # Distributed training or not + is_save_on_master: 1 # Only save ckpt on master device + save_checkpoint: 1 # Is save ckpt while training + save_ckpt_interval: 1 # Saving ckpt interval + keep_checkpoint_max: 5 # Max ckpt file number + save_checkpoint_path: "/path/ckpt/" # Ckpt save path + + # eval Setup + eval_data: "MpiSintelClean" # Eval Dataset name + eval_data_path: "/home/shm/ds/training" # Eval Dataset path + eval_checkpoint_path: "/path/flownet2.ckpt" # Ckpt path used to eval + run_evalCallback: 1 # Is run evalCallBack while training + eval_start_epoch: 1 # EvalCallback start epoch + eval_interval: 5 # EvalCallback running interval + save_best_ckpt: 1 # Is save best ckpt + + # Export Setup + mindir_file_name: "Flownet2" # Save file path + file_format: "MINDIR" # Save file format + + # Modelarts Setup + enable_modelarts: 0 # Is training on modelarts + ``` + +For more configuration details, please refer the script `config.py`. + +## [Training Process](#contents) + +### Training + +- running on GPU + + ```python + export CUDA_VISIBLE_DEVICES=0 + python train.py > train.log 2>&1 & + ``` + + ```bash + bash scripts/run_train_gpu.sh 1 0 + ``` + + The python command above will run in the background, you can view the results through the file `train.log`. + + After training, you'll get some checkpoint files under the folder `${save_checkpoint_path}/ckpt_0/` by default. + +- train.log for flyingchairs + +```text +epoch: 1 step: 2859, loss is 1.0592992305755615 +epoch time: 2454542.145 ms, per step time: 858.532 ms +epoch: 2 step: 2859, loss is 1.074428915977478 +epoch time: 2416319.469 ms, per step time: 845.162 ms +epoch: 3 step: 2859, loss is 0.6141664981842041 +epoch time: 2412936.084 ms, per step time: 843.979 ms +``` + +- train.log for MpiSintel + +```text +epoch: 1 step: 131, loss is 0.3894098699092865 +epoch time: 114087.253 ms, per step time: 870.895 ms +epoch: 2 step: 131, loss is 1.822862982749939 +epoch time: 93423.045 ms, per step time: 713.153 ms +epoch: 3 step: 131, loss is 0.06125941127538681 +epoch time: 93837.971 ms, per step time: 716.320 ms +``` + +### Distributed Training + +- running on GPU + + ```bash + bash scripts/run_train_gpu.sh 8 0,1,2,3,4,5,6,7 + ``` + + The above shell script will run distribute training in the background. You can view the results through the file `train.log`. + +- train.log for flyingchairs + +```text +epoch: 1 step: 358, loss is 1.1717915534973145 +epoch: 1 step: 358, loss is 0.6347103118896484 +epoch: 1 step: 358, loss is 1.4680955410003662 +epoch: 1 step: 358, loss is 1.7656424045562744 +epoch: 1 step: 358, loss is 1.1760812997817993 +epoch: 1 step: 358, loss is 0.8203185200691223 +epoch: 1 step: 358, loss is 2.2942874431610107 +epoch: 1 step: 358, loss is 1.3205347061157227 +epoch time: 858929.203 ms, per step time: 2399.244 ms +epoch time: 859414.930 ms, per step time: 2400.600 ms +epoch time: 859515.190 ms, per step time: 2400.880 ms +epoch time: 859614.460 ms, per step time: 2401.158 ms +epoch time: 859695.493 ms, per step time: 2401.384 ms +epoch time: 859799.146 ms, per step time: 2401.674 ms +epoch time: 859995.238 ms, per step time: 2402.221 ms +epoch time: 860035.718 ms, per step time: 2402.334 ms +``` + +## [Evaluation Process](#contents) + +### Evaluation + +- evaluation on MpiSintelClean dataset when running on GPU + + Before running the command below, please check the checkpoint path used for evaluation. Please set the checkpoint path to be the absolute full path, e.g., "path/flownet2/ckpt/flownet2-125_390.ckpt". + + ```python + python eval.py --eval_data=[DATASET_NAME] --eval_data_path=[DATASET_PATH]/ + --model=[MODEL_NAME] --eval_checkpoint_path=[CHECKPOINT_PATH] > eval.log 2>&1 & + ``` + + The above python command will run in the background. You can view the results through the file "eval.log". The accuracy of the test dataset will be as follows: + + ```bash + # grep "mean error: " eval.log + flownet2 mean error: {'flownetEPE': 2.112366} + ``` + + OR, + + ```bash + bash scripts/run_eval_gpu.sh [MpiSintelClean/MpiSintelFinal] [DATA_PATH] [MODEL_NAME] [CKPT_PATH] [DEVICE_ID] + ``` + + The above python command will run in the background. You can view the results through the file "eval/eval.log". The accuracy of the test dataset will be as follows: + + ```text + # grep "mean error: " eval.log + flownet2 mean error: {'flownetEPE': 2.112366} + ``` + +# [Model Description](#contents) + +## [Performance](#contents) + +### Training Performance + +#### FlowNet2 train on FlyingChairs + +| Parameters | GPU | +|----------------------------|---------------------------------------------------------------------------------------------------| +| Model Version | Inception V1 | +| Resource | NV SMX2 V100-32G | +| uploaded Date | 04/05/2021 (month/day/year) | +| MindSpore Version | 1.7.0 | +| Dataset | FlyingChairs | +| Training Parameters | epoch=50, steps=2800, batch_size=8, lr=1e-6 | +| Optimizer | Adam | +| Loss Function | L1loss | +| outputs | flow | | +| Speed | 1pc: 152 ms/step; 8pcs: 171 ms/step | +| Total time | 8pcs: 8.8 hours | +| Parameters | 162,518,834 | +| Checkpoint for Fine tuning | 260M (.ckpt file) | +| Scripts | [flownet2 script](https://gitee.com/mindspore/models/tree/master/research/cv/flownet2) | + +### Inference Performance + +#### FlowNet2 infer on MpiSintelClean + +| Parameters | GPU | +|-------------------|-----------------------------| +| Model Version | Inception V1 | +| Resource | NV SMX2 V100-32G | +| Uploaded Date | 04/05/2022 (month/day/year) | +| MindSpore Version | 1.7.0 | +| Dataset | MpiSintelClean | +| batch_size | 8 | +| outputs | flow | +| Mean Error | 2.10 | + +# [ModelZoo Homepage](#contents) + + Please check the official [homepage](https://gitee.com/mindspore/models). diff --git a/research/cv/flownet2/default_config.yaml b/research/cv/flownet2/default_config.yaml new file mode 100644 index 0000000000000000000000000000000000000000..6a21bdd7d65681c0b2e0bb4ba31afe84272e9c8a --- /dev/null +++ b/research/cv/flownet2/default_config.yaml @@ -0,0 +1,48 @@ +# ============================================================================== +# Device +device_target: "GPU" +device_id: 0 + +# Dataset Setup +crop_type: Random # Type of cropping operation (Random and Center) when training +crop_size: [384, 512] # (Height, Width) of image when training +eval_size: [256, 256] # (Height, Width) of image when eval + +# Experiment Setup +model: "FlowNet2" # Name of model to be loaded +rgb_max: 255 # rgb channel used +batchNorm: False # boolean switch to whether add batchnorm before conv +lr: 0.0000001 # Learning rate +num_parallel_workers: 2 # Number of CPU worker used to load data +max_rowsize: 2 # Number of max rowsize used to load data +batch_size: 8 # Numbers of image pairs in a mini-batch +epoch_size: 20 # Total number of epochs +pre_trained: 1 # Load pretrained network +pre_trained_ckpt_path: "/path/flownet2.ckpt" # Pretrained ckpt path +seed: 1 # Seed for reproducibility +is_dynamicLoss_scale: 0 # Using dynamicLoss scale or fix scale +scale: 1024 # Fix scale value +weight_decay: 0.00001 # Weight decay +train_data: "FlyingChairs" # Train Dataset name +train_data_path: "/path/ds/FlyingChairs_release/data" # Train Dataset path + +# Train Setup +run_distribute: 0 # Distributed training or not +is_save_on_master: 1 # Only save ckpt on master device +save_checkpoint: 1 # Is save ckpt while training +save_ckpt_interval: 1 # Saving ckpt interval +keep_checkpoint_max: 5 # Max ckpt file number +save_checkpoint_path: "/path/ckpt/" # Ckpt save path + +# eval Setup +eval_data: "MpiSintelClean" # Eval Dataset name +eval_data_path: "/path/ds/training" # Eval Dataset path +eval_checkpoint_path: "/path/flownet2.ckpt" # Ckpt path used to eval +run_evalCallback: 1 # Is run evalCallBack while training +eval_start_epoch: 1 # EvalCallback start epoch +eval_interval: 1 # EvalCallback running interval +save_best_ckpt: 1 # Is save best ckpt + +# Export Setup +mindir_file_name: "Flownet2" # Save file path +file_format: "MINDIR" # Save file format diff --git a/research/cv/flownet2/eval.py b/research/cv/flownet2/eval.py new file mode 100644 index 0000000000000000000000000000000000000000..0034d6a24effeac63647cbcee346a3f79bbeb363 --- /dev/null +++ b/research/cv/flownet2/eval.py @@ -0,0 +1,64 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import mindspore.nn as nn +import mindspore.dataset as ds +from mindspore import context +from mindspore.common import set_seed +from mindspore.context import ParallelMode +from mindspore.train.model import Model +from mindspore.train.serialization import load_checkpoint, load_param_into_net + + +import src.dataset as datasets +import src.models as models +from src.metric import FlowNetEPE +import src.model_utils.tools as tools +from src.model_utils.config import config + +def run_eval(): + set_seed(config.seed) + context.set_context(mode=context.GRAPH_MODE, device_target=config.device_target, save_graphs=False) + context.set_auto_parallel_context(parallel_mode=ParallelMode.STAND_ALONE, gradients_mean=True, device_num=1) + ds.config.set_enable_shared_mem(False) + # load dataset by config param + config.eval_dataset_class = tools.module_to_dict(datasets)[config.eval_data] + flownet_eval_gen = config.eval_dataset_class("Center", config.crop_size, config.eval_size, + config.eval_data_path) + eval_dataset = ds.GeneratorDataset(flownet_eval_gen, ["images", "flow"] + , num_parallel_workers=config.num_parallel_workers, + max_rowsize=config.max_rowsize) + eval_dataset = eval_dataset.batch(config.batch_size) + + # load model by config param + config.model_class = tools.module_to_dict(models)[config.model] + net = config.model_class(config.rgb_max, config.batchNorm) + + loss = nn.L1Loss() + + param_dict = load_checkpoint(config.eval_checkpoint_path) + print("load checkpoint from [{}].".format(config.eval_checkpoint_path)) + load_param_into_net(net, param_dict) + net.set_train(False) + + model = Model(net, loss_fn=loss, metrics={'flownetEPE': FlowNetEPE()}) + + mean_error = model.eval(eval_dataset, dataset_sink_mode=False) + + print("flownet2 mean error: ", mean_error) + + +if __name__ == '__main__': + run_eval() diff --git a/research/cv/flownet2/requirements.txt b/research/cv/flownet2/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..98b973b4425dc52838a4dfb5f6a8ab6464f5dc59 --- /dev/null +++ b/research/cv/flownet2/requirements.txt @@ -0,0 +1,8 @@ +Pillow +imageio +matplotlib +decorator +numpy +pytz +PyYAML +mindspore_gpu>=1.7.0 \ No newline at end of file diff --git a/research/cv/flownet2/scripts/run_ckpt_convert.sh b/research/cv/flownet2/scripts/run_ckpt_convert.sh new file mode 100644 index 0000000000000000000000000000000000000000..67f8f40f4194fc59c5e3ea0746d2f847a992cf4c --- /dev/null +++ b/research/cv/flownet2/scripts/run_ckpt_convert.sh @@ -0,0 +1,43 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# -lt 1 ] || [ $# -gt 2 ]; then + echo "Usage: bash run_ckpt_convert.sh [PYTORCH_FILE_PATH] [MINDSPORE_FILE_PATH] + PYTORCH_FILE_PATH is pytorch pretrained model ckpt file path. + MINDSPORE_FILE_PATH is mindspore pretrained model ckpt file path." +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +torch_file_path=$(get_real_path $1) + +if [ ! -f ${torch_file_path} ]; then + echo "Pytorch pretrained model ckpt file path does not exist." +exit 1 +fi + +mindspore_file_path=$(get_real_path $2) + +BASEPATH=$(cd "`dirname $0`" || exit; pwd) + +python3 ${BASEPATH}/../src/model_utils/ckpt_convert.py ${torch_file_path} ${mindspore_file_path} diff --git a/research/cv/flownet2/scripts/run_compile_custom_ops.sh b/research/cv/flownet2/scripts/run_compile_custom_ops.sh new file mode 100644 index 0000000000000000000000000000000000000000..618255a887b81f018970218863aa3061962940ea --- /dev/null +++ b/research/cv/flownet2/scripts/run_compile_custom_ops.sh @@ -0,0 +1,22 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +BASEPATH=$(cd "`dirname $0`" || exit; pwd) + +CUSTOM_OP_PATH="${BASEPATH}/../src/submodels/custom_ops" + +nvcc --shared -Xcompiler -fPIC -o "${CUSTOM_OP_PATH}/correlation.so" "${CUSTOM_OP_PATH}/correlation.cu" +nvcc --shared -Xcompiler -fPIC -o "${CUSTOM_OP_PATH}/resample2d.so" "${CUSTOM_OP_PATH}/resample2d.cu" diff --git a/research/cv/flownet2/scripts/run_eval_gpu.sh b/research/cv/flownet2/scripts/run_eval_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..fc4ef3681028b7a9063ef2fddff3b1620832ef06 --- /dev/null +++ b/research/cv/flownet2/scripts/run_eval_gpu.sh @@ -0,0 +1,42 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# -lt 4 ] +then + echo "Usage: \ + bash run_eval_gpu.sh [MpiSintelClean/MpiSintelFinal] [DATA_PATH] [MODEL_NAME] [CKPT_PATH] [DEVICE_ID]\ + " +exit 1 +fi + + +export DATA_NAME=$1 +export DATA_PATH=$2 +export MODEL_NAME=$3 +export CKPT_PATH=$4 +export DEVICE_ID=$5 + +BASEPATH=$(cd "`dirname $0`" || exit; pwd) + +ulimit -u unlimited + +CONFIG_PATH="${BASEPATH}/../default_config.yaml" +echo "config path is : ${CONFIG_PATH}" + + +python3 eval.py --config_path=$CONFIG_PATH --eval_data=$DATA_NAME \ + --eval_data_path=$DATA_PATH --model=$MODEL_NAME --eval_checkpoint_path=$CKPT_PATH \ + --device_id=$DEVICE_ID --device_target="GPU" > eval.log 2>&1 & diff --git a/research/cv/flownet2/scripts/run_train_gpu.sh b/research/cv/flownet2/scripts/run_train_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..e3c64c12ae81907a9bcfd79a3fd32049fe858011 --- /dev/null +++ b/research/cv/flownet2/scripts/run_train_gpu.sh @@ -0,0 +1,49 @@ +#!/bin/bash +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# -lt 2 ] +then + echo "Usage: \ + bash run_train_gpu.sh [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)]\ + " +exit 1 +fi + +if [ $1 -lt 1 ] && [ $1 -gt 8 ] +then + echo "error: DEVICE_NUM=$1 is not in (1-8)" +exit 1 +fi + +export DEVICE_NUM=$1 +export RANK_SIZE=$1 + +BASEPATH=$(cd "`dirname $0`" || exit; pwd) + +ulimit -u unlimited +export CUDA_VISIBLE_DEVICES="$2" + +CONFIG_PATH="${BASEPATH}/../default_config.yaml" +echo "config path is : ${CONFIG_PATH}" + + +if [ $1 -gt 1 ] +then + mpirun -n $1 --allow-run-as-root --output-filename log_output --merge-stderr-to-stdout \ + python3 train.py --config_path=$CONFIG_PATH --run_distribute=1 > train.log 2>&1 & +else + python3 train.py --config_path=$CONFIG_PATH --run_distribute=0 > train.log 2>&1 & +fi diff --git a/research/cv/flownet2/src/dataset.py b/research/cv/flownet2/src/dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..c1326702b2983947ca5e41138d2bf89581ea0293 --- /dev/null +++ b/research/cv/flownet2/src/dataset.py @@ -0,0 +1,302 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import random +import math +from os.path import join +from os.path import isfile +from glob import glob +import numpy as np +from src.model_utils.frame_utils import read_gen + + +class StaticRandomCrop(): + def __init__(self, image_size, crop_size): + self.th, self.tw = crop_size + h, w = image_size + self.h1 = random.randint(0, h - self.th) + self.w1 = random.randint(0, w - self.tw) + + def __call__(self, img): + return img[self.h1:(self.h1 + self.th), self.w1:(self.w1 + self.tw), :] + + +class StaticCenterCrop(): + def __init__(self, image_size, crop_size): + self.th, self.tw = crop_size + self.h, self.w = image_size + + def __call__(self, img): + return img[(self.h - self.th) // 2:(self.h + self.th) // 2, (self.w - self.tw) // 2:(self.w + self.tw) // 2, :] + + +class DistributedSampler(): + """ + Distributed sampler + """ + + def __init__(self, dataset, rank, group_size, shuffle=True, seed=0): + self.dataset = dataset + self.rank = rank + self.group_size = group_size + self.dataset_length = len(self.dataset) + self.num_samples = int(math.ceil(self.dataset_length * 1.0 / self.group_size)) + self.total_size = self.num_samples * self.group_size + self.shuffle = shuffle + self.seed = seed + + def __iter__(self): + if self.shuffle: + self.seed = (self.seed + 1) & 0xffffffff + np.random.seed(self.seed) + indices = np.random.permutation(self.dataset_length).tolist() + else: + indices = list(range(len(self.dataset_length))) + indices += indices[:(self.total_size - len(indices))] + assert len(indices) == self.total_size + indices = indices[self.rank::self.group_size] + assert len(indices) == self.num_samples + return iter(indices) + + def __len__(self): + return self.num_samples + + +class MpiSintel: + def __init__(self, crop_type='Random', crop_size=None, eval_size=None, + root='', dstype='clean', replicates=1): + self.crop_type = crop_type + if crop_size is None: + crop_size = [384, 512] + self.crop_size = crop_size + if eval_size is None: + eval_size = [256, 256] + self.render_size = eval_size + self.replicates = replicates + + flow_root = join(root, 'flow') + image_root = join(root, dstype) + + file_list = sorted(glob(join(flow_root, '*/*.flo'))) + + self.flow_list = [] + self.image_list = [] + + for file in file_list: + if 'test' in file: + # print file + continue + + fbase = file[len(flow_root) + 1:] + fprefix = fbase[:-8] + fnum = int(fbase[-8:-4]) + + img1 = join(image_root, fprefix + "%04d" % (fnum + 0) + '.png') + img2 = join(image_root, fprefix + "%04d" % (fnum + 1) + '.png') + + if not isfile(img1) or not isfile(img2) or not isfile(file): + continue + + self.image_list += [[img1, img2]] + self.flow_list += [file] + + self.size = len(self.image_list) + + self.frame_size = read_gen(self.image_list[0][0]).shape + + if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0] % 64) or ( + self.frame_size[1] % 64): + self.render_size[0] = ((self.frame_size[0]) // 64) * 64 + self.render_size[1] = ((self.frame_size[1]) // 64) * 64 + + # args.eval_size = self.render_size + + assert len(self.image_list) == len(self.flow_list) + + def __getitem__(self, index): + + index = index % self.size + + img1 = read_gen(self.image_list[index][0]) + img2 = read_gen(self.image_list[index][1]) + + flow = read_gen(self.flow_list[index]) + + images = [img1, img2] + image_size = img1.shape[:2] + + if self.crop_type == 'Random': + cropper = StaticRandomCrop(image_size, self.crop_size) + elif self.crop_type == 'Center': + cropper = StaticCenterCrop(image_size, self.render_size) + images = list(map(cropper, images)) + flow = cropper(flow) + + images = np.array(images).transpose(3, 0, 1, 2) + flow = flow.transpose(2, 0, 1) + + images = images.astype(np.float32) + flow = flow.astype(np.float32) + + return images, flow + + def __len__(self): + return self.size * self.replicates + + +class MpiSintelClean(MpiSintel): + def __init__(self, crop_type, crop_size, eval_size, root, replicates=1): + super(MpiSintelClean, self).__init__(crop_type=crop_type, crop_size=crop_size, eval_size=eval_size, + root=root, dstype='clean', replicates=replicates) + + +class MpiSintelFinal(MpiSintel): + def __init__(self, crop_type, crop_size, eval_size, root, replicates=1): + super(MpiSintelFinal, self).__init__(crop_type=crop_type, crop_size=crop_size, eval_size=eval_size, + root=root, dstype='final', replicates=replicates) + + +# definite a DatasetGenerator +class ChairsSDHom: + def __init__(self, crop_type, crop_size, eval_size, root='/path/to/chairssdhom/data', dstype='train', replicates=1): + self.crop_type = crop_type + self.crop_size = crop_size + self.render_size = eval_size + self.replicates = replicates + + image1 = sorted(glob(join(root, dstype, 't0/*.png'))) + image2 = sorted(glob(join(root, dstype, 't1/*.png'))) + self.flow_list = sorted(glob(join(root, dstype, 'flow/*.pfm'))) + + assert len(image1) == len(self.flow_list) + + self.image_list = [] + for i in range(len(self.flow_list)): + im1 = image1[i] + im2 = image2[i] + self.image_list += [[im1, im2]] + + assert len(self.image_list) == len(self.flow_list) + + self.size = len(self.image_list) + + self.frame_size = read_gen(self.image_list[0][0]).shape + + if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0] % 64) or ( + self.frame_size[1] % 64): + self.render_size[0] = ((self.frame_size[0]) // 64) * 64 + self.render_size[1] = ((self.frame_size[1]) // 64) * 64 + + # args.eval_size = self.render_size + + def __getitem__(self, index): + index = index % self.size + + img1 = read_gen(self.image_list[index][0]) + img2 = read_gen(self.image_list[index][1]) + + flow = read_gen(self.flow_list[index]) + flow = flow[::-1, :, :] + + images = [img1, img2] + image_size = img1.shape[:2] + if self.crop_type == 'Random': + cropper = StaticRandomCrop(image_size, self.crop_size) + elif self.crop_type == 'Center': + cropper = StaticCenterCrop(image_size, self.render_size) + images = list(map(cropper, images)) + flow = cropper(flow) + + images = np.array(images).transpose(3, 0, 1, 2) + flow = flow.transpose(2, 0, 1) + + images = images.astype(np.float32) + flow = flow.astype(np.float32) + return images, flow + + def __len__(self): + return self.size * self.replicates + + +class ChairsSDHomTrain(ChairsSDHom): + def __init__(self, crop_type, crop_size, eval_size, root='', replicates=1): + super(ChairsSDHomTrain, self).__init__(crop_type=crop_type, crop_size=crop_size, eval_size=eval_size, + root=root, dstype='train', replicates=replicates) + + +class ChairsSDHomTest(ChairsSDHom): + def __init__(self, crop_type, crop_size, eval_size, root='', replicates=1): + super(ChairsSDHomTest, self).__init__(crop_type=crop_type, crop_size=crop_size, eval_size=eval_size, root=root, + dstype='test', replicates=replicates) + + +class FlyingChairs: + def __init__(self, crop_type, crop_size, eval_size, root='/path/to/FlyingChairs_release/data', replicates=1): + self.crop_type = crop_type + self.crop_size = crop_size + self.render_size = eval_size + self.replicates = replicates + + images = sorted(glob(join(root, '*.ppm'))) + + self.flow_list = sorted(glob(join(root, '*.flo'))) + + assert len(images) // 2 == len(self.flow_list) + + self.image_list = [] + for i in range(len(self.flow_list)): + im1 = images[2 * i] + im2 = images[2 * i + 1] + self.image_list += [[im1, im2]] + + assert len(self.image_list) == len(self.flow_list) + + self.size = len(self.image_list) + + self.frame_size = read_gen(self.image_list[0][0]).shape + + if (self.render_size[0] < 0) or (self.render_size[1] < 0) or (self.frame_size[0] % 64) or ( + self.frame_size[1] % 64): + self.render_size[0] = ((self.frame_size[0]) // 64) * 64 + self.render_size[1] = ((self.frame_size[1]) // 64) * 64 + + # args.eval_size = self.render_size + + def __getitem__(self, index): + index = index % self.size + + img1 = read_gen(self.image_list[index][0]) + img2 = read_gen(self.image_list[index][1]) + + flow = read_gen(self.flow_list[index]) + + images = [img1, img2] + image_size = img1.shape[:2] + if self.crop_type == 'Random': + cropper = StaticRandomCrop(image_size, self.crop_size) + elif self.crop_type == 'Center': + cropper = StaticCenterCrop(image_size, self.render_size) + images = list(map(cropper, images)) + flow = cropper(flow) + + images = np.array(images).transpose(3, 0, 1, 2) + flow = flow.transpose(2, 0, 1) + + images = images.astype(np.float32) + flow = flow.astype(np.float32) + return images, flow + + def __len__(self): + return self.size * self.replicates diff --git a/research/cv/flownet2/src/eval_callback.py b/research/cv/flownet2/src/eval_callback.py new file mode 100644 index 0000000000000000000000000000000000000000..9002a9341a64e3ee78317991298ca7140dd39540 --- /dev/null +++ b/research/cv/flownet2/src/eval_callback.py @@ -0,0 +1,93 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""Evaluation callback when training""" +import os +import stat +import time +from mindspore import save_checkpoint +from mindspore import log as logger +from mindspore.train.callback import Callback + +class EvalCallBack(Callback): + """ + Evaluation callback when training. + + Args: + eval_function (function): evaluation function. + eval_param_dict (dict): evaluation parameters' configure dict. + interval (int): run evaluation interval, default is 1. + eval_start_epoch (int): evaluation start epoch, default is 1. + save_best_ckpt (bool): Whether to save best checkpoint, default is True. + besk_ckpt_name (str): bast checkpoint name, default is `best.ckpt`. + metrics_name (str): evaluation metrics name, default is `acc`. + + Returns: + None + + Examples: + >>> EvalCallBack(eval_function, eval_param_dict) + """ + + def __init__(self, eval_function, eval_param_dict, interval=1, eval_start_epoch=1, save_best_ckpt=True, + ckpt_directory="./", besk_ckpt_name="best.ckpt", metrics_name="MeanError"): + super(EvalCallBack, self).__init__() + self.eval_param_dict = eval_param_dict + self.eval_function = eval_function + self.eval_start_epoch = eval_start_epoch + if interval < 1: + raise ValueError("interval should >= 1.") + self.interval = interval + self.save_best_ckpt = save_best_ckpt + self.best_res = 10 + self.best_epoch = 0 + if not os.path.isdir(ckpt_directory): + os.makedirs(ckpt_directory) + self.bast_ckpt_path = os.path.join(ckpt_directory, besk_ckpt_name) + self.metrics_name = metrics_name + + def remove_ckpoint_file(self, file_name): + """Remove the specified checkpoint file from this checkpoint manager and also from the directory.""" + try: + os.chmod(file_name, stat.S_IWRITE) + os.remove(file_name) + except OSError: + logger.warning("OSError, failed to remove the older ckpt file %s.", file_name) + except ValueError: + logger.warning("ValueError, failed to remove the older ckpt file %s.", file_name) + + def epoch_end(self, run_context): + """Callback when epoch end.""" + cb_params = run_context.original_args() + cur_epoch = cb_params.cur_epoch_num + if cur_epoch >= self.eval_start_epoch and (cur_epoch - self.eval_start_epoch) % self.interval == 0: + eval_start = time.time() + res = self.eval_function(self.eval_param_dict) + eval_cost = time.time() - eval_start + print("epoch: {}, {}: {}, eval_cost:{:.2f}".format(cur_epoch, self.metrics_name, res, eval_cost), + flush=True) + if res <= self.best_res: + self.best_res = res + self.best_epoch = cur_epoch + print("update best result: {}".format(res), flush=True) + if self.save_best_ckpt: + if os.path.exists(self.bast_ckpt_path): + self.remove_ckpoint_file(self.bast_ckpt_path) + save_checkpoint(cb_params.train_network, self.bast_ckpt_path) + print("update best checkpoint at: {}".format(self.bast_ckpt_path), flush=True) + + def end(self, run_context): + print("End training, the best {0} is: {1}, the best {0} epoch is {2}".format(self.metrics_name, + self.best_res, + self.best_epoch), flush=True) diff --git a/research/cv/flownet2/src/metric.py b/research/cv/flownet2/src/metric.py new file mode 100644 index 0000000000000000000000000000000000000000..2e71ead2172bfa6fe9237bad87f3975f00e7902f --- /dev/null +++ b/research/cv/flownet2/src/metric.py @@ -0,0 +1,44 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +import mindspore as ms +import mindspore.ops as ops +import mindspore.nn as nn +from mindspore.nn import rearrange_inputs +import numpy as np + +class FlowNetEPE(nn.Metric): + def __init__(self): + super(FlowNetEPE, self).__init__() + self.norm_op = nn.Norm(axis=1) + self.mean = ops.ReduceMean() + + def clear(self): + self._abs_error_sum = [] + self._samples_num = 0 + + @rearrange_inputs + def update(self, *inputs): + if len(inputs) != 2: + raise ValueError('The MAE needs 2 inputs (y_pred, y), but got {}'.format(len(inputs))) + y_pred = self._convert_data(inputs[0]) + y = self._convert_data(inputs[1]) + abs_error_sum = self.mean(self.norm_op(ms.Tensor(y) - ms.Tensor(y_pred))) + self._abs_error_sum.append(abs_error_sum.asnumpy().sum()) + self._samples_num += y.shape[0] + + def eval(self): + if self._samples_num == 0: + raise RuntimeError('The total number of samples must not be 0.') + return np.array(self._abs_error_sum).mean() diff --git a/research/cv/flownet2/src/model_utils/ckpt_convert.py b/research/cv/flownet2/src/model_utils/ckpt_convert.py new file mode 100644 index 0000000000000000000000000000000000000000..f8fb9f7689dbc6f42abf2c4af8a16780f33d52b0 --- /dev/null +++ b/research/cv/flownet2/src/model_utils/ckpt_convert.py @@ -0,0 +1,40 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import sys +from mindspore import Tensor +from mindspore.train.serialization import save_checkpoint +import torch + +def torch_to_mindspore(torch_file_path, mindspore_file_path): + ckpt = torch.load(torch_file_path, map_location=torch.device('cpu')) + mindspore_params_list = [] + par_dict = ckpt['state_dict'] + for name in par_dict: + print(name) + param_dict = {} + parameter = par_dict[name] + print(parameter.size()) + param_dict['name'] = name + param_dict['data'] = Tensor(parameter.numpy()) + mindspore_params_list.append(param_dict) + save_checkpoint(mindspore_params_list, mindspore_file_path) + print('convert pytorch ckpt file to mindspore ckpt file success !') + + +if __name__ == '__main__': + torch_ckpt_file_path = sys.argv[1] + mindspore_ckpt_file_path = sys.argv[2] + torch_to_mindspore(torch_ckpt_file_path, mindspore_ckpt_file_path) diff --git a/research/cv/flownet2/src/model_utils/config.py b/research/cv/flownet2/src/model_utils/config.py new file mode 100644 index 0000000000000000000000000000000000000000..79cf623c9348100fb931658a785ea0408193e55d --- /dev/null +++ b/research/cv/flownet2/src/model_utils/config.py @@ -0,0 +1,127 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Parse arguments""" + +import os +import ast +import argparse +from pprint import pprint, pformat +import yaml + +class Config: + """ + Configuration namespace. Convert dictionary to members. + """ + def __init__(self, cfg_dict): + for k, v in cfg_dict.items(): + if isinstance(v, (list, tuple)): + setattr(self, k, [Config(x) if isinstance(x, dict) else x for x in v]) + else: + setattr(self, k, Config(v) if isinstance(v, dict) else v) + + def __str__(self): + return pformat(self.__dict__) + + def __repr__(self): + return self.__str__() + + +def parse_cli_to_yaml(parser, cfg, helper=None, choices=None, cfg_path="default_config.yaml"): + """ + Parse command line arguments to the configuration according to the default yaml. + + Args: + parser: Parent parser. + cfg: Base configuration. + helper: Helper description. + cfg_path: Path to the default yaml config. + """ + parser = argparse.ArgumentParser(description="[REPLACE THIS at config.py]", + parents=[parser]) + helper = {} if helper is None else helper + choices = {} if choices is None else choices + for item in cfg: + if not isinstance(cfg[item], list) and not isinstance(cfg[item], dict): + help_description = helper[item] if item in helper else "Please reference to {}".format(cfg_path) + choice = choices[item] if item in choices else None + if isinstance(cfg[item], bool): + parser.add_argument("--" + item, type=ast.literal_eval, default=cfg[item], choices=choice, + help=help_description) + else: + parser.add_argument("--" + item, type=type(cfg[item]), default=cfg[item], choices=choice, + help=help_description) + args = parser.parse_args() + return args + + +def parse_yaml(yaml_path): + """ + Parse the yaml config file. + + Args: + yaml_path: Path to the yaml config. + """ + with open(yaml_path, 'r') as fin: + try: + cfgs = yaml.load_all(fin.read(), Loader=yaml.FullLoader) + cfgs = [x for x in cfgs] + if len(cfgs) == 1: + cfg_helper = {} + cfg = cfgs[0] + cfg_choices = {} + elif len(cfgs) == 2: + cfg, cfg_helper = cfgs + cfg_choices = {} + elif len(cfgs) == 3: + cfg, cfg_helper, cfg_choices = cfgs + else: + raise ValueError("At most 3 docs (config, description for help, choices) are supported in config yaml") + print(cfg_helper) + except: + raise ValueError("Failed to parse yaml") + return cfg, cfg_helper, cfg_choices + + +def merge(args, cfg): + """ + Merge the base config from yaml file and command line arguments. + + Args: + args: Command line arguments. + cfg: Base configuration. + """ + args_var = vars(args) + for item in args_var: + cfg[item] = args_var[item] + return cfg + + +def get_config(): + """ + Get Config according to the yaml file and cli arguments. + """ + parser = argparse.ArgumentParser(description="default name", add_help=False) + current_dir = os.path.dirname(os.path.abspath(__file__)) + parser.add_argument("--config_path", type=str, default=os.path.join(current_dir, "../../default_config.yaml"), + help="Config file path") + path_args, _ = parser.parse_known_args() + default, helper, choices = parse_yaml(path_args.config_path) + pprint(default) + args = parse_cli_to_yaml(parser=parser, cfg=default, helper=helper, choices=choices, cfg_path=path_args.config_path) + final_config = merge(args, default) + return Config(final_config) + +config = get_config() diff --git a/research/cv/flownet2/src/model_utils/frame_utils.py b/research/cv/flownet2/src/model_utils/frame_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..33c26443a2ea209c33c297b68d31088eda1c6885 --- /dev/null +++ b/research/cv/flownet2/src/model_utils/frame_utils.py @@ -0,0 +1,76 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import re +from os.path import splitext +import numpy as np +from imageio import imread + +def read_gen(file_name): + ext = splitext(file_name)[-1] + if ext in ('.png', '.jpeg', '.ppm', '.jpg'): + im = imread(file_name) + if im.shape[2] > 3: + return im[:, :, :3] + return im + if ext in ('.bin', '.raw'): + return np.load(file_name) + if ext == '.flo': + return readFlow(file_name).astype(np.float32) + if ext == '.pfm': + return readPFM(file_name).astype(np.float32) + return [] + +def readFlow(fn): + """ Read .flo file in Middlebury format""" + with open(fn, 'rb') as f: + magic = np.fromfile(f, np.float32, count=1) + if magic != 202021.25: + print('Magic number incorrect. Invalid .flo file') + return None + w = np.fromfile(f, np.int32, count=1) + h = np.fromfile(f, np.int32, count=1) + data = np.fromfile(f, np.float32, count=2*int(w)*int(h)) + return np.resize(data, (int(h), int(w), 2)) + +def readPFM(file): + file = open(file, 'rb') + header = file.readline().rstrip() + if header in ('PF', b'PF'): + color = True + elif header in ('Pf', b'Pf'): + color = False + else: + raise Exception('Not a PFM file.') + wh = bytes.decode(file.readline()) + dim_match = re.match(r'^(\d+)\s+(\d+)$', wh.strip()) + + if dim_match: + width, height = map(int, dim_match.groups()) + else: + raise Exception('Malformed PFM header.') + scale = float(file.readline().decode().rstrip()) + if scale < 0: # little-endian + endian = '<' + scale = -scale + else: + endian = '>' # big-endian + + data = np.fromfile(file, endian + 'f') + shape = (height, width, 3) if color else (height, width) + + data = np.reshape(data, shape) + data = np.flipud(data) + return data diff --git a/research/cv/flownet2/src/model_utils/local_adapter.py b/research/cv/flownet2/src/model_utils/local_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..0e7c529384330b0bf45a1a3f7c85a7244f3fdc3f --- /dev/null +++ b/research/cv/flownet2/src/model_utils/local_adapter.py @@ -0,0 +1,36 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Local adapter""" + +import os + +def get_device_id(): + device_id = os.getenv('DEVICE_ID', '0') + return int(device_id) + + +def get_device_num(): + device_num = os.getenv('RANK_SIZE', '1') + return int(device_num) + + +def get_rank_id(): + global_rank_id = os.getenv('RANK_ID', '0') + return int(global_rank_id) + + +def get_job_id(): + return "Local Job" diff --git a/research/cv/flownet2/src/model_utils/moxing_adapter.py b/research/cv/flownet2/src/model_utils/moxing_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..e68e3dfeb9ebd79bab3c099a26270b969135cc04 --- /dev/null +++ b/research/cv/flownet2/src/model_utils/moxing_adapter.py @@ -0,0 +1,116 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Moxing adapter for ModelArts""" + +import os +import functools +from mindspore import context +from config import config + +_global_sync_count = 0 + +def get_device_id(): + device_id = os.getenv('DEVICE_ID', '0') + return int(device_id) + + +def get_device_num(): + device_num = os.getenv('RANK_SIZE', '1') + return int(device_num) + + +def get_rank_id(): + global_rank_id = os.getenv('RANK_ID', '0') + return int(global_rank_id) + + +def get_job_id(): + job_id = os.getenv('JOB_ID') + job_id = job_id if job_id != "" else "default" + return job_id + +def sync_data(from_path, to_path): + """ + Download data from remote obs to local directory if the first url is remote url and the second one is local path + Upload data from local directory to remote obs in contrast. + """ + import moxing as mox + import time + global _global_sync_count + sync_lock = "/tmp/copy_sync.lock" + str(_global_sync_count) + _global_sync_count += 1 + + # Each server contains 8 devices as most. + if get_device_id() % min(get_device_num(), 8) == 0 and not os.path.exists(sync_lock): + print("from path: ", from_path) + print("to path: ", to_path) + mox.file.copy_parallel(from_path, to_path) + print("===finish data synchronization===") + try: + os.mknod(sync_lock) + except IOError: + pass + print("===save flag===") + + while True: + if os.path.exists(sync_lock): + break + time.sleep(1) + + print("Finish sync data from {} to {}.".format(from_path, to_path)) + + +def moxing_wrapper(pre_process=None, post_process=None): + """ + Moxing wrapper to download dataset and upload outputs. + """ + def wrapper(run_func): + @functools.wraps(run_func) + def wrapped_func(*args, **kwargs): + # Download data from data_url + if config.enable_modelarts: + if config.data_url: + sync_data(config.data_url, config.data_path) + print("Dataset downloaded: ", os.listdir(config.data_path)) + if config.checkpoint_url: + sync_data(config.checkpoint_url, config.load_path) + print("Preload downloaded: ", os.listdir(config.load_path)) + if config.train_url: + sync_data(config.train_url, config.output_path) + print("Workspace downloaded: ", os.listdir(config.output_path)) + + context.set_context(save_graphs_path=os.path.join(config.output_path, str(get_rank_id()))) + config.device_num = get_device_num() + config.device_id = get_device_id() + if not os.path.exists(config.output_path): + os.makedirs(config.output_path) + + if pre_process: + pre_process() + + # Run the main function + run_func(*args, **kwargs) + + # Upload data to train_url + if config.enable_modelarts: + if post_process: + post_process() + + if config.train_url: + print("Start to copy output directory") + sync_data(config.output_path, config.train_url) + return wrapped_func + return wrapper diff --git a/research/cv/flownet2/src/model_utils/tools.py b/research/cv/flownet2/src/model_utils/tools.py new file mode 100644 index 0000000000000000000000000000000000000000..e78cba2782326373d45b55689b364f24c3d261fa --- /dev/null +++ b/research/cv/flownet2/src/model_utils/tools.py @@ -0,0 +1,24 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +from inspect import isclass + +def module_to_dict(module, exclude=None): + if exclude is None: + exclude = [] + module_dict = {} + for x in dir(module): + if isclass(getattr(module, x)) and x not in exclude and getattr(module, x) not in exclude: + module_dict[x] = getattr(module, x) + return module_dict diff --git a/research/cv/flownet2/src/models.py b/research/cv/flownet2/src/models.py new file mode 100644 index 0000000000000000000000000000000000000000..469b655acdf0489910563919272633ab285b6447 --- /dev/null +++ b/research/cv/flownet2/src/models.py @@ -0,0 +1,452 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import mindspore.nn as nn +import mindspore.ops as ops + +from .submodels import FlowNetC +from .submodels import FlowNetS +from .submodels import FlowNetSD +from .submodels import FlowNetFusion +from .submodels.custom_ops.custom_ops import Resample2D as Resample2d +from .submodels.submodules import ChannelNorm +from .submodels.submodules import Upsample + +Parameter_count = 162, 518, 834 + +class FlowNet2(nn.Cell): + + def __init__(self, rgb_max=255, batchNorm=False, div_flow=20.): + super(FlowNet2, self).__init__() + self.batchNorm = batchNorm + self.div_flow = div_flow + self.rgb_max = rgb_max + + self.channelnorm = ChannelNorm(axis=1) + + # First Block (FlowNetC) + self.flownetc = FlowNetC.FlowNetC(batchNorm=self.batchNorm) + + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + self.resample1 = Resample2d() + + # Block (FlowNetS1) + self.flownets_1 = FlowNetS.FlowNetS(batchNorm=self.batchNorm) + self.upsample2 = Upsample(scale_factor=4, mode='bilinear') + self.resample2 = Resample2d() + + # Block (FlowNetS2) + self.flownets_2 = FlowNetS.FlowNetS(batchNorm=self.batchNorm) + + # Block (FlowNetSD) + self.flownets_d = FlowNetSD.FlowNetSD(batchNorm=self.batchNorm) + + self.upsample3 = Upsample(scale_factor=4, mode='nearest') + self.upsample4 = Upsample(scale_factor=4, mode='nearest') + + self.resample3 = Resample2d() + + self.resample4 = Resample2d() + + # Block (FLowNetFusion) + self.flownetfusion = FlowNetFusion.FlowNetFusion(batchNorm=self.batchNorm) + + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + + def construct(self, inputs): + rgb_mean = inputs.view(inputs.shape[:2] + (-1,)).mean(axis=-1).view(inputs.shape[:2] + (1, 1, 1,)) + + x = (inputs - rgb_mean) / self.rgb_max + x1 = x[:, :, 0, :, :] + x2 = x[:, :, 1, :, :] + + x = self.concat_op((x1, x2)) + + # flownetc + flownetc_flow2 = self.flownetc(x)[0] + flownetc_flow = self.upsample1(flownetc_flow2 * self.div_flow) + + # warp img1 to img0; magnitude of diff between img0 and and warped_img1, + resampled_img1 = self.resample1(x[:, 3:, :, :], flownetc_flow) + diff_img0 = x[:, :3, :, :] - resampled_img1 + norm_diff_img0 = self.channelnorm(diff_img0) + + # concat img0, img1, img1->img0, flow, diff-mag ; + concat1 = self.concat_op((x, resampled_img1, flownetc_flow / self.div_flow, norm_diff_img0)) + + # flownets1 + flownets1_flow2 = self.flownets_1(concat1)[0] + flownets1_flow = self.upsample2(flownets1_flow2 * self.div_flow) + + # warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1 + resampled_img1 = self.resample2(x[:, 3:, :, :], flownets1_flow) + diff_img0 = x[:, :3, :, :] - resampled_img1 + norm_diff_img0 = self.channelnorm(diff_img0) + + # concat img0, img1, img1->img0, flow, diff-mag + concat2 = self.concat_op((x, resampled_img1, flownets1_flow / self.div_flow, norm_diff_img0)) + + # flownets2 + flownets2_flow2 = self.flownets_2(concat2)[0] + flownets2_flow = self.upsample4(flownets2_flow2 * self.div_flow) + norm_flownets2_flow = self.channelnorm(flownets2_flow) + + diff_flownets2_flow = self.resample4(x[:, 3:, :, :], flownets2_flow) + + diff_flownets2_img1 = self.channelnorm((x[:, :3, :, :] - diff_flownets2_flow)) + + # flownetsd + flownetsd_flow2 = self.flownets_d(x)[0] + flownetsd_flow = self.upsample3(flownetsd_flow2 / self.div_flow) + norm_flownetsd_flow = self.channelnorm(flownetsd_flow) + + diff_flownetsd_flow = self.resample3(x[:, 3:, :, :], flownetsd_flow) + + diff_flownetsd_img1 = self.channelnorm((x[:, :3, :, :] - diff_flownetsd_flow)) + + # concat img1 flownetsd, flownets2, norm_flownetsd, norm_flownets2, diff_flownetsd_img1, diff_flownets2_img1 + concat3 = self.concat_op( + (x[:, :3, :, :], flownetsd_flow, flownets2_flow, norm_flownetsd_flow, norm_flownets2_flow, + diff_flownetsd_img1, diff_flownets2_img1)) + flownetfusion_flow = self.flownetfusion(concat3) + + return flownetfusion_flow + + +class FlowNet2C(FlowNetC.FlowNetC): + def __init__(self, rgb_max, batchNorm=False, div_flow=20): + super(FlowNet2C, self).__init__(batchNorm=batchNorm, div_flow=div_flow) + self.rgb_max = rgb_max + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + def construct(self, inputs): + rgb_mean = self.mean(inputs.view(inputs.shape[:2] + (-1,)), -1).view(inputs.shape[:2] + (1, 1, 1,)) + + x = (inputs - rgb_mean) / self.rgb_max + x1 = x[:, :, 0, :, :] + x2 = x[:, :, 1, :, :] + + # FlownetC top input stream + out_conv1a = self.conv1(x1) + out_conv2a = self.conv2(out_conv1a) + out_conv3a = self.conv3(out_conv2a) + + # FlownetC bottom input stream + out_conv1b = self.conv1(x2) + + out_conv2b = self.conv2(out_conv1b) + out_conv3b = self.conv3(out_conv2b) + + # Merge streams + out_corr = self.corr(out_conv3a, out_conv3b) # False + out_corr = self.corr_activation(out_corr) + + # Redirect top input stream and concatenate + out_conv_redir = self.conv_redir(out_conv3a) + + in_conv3_1 = self.concat_op((out_conv_redir, out_corr)) + + # Merged conv layers + out_conv3_1 = self.conv3_1(in_conv3_1) + + out_conv4 = self.conv4_1(self.conv4(out_conv3_1)) + + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + + flow5 = self.predict_flow5(concat5) + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + + flow4 = self.predict_flow4(concat4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + concat3 = self.concat_op((out_conv3_1, out_deconv3, flow4_up)) + + flow3 = self.predict_flow3(concat3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + concat2 = self.concat_op((out_conv2a, out_deconv2, flow3_up)) + + flow2 = self.predict_flow2(concat2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return self.upsample1(flow2 * self.div_flow) + + +class FlowNet2S(FlowNetS.FlowNetS): + def __init__(self, rgb_max=255, batchNorm=False, div_flow=20): + super(FlowNet2S, self).__init__(input_channels=6, batchNorm=batchNorm) + self.rgb_max = rgb_max + self.div_flow = div_flow + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + def construct(self, inputs): + rgb_mean = self.mean(inputs.view(inputs.shape[:2] + (-1,)), -1).view(inputs.shape[:2] + (1, 1, 1,)) + x = (inputs - rgb_mean) / self.rgb_max + x = self.concat_op((x[:, :, 0, :, :], x[:, :, 1, :, :])) + + out_conv1 = self.conv1(x) + + out_conv2 = self.conv2(out_conv1) + out_conv3 = self.conv3_1(self.conv3(out_conv2)) + out_conv4 = self.conv4_1(self.conv4(out_conv3)) + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + flow5 = self.predict_flow5(concat5) + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + flow4 = self.predict_flow4(concat4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + + concat3 = self.concat_op((out_conv3, out_deconv3, flow4_up)) + flow3 = self.predict_flow3(concat3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + + concat2 = self.concat_op((out_conv2, out_deconv2, flow3_up)) + flow2 = self.predict_flow2(concat2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return self.upsample1(flow2 * self.div_flow) + + +class FlowNet2SD(FlowNetSD.FlowNetSD): + def __init__(self, rgb_max=255, batchNorm=False, div_flow=20): + super(FlowNet2SD, self).__init__(batchNorm=batchNorm) + self.rgb_max = rgb_max + self.div_flow = div_flow + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + def construct(self, inputs): + rgb_mean = self.mean(inputs.view(inputs.shape[:2] + (-1,)), -1).view(inputs.shape[:2] + (1, 1, 1,)) + x = (inputs - rgb_mean) / self.rgb_max + x = self.concat_op((x[:, :, 0, :, :], x[:, :, 1, :, :])) + + out_conv0 = self.conv0(x) + out_conv1 = self.conv1_1(self.conv1(out_conv0)) + out_conv2 = self.conv2_1(self.conv2(out_conv1)) + + out_conv3 = self.conv3_1(self.conv3(out_conv2)) + out_conv4 = self.conv4_1(self.conv4(out_conv3)) + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + out_interconv5 = self.inter_conv5(concat5) + flow5 = self.predict_flow5(out_interconv5) + + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + out_interconv4 = self.inter_conv4(concat4) + flow4 = self.predict_flow4(out_interconv4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + + concat3 = self.concat_op((out_conv3, out_deconv3, flow4_up)) + out_interconv3 = self.inter_conv3(concat3) + flow3 = self.predict_flow3(out_interconv3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + + concat2 = self.concat_op((out_conv2, out_deconv2, flow3_up)) + out_interconv2 = self.inter_conv2(concat2) + flow2 = self.predict_flow2(out_interconv2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return self.upsample1(flow2 * self.div_flow) + + +class FlowNet2CS(nn.Cell): + + def __init__(self, rgb_max=255, batchNorm=False, div_flow=20.): + super(FlowNet2CS, self).__init__() + self.batchNorm = batchNorm + self.div_flow = div_flow + self.rgb_max = rgb_max + + self.channelnorm = ChannelNorm(axis=1) + + # First Block (FlowNetC) + self.flownetc = FlowNetC.FlowNetC(batchNorm=self.batchNorm) + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + self.resample1 = Resample2d() + + # Block (FlowNetS1) + self.flownets_1 = FlowNetS.FlowNetS(batchNorm=self.batchNorm) + self.upsample2 = Upsample(scale_factor=4, mode='bilinear') + + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + def construct(self, inputs): + rgb_mean = self.mean(inputs.view(inputs.shape[:2] + (-1,)), -1).view(inputs.shape[:2] + (1, 1, 1,)) + + x = (inputs - rgb_mean) / self.rgb_max + x1 = x[:, :, 0, :, :] + x2 = x[:, :, 1, :, :] + x = self.concat_op((x1, x2)) + + # flownetc + flownetc_flow2 = self.flownetc(x)[0] + flownetc_flow = self.upsample1(flownetc_flow2 * self.div_flow) + + # warp img1 to img0; magnitude of diff between img0 and and warped_img1, + resampled_img1 = self.resample1(x[:, 3:, :, :], flownetc_flow) + diff_img0 = x[:, :3, :, :] - resampled_img1 + norm_diff_img0 = self.channelnorm(diff_img0) + + # concat img0, img1, img1->img0, flow, diff-mag ; + concat1 = self.concat_op((x, resampled_img1, flownetc_flow / self.div_flow, norm_diff_img0)) + + # flownets1 + flownets1_flow2 = self.flownets_1(concat1)[0] + flownets1_flow = self.upsample2(flownets1_flow2 * self.div_flow) + + return flownets1_flow + + +class FlowNet2CSS(nn.Cell): + + def __init__(self, rgb_max=255, batchNorm=False, div_flow=20.): + super(FlowNet2CSS, self).__init__() + self.batchNorm = batchNorm + self.div_flow = div_flow + self.rgb_max = rgb_max + + self.channelnorm = ChannelNorm(axis=1) + + # First Block (FlowNetC) + self.flownetc = FlowNetC.FlowNetC(batchNorm=self.batchNorm) + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + self.resample1 = Resample2d() + + # Block (FlowNetS1) + self.flownets_1 = FlowNetS.FlowNetS(batchNorm=self.batchNorm) + self.upsample2 = Upsample(scale_factor=4, mode='bilinear') + + self.resample2 = Resample2d() + + # Block (FlowNetS2) + self.flownets_2 = FlowNetS.FlowNetS(batchNorm=self.batchNorm) + self.upsample3 = Upsample(scale_factor=4, mode='nearest') + + self.concat_op = ops.Concat(1) + self.mean = ops.ReduceMean() + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + + def construct(self, inputs): + rgb_mean = self.mean(inputs.view(inputs.shape[:2] + (-1,)), -1).view(inputs.shape[:2] + (1, 1, 1,)) + + x = (inputs - rgb_mean) / self.rgb_max + x1 = x[:, :, 0, :, :] + x2 = x[:, :, 1, :, :] + x = self.concat_op((x1, x2)) + + # flownetc + flownetc_flow2 = self.flownetc(x)[0] + flownetc_flow = self.upsample1(flownetc_flow2 * self.div_flow) + + # warp img1 to img0; magnitude of diff between img0 and and warped_img1, + resampled_img1 = self.resample1(x[:, 3:, :, :], flownetc_flow) + diff_img0 = x[:, :3, :, :] - resampled_img1 + norm_diff_img0 = self.channelnorm(diff_img0) + + # concat img0, img1, img1->img0, flow, diff-mag ; + concat1 = self.concat_op((x, resampled_img1, flownetc_flow / self.div_flow, norm_diff_img0)) + + # flownets1 + flownets1_flow2 = self.flownets_1(concat1)[0] + flownets1_flow = self.upsample2(flownets1_flow2 * self.div_flow) + + # warp img1 to img0 using flownets1; magnitude of diff between img0 and and warped_img1 + resampled_img1 = self.resample2(x[:, 3:, :, :], flownets1_flow) + diff_img0 = x[:, :3, :, :] - resampled_img1 + norm_diff_img0 = self.channelnorm(diff_img0) + + # concat img0, img1, img1->img0, flow, diff-mag + concat2 = self.concat_op((x, resampled_img1, flownets1_flow / self.div_flow, norm_diff_img0)) + + # flownets2 + flownets2_flow2 = self.flownets_2(concat2)[0] + flownets2_flow = self.upsample3(flownets2_flow2 * self.div_flow) + + return flownets2_flow diff --git a/research/cv/flownet2/src/submodels/FlowNetC.py b/research/cv/flownet2/src/submodels/FlowNetC.py new file mode 100644 index 0000000000000000000000000000000000000000..66f16e0df53be4d22c20d10d4c7b0357274a51ed --- /dev/null +++ b/research/cv/flownet2/src/submodels/FlowNetC.py @@ -0,0 +1,139 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import mindspore.nn as nn +import mindspore.ops as ops +from .custom_ops.custom_ops import Correlation +from .submodules import conv +from .submodules import predict_flow +from .submodules import deconv +from .submodules import Upsample + + +Parameter_count = 39, 175, 298 + + +class FlowNetC(nn.Cell): + def __init__(self, batchNorm=True, div_flow=20): + super(FlowNetC, self).__init__() + + self.batchNorm = batchNorm + self.div_flow = div_flow + + self.conv1 = conv(self.batchNorm, 3, 64, kernel_size=7, stride=2) + self.conv2 = conv(self.batchNorm, 64, 128, kernel_size=5, stride=2) + self.conv3 = conv(self.batchNorm, 128, 256, kernel_size=5, stride=2) + self.conv_redir = conv(self.batchNorm, 256, 32, kernel_size=1, stride=1) + + self.corr = Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2) + + self.corr_activation = nn.LeakyReLU(0.1) + self.conv3_1 = conv(self.batchNorm, 473, 256) + self.conv4 = conv(self.batchNorm, 256, 512, stride=2) + self.conv4_1 = conv(self.batchNorm, 512, 512) + self.conv5 = conv(self.batchNorm, 512, 512, stride=2) + self.conv5_1 = conv(self.batchNorm, 512, 512) + self.conv6 = conv(self.batchNorm, 512, 1024, stride=2) + self.conv6_1 = conv(self.batchNorm, 1024, 1024) + + self.deconv5 = deconv(1024, 512) + self.deconv4 = deconv(1026, 256) + self.deconv3 = deconv(770, 128) + self.deconv2 = deconv(386, 64) + + self.predict_flow6 = predict_flow(1024) + self.predict_flow5 = predict_flow(1026) + self.predict_flow4 = predict_flow(770) + self.predict_flow3 = predict_flow(386) + self.predict_flow2 = predict_flow(194) + + self.upsampled_flow6_to_5 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow5_to_4 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow4_to_3 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow3_to_2 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + + self.concat_op = ops.Concat(1) + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + def construct(self, x): + x1 = x[:, 0:3, :, :] + x2 = x[:, 3::, :, :] + + out_conv1a = self.conv1(x1) + out_conv2a = self.conv2(out_conv1a) + out_conv3a = self.conv3(out_conv2a) + + # FlownetC bottom input stream + out_conv1b = self.conv1(x2) + + out_conv2b = self.conv2(out_conv1b) + out_conv3b = self.conv3(out_conv2b) + + # Merge streams + out_corr = self.corr(out_conv3a, + out_conv3b) # 未打印 Correlation(pad_size=20, kernel_size=1, max_displacement=20, stride1=1, stride2=2) + out_corr = self.corr_activation(out_corr) # nn.LeakyReLU(0.1) + + # Redirect top input stream and concatenate + out_conv_redir = self.conv_redir(out_conv3a) # 已打印 conv(self.batchNorm, 256, 32, kernel_size=1, stride=1) + + in_conv3_1 = self.concat_op((out_conv_redir, out_corr)) + + # Merged conv layers + out_conv3_1 = self.conv3_1(in_conv3_1) + + out_conv4 = self.conv4_1(self.conv4(out_conv3_1)) + + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + + flow5 = self.predict_flow5(concat5) + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + + flow4 = self.predict_flow4(concat4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + concat3 = self.concat_op((out_conv3_1, out_deconv3, flow4_up)) + + flow3 = self.predict_flow3(concat3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + concat2 = self.concat_op((out_conv2a, out_deconv2, flow3_up)) + + flow2 = self.predict_flow2(concat2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return flow2, None diff --git a/research/cv/flownet2/src/submodels/FlowNetFusion.py b/research/cv/flownet2/src/submodels/FlowNetFusion.py new file mode 100644 index 0000000000000000000000000000000000000000..2440eac5c9f711babad59847140d8007c3456286 --- /dev/null +++ b/research/cv/flownet2/src/submodels/FlowNetFusion.py @@ -0,0 +1,82 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +from mindspore import nn +import mindspore.ops as ops +from .submodules import conv +from .submodules import deconv +from .submodules import i_conv +from .submodules import predict_flow + +Parameter_count = 581, 226 + + +class FlowNetFusion(nn.Cell): + def __init__(self, batchNorm=True): + super(FlowNetFusion, self).__init__() + + self.batchNorm = batchNorm + self.conv0 = conv(self.batchNorm, 11, 64) + self.conv1 = conv(self.batchNorm, 64, 64, stride=2) + self.conv1_1 = conv(self.batchNorm, 64, 128) + self.conv2 = conv(self.batchNorm, 128, 128, stride=2) + self.conv2_1 = conv(self.batchNorm, 128, 128) + + self.deconv1 = deconv(128, 32) + self.deconv0 = deconv(162, 16) + + self.inter_conv1 = i_conv(self.batchNorm, 162, 32) + self.inter_conv0 = i_conv(self.batchNorm, 82, 16) + + self.predict_flow2 = predict_flow(128) + self.predict_flow1 = predict_flow(32) + self.predict_flow0 = predict_flow(16) + + self.upsampled_flow2_to_1 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow1_to_0 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + + self.concat_op = ops.Concat(1) + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + def construct(self, x): + out_conv0 = self.conv0(x) + out_conv1 = self.conv1_1(self.conv1(out_conv0)) + out_conv2 = self.conv2_1(self.conv2(out_conv1)) + + flow2 = self.predict_flow2(out_conv2) + flow2_up = self.upsampled_flow2_to_1(flow2) + out_deconv1 = self.deconv1(out_conv2) + + concat1 = self.concat_op((out_conv1, out_deconv1, flow2_up)) + out_interconv1 = self.inter_conv1(concat1) + flow1 = self.predict_flow1(out_interconv1) + flow1_up = self.upsampled_flow1_to_0(flow1) + out_deconv0 = self.deconv0(concat1) + + concat0 = self.concat_op((out_conv0, out_deconv0, flow1_up)) + out_interconv0 = self.inter_conv0(concat0) + flow0 = self.predict_flow0(out_interconv0) + + return flow0 diff --git a/research/cv/flownet2/src/submodels/FlowNetS.py b/research/cv/flownet2/src/submodels/FlowNetS.py new file mode 100644 index 0000000000000000000000000000000000000000..0ee92ae23826af80cf21e8e4ba6f781043c5fccc --- /dev/null +++ b/research/cv/flownet2/src/submodels/FlowNetS.py @@ -0,0 +1,104 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +from mindspore import nn +import mindspore.ops as ops +from .submodules import conv +from .submodules import deconv +from .submodules import predict_flow +from .submodules import Upsample + + +class FlowNetS(nn.Cell): + def __init__(self, input_channels=12, batchNorm=True): + super(FlowNetS, self).__init__() + + self.batchNorm = batchNorm + self.conv1 = conv(self.batchNorm, input_channels, 64, kernel_size=7, stride=2) + self.conv2 = conv(self.batchNorm, 64, 128, kernel_size=5, stride=2) + self.conv3 = conv(self.batchNorm, 128, 256, kernel_size=5, stride=2) + self.conv3_1 = conv(self.batchNorm, 256, 256) + self.conv4 = conv(self.batchNorm, 256, 512, stride=2) + self.conv4_1 = conv(self.batchNorm, 512, 512) + self.conv5 = conv(self.batchNorm, 512, 512, stride=2) + self.conv5_1 = conv(self.batchNorm, 512, 512) + self.conv6 = conv(self.batchNorm, 512, 1024, stride=2) + self.conv6_1 = conv(self.batchNorm, 1024, 1024) + + self.deconv5 = deconv(1024, 512) + self.deconv4 = deconv(1026, 256) + self.deconv3 = deconv(770, 128) + self.deconv2 = deconv(386, 64) + + self.predict_flow6 = predict_flow(1024) + self.predict_flow5 = predict_flow(1026) + self.predict_flow4 = predict_flow(770) + self.predict_flow3 = predict_flow(386) + self.predict_flow2 = predict_flow(194) + + self.upsampled_flow6_to_5 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=False) + self.upsampled_flow5_to_4 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=False) + self.upsampled_flow4_to_3 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=False) + self.upsampled_flow3_to_2 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=False) + + self.concat_op = ops.Concat(1) + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + def construct(self, x): + out_conv1 = self.conv1(x) + + out_conv2 = self.conv2(out_conv1) + out_conv3 = self.conv3_1(self.conv3(out_conv2)) + out_conv4 = self.conv4_1(self.conv4(out_conv3)) + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + flow5 = self.predict_flow5(concat5) + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + flow4 = self.predict_flow4(concat4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + + concat3 = self.concat_op((out_conv3, out_deconv3, flow4_up)) + flow3 = self.predict_flow3(concat3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + + concat2 = self.concat_op((out_conv2, out_deconv2, flow3_up)) + flow2 = self.predict_flow2(concat2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return flow2, None diff --git a/research/cv/flownet2/src/submodels/FlowNetSD.py b/research/cv/flownet2/src/submodels/FlowNetSD.py new file mode 100644 index 0000000000000000000000000000000000000000..ba64a6a2671009c1ffb053c2f90f3cd5e5b2cdd1 --- /dev/null +++ b/research/cv/flownet2/src/submodels/FlowNetSD.py @@ -0,0 +1,122 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +from mindspore import nn +import mindspore.ops as ops +from .submodules import conv +from .submodules import deconv +from .submodules import i_conv +from .submodules import predict_flow +from .submodules import Upsample + +Parameter_count = 45, 371, 666 + + +class FlowNetSD(nn.Cell): + def __init__(self, batchNorm=True): + super(FlowNetSD, self).__init__() + + self.batchNorm = batchNorm + self.conv0 = conv(self.batchNorm, 6, 64) + self.conv1 = conv(self.batchNorm, 64, 64, stride=2) + self.conv1_1 = conv(self.batchNorm, 64, 128) + self.conv2 = conv(self.batchNorm, 128, 128, stride=2) + self.conv2_1 = conv(self.batchNorm, 128, 128) + self.conv3 = conv(self.batchNorm, 128, 256, stride=2) + self.conv3_1 = conv(self.batchNorm, 256, 256) + self.conv4 = conv(self.batchNorm, 256, 512, stride=2) + self.conv4_1 = conv(self.batchNorm, 512, 512) + self.conv5 = conv(self.batchNorm, 512, 512, stride=2) + self.conv5_1 = conv(self.batchNorm, 512, 512) + self.conv6 = conv(self.batchNorm, 512, 1024, stride=2) + self.conv6_1 = conv(self.batchNorm, 1024, 1024) + + self.deconv5 = deconv(1024, 512) + self.deconv4 = deconv(1026, 256) + self.deconv3 = deconv(770, 128) + self.deconv2 = deconv(386, 64) + + self.inter_conv5 = i_conv(self.batchNorm, 1026, 512) + self.inter_conv4 = i_conv(self.batchNorm, 770, 256) + self.inter_conv3 = i_conv(self.batchNorm, 386, 128) + self.inter_conv2 = i_conv(self.batchNorm, 194, 64) + + self.predict_flow6 = predict_flow(1024) + self.predict_flow5 = predict_flow(512) + self.predict_flow4 = predict_flow(256) + self.predict_flow3 = predict_flow(128) + self.predict_flow2 = predict_flow(64) + + self.upsampled_flow6_to_5 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow5_to_4 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow4_to_3 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + self.upsampled_flow3_to_2 = nn.Conv2dTranspose(2, 2, 4, 2, pad_mode='pad', padding=1, has_bias=True) + + self.concat_op = ops.Concat(1) + + for c in self.cells(): + if isinstance(c, nn.Conv2d): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + if isinstance(c, nn.Conv2dTranspose): + if c.bias_init is not None: + c.bias_init = 'Uniform' + c.weight_init = 'XavierUniform' + + self.upsample1 = Upsample(scale_factor=4, mode='bilinear') + + def construct(self, x): + # print(x.shape) + out_conv0 = self.conv0(x) + out_conv1 = self.conv1_1(self.conv1(out_conv0)) + out_conv2 = self.conv2_1(self.conv2(out_conv1)) + + out_conv3 = self.conv3_1(self.conv3(out_conv2)) + out_conv4 = self.conv4_1(self.conv4(out_conv3)) + out_conv5 = self.conv5_1(self.conv5(out_conv4)) + out_conv6 = self.conv6_1(self.conv6(out_conv5)) + + flow6 = self.predict_flow6(out_conv6) + flow6_up = self.upsampled_flow6_to_5(flow6) + out_deconv5 = self.deconv5(out_conv6) + + concat5 = self.concat_op((out_conv5, out_deconv5, flow6_up)) + out_interconv5 = self.inter_conv5(concat5) + flow5 = self.predict_flow5(out_interconv5) + + flow5_up = self.upsampled_flow5_to_4(flow5) + out_deconv4 = self.deconv4(concat5) + + concat4 = self.concat_op((out_conv4, out_deconv4, flow5_up)) + out_interconv4 = self.inter_conv4(concat4) + flow4 = self.predict_flow4(out_interconv4) + flow4_up = self.upsampled_flow4_to_3(flow4) + out_deconv3 = self.deconv3(concat4) + + concat3 = self.concat_op((out_conv3, out_deconv3, flow4_up)) + out_interconv3 = self.inter_conv3(concat3) + flow3 = self.predict_flow3(out_interconv3) + flow3_up = self.upsampled_flow3_to_2(flow3) + out_deconv2 = self.deconv2(concat3) + + concat2 = self.concat_op((out_conv2, out_deconv2, flow3_up)) + out_interconv2 = self.inter_conv2(concat2) + flow2 = self.predict_flow2(out_interconv2) + + if self.training: + return flow2, flow3, flow4, flow5, flow6 + return flow2, None diff --git a/research/cv/flownet2/src/submodels/__init__.py b/research/cv/flownet2/src/submodels/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/research/cv/flownet2/src/submodels/custom_ops/correlation.cu b/research/cv/flownet2/src/submodels/custom_ops/correlation.cu new file mode 100644 index 0000000000000000000000000000000000000000..4e40a00f1435955b77a2daca2073996c6b4f435e --- /dev/null +++ b/research/cv/flownet2/src/submodels/custom_ops/correlation.cu @@ -0,0 +1,392 @@ +// Copyright 2022 Huawei Technologies Co., Ltd +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +// ============================================================================ + +#include <thrust/reduce.h> +#include <stdio.h> +#include <algorithm> + +#define CUDA_NUM_THREADS 1024 +#define THREADS_PER_BLOCK 32 +#define FULL_MASK 0xffffffff + +__global__ void correlationInitKernel(size_t size_init, float *input) { + auto idx = blockIdx.x * CUDA_NUM_THREADS + threadIdx.x; + if (idx < size_init) { + input[idx] = static_cast<float>(.0); + } +} + + +__forceinline__ __device__ float warpReduceSum(float value) { + for (int offset = 16; offset > 0; offset /= 2) + value += __shfl_down_sync(FULL_MASK, value, offset); + return value; +} + + +__forceinline__ __device__ float blockReduceSum(float value) { + static __shared__ float shared[32]; + int lane = threadIdx.x % warpSize; + int windex = threadIdx.x / warpSize; + value = warpReduceSum(value); + if (lane == 0) + shared[windex] = value; + + __syncthreads(); + + value = (threadIdx.x < blockDim.x / warpSize) ? shared[lane] : 0; + + if (windex == 0) + value = warpReduceSum(value); + return value; +} + +__global__ void correlation_forward(float* output, + const int tdim_cyx, const int tdim_yx, const int tdim_x, + const float* padded_input1, const float* padded_input2, + const int pdim_yxc, const int pdim_xc, const int pdim_c, + const int kernel_size, const int max_displacement, const int stride1, const int stride2) { + int32_t kernel_radius = (kernel_size - 1) / 2; + int32_t displacement_radius = max_displacement / stride2; + int32_t displacement_size = 2 * displacement_radius + 1; + + int32_t nums = kernel_size * kernel_size * pdim_c; + + int32_t n = blockIdx.x; + int32_t y1 = blockIdx.y * stride1 + max_displacement; + int32_t x1 = blockIdx.z * stride1 + max_displacement; + int32_t c = threadIdx.x; + + // along channel axism, do element-wise product + for (int t_j = -displacement_radius; t_j <= displacement_radius; ++t_j) { + for (int t_i = -displacement_radius; t_i <= displacement_radius; ++t_i) { + int x2 = x1 + t_i * stride2; + int y2 = y1 + t_j * stride2; + float acc = 0.0f; + // add 2 feature kernel_radius + for (int j = -kernel_radius; j <= kernel_radius; ++j) { + for (int i = -kernel_radius; i <= kernel_radius; ++i) { + #pragma unroll + for (int ch = c; ch < pdim_c; ch += blockDim.x) { + int index1 = n * pdim_yxc + (y1 + j) * pdim_xc + (x1 + i) * pdim_c + ch; + int index2 = n * pdim_yxc + (y2 + j) * pdim_xc + (x2 + i) * pdim_c + ch; + acc += static_cast<float>(padded_input1[index1] * + padded_input2[index2]); + } + } + } + + if (blockDim.x == warpSize) { + __syncwarp(); + acc = warpReduceSum(acc); + } else { + __syncthreads(); + acc = blockReduceSum(acc); + } + + if (threadIdx.x == 0) { + int tc = (t_j + displacement_radius) * displacement_size + + (t_i + displacement_radius); + const int tindex = n * tdim_cyx + tc * tdim_yx + blockIdx.y * tdim_x + blockIdx.z; + output[tindex] = static_cast<float>(acc / nums); + } + } + } +} + +extern "C" int correlation(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes, void *stream, + void *extra) { + cudaStream_t custream = static_cast<cudaStream_t>(stream); + constexpr int OUTPUT_INDEX = 2; + constexpr int INPUT_INDEX = 0; + constexpr int TOTAL_PARAM_NUM = 3; + if (nparam != TOTAL_PARAM_NUM) { + return 1; + } + // This is to check if the type of parameters the same as what the user wants. + for (int i = 0; i < nparam; i++) { + if (strcmp(dtypes[i], "float32") != 0) { + return 2; + } + } + // input1's index is 0, input2's index is 1 and output's index is 2 + float *input1 = static_cast<float *>(params[0]); + float *input2 = static_cast<float *>(params[1]); + float *output = static_cast<float *>(params[2]); + + int batchSize = shapes[OUTPUT_INDEX][0]; + int outputChannels = shapes[OUTPUT_INDEX][1]; + int outputHeight = shapes[OUTPUT_INDEX][2]; + int outputWidth = shapes[OUTPUT_INDEX][3]; + int inputChannels = shapes[INPUT_INDEX][3]; + int inputHeight = shapes[INPUT_INDEX][1]; + int inputWidth = shapes[INPUT_INDEX][2]; + + // notice: At Currently the parameter used in cuda is fixed because the interface have no place to pass parameters + // need to be changed in future + const int kernel_size = 1; + const int max_displacement = 20; + const int stride1 = 1; + const int stride2 = 2; + + int output_size = batchSize*outputChannels*outputWidth*outputHeight; + int n = output_size / CUDA_NUM_THREADS; + correlationInitKernel<<<n + 1, CUDA_NUM_THREADS, 0, custream>>>(output_size, output); + + dim3 threadsPerBlock(THREADS_PER_BLOCK); + dim3 totalBlocksCorr(batchSize, outputHeight, outputWidth); + + int32_t pdim_yxc = inputHeight * inputWidth * inputChannels; + int32_t pdim_xc = inputWidth * inputChannels; + int32_t pdim_c = inputChannels; + + int32_t tdim_cyx = outputChannels * outputHeight * outputWidth; + int32_t tdim_yx = outputHeight * outputWidth; + int32_t tdim_x = outputWidth; + + correlation_forward<<<totalBlocksCorr, threadsPerBlock, 0, custream>>> + (output, tdim_cyx, tdim_yx, tdim_x, + input1, input2, pdim_yxc, pdim_xc, pdim_c, + kernel_size, max_displacement, stride1, stride2); + return 0; +} + +// correlation_backward_input1 kernel +__global__ void correlation_backward_input1(int item, float *grad_input_1, + const int p_dim_yxc, const int p_dim_xc, const int p_dim_c, + const int o_dim_cyx, const int o_dim_yx, const int o_dim_x, + const float *gradOutput, int outputChannels, + int outputHeight, int outputWidth, + const float *padded_input2, int pad_size, + int kernel_size, int max_displacement, + int stride1, int stride2, int kernel_radius, int displacement_radius, + int displacement_size) { + // NCHW (bs,num of channels,height,width) + int n = item; + int y = blockIdx.x * stride1 + pad_size; + int x = blockIdx.y * stride1 + pad_size; + int c = blockIdx.z; + int tch_off = threadIdx.x; + + int t_dim_cyx = outputChannels * outputHeight * outputWidth; + int t_dim_yx = outputHeight * outputWidth; + int t_dim_x = outputWidth; + + int x_min = (x - kernel_radius - max_displacement) / stride1; + int y_min = (y - kernel_radius - max_displacement) / stride1; + int x_max = (x + kernel_radius - max_displacement) / stride1; + int y_max = (y + kernel_radius - max_displacement) / stride1; + + // grad_input_1 is zero filled + if (x_max < 0 || y_max < 0 || x_min >= outputWidth || y_min >= outputHeight + || x_min > x_max || y_min > y_max) { + return; + } + // add range limit of height and width to cal grad_input_1 + x_min = max(0, x_min); + x_max = min(outputWidth-1, x_max); + + y_min = max(0, y_min); + y_max = min(outputHeight-1, y_max); + + float nums = kernel_size * kernel_size * p_dim_c; + + __shared__ float temp_sum[THREADS_PER_BLOCK]; + temp_sum[tch_off] = 0; + // along channel axism + for (int tc = tch_off; tc < outputChannels; tc += THREADS_PER_BLOCK) { + int m_2 = (tc % displacement_size - displacement_radius) * stride2; + int n_2 = (tc / displacement_size - displacement_radius) * stride2; + int index2 = n * p_dim_yxc + (y + n_2) * p_dim_xc + (x + m_2) * p_dim_c + c; + + float val2 = padded_input2[index2]; + + for (int j = y_min; j <= y_max; ++j) { + for (int i = x_min; i <= x_max; ++i) { + int t_index = n * t_dim_cyx + tc * t_dim_yx + j * t_dim_x + i; + temp_sum[tch_off] += gradOutput[t_index] * val2; + } + } + } + __syncthreads(); + + if (tch_off == 0) { + float reduce_sum = 0; + for (int index = 0; index < THREADS_PER_BLOCK; index++) { + reduce_sum += temp_sum[index]; + } + const int index1 = n * o_dim_cyx + c * o_dim_yx + (y - pad_size) * o_dim_x + (x - pad_size); + grad_input_1[index1] = reduce_sum / nums; + } +} + +// correlation_backward_input2 kernel +__global__ void correlation_backward_input2(int item, float *grad_input_2, + const int p_dim_yxc, const int p_dim_xc, const int p_dim_c, + const int o_dim_cyx, const int o_dim_yx, const int o_dim_x, + const int t_dim_cyx, const int t_dim_yx, const int t_dim_x, + const float *gradOutput, int outputChannels, + int outputHeight, int outputWidth, + const float *padded_input1, int pad_size, + int kernel_size, int max_displacement, + int stride1, int stride2, int kernel_radius, int displacement_radius, + int displacement_size) { + // NCHW (bs,num of channels,height,width) + int n = item; + int y = blockIdx.x * stride1 + pad_size; + int x = blockIdx.y * stride1 + pad_size; + int c = blockIdx.z; + + int tch_off = threadIdx.x; + __shared__ float prod_sum[THREADS_PER_BLOCK]; + prod_sum[tch_off] = 0; + for (int tc = tch_off; tc < outputChannels; tc += THREADS_PER_BLOCK) { + int m_1 = (tc % displacement_size - displacement_radius) * stride2; + int n_1 = (tc / displacement_size - displacement_radius) * stride2; + + int x_min = (x - kernel_radius - max_displacement - m_1) / stride1; + int y_min = (y - kernel_radius - max_displacement - n_1) / stride1; + + int x_max = (x + kernel_radius - max_displacement - m_1) / stride1; + int y_max = (y + kernel_radius - max_displacement - n_1) / stride1; + + if (x_max < 0 || y_max < 0) { + continue; + } + if (x_min >= outputWidth || y_min >= outputHeight) { + continue; + } + if (x_min > x_max || y_min > y_max) { + continue; + } + + // add range limit of height and width to cal grad_input_2 + x_min = max(0, x_min); + x_max = min(outputWidth-1, x_max); + y_min = max(0, y_min); + y_max = min(outputHeight-1, y_max); + + // assign value of gradOutput to grad_input_2 + int index_1 = n * p_dim_yxc + (y - n_1) * p_dim_xc + (x - m_1) * p_dim_c + c; + float val_1 = padded_input1[index_1]; + for (int j = y_min; j <= y_max; ++j) { + for (int i = x_min; i <= x_max; ++i) { + int t_index = n * t_dim_cyx + tc * t_dim_yx + j * t_dim_x + i; + prod_sum[tch_off] += gradOutput[t_index] * val_1; + } + } + } + + __syncthreads(); + if (tch_off == 0) { + float reduce_sum = 0; + for (int index = 0; index < THREADS_PER_BLOCK; index++) { + reduce_sum += prod_sum[index]; + } + const int index_2 = n * o_dim_cyx + c * o_dim_yx + (y - pad_size) * o_dim_x + (x - pad_size); + float nums = kernel_size * kernel_size * p_dim_c; + grad_input_2[index_2] = reduce_sum / nums; + } +} + +extern "C" int correlationGrad(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes, + void *stream, void *extra) { + cudaStream_t custream = static_cast<cudaStream_t>(stream); + constexpr int INPUT1_INDEX = 0; + constexpr int GRAD_OUTPUT_INDEX = 2; + constexpr int TOTAL_PARAM_NUM = 5; + + if (nparam != TOTAL_PARAM_NUM) { + return 1; + } + // This is to check if the type of parameters the same as what the user wants. + for (int i = 0; i < TOTAL_PARAM_NUM; i++) { + if (strcmp(dtypes[0], "float32") != 0) { + return 2; + } + } + + float *padded_input1 = static_cast<float *>(params[0]); + float *padded_input2 = static_cast<float *>(params[1]); + float *gradOutput = static_cast<float *>(params[2]); + float *gradInput1 = static_cast<float *>(params[3]); + float *gradInput2 = static_cast<float *>(params[4]); + + int batchSize = shapes[GRAD_OUTPUT_INDEX][0]; + int outputChannels = shapes[GRAD_OUTPUT_INDEX][1]; + int outputHeight = shapes[GRAD_OUTPUT_INDEX][2]; + int outputWidth = shapes[GRAD_OUTPUT_INDEX][3]; + + int inputChannels = shapes[INPUT1_INDEX][3]; + int p_inputHeight = shapes[INPUT1_INDEX][1]; + int p_inputWidth = shapes[INPUT1_INDEX][2]; + + // notice: At Currently the parameter used in cuda is fixed because the interface have no place to pass parameters + // need to be changed in future + const int pad_size = 20; + const int kernel_size = 1; + const int max_displacement = 20; + const int stride1 = 1; + const int stride2 = 2; + + int inputWidth = p_inputWidth - 2 * pad_size; + int inputHeight = p_inputHeight - 2 * pad_size; + + int kernel_radius = (kernel_size - 1) / 2; + int displacement_radius = max_displacement / stride2; + int displacement_size = 2 * displacement_radius + 1; + + int p_dim_yxc = p_inputHeight * p_inputWidth * inputChannels; + int p_dim_xc = p_inputWidth * inputChannels; + int p_dim_c = inputChannels; + + int t_dim_cyx = outputChannels * outputHeight * outputWidth; + int t_dim_yx = outputHeight * outputWidth; + int t_dim_x = outputWidth; + + int o_dim_cyx = inputChannels * inputHeight* inputWidth; + int o_dim_yx = inputHeight * inputWidth; + int o_dim_x = inputWidth; + + dim3 threadsPerBlock(THREADS_PER_BLOCK); + dim3 totalBlocksCorr(inputHeight, inputWidth, inputChannels); + + // initialize gradInput1 zero + int gradInput1_size = batchSize*inputChannels*inputWidth*inputHeight; + correlationInitKernel<<<gradInput1_size / CUDA_NUM_THREADS + 1, CUDA_NUM_THREADS, + 0, custream>>>(gradInput1_size, gradInput1); + // call correlation_backward_input1 + for (int n = 0; n < batchSize; ++n) { + correlation_backward_input1<<<totalBlocksCorr, threadsPerBlock, 0, custream>>> ( + n, gradInput1, p_dim_yxc, p_dim_xc, p_dim_c, o_dim_cyx, o_dim_yx, o_dim_x, + gradOutput, outputChannels, outputHeight, outputWidth, + padded_input2, pad_size, kernel_size, max_displacement, stride1, stride2, + kernel_radius, displacement_radius, displacement_size); + } + // initialize gradInput2 zero + int gradInput2_size = batchSize*inputChannels*inputWidth*inputHeight; + correlationInitKernel<<<gradInput2_size / CUDA_NUM_THREADS + 1, CUDA_NUM_THREADS, + 0, custream>>>(gradInput2_size, gradInput2); + // call correlation_backward_input2 + for (int n = 0; n < batchSize; n++) { + correlation_backward_input2<<<totalBlocksCorr, threadsPerBlock, 0, custream>>>( + n, gradInput2, p_dim_yxc, p_dim_xc, p_dim_c, o_dim_cyx, o_dim_yx, o_dim_x, + t_dim_cyx, t_dim_yx, t_dim_x, + gradOutput, outputChannels, outputHeight, outputWidth, + padded_input1, pad_size, kernel_size, max_displacement, stride1, stride2, + kernel_radius, displacement_radius, displacement_size); + } + return 0; +} diff --git a/research/cv/flownet2/src/submodels/custom_ops/custom_ops.py b/research/cv/flownet2/src/submodels/custom_ops/custom_ops.py new file mode 100644 index 0000000000000000000000000000000000000000..c985f27de1939ae93e2111c897f2c65a5786b4b4 --- /dev/null +++ b/research/cv/flownet2/src/submodels/custom_ops/custom_ops.py @@ -0,0 +1,93 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import os +import math +from mindspore.common import dtype as mstype +from mindspore.nn import Cell +import mindspore.ops as ops + + + +class TransposeAndPad(Cell): + def __init__(self, pad_size): + super(TransposeAndPad, self).__init__() + self.tanspose = ops.Transpose() + self.pad = ops.Pad(((0, 0), (pad_size, pad_size), (pad_size, pad_size), (0, 0))) + + def construct(self, x): + x_tanspose = self.tanspose(x, (0, 2, 3, 1)) + x_tanspose_pad = self.pad(x_tanspose) + return x_tanspose_pad + + +class Correlation(Cell): + def __init__(self, pad_size=0, kernel_size=0, max_displacement=0, stride1=1, stride2=2): + super(Correlation, self).__init__() + self.pad_size = pad_size + self.max_displacement = max_displacement + self.kernel_size = kernel_size + self.stride1 = stride1 + self.stride2 = stride2 + self.transpose_pad = TransposeAndPad(pad_size) + dir_path = os.path.dirname(os.path.abspath(__file__)) + self.func_path = dir_path + "/correlation.so" + + def construct(self, x1, x2): + pad_x1 = self.transpose_pad(x1) + pad_x2 = self.transpose_pad(x2) + n_output_channels = (int(self.max_displacement / self.stride2) * 2 + 1) \ + * (int(self.max_displacement / self.stride2) * 2 + 1) + x1_shape = x1.shape + kernel_radius = (self.kernel_size - 1) / 2 + border_radius = kernel_radius + self.max_displacement + padded_height = x1_shape[2] + 2 * self.pad_size + padded_width = x1_shape[3] + 2 * self.pad_size + output_height = int(math.ceil((padded_height - 2 * border_radius) / self.stride1)) + output_width = int(math.ceil((padded_width - 2 * border_radius) / self.stride1)) + out_shape = (x1_shape[0], n_output_channels, output_height, output_width) + correlation_forward = ops.Custom(self.func_path + ":correlation", out_shape, mstype.float32, "aot") + output = correlation_forward(pad_x1, pad_x2) + return output + + def bprop(self, x1, x2, out, dout): + pad_x1 = self.transpose_pad(x1) + pad_x2 = self.transpose_pad(x2) + correlation_backward = ops.Custom(self.func_path + ":correlationGrad", (x1.shape, x2.shape), + (mstype.float32, mstype.float32), "aot") + dx1, dx2 = correlation_backward(pad_x1, pad_x2, dout) + return dx1, dx2 + + +class Resample2D(Cell): + def __init__(self, kernel_size=1, bilinear=True): + super(Resample2D, self).__init__() + self.kernel_saize = kernel_size + self.bilinear = bilinear + dir_path = os.path.dirname(os.path.abspath(__file__)) + self.func_path = dir_path + "/resample2d.so" + + + def construct(self, x1, x2): + out_shape = (x2.shape[0], x1.shape[1], x2.shape[2], x2.shape[3]) + resample2d_forward = ops.Custom(self.func_path + ":Resample2d", out_shape, mstype.float32, "aot") + output = resample2d_forward(x1, x2) + return output + + def bprop(self, x1, x2, out, dout): + Resample2d_backward = ops.Custom(self.func_path + ":Resample2dGrad", (x1.shape, x2.shape), + (mstype.float32, mstype.float32), "aot") + dx1, dx2 = Resample2d_backward(x1, x2, dout) + return dx1, dx2 diff --git a/research/cv/flownet2/src/submodels/custom_ops/resample2d.cu b/research/cv/flownet2/src/submodels/custom_ops/resample2d.cu new file mode 100644 index 0000000000000000000000000000000000000000..fa7dfa4f752046997986b8e477838fd05bfa98e3 --- /dev/null +++ b/research/cv/flownet2/src/submodels/custom_ops/resample2d.cu @@ -0,0 +1,418 @@ +// Copyright 2022 Huawei Technologies Co., Ltd +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +// ============================================================================ +#define CUDA_NUM_THREADS 512 +#define THREADS_PER_BLOCK 64 + +#include <algorithm> + +__device__ __forceinline__ float MsAtomicAdd(float *address, const float val) { + return atomicAdd(address, val); +} + +__global__ void Resample2dInitKernel(size_t size_init, float *input) { + auto idx = blockIdx.x * CUDA_NUM_THREADS + threadIdx.x; + if (idx < size_init) { + input[idx] = static_cast<float>(.0); + } +} + + +__device__ int GET_INDEX(const int batch , const int channels, const int height, const int width, + const int batch_stride , const int channels_stride, const int height_stride) { + return batch*batch_stride+channels*channels_stride+height*height_stride+width; +} + +__device__ float DIM3_INDEX(const float *input, const int batch , const int channels, const int height, const int width, + const int batch_stride , const int channels_stride, const int height_stride) { + return input[batch*batch_stride+channels*channels_stride+height*height_stride+width]; +} + + +__global__ void Resample2dKernel(size_t size, const float *input1, const float *input2, float *out_data, + int batch_stride_x1, int channel_stride_x1, int height_stride_x1, + int batch_stride_x2, int channel_stride_x2, int height_stride_x2, + int batch_output, int channel_output, int height_output, int width_output, + int kernel_size, bool bilinear) { + int index = blockIdx.x * blockDim.x + threadIdx.x; + + if (index >= size) { + return; + } + + float val = 0.0; + + int dim_b = batch_output; + int dim_c = channel_output; + int dim_h = height_output; + int dim_w = width_output; + int dim_chw = dim_c * dim_h * dim_w; + int dim_hw = dim_h * dim_w; + int b = (index / dim_chw) % dim_b; + int c = (index / dim_hw) % dim_c; + int y = (index / dim_w) % dim_h; + int x = (index) % dim_w; + + float dx = DIM3_INDEX(input2, b, 0, y, x, batch_stride_x2, channel_stride_x2, height_stride_x2); + float dy = DIM3_INDEX(input2, b, 1, y, x, batch_stride_x2, channel_stride_x2, height_stride_x2); + + float xf = x + dx; + float yf = y + dy; // img+flow + float alpha = xf - (floor(xf)); // alpha + float beta = yf - (floor(yf)); // beta + if (bilinear) { + int xL = max(min(static_cast<int>(floor(xf)), dim_w-1), 0); + int xR = max(min(static_cast<int>(floor(xf)+1), dim_w -1), 0); + int yT = max(min(static_cast<int>(floor(yf)), dim_h-1), 0); + int yB = max(min(static_cast<int>(floor(yf)+1), dim_h-1), 0); + for (int fy = 0; fy < kernel_size; fy += 1) { + for (int fx = 0; fx < kernel_size; fx += 1) { + float offTL = DIM3_INDEX(input1, b, c, yT + fy, xL + fx, + batch_stride_x1, channel_stride_x1, height_stride_x1); + float offTR = DIM3_INDEX(input1, b, c, yT + fy, xR + fx, + batch_stride_x1, channel_stride_x1, height_stride_x1); + float offBL = DIM3_INDEX(input1, b, c, yB + fy, xL + fx, + batch_stride_x1, channel_stride_x1, height_stride_x1); + float offBR = DIM3_INDEX(input1, b, c, yB + fy, xR + fx, + batch_stride_x1, channel_stride_x1, height_stride_x1); + val += (1. - alpha)*(1. - beta) * offTL; + val += (alpha)*(1. - beta) * offTR; + val += (1. - alpha)*(beta) * offBL; + val += (alpha)*(beta) * offBR; + } + } + out_data[index] = val; + } else { + int xN = max(min(static_cast<int>(floor(xf + 0.5)), dim_w - 1), 0); + int yN = max(min(static_cast<int>(floor(yf + 0.5)), dim_h - 1), 0); + out_data[index] = DIM3_INDEX(input1, b, c, yN, xN, batch_stride_x1, channel_stride_x1, height_stride_x1); + } +} + + + +extern "C" int Resample2d(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes, void *stream, + void *extra) { + cudaStream_t custream = static_cast<cudaStream_t>(stream); + constexpr int INPUT1_INDEX = 0; + constexpr int INPUT2_INDEX = 1; + constexpr int OUTPUT_INDEX = 2; + constexpr int TOTAL_PARAM_NUM = 3; + + if (nparam != TOTAL_PARAM_NUM) { + return 1; + } + // This is to check if the type of parameters the same as what the user wants. + for (int i = 0; i < nparam; i++) { + if (strcmp(dtypes[i], "float32") != 0) { + return 2; + } + } + + float *x1 = static_cast<float *>(params[0]); + float *x2 = static_cast<float *>(params[1]); + + float *out_data = static_cast<float *>(params[2]); + +// int batch_x1 = shapes[INPUT1_INDEX][0]; + int channel_x1 = shapes[INPUT1_INDEX][1]; + int height_x1 = shapes[INPUT1_INDEX][2]; + int width_x1 = shapes[INPUT1_INDEX][3]; + +// int batch_x2 = shapes[INPUT2_INDEX][0]; + int channel_x2 = shapes[INPUT2_INDEX][1]; + int height_x2 = shapes[INPUT2_INDEX][2]; + int width_x2 = shapes[INPUT2_INDEX][3]; + + int batch_output = shapes[OUTPUT_INDEX][0]; + int channel_output = shapes[OUTPUT_INDEX][1]; + int height_output = shapes[OUTPUT_INDEX][2]; + int width_output = shapes[OUTPUT_INDEX][3]; + + // fix at now ,need to be changed in future + const int kernel_size = 1; + const bool bilinear = true; + + int batch_stride_x1 = channel_x1 * height_x1 * width_x1; + int channel_stride_x1 = height_x1 * width_x1; + int height_stride_x1 = width_x1; + int batch_stride_x2 = channel_x2 * height_x2 * width_x2; + int channel_stride_x2 = height_x2 * width_x2; + int height_stride_x2 = width_x2; + size_t size = batch_output * channel_output * height_output * width_output; + Resample2dInitKernel<<<size / CUDA_NUM_THREADS +1, CUDA_NUM_THREADS, 0, custream>>>(size, out_data); + + Resample2dKernel<<< (size + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, 0, custream>>> + (size, x1, x2, out_data, batch_stride_x1, channel_stride_x1, height_stride_x1, + batch_stride_x2, channel_stride_x2, height_stride_x2, batch_output, channel_output, + height_output, width_output, kernel_size , bilinear); + return 0; +} + + +__global__ void kernel_resample2d_grad_input1(size_t size, + const float* input1, int batch_input1, int channel_input1, int height_input1, int width_input1, + const float* input2, int batch_stride_input2, int channel_stride_input2, int height_stride_input2, + const float* gradOutput, int batch_gradOutput, int channel_gradOutput, int height_gradOutput, int width_gradOutput, + int batch_stride_gradOutput, int channel_stride_gradOutput, int height_stride_gradOutput, + float* gradInput, int batch_stride_gradInput, int channel_stride_gradInput, int height_stride_gradInput, + int kernel_size, bool bilinear) { + + int index = blockIdx.x * blockDim.x + threadIdx.x; + + if (index >= size) { + return; + } + + int dim_b = batch_gradOutput; + int dim_c = channel_gradOutput; + int dim_h = height_gradOutput; + int dim_w = width_gradOutput; + int dim_chw = dim_c * dim_h * dim_w; + int dim_hw = dim_h * dim_w; + + int b = (index / dim_chw) % dim_b; + int c = (index / dim_hw) % dim_c; + int y = (index / dim_w) % dim_h; + int x = (index) % dim_w; + + float dx = DIM3_INDEX(input2, b, 0, y, x, batch_stride_input2, channel_stride_input2, height_stride_input2); + float dy = DIM3_INDEX(input2, b, 1, y, x, batch_stride_input2, channel_stride_input2, height_stride_input2); + + float xf = x + dx; + float yf = y + dy; + float alpha = xf - static_cast<int>(xf); // alpha + float beta = yf - static_cast<int>(yf); // beta + + int idim_h = height_input1; + int idim_w = width_input1; + + int xL = max(min(static_cast<int>(floor(xf)), idim_w-1), 0); + int xR = max(min(static_cast<int>(floor(xf)+1), idim_w -1), 0); + int yT = max(min(static_cast<int>(floor(yf)), idim_h-1), 0); + int yB = max(min(static_cast<int>(floor(yf)+1), idim_h-1), 0); + + float w1, w2, w3, w4; + float num = 1.f; + w1 = (num-alpha)*(num-beta); + w2 = (alpha)*(num-beta); + w3 = (num-alpha)*(beta); + w4 = (alpha)*(beta); + + float gradnum = DIM3_INDEX(gradOutput, b, c, y, x, + batch_stride_gradOutput, channel_stride_gradOutput, height_stride_gradOutput); + for (int fy = 0; fy < kernel_size; fy += 1) { + for (int fx = 0; fx < kernel_size; fx += 1) { + int indexTL = GET_INDEX(b, c, (yT + fy), (xL + fx), + batch_stride_gradInput, channel_stride_gradInput, height_stride_gradInput); + MsAtomicAdd(&gradInput[indexTL], w1 * gradnum); + + int indexTR = GET_INDEX(b, c, (yT + fy), (xR + fx), + batch_stride_gradInput, channel_stride_gradInput, height_stride_gradInput); + MsAtomicAdd(&gradInput[indexTR], w2 * gradnum); + + int indexBL = GET_INDEX(b, c, (yB + fy), (xL + fx), + batch_stride_gradInput, channel_stride_gradInput, height_stride_gradInput); + MsAtomicAdd(&gradInput[indexBL], w3 * gradnum); + + int indexBR = GET_INDEX(b, c, (yB + fy), (xR + fx), + batch_stride_gradInput, channel_stride_gradInput, height_stride_gradInput); + MsAtomicAdd(&gradInput[indexBR], w4 * gradnum); + } + } +} + + +__global__ void kernel_resample2d_grad_input2(size_t size, + const float *input1, int batch_stride_input1, int channel_stride_input1, int height_stride_input1, + const float *input2, int batch_stride_input2, int channel_stride_input2, int height_stride_input2, + const float *gradOutput, int channel_gradOutput, int batch_stride_gradOutput, + int channel_stride_gradOutput, int height_stride_gradOutput, + float *gradInput, int batch_gradInput, int channel_gradInput, int height_gradInput, int width_gradInput, + int batch_stride_gradInput, int channel_stride_gradInput, int height_stride_gradInput, + int kernel_size, bool bilinear) { + int index = blockIdx.x * blockDim.x + threadIdx.x; + + if (index >= size) { + return; + } + + float output = 0.0; + int kernel_rad = (kernel_size - 1)/2; + + int dim_b = batch_gradInput; + int dim_c = channel_gradInput; + int dim_h = height_gradInput; + int dim_w = width_gradInput; + int dim_chw = dim_c * dim_h * dim_w; + int dim_hw = dim_h * dim_w; + + int b = (index / dim_chw) % dim_b; + int c = (index / dim_hw) % dim_c; + int y = (index / dim_w) % dim_h; + int x = (index) % dim_w; + + int odim_c = channel_gradOutput; + + float dx = DIM3_INDEX(input2, b, 0, y, x, batch_stride_input2, channel_stride_input2, height_stride_input2); + float dy = DIM3_INDEX(input2, b, 1, y, x, batch_stride_input2, channel_stride_input2, height_stride_input2); + + float xf = x + dx; + float yf = y + dy; + + int xL = max(min(static_cast<int>(floor(xf)), dim_w-1), 0); + int xR = max(min(static_cast<int>(floor(xf)+1), dim_w -1), 0); + int yT = max(min(static_cast<int>(floor(yf)), dim_h-1), 0); + int yB = max(min(static_cast<int>(floor(yf)+1), dim_h-1), 0); + + if (c % 2) { + float gamma = 1 - (xf - floor(xf)); // alpha + for (int i = 0; i <= 2*kernel_rad ; ++i) { + for (int j = 0; j <= 2*kernel_rad; ++j) { + for (int ch = 0; ch < odim_c; ++ch) { + float gradout = DIM3_INDEX(gradOutput, b, ch, y, x, + batch_stride_gradOutput, channel_stride_gradOutput, height_stride_gradOutput); + output += (gamma) * gradout * DIM3_INDEX(input1, b, ch, (yB + j), (xL + i), + batch_stride_input1, channel_stride_input1, height_stride_input1); + output -= (gamma) * gradout * DIM3_INDEX(input1, b, ch, (yT + j), (xL + i), + batch_stride_input1, channel_stride_input1, height_stride_input1); + output += (1-gamma) * gradout * DIM3_INDEX(input1, b, ch, (yB + j), (xR + i), + batch_stride_input1, channel_stride_input1, height_stride_input1); + output -= (1-gamma) * gradout * DIM3_INDEX(input1, b, ch, (yT + j), (xR + i), + batch_stride_input1, channel_stride_input1, height_stride_input1); + } + } + } + } else { + float gamma = 1 - (yf - floor(yf)); // alpha + for (int i = 0; i <= 2*kernel_rad; ++i) { + for (int j = 0; j <= 2*kernel_rad; ++j) { + for (int ch = 0; ch < odim_c; ++ch) { + float gradout = static_cast<float>(DIM3_INDEX(gradOutput, b, ch, y, x, + batch_stride_gradOutput, channel_stride_gradOutput, height_stride_gradOutput)); + output += (gamma) * gradout * static_cast<float>(DIM3_INDEX(input1, b, ch, (yT + j), (xR + i), + batch_stride_input1, channel_stride_input1, height_stride_input1)); + output -= (gamma)* gradout * static_cast<float>(DIM3_INDEX(input1, b, ch, (yT + j), (xL + i), + batch_stride_input1, channel_stride_input1, height_stride_input1)); + output += (1-gamma)* gradout * static_cast<float>(DIM3_INDEX(input1, b, ch, (yB + j), (xR + i), + batch_stride_input1, channel_stride_input1, height_stride_input1)); + output -= (1-gamma) * gradout * static_cast<float>(DIM3_INDEX(input1, b, ch, (yB + j), (xL + i), + batch_stride_input1, channel_stride_input1, height_stride_input1)); + } + } + } + } + gradInput[index] = output; +} + + +extern "C" int Resample2dGrad(int nparam, void **params, int *ndims, int64_t **shapes, const char **dtypes, + void *stream, void *extra) { + cudaStream_t custream = static_cast<cudaStream_t>(stream); + constexpr int INPUT1_INDEX = 0; + constexpr int INPUT2_INDEX = 1; + constexpr int GRAD_OUTPUT_INDEX = 2; + constexpr int TOTAL_PARAM_NUM = 5; + + if (nparam != TOTAL_PARAM_NUM) { + return 1; + } + // This is to check if the type of parameters the same as what the user wants. + for (int i = 0; i < nparam; i++) { + if (strcmp(dtypes[i], "float32") != 0) { + return 2; + } + } + + float *x1 = static_cast<float *>(params[0]); + float *x2 = static_cast<float *>(params[1]); + float *dout = static_cast<float *>(params[2]); + float *dx1 = static_cast<float *>(params[3]); + float *dx2 = static_cast<float *>(params[4]); + + int batch_x1 = shapes[INPUT1_INDEX][0]; + int channel_x1 = shapes[INPUT1_INDEX][1]; + int height_x1 = shapes[INPUT1_INDEX][2]; + int width_x1 = shapes[INPUT1_INDEX][3]; + + int batch_x2 = shapes[INPUT2_INDEX][0]; + int channel_x2 = shapes[INPUT2_INDEX][1]; + int height_x2 = shapes[INPUT2_INDEX][2]; + int width_x2 = shapes[INPUT2_INDEX][3]; + + int batch_dout = shapes[GRAD_OUTPUT_INDEX][0]; + int channel_dout = shapes[GRAD_OUTPUT_INDEX][1]; + int height_dout = shapes[GRAD_OUTPUT_INDEX][2]; + int width_dout = shapes[GRAD_OUTPUT_INDEX][3]; + + // fix at now ,need to be changed in future + const int kernel_size = 1; + const bool bilinear = true; + + int batch_dx1 = batch_x1; + int channel_dx1 = channel_x1; + int height_dx1 = height_x1; + int width_dx1 = width_x1; + int batch_dx2 = batch_x2; + int channel_dx2 = channel_x2; + int height_dx2 = height_x2; + int width_dx2 = width_x2; + int batch_stride_x1 = channel_x1 * height_x1 * width_x1; + int channel_stride_x1 = height_x1 * width_x1; + int height_stride_x1 = width_x1; +// int width_stride_x1 = 1; + int batch_stride_x2 = channel_x2 * height_x2 * width_x2; + int channel_stride_x2 = height_x2 * width_x2; + int height_stride_x2 = width_x2; +// int width_stride_x2 = 1; + int batch_stride_dx1 = batch_stride_x1; + int channel_stride_dx1 = channel_stride_x1; + int height_stride_dx1 = height_stride_x1; +// int width_stride_dx1 = width_stride_x1; + int batch_stride_dx2 = batch_stride_x2; + int channel_stride_dx2 = channel_stride_x2; + int height_stride_dx2 = height_stride_x2; +// int width_stride_dx2 = width_stride_x2; + int batch_stride_dout = channel_dout * height_dout * width_dout; + int channel_stride_dout = height_dout * width_dout; + int height_stride_dout = width_dout; +// int width_stride_dout = 1; + + size_t dx1_size = batch_dx1 * channel_dx1 * height_dx1 * width_dx1; + + Resample2dInitKernel<<<dx1_size / CUDA_NUM_THREADS +1, CUDA_NUM_THREADS, 0, custream>>>(dx1_size, dx1); + size_t dx2_size = batch_dx2 * channel_dx2 * height_dx2 * width_dx2; + Resample2dInitKernel<<<dx2_size / CUDA_NUM_THREADS +1, CUDA_NUM_THREADS, 0, custream>>>(dx2_size, dx2); + + size_t dout_size = batch_dout * channel_dout * height_dout * width_dout; + + kernel_resample2d_grad_input1<<<(dout_size + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, + 0, custream>>>(dout_size, + x1, batch_x1, channel_x1, height_x1, width_x1, + x2, batch_stride_x2, channel_stride_x2, height_stride_x2, + dout, batch_dout, channel_dout, height_dout, width_dout, + batch_stride_dout, channel_stride_dout, height_stride_dout, + dx1, batch_stride_dx1, channel_stride_dx1, height_stride_dx1, + kernel_size, bilinear); + + kernel_resample2d_grad_input2<<<(dx2_size + CUDA_NUM_THREADS - 1)/CUDA_NUM_THREADS, CUDA_NUM_THREADS, + 0, custream>>>(dx2_size, + x1, batch_stride_x1, channel_stride_x1, height_stride_x1, + x2, batch_stride_x2, channel_stride_x2, height_stride_x2, + dout, channel_dout, batch_stride_dout, channel_stride_dout, height_stride_dout, + dx2, batch_dx2, channel_dx2, height_dx2, width_dx2, + batch_stride_dx2, channel_stride_dx2, height_stride_dx2, + kernel_size, bilinear); + return 0; +} diff --git a/research/cv/flownet2/src/submodels/submodules.py b/research/cv/flownet2/src/submodels/submodules.py new file mode 100644 index 0000000000000000000000000000000000000000..c8adc05bcbe9c1fde372725fbd30a89ff8dcd521 --- /dev/null +++ b/research/cv/flownet2/src/submodels/submodules.py @@ -0,0 +1,98 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +import mindspore.nn as nn +import mindspore.ops as ops + +class Norm(nn.Cell): + def __init__(self, axis=1, keep_dims=False): + super(Norm, self).__init__() + self.axis = axis + self.keep_dims = keep_dims + self.reduce_sum = ops.ReduceSum(True) + self.sqrt = ops.Sqrt() + self.squeeze = ops.Squeeze(self.axis) + + def construct(self, x): + x = self.sqrt(ops.maximum(self.reduce_sum(ops.square(x), self.axis), 1e-7)) + + if not self.keep_dims: + x = self.squeeze(x) + return x + + +class ChannelNorm(nn.Cell): + def __init__(self, axis=1): + super(ChannelNorm, self).__init__() + self.axis = axis + self.add = ops.Add() + self.norm = Norm(axis) + + def construct(self, x): + output = self.norm(x) + output = output.reshape(output.shape[0], 1, output.shape[1], output.shape[2]) + return output + + +class Upsample(nn.Cell): + + def __init__(self, scale_factor=4, mode='bilinear'): + super(Upsample, self).__init__() + self.scale_factor = scale_factor + self.mode = mode + + def construct(self, x): + shape = x.shape + new_height = shape[2] * self.scale_factor + new_width = shape[3] * self.scale_factor + if self.mode == 'nearest': + upsample_op = ops.ResizeNearestNeighbor((new_height, new_width)) + else: + upsample_op = ops.ResizeBilinear((new_height, new_width)) + return upsample_op(x) + + +def conv(batchnorm, in_planes, out_planes, kernel_size=3, stride=1): + if batchnorm: + conv2d = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, pad_mode='pad', + padding=(kernel_size - 1) // 2, has_bias=False) + batchNorm2d = nn.BatchNorm2d(out_planes) + leakyReLU = nn.LeakyReLU(0.1) + return nn.SequentialCell([conv2d, batchNorm2d, leakyReLU]) + conv2d = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, pad_mode='pad', + padding=(kernel_size - 1) // 2, has_bias=True) + leakyReLU = nn.LeakyReLU(0.1) + return nn.SequentialCell([conv2d, leakyReLU]) + + +def i_conv(batchNorm, in_planes, out_planes, kernel_size=3, stride=1, bias=True): + if batchNorm: + conv2d = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, pad_mode='pad', + padding=(kernel_size - 1) // 2, has_bias=bias) + batchNorm2d = nn.BatchNorm2d(out_planes) + return nn.SequentialCell([conv2d, batchNorm2d]) + conv2d = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, pad_mode='pad', + padding=(kernel_size - 1) // 2, has_bias=bias) + return nn.SequentialCell([conv2d]) + + +def predict_flow(in_planes): + return nn.Conv2d(in_planes, 2, kernel_size=3, stride=1, pad_mode='pad', padding=1, has_bias=True) + + +def deconv(in_planes, out_planes): + convTranspose2d = nn.Conv2dTranspose(in_planes, out_planes, kernel_size=4, stride=2, pad_mode='pad', padding=1, + has_bias=True) + leakyReLU = nn.LeakyReLU(0.1) + return nn.SequentialCell([convTranspose2d, leakyReLU]) diff --git a/research/cv/flownet2/train.py b/research/cv/flownet2/train.py new file mode 100644 index 0000000000000000000000000000000000000000..e6615a6b28898a2ea6802d7dfa1c4ed850e0ab87 --- /dev/null +++ b/research/cv/flownet2/train.py @@ -0,0 +1,162 @@ +# Copyright 2022 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +import os +import datetime +import glob +import mindspore as ms +import mindspore.dataset as ds +import mindspore.log as logger +import mindspore.nn as nn +from mindspore.context import ParallelMode +from mindspore.nn.optim.adam import Adam +from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor +from mindspore.train.loss_scale_manager import DynamicLossScaleManager, FixedLossScaleManager +from mindspore.train.model import Model +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from mindspore.communication.management import init, get_rank, get_group_size +from mindspore.common import set_seed + +from src.eval_callback import EvalCallBack +import src.dataset as datasets +import src.models as models +from src.metric import FlowNetEPE +import src.model_utils.tools as tools +from src.model_utils.config import config + + +def set_save_ckpt_dir(): + """set save ckpt dir""" + ckpt_save_dir = config.save_checkpoint_path + if config.run_distribute: + ckpt_save_dir = ckpt_save_dir + "/ckpt_" + str(get_rank()) + "/" + return ckpt_save_dir + +def apply_eval(eval_param): + eval_model = eval_param["model"] + eval_ds = eval_param["dataset"] + metrics_name = eval_param["metrics_name"] + res = eval_model.eval(eval_ds, dataset_sink_mode=False) + return res[metrics_name] + + +def load_pre_trained_checkpoint(net, pre_trained, checkpoint_path): + param_dict = None + if pre_trained: + if os.path.isdir(checkpoint_path): + ckpt_save_dir = os.path.join(checkpoint_path, "ckpt_0") + ckpt_pattern = os.path.join(ckpt_save_dir, "*.ckpt") + ckpt_files = glob.glob(ckpt_pattern) + if not ckpt_files: + logger.warning(f"There is no ckpt file in {ckpt_save_dir}, " + f"pre_trained is unsupported.") + else: + ckpt_files.sort(key=os.path.getmtime, reverse=True) + time_stamp = datetime.datetime.now() + print(f"time stamp {time_stamp.strftime('%Y.%m.%d-%H:%M:%S')}" + f" pre trained ckpt model {ckpt_files[0]} loading", + flush=True) + param_dict = load_checkpoint(ckpt_files[0]) + elif os.path.isfile(checkpoint_path): + param_dict = load_checkpoint(checkpoint_path) + else: + print(f"Invalid pre_trained {checkpoint_path} parameter.") + return + load_param_into_net(net, param_dict) + print(f"loaded param from {checkpoint_path} into net") + + +def add_ckpt_callback(step_size, ckpt_save_dir, cbs): + if config.save_checkpoint: + config_ck = CheckpointConfig(save_checkpoint_steps=step_size * config.save_ckpt_interval, + keep_checkpoint_max=config.keep_checkpoint_max) + ckpoint_cb = ModelCheckpoint(prefix="flownet2_", directory=ckpt_save_dir, config=config_ck) + cbs += [ckpoint_cb] + + +def add_eval_callback(model, ckpt_save_dir, cbs): + if config.run_evalCallback: + if config.eval_data_path is None or (not os.path.isdir(config.eval_data_path)): + raise ValueError("{} is not a existing path.".format(config.eval_data_path)) + + config.eval_dataset_class = tools.module_to_dict(datasets)[config.eval_data] + flownet_eval_gen = config.eval_dataset_class("Center", config.crop_size, config.eval_size, + config.eval_data_path) + eval_dataset = ds.GeneratorDataset(flownet_eval_gen, ["images", "flow"], + num_parallel_workers=config.num_parallel_workers, + max_rowsize=config.max_rowsize) + eval_dataset = eval_dataset.batch(config.batch_size) + + eval_param_dict = {"model": model, "dataset": eval_dataset, "metrics_name": "FlowNetEPE"} + eval_cb = EvalCallBack(apply_eval, eval_param_dict, interval=config.eval_interval, + eval_start_epoch=config.eval_start_epoch, save_best_ckpt=config.save_best_ckpt, + ckpt_directory=ckpt_save_dir, besk_ckpt_name="best_acc.ckpt", + metrics_name="FlowNetEPE") + cbs += [eval_cb] + + +def run_train(): + set_seed(config.seed) + ms.set_context(mode=ms.context.GRAPH_MODE, enable_graph_kernel=True, device_target=config.device_target) + ds.config.set_enable_shared_mem(False) + if config.device_target == "GPU": + if config.run_distribute: + init() + parallel_mode = ParallelMode.DATA_PARALLEL + rank = get_rank() + group_size = get_group_size() + else: + parallel_mode = ParallelMode.STAND_ALONE + rank = 0 + group_size = 1 + + ms.context.set_auto_parallel_context(parallel_mode=parallel_mode, gradients_mean=True, device_num=group_size) + + # load dataset by config param + config.training_dataset_class = tools.module_to_dict(datasets)[config.train_data] + flownet_train_gen = config.training_dataset_class(config.crop_type, config.crop_size, config.eval_size, + config.train_data_path) + sampler = datasets.DistributedSampler(flownet_train_gen, rank, group_size, shuffle=True) + train_dataset = ds.GeneratorDataset(flownet_train_gen, ["images", "flow"], + sampler=sampler, num_parallel_workers=config.num_parallel_workers) + train_dataset = train_dataset.batch(config.batch_size) + step_size = train_dataset.get_dataset_size() + + # load model by config param + config.model_class = tools.module_to_dict(models)[config.model] + net = config.model_class(config.rgb_max, config.batchNorm) + + loss = nn.L1Loss() + if config.is_dynamicLoss_scale == 1: + loss_scale_manager = DynamicLossScaleManager(init_loss_scale=65536, scale_factor=2, scale_window=2000) + else: + loss_scale_manager = FixedLossScaleManager(config.scale, drop_overflow_update=False) + + optim = Adam(params=net.trainable_params(), learning_rate=config.lr) + + load_pre_trained_checkpoint(net, config.pre_trained, config.pre_trained_ckpt_path) + model = Model(net, loss_fn=loss, optimizer=optim, metrics={'FlowNetEPE': FlowNetEPE()}, + amp_level="O0", keep_batchnorm_fp32=True, loss_scale_manager=loss_scale_manager) + # add callback + time_cb = TimeMonitor(data_size=step_size) + loss_cb = LossMonitor() + cbs = [time_cb, loss_cb] + ckpt_save_dir = set_save_ckpt_dir() + add_ckpt_callback(step_size, ckpt_save_dir, cbs) + add_eval_callback(model, ckpt_save_dir, cbs) + + model.train(config.epoch_size, train_dataset, callbacks=cbs, dataset_sink_mode=True) + +if __name__ == '__main__': + run_train()