Skip to content
Snippets Groups Projects
Unverified Commit e6e458d0 authored by i-robot's avatar i-robot Committed by Gitee
Browse files

!3244 CLIFF init

Merge pull request !3244 from anzhengqi/master
parents 42c88451 d9d1f5c1
No related branches found
No related tags found
No related merge requests found
Showing
with 838 additions and 0 deletions
# Content
- [Introduction](#introduction)
- [Dataset](#dataset)
- [Requiments](#requirements)
- [Quick Start](#quick-start)
- [ModelZoo Homepage](#modelzoo-homepage)
## Introduction
<img src="assets/teaser.gif" width="100%">
*(This testing video is from the 3DPW testset, and processed frame by frame without temporal smoothing.)*
This repo contains the CLIFF demo code (Implemented in MindSpore) for the following paper.
> CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation. \
> Zhihao Li, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, and Youliang Yan ⋆ \
> ECCV 2022 Oral
<img src="assets/arch.png" width="100%">
## Dataset
Not relevant.
## Requirements
```bash
conda create -n cliff python=3.9
pip install -r requirements.txt
```
Download the pretrained checkpoints and the testing sample to run the demo.
[[Baidu Pan](https://pan.baidu.com/s/15v0jnoyEpKIXWhh2AjAZeQ?pwd=7777)]
[[Google Drive](https://drive.google.com/drive/folders/1_d12Q8Yj13TEvB_4vopAbMdwJ1-KVR0R?usp=sharing)]
Finally put these data following the directory structure as below:
```text
${ROOT}
|-- ckpt
|-- cliff-hr48-PA43.0_MJE69.0_MVE81.2_3dpw.ckpt
|-- cliff-res50-PA45.7_MJE72.0_MVE85.3_3dpw.ckpt
|-- data
|-- data/im07937.png
|-- data/smpl_mean_params.npz
```
## Quick Start
```bash
python demo.py --input_path PATH --ckpt CKPT
```
<img src="assets/im08036/im08036.png" width="24%">
<img src="assets/im08036/im08036_bbox.jpg" width="24%">
<img src="assets/im08036/im08036_front_view_cliff_hr48.jpg" width="24%">
<img src="assets/im08036/im08036_side_view_cliff_hr48.jpg" width="24%">
<img src="assets/im00492/im00492.png" width="24%">
<img src="assets/im00492/im00492_bbox.jpg" width="24%">
<img src="assets/im00492/im00492_front_view_cliff_hr48.jpg" width="24%">
<img src="assets/im00492/im00492_side_view_cliff_hr48.jpg" width="24%">
One can change the demo options in the script. Please see the option description in the bottom lines of `demo.py`.
## ModelZoo Homepage
Please check the official [homepage](https://gitee.com/mindspore/models).
\ No newline at end of file
# 目录
- [模型简介](#模型简介)
- [数据集](#数据集)
- [环境要求](#环境要求)
- [快速入门](#推理)
- [ModelZoo主页](#modelzoo-主页)
## 模型简介
<img src="assets/teaser.gif" width="100%">
*(该测试视频来自3DPW的测试集,处理时是一帧一帧地处理的,并没有加上时域平滑.)*
CLIFF(ECCV 2022 Oral)是一种基于单目图像的人体动作捕捉算法,在多个公开数据集上取得了优异的效果。
> CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation. \
> Zhihao Li, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, and Youliang Yan ⋆ \
> ECCV 2022 Oral
<img src="assets/arch.png" width="100%">
## 数据集
不涉及
## 环境要求
```bash
conda create -n cliff python=3.9
pip install -r requirements.txt
```
下载预训练模型和测试样例,以运行推理代码。
[[百度网盘](https://pan.baidu.com/s/15v0jnoyEpKIXWhh2AjAZeQ?pwd=7777)]
[[Google Drive](https://drive.google.com/drive/folders/1_d12Q8Yj13TEvB_4vopAbMdwJ1-KVR0R?usp=sharing)]
请把预训练模型放在`ckpt`目录下,测试样例放在`data`目录下,形成如下的目录结构:
```text
${ROOT}
|-- ckpt
|-- cliff-hr48-PA43.0_MJE69.0_MVE81.2_3dpw.ckpt
|-- cliff-res50-PA45.7_MJE72.0_MVE85.3_3dpw.ckpt
|-- data
|-- data/im07937.png
|-- data/smpl_mean_params.npz
```
## 快速入门
运行脚本`demo.py`即可推理。
```bash
python demo.py --input_path PATH --ckpt CKPT
```
<p float="left">
<img src="assets/im08036/im08036.png" width="24%">
<img src="assets/im08036/im08036_bbox.jpg" width="24%">
<img src="assets/im08036/im08036_front_view_cliff_hr48.jpg" width="24%">
<img src="assets/im08036/im08036_side_view_cliff_hr48.jpg" width="24%">
</p>
<p float="left">
<img src="assets/im00492/im00492.png" width="24%">
<img src="assets/im00492/im00492_bbox.jpg" width="24%">
<img src="assets/im00492/im00492_front_view_cliff_hr48.jpg" width="24%">
<img src="assets/im00492/im00492_side_view_cliff_hr48.jpg" width="24%">
</p>
demo的相关参数可以修改,关于这些参数的说明请看`demo.py`文件的下方。
## ModelZoo 主页
请浏览官方[主页](https://gitee.com/mindspore/models)
\ No newline at end of file
research/cv/CLIFF/assets/arch.png

351 KiB

research/cv/CLIFF/assets/im00492/im00492.png

206 KiB

research/cv/CLIFF/assets/im00492/im00492_bbox.jpg

68.3 KiB

research/cv/CLIFF/assets/im00492/im00492_front_view_cliff_hr48.jpg

67.7 KiB

research/cv/CLIFF/assets/im00492/im00492_side_view_cliff_hr48.jpg

6.31 KiB

research/cv/CLIFF/assets/im08036/im08036.png

403 KiB

research/cv/CLIFF/assets/im08036/im08036_bbox.jpg

121 KiB

research/cv/CLIFF/assets/im08036/im08036_front_view_cliff_hr48.jpg

115 KiB

research/cv/CLIFF/assets/im08036/im08036_side_view_cliff_hr48.jpg

12.8 KiB

research/cv/CLIFF/assets/teaser.gif

2.62 MiB

# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
from os.path import join
curr_dir = os.path.dirname(os.path.abspath(__file__))
SMPL_MEAN_PARAMS = join(curr_dir, '../data/smpl_mean_params.npz')
CROP_IMG_HEIGHT = 256
CROP_IMG_WIDTH = 192
CROP_ASPECT_RATIO = CROP_IMG_HEIGHT / float(CROP_IMG_WIDTH)
# Mean and standard deviation for normalizing input image
IMG_NORM_MEAN = [0.485, 0.456, 0.406]
IMG_NORM_STD = [0.229, 0.224, 0.225]
# Copyright (c) 2019, University of Pennsylvania, Max Planck Institute for Intelligent Systems
# This script is borrowed and extended from SPIN
import cv2
import numpy as np
from common import constants
def get_transform(center, scale, res, rot=0):
"""Generate transformation matrix."""
# res: (height, width), (rows, cols)
crop_aspect_ratio = res[0] / float(res[1])
h = 200 * scale
w = h / crop_aspect_ratio
t = np.zeros((3, 3))
t[0, 0] = float(res[1]) / w
t[1, 1] = float(res[0]) / h
t[0, 2] = res[1] * (-float(center[0]) / w + .5)
t[1, 2] = res[0] * (-float(center[1]) / h + .5)
t[2, 2] = 1
if rot != 0:
rot = -rot # To match direction of rotation from cropping
rot_mat = np.zeros((3, 3))
rot_rad = rot * np.pi / 180
sn, cs = np.sin(rot_rad), np.cos(rot_rad)
rot_mat[0, :2] = [cs, -sn]
rot_mat[1, :2] = [sn, cs]
rot_mat[2, 2] = 1
# Need to rotate around center
t_mat = np.eye(3)
t_mat[0, 2] = -res[1] / 2
t_mat[1, 2] = -res[0] / 2
t_inv = t_mat.copy()
t_inv[:2, 2] *= -1
t = np.dot(t_inv, np.dot(rot_mat, np.dot(t_mat, t)))
return t
def transform(pt, center, scale, res, invert=0, rot=0):
"""Transform pixel location to different reference."""
t = get_transform(center, scale, res, rot=rot)
if invert:
t = np.linalg.inv(t)
new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T
new_pt = np.dot(t, new_pt)
return np.array([round(new_pt[0]), round(new_pt[1])], dtype=int) + 1
def crop(img, center, scale, res):
"""
Crop image according to the supplied bounding box.
res: [rows, cols]
"""
# Upper left point
ul = np.array(transform([1, 1], center, scale, res, invert=1)) - 1
# Bottom right point
br = np.array(transform([res[1] + 1, res[0] + 1], center, scale, res, invert=1)) - 1
new_shape = [br[1] - ul[1], br[0] - ul[0]]
if len(img.shape) > 2:
new_shape += [img.shape[2]]
new_img = np.zeros(new_shape, dtype=np.float32)
# Range to fill new array
new_x = max(0, -ul[0]), min(br[0], len(img[0])) - ul[0]
new_y = max(0, -ul[1]), min(br[1], len(img)) - ul[1]
# Range to sample from original image
old_x = max(0, ul[0]), min(len(img[0]), br[0])
old_y = max(0, ul[1]), min(len(img), br[1])
new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]
new_img = cv2.resize(new_img, (res[1], res[0])) # (cols, rows)
return new_img
def bbox_from_detector(bbox, rescale=1.1):
"""
Get center and scale of bounding box from bounding box.
The expected format is [min_x, min_y, max_x, max_y].
"""
# center
center_x = (bbox[0] + bbox[2]) / 2.0
center_y = (bbox[1] + bbox[3]) / 2.0
center = np.array([center_x, center_y])
# scale
bbox_w = bbox[2] - bbox[0]
bbox_h = bbox[3] - bbox[1]
bbox_size = max(bbox_w * constants.CROP_ASPECT_RATIO, bbox_h)
scale = bbox_size / 200.0
# adjust bounding box tightness
scale *= rescale
return center, scale
def process_image(orig_img_rgb, bbox,
crop_height=constants.CROP_IMG_HEIGHT,
crop_width=constants.CROP_IMG_WIDTH):
"""
Read image, do preprocessing and possibly crop it according to the bounding box.
If there are bounding box annotations, use them to crop the image.
If no bounding box is specified but openpose detections are available, use them to get the bounding box.
"""
if bbox is not None:
center, scale = bbox_from_detector(bbox)
else:
# Assume that the person is centered in the image
height = orig_img_rgb.shape[0]
width = orig_img_rgb.shape[1]
center = np.array([width // 2, height // 2])
scale = max(height, width * crop_height / float(crop_width)) / 200.
img = crop(orig_img_rgb, center, scale, (crop_height, crop_width))
img = img / 255.
mean = np.array(constants.IMG_NORM_MEAN, dtype=np.float32)
std = np.array(constants.IMG_NORM_STD, dtype=np.float32)
norm_img = (img - mean) / std
norm_img = np.transpose(norm_img, (2, 0, 1))
return norm_img, center, scale
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
import cv2
import mindspore
from mindspore import Tensor
import numpy as np
from models.cliff_res50 import MindSporeModel as cliff_res50
from common.imutils import process_image
from common import constants
def main(args):
# load the model
print("ckpt:", args.ckpt)
cliff = cliff_res50()
param_dict = mindspore.load_checkpoint(args.ckpt)
mindspore.load_param_into_net(cliff, param_dict)
# load and pre-process the image
print("input_path:", args.input_path)
img_bgr = cv2.imread(args.input_path)
img_rgb = img_bgr[:, :, ::-1]
norm_img, center, scale = process_image(img_rgb, bbox=None)
norm_img = norm_img[np.newaxis, :, :, :]
# calculate the bbox info
cx, cy, b = center[0], center[1], scale * 200
img_h, img_w, _ = img_rgb.shape
focal_length = (img_w * img_w + img_h * img_h) ** 0.5 # fov: 55 degree
bbox_info = np.array([cx - img_w / 2., cy - img_h / 2., b], dtype=np.float32)
bbox_info = bbox_info[np.newaxis, :]
bbox_info[:, :2] = bbox_info[:, :2] / focal_length * 2.8 # [-1, 1]
bbox_info[:, 2] = (bbox_info[:, 2] - 0.24 * focal_length) / (0.06 * focal_length) # [-1, 1]
# load the initial parameter
mean_params = np.load(constants.SMPL_MEAN_PARAMS)
init_pose = mean_params['pose'][np.newaxis, :].astype('float32')
init_shape = mean_params['shape'][np.newaxis, :].astype('float32')
init_cam = mean_params['cam'][np.newaxis, :].astype('float32')
# feed-forward
pred_rotmat_6d, pred_betas, pred_cam_crop = cliff(Tensor(norm_img), Tensor(bbox_info),
Tensor(init_pose), Tensor(init_shape), Tensor(init_cam))
print("pred_rotmat_6d", pred_rotmat_6d)
print("pred_betas", pred_betas)
print("pred_cam_crop", pred_cam_crop)
print("Inference finished successfully!")
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--input_path', default='data/im07937.png', help='path to the input data')
parser.add_argument('--ckpt', default="ckpt/cliff-res50-PA45.7_MJE72.0_MVE85.3_3dpw.ckpt",
help='path to the pretrained checkpoint')
arguments = parser.parse_args()
main(arguments)
This diff is collapsed.
opencv-python>=4.6.0.66
numpy>=1.23.1
\ No newline at end of file
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment