Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi
[arXiv] [video] [poster] [livedemo]
Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs.
The code requires Python 3.7 or later. The file requirements.txt contains the full list of required Python modules.
pip install -r requirements.txt
Human3.6M in exponential map can be downloaded from here.
Directory structure:
H3.6m
|-- S1
|-- S5
|-- S6
|-- ...
|-- S11
AMASS and 3DPW from their official websites.
Specify the data path with data_dir
argument.
You need to train a short-term and long-term model using these commands:
python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 50 --output_n 5 --data_dir data_dir --output_dir model_s
python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 55 --output_n 20 --data_dir data_dir --output_dir model_l
For evaluating the TCD model you can run the following command. Specify the short-term and long-term model checkpoints directory with --model_s
and --model_l
arguments.
python main_tcd_h36m.py --mode test --data all --joints 22 --input_n 50 --output_n 25 --data_dir data_dir --model_s model_s --model_l model_l --output_dir model_l
The results will be saved in a csv file in the output directory.
You can train a model on AMASS dataset using the following command:
python main_amass.py --mode train --epochs 50 --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass
Then you can evaluate it on both AMASS and 3DPW datasets:
python main_amass.py --mode test --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass
python main_amass.py --mode test --dataset 3DPW --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass
The results will be saved in csv files in the output directory.
This repository is being updated so stay tuned!
The overall code framework (dataloading, training, testing etc.) was adapted from HRI. The base of the diffusion was borrowed from CSDI.
@INPROCEEDINGS{saadatnejad2023diffusion,
author = {Saeed Saadatnejad and Ali Rasekh and Mohammadreza Mofayezi and Yasamin Medghalchi and Sara Rajabzadeh and Taylor Mordan and Alexandre Alahi},
title = {A generic diffusion-based approach for 3D human pose prediction in the wild},
booktitle={International Conference on Robotics and Automation (ICRA)},
year = {2023}
}
AGPL-3.0 license