by Haozhe Qi, Chen Zhao, Mathieu Salzmann, Alexander Mathis, EPFL (Switzerland).
- We show that HOISDF achieves state-of-the-art results on hand-object pose estimation benchmarks (DexYCB and HO3Dv2).
- We introduce a hand-object pose estimation network that uses signed distance fields (HOISDF) to introduce implicit 3D shape information.
- This repo contains the official Pytorch implementation of HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields published at CVPR'24.
News:
- August 2024: We also shared additional data: rendered images and the segmentation masks that we use to train our model on HO3Dv2 and preprocessed SDF samples and rendered data for HO3Dv2.
- July 2024: We shared preprocessed data of the interacting objects, SDF samples, & trained model weights on Zenodo!
- June 2024: We presented the poster at CVPR in Seattle
- June 2024: We presented the poster at FENS in Vienna
-
Clone the Current Repo
git clone [email protected]:amathislab/HOISDF.git
-
Setup the conda environment
conda create --name hoisdf python=3.9 conda activate hoisdf # install the pytorch version compatible with the your cuda version pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116 pip install -r requirements.txt
-
Download MANO model files (
MANO_LEFT.pkl
andMANO_RIGHT.pkl
) from the website and place them in thetool/mano_models
folder. -
Download the YCB models from here and set
object_models_dir
inconfig.py
to point to the dataset folder. The original mesh models are large and have different vertices for different objects. To enable batched inference, we additionally use simplified object models with 1000 vertices. Download the simplified models from here and setsimple_object_models_dir
inconfig.py
to point to the dataset folder -
Download the processed annotation files for both datasets from here and set
annotation_dir
inconfig.py
to point to the processed data folder.
Depending on the dataset you intend to train/evaluate follow the instructions below for the setup.
- Download the dataset from the website and set
ho3d_data_dir
inconfig.py
to point to the dataset folder. - Obtain Signed-Distance-Field (SDF) files for every sample. This is only needed for the training set. You can obtain them in either of the following ways. Set
fast_data_dir
inconfig.py
to point to the processed SDF folder. - If you want to train HOISDF with the rendered images, download the rendered data including the images from here as well as the SDF files from here and put them into the
fast_data_dir
folder.
- Download the dataset from the website and set
dexycb_data_dir
inconfig.py
to point to the dataset folder. - Obtain Signed-Distance-Field (SDF) files for every sample. This is needed for both the training and test sets. You can obtain them in either of the following ways. Set
fast_data_dir
inconfig.py
to point to the processed SDF folder.- Download the processed SDF files for DexYCB test set from here and for the DexYCB full test set from here. Since the processed SDF files for DexYCB training set and full training set are too big. We unfortunately cannot share them on Zonado and would encourage you to generate them by yourself.
- Follow the AlignSDF repo to generate the original SDF files. Then use the
tool/pre_process_sdf.py
script to process the SDF data.
Depending on the dataset you intend to evaluate follow the instructions below. To test the model with our trained weights, you can download the weights from the links provided here and put them in the ckpts
folder.
- In the
config.py
, modifysetting
parameter.setting = 'ho3d'
for evaluating the model only trained on the HO3Dv2 training set.setting = 'ho3d_render'
for evaluating the model also trained on the rendered data.
- Run the following command:
python main/test.py --ckpt_path ckpts/ho3d/snapshot_ho3d.pth.tar # for ho3d setting python main/test.py --ckpt_path ckpts/ho3d_render/snapshot_ho3d_render.pth.tar # for ho3d_render setting
- The results are dumped into a
results.txt
file in the folder containing the checkpoint. - Also dumped is a
pred_mano.json
file which can be submitted to the HO-3D (v2) challenge after zipping the file. - Hand pose estimation accuracy in the HO-3D challenge leaderboard: here, user: inspire
- In the
config.py
, modifysetting
parameter.setting = 'dexycb'
for evaluating the model only trained on the DexYCB split, which only includes the right hand data.setting = 'dexycb_full'
for evaluating the model trained on the DexYCB Full split, which includes both the right and left hands data.
- Run the following command:
python main/test.py --ckpt_path ckpts/dexycb/snapshot_dexycb.pth.tar # for dexycb setting python main/test.py --ckpt_path ckpts/dexycb_full/snapshot_dexycb_full.pth.tar # for dexycb_full setting
- The results are dumped into a
results.txt
file in the folder containing the checkpoint. - For the
dexycb_full
setting, additional hand mesh results are shown in theresults.txt
file (Table 3 in the paper).
Depending on the dataset you intend to train follow the instructions below.
- Set the
output_dir
inconfig.py
to point to the directory where the checkpoints will be saved. - In the
config.py
, modifysetting
parameter.setting = 'ho3d'
for training the model on the HO3Dv2 training set.setting = 'ho3d_render'
for training the model also on the rendered data.setting = 'dexycb'
for training the model on the DexYCB split, which only includes the right hand data..setting = 'dexycb_full'
for training the model on the DexYCB Full split, which includes both the right and left hands data.
- Run the following command, set the
CUDA_VISIBLE_DEVICES
and--gpu
to the desired GPU ids. Here is an example command for training on two GPUs:CUDA_VISIBLE_DEVICES=0,1 python main/train.py --run_dir_name test --gpu 0,1
- To continue training from the last saved checkpoint, add
--continue
argument in the above command. - The checkpoints are dumped after every epoch in the 'output' folder of the base directory.
- Tensorboard logging is also available in the 'output' folder.
If you find our code or ideas useful, please cite:
@inproceedings{qi2024hoisdf,
title={HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields},
author={Qi, Haozhe and Zhao, Chen and Salzmann, Mathieu and Mathis, Alexander},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={10392--10402},
year={2024}
}
Link to CVPR article: HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields
- Some of the code has been reused from Keypoint Transformer, HFL-Net, DenseMutualAttention, and AlignSDF repositories. We thank the authors for sharing their excellent work!
- Our work was funded by EPFL and Microsoft Swiss Joint Research Center (H.Q., A.M.) and a Boehringer Ingelheim Fonds PhD stipend (H.Q.). We are grateful to Niels Poulsen for comments on an earlier version of this manuscript. We also sincerely thank Rong Wang, Wei Mao and Hongdong Li for sharing the hand-object rendering pipeline.