This repository gives the official PyTorch implementation of STMixer: A One-Stage Sparse Action Detector (CVPR 2023)
- PyTorch == 1.8 or 1.12 (other versions are not tested)
- tqdm
- yacs
- opencv-python
- tensorboardX
- SciPy
- fvcore
- timm
- iopath
Please refer to PySlowFast DATASET.md for AVA dataset preparation.
Backbone | Config | Pre-train Model | Frames | Sampling Rate | Model |
---|---|---|---|---|---|
SlowOnly-R50 | cfg | K400 | 4 | 16 | Link |
SlowFast-R50 | cfg | K400 | 8 | 8 | Link |
SlowFast-R101-NL | cfg | K600 | 8 | 8 | Link |
ViT-B(VideoMAE) | cfg | K400 | 16 | 4 | Link |
ViT-B(VideoMAEv2) | cfg | K710+K400 | 16 | 4 | Link |
python -m torch.distributed.launch --nproc_per_node=8 train_net.py --config-file "config_files/config_file.yaml" --transfer --no-head --use-tfboard
python -m torch.distributed.launch --nproc_per_node=8 test_net.py --config-file "config_files/config_file.yaml" MODEL.WEIGHT "/path/to/model"
We would like to thank Ligeng Chen for his help in drawing the figures in the paper and thank Lei Chen for her surpport in experiments. This project is built upon AlphaAction, AdaMixer and PySlowFast. We also reference and use some code from SparseR-CNN, WOO and VideoMAE. Very sincere thanks to the contributors to these excellent codebases.
If this project helps you in your research or project, please cite our paper:
@inproceedings{wu2023stmixer,
title={STMixer: A One-Stage Sparse Action Detector},
author={Tao Wu and Mengqi Cao and Ziteng Gao and Gangshan Wu and Limin Wang},
booktitle={{CVPR}},
year={2023}
}