This repo provides the 2nd place solution of the CVPR'23 LOVEU-AQTC challenge.
[Challenge Page] [Challenge Paper] [LOVEU@CVPR'23 Challenge] [CodaLab Leaderboard] [Technical Report]
Click to know the task:
(1) PyTorch. See https://pytorch.org/ for instruction. For example,
conda install pytorch torchvision torchtext cudatoolkit=11.3 -c pytorch
(2) PyTorch Lightning. See https://www.pytorchlightning.ai/ for instruction. For example,
pip install pytorch-lightning
Please replace the package file fairseq/examples/MMPT/mmpt/models/mmfusion.py with /pretrain/mmfusion.py.
Download training set and testing set (without ground-truth labels) of CVPR'22 LOVEU-AQTC challenge by filling in the [AssistQ Downloading Agreement].
Then carefully set your data path in the config file ;)
For segmenting videos based on functions, see /encoder for further details.
We utilize pretrained S3D and VideoCLIP to encode the videos and scripts.
/pretrain/feature.sh is the script to conduct encoding.
/sh/search_dim.sh is the training script.
/sh/search_inf.sh is the inference script.
ensemble_b.py and ensemble.py are the file to ensemble results.