Skip to content

zcfinal/LOVEU-CVPR23-AQTC

Repository files navigation

A solution to the CVPR'23 LOVEU-AQTC challenge

Video Alignment for Multi-step Inference

This repo provides the 2nd place solution of the CVPR'23 LOVEU-AQTC challenge.

image

[Challenge Page] [Challenge Paper] [LOVEU@CVPR'23 Challenge] [CodaLab Leaderboard] [Technical Report]

Click to know the task:

Click to see the demo

Install

(1) PyTorch. See https://pytorch.org/ for instruction. For example,

conda install pytorch torchvision torchtext cudatoolkit=11.3 -c pytorch

(2) PyTorch Lightning. See https://www.pytorchlightning.ai/ for instruction. For example,

pip install pytorch-lightning

(3) VideoCLIP install

Please replace the package file fairseq/examples/MMPT/mmpt/models/mmfusion.py with /pretrain/mmfusion.py.

Data

Download training set and testing set (without ground-truth labels) of CVPR'22 LOVEU-AQTC challenge by filling in the [AssistQ Downloading Agreement].

Then carefully set your data path in the config file ;)

Encoding

For segmenting videos based on functions, see /encoder for further details.

We utilize pretrained S3D and VideoCLIP to encode the videos and scripts.

/pretrain/feature.sh is the script to conduct encoding.

Training & Evaluation

/sh/search_dim.sh is the training script.

/sh/search_inf.sh is the inference script.

ensemble_b.py and ensemble.py are the file to ensemble results.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published