Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval

This is the official implementation of the paper Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval (WACV 2024).

Pyhton Environments

Install python packages.
```
$ pip install -r requirements.txt
```
(Optional) Boost mAP calculation.
```
$ python setup.py install
```

Datasets

The datasets can be downloaded from our OneDrive.

The folder structure should be like this:

./data
├── ActivityNet
│   ├── C3D
│   │   └── activitynet_v1-3_c3d.hdf5
│   ├── I3D
│   │   ├── v_00Dk03Jr70M.npy
│   │   ├── v_00KMCm2oGhk.npy
│   │   ├── ...
|   |   └── ...
│   ├── multi_test.json
│   ├── test.json
│   ├── train.json
│   └── val.json
├── CharadesSTA
│   ├── VGG
│   │   └── vgg_rgb_features.hdf5
│   ├── C3D
│   │   └── Charades_C3D.hdf5
│   ├── I3D
│   │   ├── 001YG.npy
│   │   ├── 003WS.npy
│   │   ├── ...
|   |   └── ...
│   ├── multi_test.json
│   ├── test.json
│   └── train.json
└── QVHighlights
    ├── features
    │   ├── clip_features
    │   ├── clip_text_features
    │   └── slowfast_features
    ├── test.json
    ├── train.json
    └── val.json

Training

All our training configuration files are in the ./configs folder. The training command is as follows:

Single GPU training.

$ python main.py --config path/to/config.json --logdir path/to/log/dir

For example, to train the model on CharadesSTA dataset with VGG backbone:

$ python main.py --config ./configs/charades-VGG.json --logdir ./logs/charades-VGG-log

Multi-GPU training. For example, to train the model on CharadesSTA dataset with VGG backbone on 4 GPUs:

$ CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --config ./configs/charades-VGG.json --logdir ./logs/charades-VGG-log

Pretrained Models

The pretrained models used in our paper can be downloaded from OneDrive.

It is recommended to put the pretrained models in the ./logs folder.

./logs
├── activity-C3D-log
│   ├── ...
├── activity-I3D-log
│   ├── ...
├── charades-C3D-log
│   ├── ...
├── charades-I3D-log
│   ├── ...
├── charades-VGG-log
│   ├── best.pth
│   └── config.json
└── qv-log
    ├── ...

Reproduce the results in our paper. Take the CharadesSTA dataset as an example:

$ python main.py --test_only --config ./logs/charades-VGG-log/config.json  --logdir ./logs/charades-VGG-log

Citation

If you find this code useful for your research, please cite our paper:

@InProceedings{Huang_2024_WACV,
    author    = {Cheng Huang, Yi-Lun Wu, Hong-Han Shuai, Ching-Chun Huang},
    title     = {Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
configs		configs
misc		misc
src		src
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval

Pyhton Environments

Datasets

Training

Pretrained Models

Citation

About

Releases

Packages

Contributors 2

Languages

basiclab/SFABD

Folders and files

Latest commit

History

Repository files navigation

Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval

Pyhton Environments

Datasets

Training

Pretrained Models

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages