Khai Trinh Xuan, Nguyen Nguyen Khoi, Huy Luong-Quang, Sang Hoa-Xuan, Anh Nguyen-Luong-Nam, Minh-Hung An
SOICT 2023 [Paper
]
Dataset structure:
|- dict
|- ...
|- faiss_clip_cosine.bin
|- faiss_clipv2_cosine.bin
|- frontend
|- ai
| |- public
| | |- data
| | | |- KeyFrames
| | | | |-L01
| | | | |-L01_extra
| | | | |-....
Download dict zip file: dict
Download bin file:
Download keyframes zip file and extract to folder frontend/ai/public/data.
Data part 1:
Data part 2:
Data part 3:
- AIC_KeyframesB3_Reduced
- AIC_KeyframesB2_Extra_Reduced_0
- AIC_KeyframesB2_Extra_Reduced_1
- AIC_KeyframesB2_Extra_Reduced_2
Data part 1:
Data part 2:
Data part 3:
Detailed on dataset extraction: data
conda create -n AIChallenge2023
conda activate AIChallenge2023
pip install git+https://github.com/openai/CLIP.git
pip install -r requirements.txt
Install nodejs: https://nodejs.org/en/download
npm install
pip install flask
pip install flask-cors
pip install flask-socketio
pip install pyngrok==4.1.1
ngrok authtoken your_token # Add your ngrok authentication
It is recommended to configure the environment using Anaconda. Linux support only.
Using local machine, from root of repo:
python3 app.py
Using colaboratory, run appNotebook (App section) for starting the backend.
Change url in frontend/ai/src/helper/web_url.js.
cd frontend/ai/
npm run dev
Open 2 terminal and run:
python appStorage.py
ngrok http 5000
If you have any questions, please leave an issue or contact us: [email protected]
@inproceedings{10.1145/3628797.3628957,
author = {Trinh Xuan, Khai and Nguyen Khoi, Nguyen and Luong-Quang, Huy and Hoa-Xuan, Sang and Nguyen-Luong-Nam, Anh and An, Minh-Hung and Nguyen, Hong-Phuc},
title = {Multi-User Video Search: Bridging the Gap Between Text and Embedding Queries},
year = {2023},
isbn = {9798400708916},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3628797.3628957},
doi = {10.1145/3628797.3628957},
abstract = {Video search is a crucial task in the modern era, as the rapid growth of video platforms has led to an exponential increase in the number of videos on the internet. Effective video management is therefore essential. Significant research has been conducted on video search, with most approaches leveraging image-text retrieval or searching by object, speech, color, and text in images. However, these approaches can be inefficient when multiple users search for the same query simultaneously, as they may overlap in their search spaces. Additionally, most video search systems do not support complex queries that require information from multiple frames in a video. In this paper, we propose a solution to these problems by splitting the search space for different users and ignoring images that have already been considered by other users to avoid redundant searches. To address complex queries, we split the query and apply a technique called forward and backward search.},
booktitle = {Proceedings of the 12th International Symposium on Information and Communication Technology},
pages = {923–930},
numpages = {8},
keywords = {embedding-based search, interactive video retrieval, multi-user search engine, multimedia and multimodal retrieval, text-based search},
location = {Ho Chi Minh, Vietnam},
series = {SOICT '23}
}