Multi-User Video Search: Bridging the Gap Between Text and Embedding Queries

Khai Trinh Xuan, Nguyen Nguyen Khoi, Huy Luong-Quang, Sang Hoa-Xuan, Anh Nguyen-Luong-Nam, Minh-Hung An

SOICT 2023 [Paper]

Pipeline

Dataset preparation

Dataset structure:

|- dict 
   |- ...
   |- faiss_clip_cosine.bin
   |- faiss_clipv2_cosine.bin
|- frontend
   |- ai
   |   |- public
   |   |   |- data
   |   |   |   |- KeyFrames
   |   |   |   |   |-L01
   |   |   |   |   |-L01_extra
   |   |   |   |   |-....

Dict

Download dict zip file: dict

Vector embeddings

Download bin file:

Keyframes

Download keyframes zip file and extract to folder frontend/ai/public/data.
Data part 1:

Data part 2:

Data part 3:

Raw video from AIChallenge 2023

Data part 1:

Data part 2:

AIC_VideoB2

Data part 3:

Dataset extraction

Detailed on dataset extraction: data

Installation

Backend

conda create -n AIChallenge2023
conda activate AIChallenge2023
pip install git+https://github.com/openai/CLIP.git
pip install -r requirements.txt

Frontend

Install nodejs: https://nodejs.org/en/download

npm install

DB Sever

pip install flask
pip install flask-cors
pip install flask-socketio
pip install pyngrok==4.1.1
ngrok authtoken your_token # Add your ngrok authentication

Usage

It is recommended to configure the environment using Anaconda. Linux support only.

Backend

Using local machine, from root of repo:

python3 app.py

Using colaboratory, run appNotebook (App section) for starting the backend.

Frontend

Change url in frontend/ai/src/helper/web_url.js.

cd frontend/ai/
npm run dev

DB Sever

Open 2 terminal and run:

python appStorage.py

ngrok http 5000

Interface

Citation

If you have any questions, please leave an issue or contact us: [email protected]

@inproceedings{10.1145/3628797.3628957,
author = {Trinh Xuan, Khai and Nguyen Khoi, Nguyen and Luong-Quang, Huy and Hoa-Xuan, Sang and Nguyen-Luong-Nam, Anh and An, Minh-Hung and Nguyen, Hong-Phuc},
title = {Multi-User Video Search: Bridging the Gap Between Text and Embedding Queries},
year = {2023},
isbn = {9798400708916},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3628797.3628957},
doi = {10.1145/3628797.3628957},
abstract = {Video search is a crucial task in the modern era, as the rapid growth of video platforms has led to an exponential increase in the number of videos on the internet. Effective video management is therefore essential. Significant research has been conducted on video search, with most approaches leveraging image-text retrieval or searching by object, speech, color, and text in images. However, these approaches can be inefficient when multiple users search for the same query simultaneously, as they may overlap in their search spaces. Additionally, most video search systems do not support complex queries that require information from multiple frames in a video. In this paper, we propose a solution to these problems by splitting the search space for different users and ignoring images that have already been considered by other users to avoid redundant searches. To address complex queries, we split the query and apply a technique called forward and backward search.},
booktitle = {Proceedings of the 12th International Symposium on Information and Communication Technology},
pages = {923–930},
numpages = {8},
keywords = {embedding-based search, interactive video retrieval, multi-user search engine, multimedia and multimodal retrieval, text-based search},
location = {Ho Chi Minh, Vietnam},
series = {SOICT '23}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dataset_extraction		dataset_extraction
figs		figs
frontend/ai		frontend/ai
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
appNotebook.ipynb		appNotebook.ipynb
appStorage.py		appStorage.py
requirements.txt		requirements.txt
videosplit.py		videosplit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-User Video Search: Bridging the Gap Between Text and Embedding Queries

Pipeline

Dataset preparation

Dict

Vector embeddings

Keyframes

Raw video from AIChallenge 2023

Dataset extraction

Installation

Backend

Frontend

DB Sever

Usage

Backend

Frontend

DB Sever

Interface

Citation

About

Releases

Packages

Contributors 2

Languages

AIVIETNAMResearch/VN_Multi_User_Video_Search

Folders and files

Latest commit

History

Repository files navigation

Multi-User Video Search: Bridging the Gap Between Text and Embedding Queries

Pipeline

Dataset preparation

Dict

Vector embeddings

Keyframes

Raw video from AIChallenge 2023

Dataset extraction

Installation

Backend

Frontend

DB Sever

Usage

Backend

Frontend

DB Sever

Interface

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages