Aoyu Gong, Sepehr Mousavi, Yiting Xia, Savvas Zannettou
This repository contains the code for our paper:
ClipMind: A Framework for Auditing Short-Format Video Recommendations Using Multimodal AI Models
💡 If you have Conda installed, you may skip this section and proceed to the next one.
Follow these steps to set up a reproducible environment:
wget https://github.com/conda-forge/miniforge/releases/download/24.11.0-0/Miniforge3-24.11.0-0-Linux-x86_64.shbash Miniforge3-24.11.0-0-Linux-x86_64.sh -b -p ~/miniforge3~/miniforge3/bin/conda init bash
source ~/.bashrcTo ensure Conda is initialized in login shells, add the following to ~/.bash_profile:
echo 'source ~/.bashrc' >> ~/.bash_profileIf ~/.bash_profile already exists, make sure it includes this line:
source ~/.bashrcconda --versionconda env create -f environment.ymlconda activate clipmindpip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117conda install -c conda-forge ffmpeg=4.3.1 git-lfs💡 FFmpeg version 4 is required for compatibility.
Update the following fields:
openai.api_key: Insert your OpenAI API key.working_trace: Path to your short-format video trace directory.
💡 A default
testtrace is provided for demonstration purposes.
To analyze your short-format video traces, organize your data using the following folder structure:
./ClipMind/
└── data/
└── your_trace_name/
├── metadata/ # Video metadata
├── videos/ # Video files
└── viewing.json # A JSON file with timestamped viewing history
💡 A default
testtrace is provided for demonstration purposes.
The working_trace field in configuration.yaml specifies the active data directory used by the framework.
-
Phase 1 – Calibration Trace: Start by setting
working_traceto a trace you want to use for sampling and annotation. This trace is used to identify the best feature combination and similarity threshold. After running the notebookidentify_best_features_threshold.ipynb, the best parameters will be written back intoconfiguration.yaml. -
Phase 2 – Analysis Traces: You can now switch
working_traceto other traces you wish to analyze. The notebookvideo_sequence_analysis.ipynbwill apply the identified parameters to auditing short-format video recommendations in those traces.
The following list outlines the recommended notebook execution order across the two phases:
setup.ipynbconvert_video_to_audio.ipynbllm_generated_description.ipynbuser_defined_metadata.ipynbllm_generated_keywords.ipynbtext_embedding.ipynbsampling.ipynbannotation.ipynbidentify_best_features_threshold.ipynbvideo_sequence_analysis.ipynb
💡 Use Jupyter or VSCode to execute notebooks interactively.
For the two phases, run different subsets of notebooks depending on whether you are identifying the best parameters or analyzing new traces:
- Prepare AI Models: Run notebook 1
-
Phase 1 – Calibration Trace:
Run notebooks 2
$\to$ 9 -
Phase 2 – Analysis Traces:
Run notebooks 2
$\to$ 6 (prepare embeddings), then notebook 10 (analyze new traces)
If you find the codebase helpful, please consider giving a ⭐ and citing our paper:
@inproceedings{gong2025clipmind,
title={ClipMind: A Framework for Auditing Short-Format Video Recommendations Using Multimodal AI Models},
author={Gong, Aoyu and Mousavi, Sepehr and Xia, Yiting and Zannettou, Savvas},
booktitle={Proceedings of the International AAAI Conference on Web and Social Media},
volume={19},
pages={671--687},
year={2025}
}If you run into problems or have suggestions, feel free to open an issue or reach out to us.
