diff --git a/tutorials/video/getting-started/video_pipeline_tutorial.ipynb b/tutorials/video/getting-started/video_pipeline_tutorial.ipynb
new file mode 100644
index 000000000..b10c221ad
--- /dev/null
+++ b/tutorials/video/getting-started/video_pipeline_tutorial.ipynb
@@ -0,0 +1,693 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Video Pipeline Tutorial with NeMo Curator\n",
+    "\n",
+    "This notebook demonstrates how to use NeMo Curator's video curation pipeline to process videos, extract clips, generate embeddings, and create captions.\n",
+    "\n",
+    "## Table of Contents\n",
+    "1. [Installation and Setup](#installation-and-setup)\n",
+    "2. [Understanding the Video Pipeline](#understanding-the-video-pipeline)\n",
+    "3. [Basic Example: Reading Videos](#basic-example-reading-videos)\n",
+    "4. [Advanced Example: Complete Video Processing](#advanced-example-complete-video-processing)\n",
+    "5. [Pipeline Parameters Explained](#pipeline-parameters-explained)\n",
+    "6. [Troubleshooting](#troubleshooting)\n",
+    "\n",
+    "---\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Installation and Setup\n",
+    "\n",
+    "### Prerequisites\n",
+    "\n",
+    "Before running the video pipeline, ensure you have:\n",
+    "\n",
+    "- **NVIDIA GPU** with Volta™ or higher (compute capability 7.0+)\n",
+    "- **CUDA 12 or above**\n",
+    "- **FFmpeg 7+** (will be installed using the provided script)\n",
+    "\n",
+    "### System Requirements\n",
+    "\n",
+    "- **Memory**: 16GB+ RAM for basic processing\n",
+    "- **GPU Memory**: 16GB+ VRAM recommended (up to 38GB for full pipeline with captions)\n",
+    "- **Storage**: Sufficient space for input videos and output clips\n",
+    "\n",
+    "### Installation Steps\n",
+    "\n",
+    "1. **Install FFmpeg:**\n",
+    "First, install FFmpeg using the provided installation script:\n",
+    "```bash\n",
+    "# Download and run the FFmpeg installation script\n",
+    "curl -O https://raw.githubusercontent.com/NVIDIA-NeMo/Curator/main/docker/common/install_ffmpeg.sh\n",
+    "chmod +x install_ffmpeg.sh\n",
+    "./install_ffmpeg.sh\n",
+    "```\n",
+    "\n",
+    "2. **Install UV (if not already installed):**\n",
+    "UV is a fast Python package installer and resolver that's significantly faster than pip:\n",
+    "```bash\n",
+    "# Install UV package manager\n",
+    "curl -LsSf https://astral.sh/uv/install.sh | sh\n",
+    "# Or on Windows: powershell -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n",
+    "```\n",
+    "\n",
+    "3. **Create and activate a virtual environment with UV:**\n",
+    "```bash\n",
+    "uv venv .venv\n",
+    "source .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n",
+    "```\n",
+    "\n",
+    "4. **Install NeMo Curator with video support using UV:**\n",
+    "```bash\n",
+    "uv pip install \"nemo-curator[video,video_cuda]\"\n",
+    "```\n",
+    "\n",
+    "5. **Verify installation:**\n",
+    "```bash\n",
+    "python -c \"import nemo_curator; print('Installation successful!')\"\n",
+    "```\n",
+    "\n",
+    "### Download Required Models\n",
+    "\n",
+    "The video pipeline requires several pre-trained models (e.g. [Cosmos Embed](https://huggingface.co/nvidia/Cosmos-Embed1-448p)). Models will be downloaded automatically based on the selected stages.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Understanding the Video Pipeline\n",
+    "\n",
+    "NeMo Curator's video pipeline is built on a **stage-based architecture** where each stage performs a specific processing step:\n",
+    "\n",
+    "### Core Components\n",
+    "\n",
+    "1. **Pipelines**: Ordered sequences of stages forming an end-to-end workflow\n",
+    "2. **Stages**: Individual processing units that perform single steps\n",
+    "3. **Tasks**: Data units that flow through the pipeline (`VideoTask` containing `Video` and `Clip` objects)\n",
+    "4. **Executors**: Components that run pipelines on distributed backends (Ray)\n",
+    "\n",
+    "### Pipeline Stages\n",
+    "\n",
+    "The video pipeline includes these stages (all optional - choose based on your needs):\n",
+    "\n",
+    "1. **VideoReader**: Reads video files and extracts metadata\n",
+    "2. **Splitting Algorithm**: \n",
+    "   - **Fixed Stride**: Splits videos into fixed-length clips\n",
+    "   - **TransNetV2**: Uses AI to detect scene transitions for intelligent splitting ([GitHub](https://github.com/soCzech/TransNetV2))\n",
+    "3. **ClipTranscodingStage**: Converts clips to standardized format\n",
+    "4. **MotionFilterStage**: Filters clips based on motion content\n",
+    "5. **ClipAestheticFilterStage**: Filters clips based on aesthetic quality using [CLIP](https://openai.com/research/clip) model\n",
+    "6. **Embedding Generation**: Creates vector embeddings for similarity search\n",
+    "   - **Cosmos-Embed1**: NVIDIA's state-of-the-art video embedding model (224p, 336p, 448p variants) ([Hugging Face](https://huggingface.co/nvidia/Cosmos-Embed1-448p))\n",
+    "   - **InternVideo2**: Advanced video understanding model for comprehensive embeddings ([GitHub](https://github.com/OpenGVLab/InternVideo2))\n",
+    "7. **Caption Generation**: Generates text descriptions of video content using [Qwen-VL](https://huggingface.co/Qwen/Qwen-VL) model\n",
+    "8. **Caption Enhancement**: Improves and refines generated captions using [Qwen-LM](https://huggingface.co/Qwen/Qwen2.5-7B) for better quality\n",
+    "9. **ClipWriterStage**: Saves processed clips and metadata\n",
+    "\n",
+    "### Data Flow\n",
+    "\n",
+    "```\n",
+    "Input Videos → VideoReader → Splitting → Transcoding → Filtering → Embeddings → Captions → Caption Enhancement → Output\n",
+    "```\n",
+    "\n",
+    "**Note**: All stages except VideoReader are optional. You can customize the pipeline by:\n",
+    "- **Basic**: VideoReader → Splitting → Transcoding → Output\n",
+    "- **With Quality Control**: Add Motion/Aesthetic filtering\n",
+    "- **With AI Features**: Add Embedding generation and/or Caption generation\n",
+    "- **Full Pipeline**: Include all stages for comprehensive video processing\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Running the Basic Example\n",
+    "\n",
+    "[`video_read_example.py`](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/video/getting-started/video_read_example.py). To run this example:\n",
+    "\n",
+    "```bash\n",
+    "python video_read_example.py --video-folder /path/to/your/videos --video-limit 5 --verbose\n",
+    "```\n",
+    "\n",
+    "**Parameters:**\n",
+    "- `--video-folder`: Path to directory containing video files\n",
+    "- `--video-limit`: Maximum number of videos to process (-1 for unlimited)\n",
+    "- `--verbose`: Enable detailed logging\n",
+    "\n",
+    "**What it does:**\n",
+    "- Reads video files from the specified directory\n",
+    "- Extracts metadata (duration, framerate, resolution, etc.)\n",
+    "- Processes videos in parallel using Ray\n",
+    "- Provides detailed logging of the process\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Running the Advanced Example\n",
+    "\n",
+    "The comprehensive video processing example is available in [video_split_clip_example.py](https://github.com/NVIDIA-NeMo/Curator/blob/main/tutorials/video/getting-started/video_split_clip_example.py)\n",
+    "\n",
+    "To run the comprehensive video processing pipeline, use the provided script:\n",
+    "\n",
+    "Key features of the comprehensive pipeline:\n",
+    "- Video reading and metadata extraction\n",
+    "- Multiple splitting algorithms (Fixed Stride and TransNetV2)\n",
+    "- Clip transcoding with various encoders \n",
+    "- Motion and aesthetic filtering\n",
+    "- Embedding generation (Cosmos-Embed1, InternVideo2)\n",
+    "- Caption generation (Qwen)\n",
+    "- Preview generation\n",
+    "- Flexible output options\n",
+    "\n",
+    "\n",
+    "```bash\n",
+    "python video_split_clip_example.py \\\n",
+    "    --video-dir /path/to/your/videos \\\n",
+    "    --model-dir /path/to/models \\\n",
+    "    --output-clip-path /path/to/output/clips \\\n",
+    "    --splitting-algorithm fixed_stride \\\n",
+    "    --generate-embeddings \\\n",
+    "    --video-limit 5 \\\n",
+    "    --verbose\n",
+    "```\n",
+    "\n",
+    "**Key Parameters:**\n",
+    "- `--video-dir`: Input video directory\n",
+    "- `--model-dir`: Model directory (Can be empty and models will be automatically downloaded)\n",
+    "- `--output-clip-path`: Output directory for processed clips\n",
+    "- `--splitting-algorithm`: Choose between \"fixed_stride\" or \"transnetv2\"\n",
+    "- `--generate-embeddings`: Enable embedding generation\n",
+    "- `--generate-captions`: Enable caption generation\n",
+    "- `--aesthetic-threshold`: Filter clips by aesthetic score (e.g., 3.5)\n",
+    "- `--motion-filter`: Motion filtering mode (\"disable\", \"enable\", \"score-only\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Pipeline Parameters Explained\n",
+    "\n",
+    "### Splitting Algorithms\n",
+    "\n",
+    "#### Fixed Stride Splitting\n",
+    "**What it does**: Splits videos into clips of fixed duration at regular intervals.\n",
+    "- **Parameters**:\n",
+    "  - `--fixed-stride-split-duration`: Duration of each clip in seconds (default: 10.0)\n",
+    "  - `--fixed-stride-min-clip-length-s`: Minimum clip length in seconds (default: 2.0)\n",
+    "  - `--limit-clips`: Maximum clips per video (0 = unlimited)\n",
+    "\n",
+    "#### TransNetV2 Splitting\n",
+    "**What it does**: Uses AI to detect scene transitions and intelligently split videos at natural break points.\n",
+    "- **Parameters**:\n",
+    "  - `--transnetv2-threshold`: Probability threshold for scene transitions (default: 0.4)\n",
+    "  - `--transnetv2-min-length-s`: Minimum scene length in seconds (default: 2.0)\n",
+    "  - `--transnetv2-max-length-s`: Maximum scene length in seconds (default: 10.0)\n",
+    "  - `--transnetv2-max-length-mode`: How to handle long scenes (\"truncate\" or \"stride\")\n",
+    "  - `--transnetv2-crop-s`: Seconds to crop from start/end of scenes (default: 0.5)\n",
+    "\n",
+    "### Transcoding Parameters\n",
+    "\n",
+    "**What it does**: Converts video clips to a standardized format for consistent processing and storage.\n",
+    "- `--transcode-encoder`: Video encoder (\"libopenh264\", \"h264_nvenc\", \"libx264\")\n",
+    "- `--transcode-encoder-threads`: CPU threads per encoding operation\n",
+    "- `--transcode-ffmpeg-batch-size`: Number of clips to encode in parallel\n",
+    "- `--transcode-use-hwaccel`: Use GPU acceleration for decoding\n",
+    "- `--transcode-use-input-video-bit-rate`: Use input video's bit rate\n",
+    "\n",
+    "### Filtering Parameters\n",
+    "\n",
+    "#### Motion Filtering\n",
+    "**What it does**: Analyzes video motion content to filter out static or low-motion clips.\n",
+    "- `--motion-filter`: Mode (\"disable\", \"enable\", \"score-only\")\n",
+    "- `--motion-global-mean-threshold`: Global motion threshold (default: 0.00098)\n",
+    "- `--motion-per-patch-min-256-threshold`: Per-patch motion threshold (default: 0.000001)\n",
+    "\n",
+    "#### Aesthetic Filtering\n",
+    "**What it does**: Uses AI to score video clips based on visual quality and aesthetic appeal.\n",
+    "- `--aesthetic-threshold`: Minimum aesthetic score (e.g., 3.5)\n",
+    "- `--aesthetic-reduction`: Score reduction method (\"mean\" or \"min\")\n",
+    "\n",
+    "### Embedding Parameters\n",
+    "\n",
+    "**What it does**: Generates vector embeddings from video clips for similarity search and clustering.\n",
+    "- `--embedding-algorithm`: Algorithm (\"cosmos-embed1-224p\", \"cosmos-embed1-336p\", \"cosmos-embed1-448p\", \"internvideo2\")\n",
+    "- `--embedding-gpu-memory-gb`: GPU memory allocation (default: 20.0)\n",
+    "\n",
+    "### Captioning Parameters\n",
+    "\n",
+    "**What it does**: Generates text descriptions of video content using AI vision-language models.\n",
+    "- `--generate-captions`: Enable caption generation\n",
+    "- `--captioning-algorithm`: Model variant (\"qwen\")\n",
+    "- `--captioning-batch-size`: Batch size for processing (default: 8)\n",
+    "- `--captioning-max-output-tokens`: Maximum tokens per caption (default: 512)\n",
+    "- `--captioning-sampling-fps`: Frames per second for sampling (default: 2.0)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example Usage Scenarios\n",
+    "\n",
+    "### Scenario 1: Basic Video Splitting\n",
+    "For simple video splitting without advanced features:\n",
+    "\n",
+    "```bash\n",
+    "python video_split_clip_example.py \\\n",
+    "    --video-dir /path/to/videos \\\n",
+    "    --model-dir /path/to/models \\\n",
+    "    --output-clip-path /path/to/output \\\n",
+    "    --splitting-algorithm fixed_stride \\\n",
+    "    --fixed-stride-split-duration 15.0 \\\n",
+    "    --video-limit 10\n",
+    "```\n",
+    "\n",
+    "### Scenario 2: High-Quality Video Processing\n",
+    "For production-quality processing with all features:\n",
+    "\n",
+    "```bash\n",
+    "python video_split_clip_example.py \\\n",
+    "    --video-dir /path/to/videos \\\n",
+    "    --model-dir /path/to/models \\\n",
+    "    --output-clip-path /path/to/output \\\n",
+    "    --splitting-algorithm transnetv2 \\\n",
+    "    --transnetv2-threshold 0.3 \\\n",
+    "    --transnetv2-min-length-s 3.0 \\\n",
+    "    --transnetv2-max-length-s 15.0 \\\n",
+    "    --generate-embeddings \\\n",
+    "    --embedding-algorithm cosmos-embed1-336p \\\n",
+    "    --generate-captions \\\n",
+    "    --captioning-batch-size 4 \\\n",
+    "    --aesthetic-threshold 3.5 \\\n",
+    "    --motion-filter enable \\\n",
+    "    --transcode-encoder h264_nvenc \\\n",
+    "    --transcode-use-hwaccel \\\n",
+    "    --video-limit 50\n",
+    "```\n",
+    "\n",
+    "### Scenario 3: Quick Testing\n",
+    "For rapid testing with minimal resources:\n",
+    "\n",
+    "```bash\n",
+    "python video_split_clip_example.py \\\n",
+    "    --video-dir /path/to/videos \\\n",
+    "    --model-dir /path/to/models \\\n",
+    "    --output-clip-path /path/to/output \\\n",
+    "    --splitting-algorithm fixed_stride \\\n",
+    "    --fixed-stride-split-duration 5.0 \\\n",
+    "    --transcode-encoder libopenh264 \\\n",
+    "    --video-limit 3 \\\n",
+    "    --dry-run\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Interactive End-to-End Example\n",
+    "\n",
+    "Now let's put everything together! This section will walk you through a complete video processing pipeline from start to finish.\n",
+    "\n",
+    "### What We'll Do\n",
+    "\n",
+    "1. **Download sample videos** from the PE-Video dataset\n",
+    "2. **Process the videos** using NeMo Curator's video pipeline\n",
+    "3. **Explore the results** and understand the output structure\n",
+    "\n",
+    "This hands-on example will help you understand how all the components work together in practice.\n",
+    "\n",
+    "---\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Step 1: Download Sample Videos\n",
+    "\n",
+    "First, let's download some sample videos from the [PE-Video](https://huggingface.co/datasets/facebook/PE-Video) dataset. This will give us real video content to work with.\n",
+    "\n",
+    "The following code cell would download 10 videos from PE-Video dataset:\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Install required dependencies for this example\n",
+    "!pip install datasets\n",
+    "\n",
+    "import os\n",
+    "from pathlib import Path\n",
+    "\n",
+    "from datasets import load_dataset\n",
+    "\n",
+    "# Create output directory for sample videos\n",
+    "output_dir = Path(\"./pe_video_samples\")\n",
+    "output_dir.mkdir(exist_ok=True)\n",
+    "\n",
+    "print(f\"Downloading sample videos to: {output_dir.absolute()}\")\n",
+    "\n",
+    "# Load PE-Video dataset (streaming mode for efficiency)\n",
+    "dataset = load_dataset(\"facebook/PE-Video\", split=\"train\", streaming=True)\n",
+    "\n",
+    "# Download 10 sample videos (adjust this number as needed)\n",
+    "count = 0\n",
+    "max_videos = 10\n",
+    "\n",
+    "print(f\"Downloading {max_videos} sample videos...\")\n",
+    "\n",
+    "for sample in dataset:\n",
+    "    if count >= max_videos:\n",
+    "        break\n",
+    "\n",
+    "    video_data = sample.get(\"mp4\")\n",
+    "    description = sample.get(\"json\", {}).get(\"description\", f\"video_{count+1}\")\n",
+    "\n",
+    "    if video_data:\n",
+    "        # Create safe filename\n",
+    "        safe_name = \"\".join(c for c in description[:30] if c.isalnum() or c in (\" \", \"-\", \"_\")).strip()\n",
+    "        filename = f\"{safe_name}_{count+1}.mp4\" if safe_name else f\"video_{count+1}.mp4\"\n",
+    "\n",
+    "        # Save video\n",
+    "        with open(output_dir / filename, \"wb\") as f:\n",
+    "            f.write(video_data)\n",
+    "\n",
+    "        print(f\"✓ Downloaded: {filename}\")\n",
+    "        count += 1\n",
+    "\n",
+    "print(f\"Successfully downloaded {count} videos to {output_dir.absolute()}\")\n",
+    "print(\"Video files:\")\n",
+    "for video_file in output_dir.glob(\"*.mp4\"):\n",
+    "    file_size = video_file.stat().st_size / (1024 * 1024)  # Size in MB\n",
+    "    print(f\"   - {video_file.name} ({file_size:.1f} MB)\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Step 2: Set Up Video Processing Pipeline\n",
+    "\n",
+    "Now let's configure and run the video processing pipeline on our downloaded videos. We'll use a moderate configuration that demonstrates key features without requiring excessive resources.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Command Breakdown\n",
+    "\n",
+    "The following command runs the complete video processing pipeline. Here's what each parameter does:\n",
+    "\n",
+    "**📁 Input/Output:**\n",
+    "- `--video-dir ./pe_video_samples` → Input directory containing our downloaded videos\n",
+    "- `--output-clip-path ./processed_clips` → Output directory where processed clips will be saved\n",
+    "\n",
+    "**✂️ Video Splitting:**\n",
+    "- `--splitting-algorithm fixed_stride` → Split videos into clips using fixed time intervals\n",
+    "  - *Alternative: `transnetv2` for AI-based scene detection*\n",
+    "- `--fixed-stride-split-duration 8.0` → Each clip will be 8 seconds long\n",
+    "- `--fixed-stride-min-clip-length-s 2.0` → Discard clips shorter than 2 seconds\n",
+    "\n",
+    "**🎥 Video Processing:**\n",
+    "- `--transcode-encoder libopenh264` → Use libopenh264 codec (good speed/quality balance)\n",
+    "  - *Alternatives: `h264_nvenc` (GPU), `libx264` (CPU)*\n",
+    "- `--transcode-ffmpeg-batch-size 8` → Process 8 clips in parallel during transcoding\n",
+    "\n",
+    "**🧠 AI Features:**\n",
+    "- `--generate-embeddings` → Generate vector embeddings for similarity search and clustering\n",
+    "- `--embedding-algorithm cosmos-embed1-224p` → Use NVIDIA's Cosmos-Embed1 model at 224p resolution\n",
+    "  - *Alternatives: `cosmos-embed1-336p`, `cosmos-embed1-448p`, `internvideo2`*\n",
+    "- `--embedding-gpu-memory-gb 8.0` → Allocate 8GB of GPU memory for embedding generation\n",
+    "\n",
+    "**🔍 Quality Filtering:**\n",
+    "- `--motion-filter score-only` → Calculate motion scores but don't filter clips based on motion\n",
+    "  - *Alternatives: `enable` (filter low-motion clips), `disable` (no motion analysis)*\n",
+    "- `--aesthetic-threshold 3.0` → Filter out clips with aesthetic scores below 3.0 (1-5 scale)\n",
+    "  - *Higher values = more selective filtering*\n",
+    "\n",
+    "**⚙️ Processing Control:**\n",
+    "- `--video-limit 3` → Process only 3 videos (for this example)\n",
+    "  - *Remove this parameter to process all videos*\n",
+    "- `--verbose` → Show detailed progress information during processing\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python video_split_clip_example.py \\\n",
+    "    --video-dir ./pe_video_samples \\\n",
+    "    --output-clip-path ./processed_clips \\\n",
+    "    --splitting-algorithm fixed_stride \\\n",
+    "    --fixed-stride-split-duration 8.0 \\\n",
+    "    --fixed-stride-min-clip-length-s 2.0 \\\n",
+    "    --transcode-encoder libopenh264 \\\n",
+    "    --transcode-ffmpeg-batch-size 8 \\\n",
+    "    --generate-embeddings \\\n",
+    "    --embedding-algorithm cosmos-embed1-224p \\\n",
+    "    --embedding-gpu-memory-gb 8.0 \\\n",
+    "    --motion-filter score-only \\\n",
+    "    --aesthetic-threshold 3.0 \\\n",
+    "    --video-limit 3 \\\n",
+    "    --verbose\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Step 3: Understanding the Output\n",
+    "\n",
+    "The video pipeline produces several types of output:\n",
+    "\n",
+    "#### 📁 Directory Structure\n",
+    "```\n",
+    "processed_clips/\n",
+    "├── clips/                    # Processed video clips (.mp4 files)\n",
+    "│   ├── video1_clip_0.mp4\n",
+    "│   ├── video1_clip_1.mp4\n",
+    "│   └── ...\n",
+    "├── metadata/                 # Metadata files (.json)\n",
+    "│   ├── video1_metadata.json\n",
+    "│   └── ...\n",
+    "└── iv2_embd/              # InternVideo2 Embedding files (if generated)\n",
+    "    └── ...\n",
+    "```\n",
+    "\n",
+    "#### 📊 Metadata Fields\n",
+    "Each clip in the metadata includes:\n",
+    "- **Basic Info**: `clip_path`, `duration`, `fps`, `resolution`\n",
+    "- **Quality Scores**: `aesthetic_score`, `motion_score`\n",
+    "- **AI Features**: `embedding` (vector), `caption` (text description)\n",
+    "- **Processing Info**: `source_video`, `clip_index`, `timestamp`\n",
+    "\n",
+    "#### 🎯 Next Steps\n",
+    "Now that you've seen the complete pipeline in action, you can:\n",
+    "\n",
+    "1. **Experiment with parameters** - Try different splitting algorithms, thresholds, or models\n",
+    "2. **Scale up** - Process more videos or use higher-quality settings\n",
+    "3. **Customize the pipeline** - Add or remove stages based on your needs\n",
+    "4. **Use the results** - Leverage embeddings for similarity search or captions for content analysis\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Video Deduplication Pipeline\n",
+    "\n",
+    "After processing videos and generating embeddings, you may want to remove duplicate or very similar video clips from your dataset. NeMo Curator provides a powerful semantic deduplication pipeline that uses the generated embeddings to identify and remove near-duplicate content.\n",
+    "\n",
+    "### What is Semantic Deduplication?\n",
+    "\n",
+    "Semantic deduplication goes beyond simple hash-based deduplication by understanding the *content* of videos. It uses the embeddings generated in the previous steps to:\n",
+    "\n",
+    "- **Identify similar content** even when videos have different encoding, resolution, or slight variations\n",
+    "- **Group similar clips** using clustering algorithms\n",
+    "- **Remove duplicates** while preserving the most representative examples\n",
+    "- **Maintain metadata** for all processed clips\n",
+    "\n",
+    "### When to Use Deduplication\n",
+    "\n",
+    "- **Large datasets** with potential duplicate content\n",
+    "- **Video collections** from multiple sources\n",
+    "- **Content curation** where quality over quantity matters\n",
+    "- **Storage optimization** by removing redundant clips\n",
+    "- **Training data preparation** for machine learning models\n",
+    "\n",
+    "### Deduplication Pipeline Parameters\n",
+    "\n",
+    "The semantic deduplication pipeline offers several key parameters:\n",
+    "\n",
+    "- **`n_clusters`**: Number of clusters for grouping similar content (default: 100)\n",
+    "- **`distance_metric`**: Method for measuring similarity (\"cosine\", \"euclidean\", \"manhattan\")\n",
+    "- **`eps`**: Maximum distance threshold for considering clips as duplicates (lower = more strict)\n",
+    "- **`which_to_keep`**: Strategy for selecting which clip to keep from duplicates (\"random\", \"first\", \"last\")\n",
+    "- **`random_state`**: Seed for reproducible results\n",
+    "\n",
+    "### Running the Deduplication Pipeline\n",
+    "\n",
+    "The following example shows how to run semantic deduplication on your processed video clips:\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Import required modules\n",
+    "from nemo_curator.pipeline.pipeline import Pipeline\n",
+    "from nemo_curator.stages.deduplication.semantic import SemanticDeduplicationWorkflow\n",
+    "\n",
+    "# Configuration for deduplication\n",
+    "# Update these paths to match your actual processed video output\n",
+    "input_embeddings_path = \"./processed_clips/iv2_embd_parquet\"  # Path to your embedding parquet files\n",
+    "output_dedup_path = \"./processed_clips/dedup_output\"  # Path for deduplicated results\n",
+    "\n",
+    "# Create output directory if it doesn't exist\n",
+    "os.makedirs(output_dedup_path, exist_ok=True)\n",
+    "\n",
+    "# Create the deduplication pipeline\n",
+    "def create_video_dedup_pipeline() -> Pipeline:\n",
+    "    return SemanticDeduplicationWorkflow(\n",
+    "        input_path=input_embeddings_path,\n",
+    "        output_path=output_dedup_path,\n",
+    "        id_field=\"id\",                    # Field containing unique clip identifiers\n",
+    "        embedding_field=\"embeddings\",     # Field containing the vector embeddings\n",
+    "        metadata_fields=[\"id\"],           # Additional metadata fields to preserve\n",
+    "        n_clusters=100,                   # Number of clusters for grouping similar content\n",
+    "        distance_metric=\"cosine\",         # Distance metric for similarity calculation\n",
+    "        which_to_keep=\"random\",           # Strategy for selecting which duplicate to keep\n",
+    "        random_state=42,                  # Random seed for reproducible results\n",
+    "        eps=0.002,                        # Maximum distance threshold for duplicates (lower = more strict)\n",
+    "        # Storage options for local filesystem\n",
+    "        read_kwargs={\"storage_options\": {}},\n",
+    "        write_kwargs={\"storage_options\": {}},\n",
+    "        verbose=True,                     # Enable detailed logging\n",
+    "    )\n",
+    "\n",
+    "# Run the deduplication pipeline\n",
+    "print(\"Starting video deduplication pipeline...\")\n",
+    "print(f\"Input embeddings: {input_embeddings_path}\")\n",
+    "print(f\"Output directory: {output_dedup_path}\")\n",
+    "\n",
+    "# Create and run the pipeline\n",
+    "pipeline = create_video_dedup_pipeline()\n",
+    "pipeline.run()\n",
+    "\n",
+    "print(\"Deduplication completed!\")\n",
+    "print(f\"Results saved to: {output_dedup_path}\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Understanding Deduplication Results\n",
+    "\n",
+    "After running the deduplication pipeline, you'll find:\n",
+    "\n",
+    "#### 📁 Output Structure\n",
+    "```\n",
+    "dedup_output/\n",
+    "├── deduplicated_embeddings.parquet    # Deduplicated embedding data\n",
+    "├── cluster_assignments.parquet        # Cluster membership for each clip\n",
+    "└── duplicate_groups.parquet           # Groups of identified duplicates\n",
+    "```\n",
+    "\n",
+    "#### 📊 Key Metrics\n",
+    "The pipeline provides several useful metrics:\n",
+    "- **Total clips processed**: Number of input clips\n",
+    "- **Duplicates found**: Number of clips identified as duplicates\n",
+    "- **Deduplication ratio**: Percentage of clips removed\n",
+    "- **Clusters created**: Number of similarity groups formed\n",
+    "\n",
+    "#### 🎯 Customizing Deduplication\n",
+    "\n",
+    "You can adjust the deduplication behavior by modifying these parameters:\n",
+    "\n",
+    "**Strictness Control:**\n",
+    "- **`eps=0.001`**: Very strict (only nearly identical clips are considered duplicates)\n",
+    "- **`eps=0.005`**: Moderate (somewhat similar clips are considered duplicates)\n",
+    "- **`eps=0.01`**: Lenient (loosely similar clips are considered duplicates)\n",
+    "\n",
+    "**Clustering Strategy:**\n",
+    "- **`n_clusters=50`**: Fewer, larger clusters (more aggressive deduplication)\n",
+    "- **`n_clusters=200`**: More, smaller clusters (more conservative deduplication)\n",
+    "\n",
+    "**Distance Metrics:**\n",
+    "- **`\"cosine\"`**: Best for high-dimensional embeddings (recommended)\n",
+    "- **`\"euclidean\"`**: Good for normalized embeddings\n",
+    "- **`\"manhattan\"`**: Alternative for specific use cases\n",
+    "\n",
+    "### Integration with Video Pipeline\n",
+    "\n",
+    "The deduplication pipeline seamlessly integrates with the video processing pipeline:\n",
+    "\n",
+    "1. **Process videos** → Generate embeddings using the video pipeline\n",
+    "2. **Run deduplication** → Remove duplicate clips using this pipeline\n",
+    "3. **Use results** → Apply deduplicated dataset for your specific use case\n",
+    "\n",
+    "This two-step approach ensures you have both high-quality video content and an optimized, duplicate-free dataset.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Summary\n",
+    "\n",
+    "Now that you understand the basics of NeMo Curator's video pipeline, you can:\n",
+    "\n",
+    "1. **Experiment with different parameters** to optimize for your specific use case\n",
+    "2. **Scale up processing** by increasing `--video-limit` and using more powerful hardware\n",
+    "3. **Customize the pipeline** by adding or removing stages based on your needs\n",
+    "4. **Integrate with other tools** by using the generated embeddings and metadata\n",
+    "5. **Explore advanced features** like caption enhancement and preview generation\n",
+    "\n",
+    "### Additional Resources\n",
+    "\n",
+    "- **Official Documentation**: [NeMo Curator Video Guide](https://docs.nvidia.com/nemo-curator/)\n",
+    "- **API Reference**: Detailed documentation of all stages and parameters\n",
+    "- **Examples**: More complex examples in the `tutorials/` directory\n",
+    "- **Community**: Join discussions and get help from the community\n",
+    "\n",
+    "### Key Takeaways\n",
+    "\n",
+    "- NeMo Curator provides a powerful, scalable framework for video curation\n",
+    "- The pipeline is modular and can be customized for different use cases\n",
+    "- GPU acceleration significantly improves performance for large-scale processing\n",
+    "- Proper parameter tuning is essential for optimal results\n",
+    "- The system handles distributed processing automatically through Ray\n",
+    "\n",
+    "Happy video curating! 🎬✨"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}