small-thinking · yxjiang · Aug 7, 2025 · Aug 7, 2025 · Aug 7, 2025 · Aug 7, 2025
diff --git a/.flake8 b/.flake8
@@ -0,0 +1,12 @@
+[flake8]
+max-line-length = 120
+extend-ignore = E203,W503,E402
+exclude = 
+    .git,
+    __pycache__,
+    build,
+    dist,
+    .venv,
+    .mypy_cache,
+    .tox,
+    .eggs
diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml
@@ -13,39 +13,34 @@ jobs:
   test:
     runs-on: ubuntu-latest
     steps:
-    - uses: actions/checkout@v2
+    - uses: actions/checkout@v4
 
     - name: Set up Python
-      uses: actions/setup-python@v2
+      uses: actions/setup-python@v4
       with:
-        python-version: '3.9'
+        python-version: '3.11'
 
-    - name: Install Poetry
-      run: |
-        python -m pip install --upgrade pip
-        pip install poetry
-        pip install -r requirements.txt
-
-    - name: Install dependencies
-      run: |
-        poetry install
+    - name: Install uv
+      uses: astral-sh/setup-uv@v1
+      with:
+        version: "latest"
 
-    - name: Install dependencies including optional dependencies
+    - name: Install dependencies
       run: |
-        python -m pip install toml
-        EXTRAS=$(python .github/workflows/extract_extra_deps.py)
-        echo "EXTRAS=$EXTRAS"
-        poetry install $EXTRAS
+        uv sync
 
-    - name: Install pytest
+    - name: Install development dependencies
       run: |
-        pip install pytest
+        uv sync --extra dev
 
     - name: Run Flake8
-      run: poetry run flake8 polymind/
+      run: uv run flake8 media_gen/
 
     - name: Run isort
-      run: poetry run isort --check-only .
+      run: uv run isort --check-only .
+
+    - name: Run Black
+      run: uv run black --check .
 
     # Set fake environment variables for openai
     - name: Set environment variables
@@ -56,4 +51,4 @@ jobs:
         echo "TAVILY_API_KEY=fake-api-key" >> $GITHUB_ENV
 
     - name: Run Pytest
-      run: poetry run pytest -vv --cov=polymind --cov-config=pyproject.toml -vv tests
+      run: uv run pytest -vv --cov=media_gen --cov-config=pyproject.toml
diff --git a/.gitignore b/.gitignore
@@ -153,6 +153,9 @@ dmypy.json
 # Cython debug symbols
 cython_debug/
 
+# uv
+.uv/
+
 # PyCharm
 #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
 #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore

diff --git a/README.md b/README.md
@@ -2,84 +2,146 @@
 
 AI-powered image and video generation/understanding framework.
 
-## Quick Start
+## Setup
 
-1. **Setup environment:**
+## Setup
+
+1. **Install uv:**
    ```bash
-   cp env.example .env
-   # Add your REPLICATE_API_TOKEN to .env
-   pip install -r requirements.txt
+   curl -LsSf https://astral.sh/uv/install.sh | sh
+   # or with pip: pip install uv
    ```
 
-2. **Generate images:**
+2. **Setup with uv:**
    ```bash
-   python media-gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "make it more vibrant"
+   python setup_uv.py
    ```
 
-## Features
+3. **Or manually with uv:**
+   ```bash
+   uv venv
+   uv sync
+   uv sync --extra dev  # For development dependencies
+   cp env.example .env
+   ```
 
-- **Image Understanding**: Analyze images with AI vision
-- **Image Generation**: Create new images via Replicate
-- **Video Understanding**: Extract and analyze video scenes
-- **Video Generation**: Generate videos from images and prompts
+4. **Configure API keys:**
+   ```bash
+   cp env.example .env
+   # Add your OPENAI_API_KEY to .env
+   # Add your REPLICATE_API_TOKEN to .env
+   ```
 
 ## Usage
 
-### Image Regeneration
+### Running with uv
+
+Use `uv run` to execute commands in the virtual environment:
+
 ```bash
-python media-gen/image_regen_pipeline.py \
-  --image-path ~/Pictures/photo.jpg \
-  --user-interests "convert to watercolor style" \
-  --output-folder ~/Desktop \
-  --aspect-ratio 16:9
+# Run any Python script
+uv run python media_gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "basketball, kapybara"
+
+# Run tests
+uv run pytest
+
+# Run with specific Python version
+uv run --python 3.11 python your_script.py
+```
+
+### Image Regeneration Pipeline
+
+Regenerate images with AI analysis and user preferences:
+
+```bash
+uv run python media_gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "basketball, kapybara"
 ```
 
 **Options:**
-- `--image-path`: Input image (required)
+- `--image-path`: Input image path (required)
 - `--user-interests`: Regeneration preferences (required)
 - `--output-folder`: Output directory (default: `~/Downloads`)
 - `--aspect-ratio`: Image ratio (default: `1:1`)
 - `--output-format`: Format (default: `png`)
 - `--debug`: Show detailed prompts
 
-### Video Understanding
-```python
-from tools.video_understanding_tool import VideoUnderstandingTool
+**Examples:**
+```bash
+# Basic regeneration
+uv run python media_gen/image_regen_pipeline.py --image-path landscape.jpg --user-interests "vintage style, steam punk"
+
+# Custom output and aspect ratio
+uv run python media_gen/image_regen_pipeline.py \
+  --image-path ~/Pictures/photo.jpg \
+  --user-interests "make it modern and professional" \
+  --output-folder ~/Desktop \
+  --aspect-ratio 16:9
+```
+
+### Video Regeneration Pipeline
+
+Process videos with AI understanding and generation:
+
+```bash
+uv run python media_gen/video_regen_pipeline.py --video-path video.mp4 --user-interests "basketball, kapybara"
+```
+
+**Options:**
+- `--video-path`: Input video path (required)
+- `--user-interests`: Processing preferences (required)
+- `--output-folder`: Output directory (default: `~/Downloads`)
+- `--screenshot-interval`: Seconds between screenshots (default: `2.0`)
 
-video_tool = VideoUnderstandingTool()
-result = video_tool.run({
-    "video_path": "video.mp4",
-    "user_preference": "cinematic lighting",
-    "screenshot_interval": 2.0
-})
+## Development
+
+### Adding Dependencies
+
+```bash
+# Add a new dependency
+uv add package-name
+
+# Add a development dependency
+uv add --dev package-name
+
+# Update dependencies
+uv sync
 ```
 
-### Video Generation
-```python
-from tools.replicate_video_gen import ReplicateVideoGen
+### Running Tests
+
+```bash
+# Run all tests
+uv run pytest
+
+# Run with coverage
+uv run pytest --cov=media_gen
 
-video_gen = ReplicateVideoGen()
-result = video_gen.run({
-    "image": "image.jpg",
-    "prompt": "serene landscape with gentle movement"
-})
+# Run specific test file
+uv run pytest media_gen/tests/test_image_understanding.py
 ```
 
-## Testing
+### Code Quality
 
 ```bash
-# Test image generation
-python media-gen/image_regen_pipeline.py --image-path media-gen/test_scripts/test_image.png --user-interests "enhance visual appeal"
+# Format code
+uv run black .
+
+# Sort imports
+uv run isort .
 
-# Test video tools
-python media-gen/test_scripts/test_video_understanding_tool.py
-python media-gen/test_scripts/test_replicate_video_gen.py
+# Type checking
+uv run mypy media_gen/
+
+# Linting
+uv run flake8 media_gen/
 ```
 
-## Architecture
+## Extensibility
+
+The framework is designed for easy extension. You can create custom tools for:
+
+- **Arbitrary image generation models** (Stable Diffusion, DALL-E, etc.)
+- **Video generation models** (Runway, Pika Labs, etc.)
+- **Custom analysis tools** (scene detection, content filtering, etc.)
 
-- **`pipeline.py`**: Core pipeline infrastructure
-- **`image_regen_pipeline.py`**: Image regeneration CLI
-- **`video_regen_pipeline.py`**: Video processing pipeline
-- **`tools/`**: Media generation and understanding tools
-- **`utils/`**: Utility functions for video processing
+Tools follow the base class in `media_gen/tools/media_gen_tool_base.py` for consistent integration.
diff --git a/__init__.py b/__init__.py
@@ -1,7 +1,7 @@
 """
 Media Regeneration Framework
 
-A comprehensive AI-powered media generation and understanding framework 
+A comprehensive AI-powered media generation and understanding framework
 for images and videos.
 """