Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[flake8]
max-line-length = 120
extend-ignore = E203,W503,E402
exclude =
.git,
__pycache__,
build,
dist,
.venv,
.mypy_cache,
.tox,
.eggs
39 changes: 17 additions & 22 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,39 +13,34 @@ jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: '3.9'
python-version: '3.11'

- name: Install Poetry
run: |
python -m pip install --upgrade pip
pip install poetry
pip install -r requirements.txt

- name: Install dependencies
run: |
poetry install
- name: Install uv
uses: astral-sh/setup-uv@v1
with:
version: "latest"

- name: Install dependencies including optional dependencies
- name: Install dependencies
run: |
python -m pip install toml
EXTRAS=$(python .github/workflows/extract_extra_deps.py)
echo "EXTRAS=$EXTRAS"
poetry install $EXTRAS
uv sync

- name: Install pytest
- name: Install development dependencies
run: |
pip install pytest
uv sync --extra dev

- name: Run Flake8
run: poetry run flake8 polymind/
run: uv run flake8 media_gen/

- name: Run isort
run: poetry run isort --check-only .
run: uv run isort --check-only .

- name: Run Black
run: uv run black --check .

# Set fake environment variables for openai
- name: Set environment variables
Expand All @@ -56,4 +51,4 @@ jobs:
echo "TAVILY_API_KEY=fake-api-key" >> $GITHUB_ENV

- name: Run Pytest
run: poetry run pytest -vv --cov=polymind --cov-config=pyproject.toml -vv tests
run: uv run pytest -vv --cov=media_gen --cov-config=pyproject.toml
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,9 @@ dmypy.json
# Cython debug symbols
cython_debug/

# uv
.uv/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
Expand Down
158 changes: 110 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,84 +2,146 @@

AI-powered image and video generation/understanding framework.

## Quick Start
## Setup

1. **Setup environment:**
## Setup

1. **Install uv:**
```bash
cp env.example .env
# Add your REPLICATE_API_TOKEN to .env
pip install -r requirements.txt
curl -LsSf https://astral.sh/uv/install.sh | sh
# or with pip: pip install uv
```

2. **Generate images:**
2. **Setup with uv:**
```bash
python media-gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "make it more vibrant"
python setup_uv.py
```

## Features
3. **Or manually with uv:**
```bash
uv venv
uv sync
uv sync --extra dev # For development dependencies
cp env.example .env
```

- **Image Understanding**: Analyze images with AI vision
- **Image Generation**: Create new images via Replicate
- **Video Understanding**: Extract and analyze video scenes
- **Video Generation**: Generate videos from images and prompts
4. **Configure API keys:**
```bash
cp env.example .env
# Add your OPENAI_API_KEY to .env
# Add your REPLICATE_API_TOKEN to .env
```

## Usage

### Image Regeneration
### Running with uv

Use `uv run` to execute commands in the virtual environment:

```bash
python media-gen/image_regen_pipeline.py \
--image-path ~/Pictures/photo.jpg \
--user-interests "convert to watercolor style" \
--output-folder ~/Desktop \
--aspect-ratio 16:9
# Run any Python script
uv run python media_gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "basketball, kapybara"

# Run tests
uv run pytest

# Run with specific Python version
uv run --python 3.11 python your_script.py
```

### Image Regeneration Pipeline

Regenerate images with AI analysis and user preferences:

```bash
uv run python media_gen/image_regen_pipeline.py --image-path photo.jpg --user-interests "basketball, kapybara"
```

**Options:**
- `--image-path`: Input image (required)
- `--image-path`: Input image path (required)
- `--user-interests`: Regeneration preferences (required)
- `--output-folder`: Output directory (default: `~/Downloads`)
- `--aspect-ratio`: Image ratio (default: `1:1`)
- `--output-format`: Format (default: `png`)
- `--debug`: Show detailed prompts

### Video Understanding
```python
from tools.video_understanding_tool import VideoUnderstandingTool
**Examples:**
```bash
# Basic regeneration
uv run python media_gen/image_regen_pipeline.py --image-path landscape.jpg --user-interests "vintage style, steam punk"

# Custom output and aspect ratio
uv run python media_gen/image_regen_pipeline.py \
--image-path ~/Pictures/photo.jpg \
--user-interests "make it modern and professional" \
--output-folder ~/Desktop \
--aspect-ratio 16:9
```

### Video Regeneration Pipeline

Process videos with AI understanding and generation:

```bash
uv run python media_gen/video_regen_pipeline.py --video-path video.mp4 --user-interests "basketball, kapybara"
```

**Options:**
- `--video-path`: Input video path (required)
- `--user-interests`: Processing preferences (required)
- `--output-folder`: Output directory (default: `~/Downloads`)
- `--screenshot-interval`: Seconds between screenshots (default: `2.0`)

video_tool = VideoUnderstandingTool()
result = video_tool.run({
"video_path": "video.mp4",
"user_preference": "cinematic lighting",
"screenshot_interval": 2.0
})
## Development

### Adding Dependencies

```bash
# Add a new dependency
uv add package-name

# Add a development dependency
uv add --dev package-name

# Update dependencies
uv sync
```

### Video Generation
```python
from tools.replicate_video_gen import ReplicateVideoGen
### Running Tests

```bash
# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=media_gen

video_gen = ReplicateVideoGen()
result = video_gen.run({
"image": "image.jpg",
"prompt": "serene landscape with gentle movement"
})
# Run specific test file
uv run pytest media_gen/tests/test_image_understanding.py
```

## Testing
### Code Quality

```bash
# Test image generation
python media-gen/image_regen_pipeline.py --image-path media-gen/test_scripts/test_image.png --user-interests "enhance visual appeal"
# Format code
uv run black .

# Sort imports
uv run isort .

# Test video tools
python media-gen/test_scripts/test_video_understanding_tool.py
python media-gen/test_scripts/test_replicate_video_gen.py
# Type checking
uv run mypy media_gen/

# Linting
uv run flake8 media_gen/
```

## Architecture
## Extensibility

The framework is designed for easy extension. You can create custom tools for:

- **Arbitrary image generation models** (Stable Diffusion, DALL-E, etc.)
- **Video generation models** (Runway, Pika Labs, etc.)
- **Custom analysis tools** (scene detection, content filtering, etc.)

- **`pipeline.py`**: Core pipeline infrastructure
- **`image_regen_pipeline.py`**: Image regeneration CLI
- **`video_regen_pipeline.py`**: Video processing pipeline
- **`tools/`**: Media generation and understanding tools
- **`utils/`**: Utility functions for video processing
Tools follow the base class in `media_gen/tools/media_gen_tool_base.py` for consistent integration.
2 changes: 1 addition & 1 deletion __init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
Media Regeneration Framework

A comprehensive AI-powered media generation and understanding framework
A comprehensive AI-powered media generation and understanding framework
for images and videos.
"""

Expand Down
Loading