A Python tool that analyzes images in PDF documents for copyright considerations using Claude AI. The tool processes each image and:
- Classifies them into copyright categories
- Finds potential source citations through reverse image search
- Generates alternative images for replaceable content
- Produces a comprehensive report
Images are classified into one of four categories:
- N: Not copyrighted/Creative Commons/Public Domain
- P: Permission needed (academic/university/federal/non-profit content)
- F: Fair use applicable
- R: Needs recreation/replacement
- Extracts and analyzes all images from PDF files
- Performs reverse image search using Google Images
- Attempts to extract citations from source websites
- Annotates the original PDF with classification labels
- Generates a JSON report with detailed classifications
- Uses Claude 3.5 Sonnet for image analysis
- Automatically generates alternative images using Stable Diffusion (optional)
- Python 3.x
- Chrome WebDriver (for Selenium)
- API Keys:
- Anthropic API key (for Claude)
- Stable Diffusion API key (for image generation)
- 2captcha API key (for solving captchas - optional)
pip install anthropic python-dotenv PyMuPDF stability-sdk selenium beautifulsoup4 trafilatura 2captcha-python
Create a .env
file with your API keys:
CLAUDE_API_KEY=your_claude_api_key
STABILITY_API_KEY=your_stability_api_key
CAPTCHA_API_KEY=your_2captcha_api_key # Optional
Basic usage:
python main.py path/to/your.pdf
With alternative image generation:
python main.py path/to/your.pdf --generate-alternatives
Show browser during citation search (helpful for debugging):
python main.py path/to/your.pdf --show-browser
-
Annotated PDF (
*_annotated.pdf
):- Original PDF with classification labels (N/P/F/R) next to each image
-
Classification Report (
*_classifications.json
):- Detailed information for each image
- Includes classifications, citations, source URLs
- Alternative image paths (if generated)
-
Alternative Images (optional):
- Generated images for content marked as 'R' or 'N'
- Saved as separate PNG files
-
Citation Search:
- Using
--show-browser
flag helps monitor the citation search process - Useful for debugging when citations aren't being found correctly
- Allows visual confirmation of successful Google Image searches
- Using
-
Captcha Handling (Work in Progress):
- Current implementation of captcha solving has limitations
- The URL being sent to 2captcha API may not match the actual captcha site
- Manual intervention might be needed for sites with strict captcha protection
-
Performance Considerations:
- Processing large PDFs with many images may take significant time
- Image search and text extraction are the most time-consuming operations