Skip to content

Migrate from pip to uv for dependency management#188

Merged
speediedan merged 7 commits intomainfrom
pip_to_uv_migration
Nov 14, 2025
Merged

Migrate from pip to uv for dependency management#188
speediedan merged 7 commits intomainfrom
pip_to_uv_migration

Conversation

@speediedan
Copy link
Copy Markdown
Owner

Overview

This PR migrates Interpretune from pip-based dependency management to uv, a modern, fast Python package installer. This migration improves installation speed, reliability, and reproducibility across development and CI environments.

Key Changes

1. Dependency Management Migration

Before: Multiple requirements files managed by custom regen_reqfiles.py script

  • requirements/ci/requirements.in + various constraint files
  • requirements/ci/platform_dependent.txt
  • requirements/ci/post_upgrades.txt
  • requirements/ci/circuit_tracer_pin.txt
  • requirements/docs.txt
  • Custom Python script for requirements generation

After: Centralized dependency specification in pyproject.toml with UV-based locking

  • All dependencies defined in pyproject.toml using PEP 735 dependency groups
  • Simple shell script requirements/utils/lock_ci_requirements.sh for generating locked requirements
  • Git URL dependencies separated into git-deps group (UV doesn't support URLs in universal lock files)
  • Locked requirements in requirements/ci/requirements.txt for CI reproducibility

2. Installation Flow Simplification

CI Installation (2 steps):

# Step 1: Install interpretune + git dependencies
uv pip install -e . --group git-deps

# Step 2: Install locked PyPI dependencies
uv pip install -r requirements.txt

Development Installation (single command):

uv pip install -e ".[test,examples,lightning,profiling]" --group git-deps  dev

Advanced Development (with from-source packages):

./scripts/build_it_env.sh --repo_home=${PWD} --target_env_name=it_latest \
  --from-source="finetuning_scheduler:${HOME}/repos/finetuning-scheduler:all" \
  --from-source="circuit_tracer:${HOME}/repos/circuit-tracer"

3. Dependency Groups (PEP 735)

Organized dependencies into logical groups in pyproject.toml:

  • git-deps: Git URL dependencies (circuit-tracer) - installed separately from locked requirements
  • dev: Core development tools (uv, pre-commit, pyright)
  • test: Testing framework and tools (pytest, coverage, etc.)
  • docs: Documentation generation (Sphinx, theme, extensions)
  • profiling: Performance profiling tools (py-spy)

4. PyTorch weights_only checkpoint Compatibility Fixes

Issue: with weights_only=True for checkpoint loading now requires explicit allowlisting of classes for safe unpickling.

Solution: Auto-register all config classes and Path types as safe globals:

  • Added __init_subclass__ to ITSerializableCfg base class
  • Automatically registers all subclasses (including user experiment configs) with torch.serialization.add_safe_globals()
  • Registers pathlib.PosixPath and pathlib.WindowsPath to allow Path objects in serialized configs

Files Modified:

  • shared.py: Added auto-registration metaclass pattern
  • datamodule.py: Updated type hints to allow Path | str instead of restricting to str

5. Build Script Improvements

build_it_env.sh:

  • Simplified 3-step installation: editable install → locked requirements → from-source packages
  • Support for multiple from-source packages with optional extras and environment variables
  • Improved venv placement options for hardlink performance (same filesystem as UV cache)
  • Better logging and error handling

lock_ci_requirements.sh:

  • New simple shell wrapper around uv pip compile
  • Generates universal lock file from pyproject.toml
  • Replaces 200+ lines of Python code with ~30 lines of shell

6. CI/CD Updates

GitHub Actions:

  • action.yml: Updated to 2-step UV installation
  • action.yml: Removed uv run prefixes (environment pre-activated)
  • copilot-setup-steps.yml: Uses --group dev instead of listing packages

Azure Pipelines:

  • gpu-tests.yml: Aligned with GitHub Actions installation flow

7. Documentation Updates

README.md:

  • Quick start with single-command installation
  • Development setup with build script examples
  • Locked requirements section for CI reproducibility

.github/copilot-instructions.md:

  • Detailed CI and development installation flows
  • Dependency update procedures using new lock script
  • From-source package installation examples

Makefile:

  • docs target now uses uv pip install --group docs instead of docs.txt

8. Type System Improvements

StrOrPath Type Alias:

  • Narrowed from Union[str, PathLike, Path] to Union[str, Path]
  • Fixes OmegaConf serialization issues ("Unions of containers not supported")
  • Aligns with actual usage patterns (no PathLike objects in practice)

Files Deleted

  • requirements/ci/requirements.in
  • requirements/ci/platform_dependent.txt
  • requirements/ci/post_upgrades.txt
  • requirements/ci/circuit_tracer_pin.txt
  • requirements/utils/regen_reqfiles.py
  • tests/core/test_regen_reqfiles.py

Files Added

  • requirements/ci/requirements.txt
  • lock_ci_requirements.sh
  • uv.lock (UV's lock file, not committed but may be generated locally)

Testing

  • ✅ All 32 parity acceptance tests passing
  • ✅ OmegaConf Union type compatibility verified
  • ✅ PyTorch pickle security compatibility verified
  • ✅ CI installation flow tested locally
  • ✅ From-source package installation tested with finetuning_scheduler transformer_lens, circuit_tracer

Performance Improvements

  • Installation speed: UV is 10-100x faster than pip for dependency resolution
  • Cache efficiency: UV uses hardlinks when venv and cache are on same filesystem
  • Reliability: Deterministic dependency resolution with locked requirements
  • Developer experience: Single-command development environment setup

Migration Impact

Breaking Changes: None for end users

  • Installation commands updated but documented
  • All existing functionality preserved
  • CI workflows updated transparently

Developer Impact:

  • Simpler dependency management workflow
  • Faster environment setup
  • Clearer separation of concerns (base deps vs. dev deps vs. CI locked deps)

Next Steps

After this PR merges:

  1. Monitor CI stability across all platforms (Linux, Windows, macOS)
  2. Update contributor documentation if needed
  3. Consider publishing circuit-tracer to PyPI to eliminate git-deps group

…interface

- Replace custom regen_reqfiles.py with simple lock_ci_requirements.sh
- Consolidate all dependencies in pyproject.toml using PEP 735 groups
- Simplify CI installation to 2-step process (editable + locked reqs)
- auto-register config classes as safe globals for PyTorch checkpoint unpickling
- Update documentation (README, copilot-instructions) with new install flows
- Update Makefile docs target to use --group docs
- Remove obsolete requirements files (base.txt, devel.txt, docs.txt, etc.)
- Update dev infra scripts and CI workflows for uv pip interface

This improves installation speed, reliability, and maintainability while
preserving all existing functionality.
@github-actions github-actions bot added module: config Configuration system module: protocol Protocol.py specific area: examples Example code and demos area: tests Testing code area: docs Documentation files area: ci Continuous integration area: build Build system and packaging area: scripts Shell scripts and automation dependencies Dependency updates config Configuration file changes labels Nov 13, 2025
…egression for AnalysisStoreProtocol, remove deprecated used of pkg_resources
@github-actions github-actions bot added module: analysis Analysis functionality module: utils Utility functions labels Nov 13, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 13, 2025

regen-ci-req-check detected changes to pinned CI requirements and uploaded a patch artifact named 'regen-pins-diff'.

Top changes:

  1. bitsandbytes: ==0.48.2 → ==0.48.2

Please review the artifact and CI results. This workflow is report-only and will not open a PR; the scheduled regen workflow will open PRs automatically.

@speediedan speediedan marked this pull request as ready for review November 13, 2025 21:38
Copilot AI review requested due to automatic review settings November 13, 2025 21:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates Interpretune from pip-based dependency management to uv, a modern, fast Python package installer. The migration simplifies dependency management, improves installation speed (10-100x faster), and enhances reproducibility across development and CI environments.

Key Changes:

  • Replaced custom Python script (regen_reqfiles.py) with simple shell script (lock_ci_requirements.sh) using uv pip compile
  • Migrated from deprecated pkg_resources to importlib.metadata for version checking
  • Introduced PEP 735 dependency groups in pyproject.toml for better organization
  • Updated PyTorch checkpoint loading to handle weights_only=True with automatic class registration
  • Simplified CI/CD workflows to 2-step installation: editable install + locked requirements
  • Enhanced build scripts with support for multiple from-source packages and flexible venv placement

Reviewed Changes

Copilot reviewed 43 out of 46 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/warns.py, tests/runif.py Migrated from pkg_resources to importlib.metadata for version checking
tests/core/test_regen_reqfiles.py Deleted (no longer needed with uv-based locking)
src/it_examples/utils/raw_graph_analysis*.py Fixed attribute name (logit_tokens → logit_token_ids), commented out incomplete example
src/interpretune/utils/logging.py Added safety check for pip_packages parsing
src/interpretune/utils/import_utils.py Migrated from pkg_resources to importlib.metadata
src/interpretune/protocol.py Narrowed StrOrPath type, added PathLike back to AnalysisStoreProtocol
src/interpretune/config/shared.py Added auto-registration of config classes for PyTorch safe pickle loading
src/interpretune/analysis/core.py Updated type hints to include os.PathLike
scripts/infra_utils.sh Added utility functions for venv management and from-source package parsing
scripts/gen_it_coverage.sh Updated to use new from-source specification format
scripts/build_it_env.sh Rewritten to use uv with 3-step installation flow
requirements/utils/lock_ci_requirements.sh New shell script replacing Python-based regeneration
requirements/utils/regen_reqfiles.py Deleted (replaced by lock_ci_requirements.sh)
requirements/*.txt Deleted individual requirement files (consolidated in pyproject.toml)
requirements/ci/requirements.txt Updated to uv-generated lock file with 799 lines
requirements/ci/*.txt (other files) Deleted (no longer needed with uv)
pyproject.toml Reorganized dependencies into PEP 735 dependency groups
dockers/* Updated PyTorch version to 2.9.1, migrated from pip to uv
README.md Added comprehensive installation documentation
Makefile Updated to use uv and removed obsolete environment variables
.gitignore Added .python-version for uv
.github/workflows/* Updated CI workflows to use uv with 2-step installation
.github/copilot-instructions.md Updated with comprehensive uv-based workflows
.github/actions/* Updated actions to use uv and simplified installation
.azure-pipelines/gpu-tests.yml Aligned with GitHub Actions uv-based installation

@speediedan speediedan merged commit 196a2d1 into main Nov 14, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: build Build system and packaging area: ci Continuous integration area: docs Documentation files area: examples Example code and demos area: scripts Shell scripts and automation area: tests Testing code config Configuration file changes dependencies Dependency updates module: analysis Analysis functionality module: config Configuration system module: protocol Protocol.py specific module: utils Utility functions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants