Skip to content

Pin v2 dependency versions for reproducibility#14

Open
conorbronsdon wants to merge 1 commit intorungalileo:mainfrom
conorbronsdon:chore/pin-v2-dependencies
Open

Pin v2 dependency versions for reproducibility#14
conorbronsdon wants to merge 1 commit intorungalileo:mainfrom
conorbronsdon:chore/pin-v2-dependencies

Conversation

@conorbronsdon
Copy link
Copy Markdown
Contributor

@conorbronsdon conorbronsdon commented Mar 13, 2026

User description

Summary

  • Adds minimum version pins (>=) to all packages in v2/requirements.txt
  • Normalizes package names to use hyphens per PyPI convention (e.g. langchain_anthropiclangchain-anthropic)

Context

The v2 requirements had zero version constraints, meaning evaluation results could vary depending on whatever package versions happen to install at the time. This is a reproducibility risk for a benchmarking tool. Minimum pins ensure a known-good baseline while still allowing compatible updates.

Test plan

  • pip install -r v2/requirements.txt resolves cleanly in a fresh virtualenv
  • Existing evaluation scripts import successfully

🤖 Generated with Claude Code


Generated description

Below is a concise technical summary of the changes proposed in this PR:
Pin the v2 benchmarking dependency list to minimum tested versions to keep evaluations reproducible with the core evaluation and install tooling. Normalize the package names in v2/requirements.txt to PyPI’s hyphenated conventions while keeping installs flexible above each baseline.

Latest Contributors(1)
UserCommitDate
pratik@galileo.aichanges-for-kimi-thinkingNovember 18, 2025
This pull request is reviewed by Baz. Review like a pro on (Baz).

The v2 requirements.txt had no version constraints at all, meaning
results could vary depending on whatever versions happen to install.
Add minimum version pins (>=) based on current latest releases to ensure
consistent behavior while still allowing compatible updates.

Also normalizes package names to use hyphens per PyPI convention.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@conorbronsdon conorbronsdon marked this pull request as ready for review March 13, 2026 06:13
Comment thread v2/requirements.txt
Comment on lines +1 to +12
galileo>=1.49.0
langchain>=1.2.0
langchain-anthropic>=1.3.0
langchain-baseten>=0.1.9
langchain-openai>=1.1.0
langchain-mistralai>=1.1.0
langchain-google-genai>=4.2.0
langchain-together>=0.3.0
langchain-fireworks>=1.1.0
langchain-writer>=0.3.5
langchain-deepseek>=1.0.0
langchain-aws>=1.4.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v2 requirements use minimum-bound specifiers (e.g. galileo>=1.49.0) instead of exact pins, so pip can install newer/unbounded versions and installs remain non-deterministic; the PR's reproducibility goal is not met. Can we pin each dependency with == or add a lockfile/constraints file?

Finding type: Logical Bugs | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

In v2/requirements.txt around lines 1-19, the dependencies are currently specified with
minimum bounds (e.g., >=) which does not provide reproducible installs. Replace each >=
specifier with an exact pin (==) using the resolved versions from a reproducible lock
step (for example run pip freeze in a controlled environment or use pip-tools to compile
a constraints file) so each line becomes package==<exact_version>. Alternatively, if you
prefer a constraints file, create v2/constraints.txt with exact versions and update
v2/requirements.txt to reference the constraints file (or document how to install using
--constraint), and ensure there are no duplicate or inconsistent package names and that
formatting remains one package per line. Return the updated v2/requirements.txt and the
generated lock/constraints file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally using >= rather than == here. The previous requirements had zero version constraints, so >= is already a big improvement. Exact pins in requirements.txt cause unnecessary dependency conflicts for contributors and downstream users. If the project adopts uv or pip-tools later, a lockfile can layer on top of these minimum bounds for CI reproducibility without constraining end users.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood on preferring minimum bounds with >= here to reduce conflicts and rely on a future lockfile for reproducibility; I’ll save this preference to memory after the PR is merged.

@galileo-automation
Copy link
Copy Markdown

No activity for 30 days — this PR will be closed in 5 days unless updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants