-
Notifications
You must be signed in to change notification settings - Fork 349
Add AGENTS instructions #3953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AGENTS instructions #3953
Conversation
| # AGENT Instructions | ||
|
|
||
| ## Development environment | ||
| - Use **Python 3.10+**. Create and activate a virtual environment via `uv`, `virtualenv`, Conda, or `pyenv` before installing dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I would recommend that new developers use uv, but I haven't gotten around to updating the documentation on ReadTheDocs about this yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is mostly about helping an AI agent get into the right development environment. Although AGENT files do contain a lot of good info for human developers too. Probably makes sense to ditch anything non-uv at this point.
| ## Operational notes | ||
| - Default local config path is `./prod_env/`; override with `--local-path` when running commands. Ensure required provider credentials are configured before executing model-dependent runs. | ||
| - Many scenarios/model clients download datasets or call external APIs; prefer running without `-m models`/`-m scenarios` in CI to avoid costs and failures. Use markers deliberately to target specific expensive suites. | ||
| - Static leaderboard assets reside under `src/helm/benchmark/static` and `static_build`; React frontend is an alternative UI and not the deployed default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: The React frontend is the default UI; static_build is a compiled version that is provided for the convenience of Python users who do not have Node installed. There might be still a few references in the code and documentation about it being an "alternative", because there used to be a separate legacy default frontend that has since been deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful to know. An issue with agents right now is they will hone in on incorrect documentation and assume it is true. (Although if you tell them something is incorrect, they typically are good at respecting that).
|
While this is interesting, I'm not interested in merging this because I am the currently the sole main maintainer, and I am not making significant use of agents in this codebase. Additionally, I am planning to transition HELM to maintenance mode later this year (I have not publicly announced this yet), after which there should be minimal large changes to the codebase. Relatedly, if you're interested in forking HELM to support your use cases, I would be open to that and happy to chat more. |
|
I recommend playing around with agents - maybe not here, but just in general. They're at the point where they are useful a decent percent of the time. They still need a lot of hand holding, but the ability to find the right context in a repo is pretty nice. I'll often spin up a few to do busy work tasks while I focus on something else, and when I come back some of them failed, but a few of them returned a decent result. Having an AGENTS.md file can make them significantly more efficient as it gives them high level context. For big repos like this one it makes a big difference. Understood about stepping back from active development. It's a lot of work, and I'm grateful for the time you've spent to help me get up to speed with this repo. The code here does a lot, and that can be a double edge sword. Forking might be an option, but I'm also hesitant to pick up maintainership of another large repo (I already take care of 30+ packages with a few of them needing active care). Still it may the best path forward for the MAGNET project. I'll let you know. |
I've been experimenting with using coding agents, and something that can help is adding a top level https://agents.md file, which provides the agent with high level context about the repo, so it can more quickly plan out how to go about whatever the user request was.
To create this AGENTS.md file, I used GPT codex and prompted it to do a deep dive into the repo and write the resulting file. I manually checked the result and made some small modifications.
Codex output:
Summary
Codex Task