Website & DemoΒ | Β DiscordΒ | Β Preprint
SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories.
On SWE-bench, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.
We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an π€ Agent-Computer Interface (ACI). Read more about it in our paper!
SWE-agent is built and maintained by researchers from Princeton University.
If you found this work helpful, please consider using the following citation:
@misc{yang2024sweagent,
title={SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
author={John Yang and Carlos E. Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik Narasimhan and Ofir Press},
year={2024},
}
We provide a command line tool and a graphical web interface:
π Watch the video
sweagent_codespace.mov
- Click
- Add your API keys to
keys.cfg
(find the file in the left sidebar and fill out the template) - Make sure to wait until the
postCreateCommand
in the terminal window at the bottom is finished - Enter your SWE-agent command (see below)
Warning
Expect some issues with Windows (we're working on them). In the meantime, use Docker (see below).
- Install Docker, then start Docker locally.
- For the web interface only: Install
nodejs
. - Clone this repository.
- Run
pip install --editable .
at the repository root (as with any python setup, it's recommended to use conda or virtual environments to manage dependencies). - Run
./setup.sh
to create theswe-agent
docker image. - Create a
keys.cfg
file at the root of this repository (see below).
Tip
If you run into docker issues, see the installation issues section for more help
Warning
The latest containerized version does not yet provide the web interface.
Instead of installing SWE-agent from source, you can also run the software directly using Docker.
- Install Docker, then start Docker locally.
- Run
docker pull sweagent/swe-agent:latest
- Add your API tokens to a file
keys.cfg
as explained below
Then run
# NOTE:
# This assumes that keys.cfg is in your current directory (else fix the path below)
# This command is equivalent to the script shown in the quickstart
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock \
-v $(pwd)/keys.cfg:/app/keys.cfg \
sweagent/swe-agent-run:latest \
python run.py --image_name=sweagent/swe-agent:latest \
--model_name gpt4 \
--data_path https://github.com/pvlib/pvlib-python/issues/1603 \
--config_file config/default_from_url.yaml --skip_existing=False
Tip
- For more information on the different API keys/tokens, see below.
- If you're using docker on Windows, use
-v //var/run/docker.sock:/var/run/docker.sock
(double slash) to escape it (more information). - See the installation issues section for more help if you run into trouble.
Create a keys.cfg
file at the root of this repository and populate it with your API keys.
GITHUB_TOKEN: 'GitHub Token Here (optional)'
OPENAI_API_KEY: 'OpenAI API Key Here if using OpenAI Model (optional)'
π More options for different keys (click to unfold)
All keys are optional.
GITHUB_TOKEN: 'GitHub Token for access to private repos' # <-- delete line if not used
OPENAI_API_KEY: 'OpenAI API Key Here if using OpenAI Model'
ANTHROPIC_API_KEY: 'Anthropic API Key Here if using Anthropic Model'
TOGETHER_API_KEY: 'Together API Key Here if using Together Model'
AZURE_OPENAI_API_KEY: 'Azure OpenAI API Key Here if using Azure OpenAI Model'
AZURE_OPENAI_ENDPOINT: 'Azure OpenAI Endpoint Here if using Azure OpenAI Model'
AZURE_OPENAI_DEPLOYMENT: 'Azure OpenAI Deployment Here if using Azure OpenAI Model'
AZURE_OPENAI_API_VERSION: 'Azure OpenAI API Version Here if using Azure OpenAI Model'
OPENAI_API_BASE_URL: 'LM base URL here if using Local or alternative api Endpoint'
See the following links for tutorials on obtaining Anthropic, OpenAI, and Github tokens.
If you seem to be having issues with running docker
- Make sure that you allow the use of the Docker socket. In Docker desktop, click Settings > Advanced > Allow the default Docker socket to be used (requires password)
- If your docker installation uses a different socket, you might have to symlink them, see this command for example
Any remaining issues? Please open a GitHub issue!
To start our web UI, simply run
./start_web_ui.sh
If the user interface doesn't automatically open in your browser, please open it at http://localhost:3000
.
Currently, the web interface only has a subset of the options of the command line interface (CLI).
For the CLI, use the run.py
script:
python run.py --model_name gpt4 \
--data_path https://github.com/pvlib/pvlib-python/issues/1603 \
--config_file config/default_from_url.yaml \
--per_instance_cost_limit 2.00
You can also apply to it to a local repository:
python run.py --model_name gpt4 \
--data_path /path/to/my_issue.md \
--repo_path /path/to/my/local/repo \
--config_file config/default_from_url.yaml \
--per_instance_cost_limit 2.00 \
--apply_patch_locally
Tip
- Run
python run.py --help
to see all available options. - You can have the agent automatically open a PR if the issue has been solved by supplying the
--open_pr
flag. Please use this feature responsibly (on your own repositories or after careful consideration).
- See the
scripts/
folder for other useful scripts and details. - See the
config/
folder for details about how you can define your own configuration! - See the
sweagent/agent/
folder for details about the logic behind configuration based workflows. - See the
sweagent/environment/
folder for details about theSWEEnv
environment (interface + implementation). - See the
trajectories/
folder for details about the output ofrun.py
.
Ollama Support
Models served with an ollama server can be used by specifying --model
with ollama:model_name
and --host_url
to point to the url used to serve ollama (http://localhost:11434
by default). See more details about using ollama here.
python run.py --model_name ollama:deepseek-coder:6.7b-instruct \
--host_url http://localhost:11434 \
--data_path https://github.com/pvlib/pvlib-python/issues/1603 \
--config_file config/default_from_url.yaml
There are two steps to the SWE-agent pipeline. First SWE-agent takes an input GitHub issue and returns a pull request that attempts to fix it. We call that step inference. The second step (currently, only available for issues in the SWE-bench benchmark) is to evaluate the pull request to verify that it has indeed fixed the issue.
Warning
At this moment, there are known issues with a small number of repositories that don't install properly for arm64
/ aarch64
architecture computers. We're working on a fix, but if you'd like to run and evaluate on the entirety of SWE-bench, the easiest way is by using an x86
machine.
Inference on any GitHub Issue: See above.
Inference on SWE-bench: Run SWE-agent on SWE-bench Lite and generate patches.
python run.py --model_name gpt4 \
--per_instance_cost_limit 2.00 \
--config_file ./config/default.yaml
If you'd like to run on a single issue from SWE-bench, use the --instance_filter
option as follows:
python run.py --model_name gpt4 \
--instance_filter marshmallow-code__marshmallow-1359
This step is only available for issues from the SWE-bench set. To evaluate generated pull requests:
cd evaluation/
./run_eval.sh <predictions_path>
Replace <predictions_path>
with the path to the model's predictions, which should be generated from the Inference step. The <predictions_path>
arguments should look like ../trajectories/<username>/<model>-<dataset>-<hyperparams>/all_preds.jsonl
- See the
evaluation/
folder for details about how evaluation works.
If you'd like to modify the example demonstration that we feed the model at the start of each run, first generate a trajectory manually by running the agent with --model_name human
and then convert that trajectory into a demonstration by following the guide here.
To edit text in human
mode:
- Run the command
edit edit_start_line:edit_end_line
- Write the text you want to insert. Feel free to write the text across multiple lines.
- Press
return
then writeend_of_edit
and then pressreturn
again to submit the edit.
- If you'd like to ask questions, learn about upcoming features, and participate in future development, join our Discord community!
- If you'd like to contribute to the codebase, we welcome issues and pull requests!
- If you'd like to see a post or tutorial about some topic, please let us know via an issue.
Contact person: John Yang and Carlos E. Jimenez (Email: {jy1682, carlosej}@princeton.edu).
MIT. Check LICENSE
.