You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cost_X_Y: X is the budget cost of running swe-agent in our experiment, and Y is the trail of repetition.
In this case, we used a budget of 2 USD, and repeated the experiment 3 times.
Inside the cost_X_Y directory, *.traj files are the conversation log files for each task instance in swe-bench.
all_pred.jsonl includes all the generated patches.
For AutoCodeRover acr-run-1, acr-run-2, and acr-run-3 results align with Table-3, In our environment, the ACR column.
generated: there is an agent-generated patch for this issue
with_logs: a log file is produced when executing the passing/failing test-cases of this issue
applied: the patch can be applied successfully to the original program.
resolved: the patch made the passing/failing test-cases of this issue pass
I am trying to understand results for Auto Code Rover and SWE-Agent.
Can you please let me know the format of the SWE-Agent test results in:
https://github.com/nus-apr/auto-code-rover/tree/main/results/swe-agent-results
What are all these cost_2_1, cost_2_2, and cost_2_3?
How can I to understand the results in this directory?
Also for Auto Code Reover, I see acr-run-1, acr-run-2, acr-run-3. Which one should I take? Which result are you reporting in the paper?
The text was updated successfully, but these errors were encountered: