Skip to content

Commit

Permalink
bench readme
Browse files Browse the repository at this point in the history
  • Loading branch information
franlucc committed Jul 14, 2023
1 parent f7eb305 commit 135db20
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions benchmark/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Benchmark and Robot Simulator

The benchmark tasks are stored in `benchmark.jsonl`. The benchmark works by running a simulation of the LLM-generated code using ASP.

The simulation is checked with ASP temporal constraints for each task. A readable version of constraints can be found in `evaluator/constraints`. An ASP solver (Clingo) is used to determine whether the simulation trace satisfies the constraints.

## Walkthrough

- `evaluator/robot.lp` contains the ASP rules governing state changes in our simulated world.
- `simple_tracer.py` contains a script for turning python generated code into a trace of ASP instructions to feed to the simulation.
- `evaluator/evaluate.py` is called by the top-level RoboEval script and runs the simulation.
- `evaluator/solve_utils.py` contains a class of helper python functions that can be called in ASP.

0 comments on commit 135db20

Please sign in to comment.