From 5503b040d37cee68faeafa3922108b0c620bb35e Mon Sep 17 00:00:00 2001 From: flucchetti Date: Fri, 14 Jul 2023 13:58:27 +0000 Subject: [PATCH] clean --- benchmark/readme.md | 12 ------------ 1 file changed, 12 deletions(-) delete mode 100644 benchmark/readme.md diff --git a/benchmark/readme.md b/benchmark/readme.md deleted file mode 100644 index 1ddd05c..0000000 --- a/benchmark/readme.md +++ /dev/null @@ -1,12 +0,0 @@ -# Benchmark and Robot Simulator - -The benchmark tasks are stored in `benchmark.jsonl`. The benchmark works by running a simulation of the LLM-generated code using ASP. - -The simulation is checked with ASP temporal constraints for each task. A readable version of constraints can be found in `evaluator/constraints`. An ASP solver (Clingo) is used to determine whether the simulation trace satisfies the constraints. - -## Walkthrough - -- `evaluator/robot.lp` contains the ASP rules governing state changes in our simulated world. -- `simple_tracer.py` contains a script for turning python generated code into a trace of ASP instructions to feed to the simulation. -- `evaluator/evaluate.py` is called by the top-level RoboEval script and runs the simulation. -- `evaluator/solve_utils.py` contains a class of helper python functions that can be called in ASP. \ No newline at end of file