From 140efea9043ef6dda65edafd1186f505d6795d80 Mon Sep 17 00:00:00 2001 From: Simon Dierl Date: Wed, 20 Dec 2023 14:16:32 +0100 Subject: [PATCH 1/8] changes for artifact draft --- AEC-README.md | 91 ++++++++++ LICENSE.md | 16 ++ README.md | 42 ++--- Zenodo-README.md | 155 ++++++++++++++++++ benchmarks/models/.DS_Store | Bin 8196 -> 0 bytes build.sh | 4 +- collate_experiment.sh | 2 + collate_results.sh | 2 + compare-learners.sh | 2 + experiments/ct-big-datastructures/config | 1 + .../fifo_size_7.register.xml | 0 install-deps.sh | 4 + ...ig-experiments-with-ce-search-tacas2024.sh | 7 + run_evaluation.sh | 1 + search_logs.sh | 2 + 15 files changed, 306 insertions(+), 23 deletions(-) create mode 100644 AEC-README.md create mode 100644 LICENSE.md create mode 100644 Zenodo-README.md delete mode 100644 benchmarks/models/.DS_Store create mode 120000 experiments/ct-big-datastructures/config rename experiments/{ct-datastructures => ct-big-datastructures}/fifo_size_7.register.xml (100%) create mode 100755 install-deps.sh create mode 100755 run-big-experiments-with-ce-search-tacas2024.sh diff --git a/AEC-README.md b/AEC-README.md new file mode 100644 index 0000000..fbfbbf3 --- /dev/null +++ b/AEC-README.md @@ -0,0 +1,91 @@ +# Artifact Evaluation Readme + +This file details how to perform artifact evaluation. See the +`README.md` accompanying the artifact for a more general +explanation. + +The artifact can replicate the data presented in Table 1 as well +as Figures 4 and 5 that is discussed in Section 6. +Note that timing data is heavily machine-dependent, +although the relative timings should be reproducible. + +## General Setup + +For both early light and full review, setup must be performed. +All dependencies are included with this artifact, i.e., +no network connection is required. +First, the system dependencies need to be installed. +When prompted, enter the sudo password (same as username). This +will take some minutes. + +``` +./install-deps.sh +``` + +Next, build the RaLib package. No administrative priviliges are +requires. This will also take some minutes, especially during unit testing. + +``` +./build.sh +``` + +## Reproducing Experimental Data + +Three scripts are provided that regenerate the experimental +output data in `results/`. Initially, this directory will +contain data produced by us which will be overwritten. If some +of the scripts are not run, our data will be used in the respective +parts of the table and figures. + +The experiments can be re-run using: +``` +./run-experiments-model-checker-tacas2024.sh +./run-experiments-with-ce-search-tacas2024.sh +./run-big-experiments-with-ce-search-tacas2024.sh +``` + +The first command uses model checking to find counterexamples +and should complete in TODO. It generates the data presented +in the top six lines of Table 1 as well as both Figures. +**Only performing this step is recommended for early light evaluation.** + +The second command uses slower counterexample search and should +complete in TODO. It generates the data presented in lines 7 and 8 +of Table 1. + +The third command uses counterexample search on larger models +and may time out on mid-end machines after TODO. It generates +the data presented in line 9 of Table 1. + +## Reproducing Table and Figures + +The plots can be re-rendered using two scripts +``` +./generate-table-tacas2024.sh +./generate-plots-tacas2024.sh +``` + +This will take some minutes. The plots will be generated from all +data present in `results/`, i.e., if some experiment scripts were +not run, they will include our data. + +The generated files that match table and figures are: + +1. `plots/tacas2024-table-results.pdf`, which should match Table 1. + Some extraneous data will be included and the experiment names + will slightly differ. +2. `plots/plot-dtls-resets.pdf`, which should match Figure 4(a). +3. `plots/plot-dtls-resets-noopt.pdf`, which should match Figure 4(b). +4. `plots/plot-dtls-counterexamples.pdf`, which should match Figure 4(c). +5. `plots/plot-dtls-wct.pdf`. which should match Figure 4(d). +6. `plots/plot-gen-transitions.pdf`, which should match Figure 5(a). +7. `plots/plot-gen-registers.pdf`, which should match Figure 5(b). +8. `plots/plot-gen-registers-noopt.pdf`, which should match Figure 5(c). + +Note that some additional files were not included in the paper. + +## Source Code + +The full source code of the tool is available in the `ralib/` directory. +Source code for dependencies provided by Ubuntu or Maven Central is not +provided. diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 0000000..1f0e52c --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,16 @@ +# License Information + +The components of this package are all released under various +open-source licenses: + +1. All Ubuntu packages in `packages/` are released under various + licenses. See each package's metadata for details. +2. All Maven dependencies in `m2/` are released under various + licenses. See each artifact's metadata for details. +3. RALib and our extensions are released under the Apache License, + Version 2.0. +4. All configuration files and scripts are released under the + MIT License. +5. The benchmark problems are either publicly available from the + Automata Wiki without restrictions or released under the MIT + License. diff --git a/README.md b/README.md index db8da2a..ca35fda 100644 --- a/README.md +++ b/README.md @@ -1,30 +1,30 @@ # RaLib Benchmarking Setup -This project contains benchmark examples and scripts to +This project contains benchmark examples and scripts to run RaLib on these benchmarks. The project is organized as follows: 1. **benchmark problems** are located in ```benchmarks/```. Problems are collected from papers on RaLib, from the [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. + are generated or manually created for this project. 2. **configurations** of RaLib are located in ```configs/```. - The different configurations are used in different - series of experiments and specify aspects of the + The different configurations are used in different + series of experiments and specify aspects of the learning experiments (how counterexamples are found, - how counterexamples are preprocessed, max. runtimes, - teachers/theories for types, etc.) -3. **configurations** are organized in series in ```experiments/```. + how counterexamples are preprocessed, max. runtimes, + teachers/theories for types, etc.). +3. **experiments** are organized in series in ```experiments/```. Series combine benchmark problems with a configuration. - The different series in this project originate from + The different series in this project originate from multiple papers on RaLib / register automata learning. 4. **results** from running experiments will be stored in ```results/```. Results include logs and models from individual learning experiments. Some scripts will use the results to compute plots and tables. - Templates and generated documents ce be found + Templates and generated documents can be found in ```plots```. -5. **RaLib** is a git sub module in ```ralib/```. +5. **RaLib** is a git submodule in ```ralib/```. ## RaLib Installation @@ -34,8 +34,6 @@ and build RaLib, which requires a JDK and Maven to be installed. ``` -git submodule init -git submodule update ./build.sh ``` @@ -74,9 +72,9 @@ stable. ./generate-table-tacas2024.sh ``` - The generated pdf is ```plots/tacas2024-table-results.pdf``` + The generated PDF is ```plots/tacas2024-table-results.pdf``` - *Note:* The generated pdf has more rows and columns + *Note:* The generated PDF has more rows and columns than the table in the paper. 2. Generating plots @@ -111,25 +109,27 @@ stable. ### Running individual experiments ``` -run_experiment.sh [-h] -s series -e experiment -i iterations -l learner +./run_experiment.sh [-h] -s series -e experiment -i iterations -l learner ``` Learn model of `experiments/[series]/[experiment].xml` with config -`experiments/[series]/config` and specified learner. +`experiments/[series]/config` and specified learner. Each `series` +corresponds to a subdirectory of `experiments/`, each experiment to +a XML file in a `series`. ### Running a series ``` -run_series.sh [-h] -s series -i iterations -l learner +./run_series.sh [-h] -s series -i iterations -l learner ``` -Run the series of experiments specified in folder -`experiments/[series]` with specified learner. +Run the series of experiments (i.e., all XML files) specified in the +directory `experiments/[series]` with the specified learner. ### Running the complete evaluation (2015 RaLib Paper) ``` -run_evaluation.sh +./run_evaluation.sh ``` This script contains explicit calls to `run_series.sh` for @@ -139,7 +139,7 @@ all experiments that are to be run as part of the evaluation. ### Searching in logs ``` -search_logs.sh series experiment learner search_term +./search_logs.sh series experiment learner search_term ``` Searches for `search_term` in the logs from all iterations. diff --git a/Zenodo-README.md b/Zenodo-README.md new file mode 100644 index 0000000..841fe45 --- /dev/null +++ b/Zenodo-README.md @@ -0,0 +1,155 @@ +# RaLib Benchmarking Setup + +This project contains benchmark examples and scripts to +run RaLib on these benchmarks. The project is organized +as follows: + +1. **benchmark problems** are located in ```benchmarks/```. + Problems are collected from papers on RaLib, from the + [automata wiki](https://automata.cs.ru.nl), and some + are generated or manually created for this project. +2. **configurations** of RaLib are located in ```configs/```. + The different configurations are used in different + series of experiments and specify aspects of the + learning experiments (how counterexamples are found, + how counterexamples are preprocessed, max. runtimes, + teachers/theories for types, etc.). +3. **experiments** are organized in series in ```experiments/```. + Series combine benchmark problems with a configuration. + The different series in this project originate from + multiple papers on RaLib / register automata learning. +4. **results** from running experiments will be stored in + ```results/```. Results include logs and models from + individual learning experiments. Some scripts will + use the results to compute plots and tables. + Templates and generated documents can be found + in ```plots/```. We ship experiment results and plots + generated by us in these locations. + +## Dependency Installation + +First, the system dependencies need to be installed. All +dependency packages are shipped with this artifact in the +```packages``` directory. When prompted, enter your password. + +``` +./install-deps.sh +``` + +## RaLib Installation + +Next, build the RaLib package. The build is performed using +Apache Maven. All Maven dependencies are pre-downloaded to the +```m2``` directory. + +``` +./build.sh +``` + +## Running experiments + +Generally, experiments are organized in series and +configurations do not specify which learner should +be used or how many times experiments should be run. +Shell scripts exist for running individual experiments, +series, and complete evaluations. + +### Experiments for the evaluation of CT-based learners (TACAS'24) + +This section describes how one can run the experiments +that are reported in the TACAS 2024 submission. +Concrete results may vary a little bit for every +execution due to randomization but trends should be +stable. + +1. Running experiments from TACAS 2024 submission + + ``` + ./run-experiments-model-checker-tacas2024.sh + ./run-experiments-with-ce-search-tacas2024.sh + ``` + + results (logs and models) can be found in ```results/```. + + +2. Generating table in TACAS 2024 submission + + this step requires latexmk and a latex distribution + to be installed + + ``` + ./generate-table-tacas2024.sh + ``` + + The generated PDF is ```plots/tacas2024-table-results.pdf``` + + *Note:* The generated PDF has more rows and columns + than the table in the paper. + +2. Generating plots + + this step requires latexmk and a latex distribution + with pgfplots to be installed + + ``` + ./generate-plots-tacas2024.sh + ``` + + The plots in Fig. 4. are: + + ``` + plots/plot-dtls-resets.pdf + plots/plot-dtls-resets-noopt.pdf + plots/plot-dtls-counterexamples.pdf + plots/plot-dtls-wct.pdf + ``` + + The plots in Fig. 5 are + + ``` + plots/plot-gen-transitions.pdf + plots/plot-gen-registers.pdf + plots/plot-gen-registers-noopt.pdf + ``` + + *Note:* Some additional plots that are generated + were not included in the paper. + +### Running individual experiments + +``` +./run_experiment.sh [-h] -s series -e experiment -i iterations -l learner +``` + +Learn model of `experiments/[series]/[experiment].xml` with config +`experiments/[series]/config` and specified learner. Each `series` +corresponds to a subdirectory of `experiments/`, each experiment to +a XML file in a `series`. + +### Running a series + +``` +./run_series.sh [-h] -s series -i iterations -l learner +``` + +Run the series of experiments (i.e., all XML files) specified in the +directory `experiments/[series]` with the specified learner. + +### Running the complete evaluation (2015 RaLib Paper) + +``` +./run_evaluation.sh +``` + +This script contains explicit calls to `run_series.sh` for +all experiments that are to be run as part of the evaluation. + + +### Searching in logs + +``` +./search_logs.sh series experiment learner search_term +``` + +Searches for `search_term` in the logs from all iterations. + diff --git a/benchmarks/models/.DS_Store b/benchmarks/models/.DS_Store deleted file mode 100644 index ce3c864d2f9a6872265ea7e04ffb8cc5608da693..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 8196 zcmeHLU2GIp6h3EL=1-7k%>FduQviEksF-8fGSQ z&;2>)+&ka>=AJzZ0GKM7tpG6qQ0U=O*Ql81;^X|8b47xlB_xsT0bD474+T-qQ4c#p z1VRKt1VRKt1VRM<0|-#e=9hAnxGzV;GDILm;D$tipAT_*xJ(9eN=k8bP$5zPlC{(= zE~--=AZ+4+Oa^jFN?}SqW%hu9DF!J9N^=^Ib9Iu*Ku$?1%>ku3U@$WV848No$u2I` z0h3aOWr#qCz+41)?yf>Rq@WM7mHE5rE{}1mEP)UH5RR9lsx$4!y=D;V1P$=f%=Q!U+J|2Mrhcfn}JQQ(M2i-H~))jm6O*& zC!~dsZt~Mx>EX|zH0+1|+_+cB*=~*i`Yl(mEzfUhxr~vT+64>il&GrlTd!B|OOLqY zY2WZmUdkvQ7cI+VxuD)>_#@?dKjA%FH*y0O8_(*E9j@aWmcxn_ro$P_G2QW$TXIZq zI4%8UEYly!>dX3yY>Z6%42KO^o|Us~%Rj9|`JL5c6BFALEzR-mJMWlmj!(2D63y{l zt%=D={i33*-jrzX87fSkJvTk`+Uw`v_~>K4{dAfJr!Fj&?b-4Zs#U>q4a973tWP2q zam$E(Q-Rn`oegSNR&U^|?=AS2>v#jK=#k9oOFV+HJMTKd+hz6oykX}{w&AmO+fG{# zGcS{O?b4{@4f^bq&wGw}Zqe@$ompKkdF%)yeQS($l9e}Tkvp!YV^n!_(el-6*KgXo zW7nQbin|~{0zS$@d+cSp^go>6dUnoybagm25iD@n7})6 z7q(#v@4>^^gZE+vPhcKRWa!`|K7v#DD4xaV@g+Qm(>Q}~ zrHZFS#^CdfWacH~{zmn{zeL8%izEdbmn{!CnVQ|$s;#W#lpvC1ieONgA%dh=(DKrX z*veI_*ATCWXzWz=)|V-v*GU?2ZDOjXD}p(^xuJ=g5u&#ud~M#Msd|kdF(2jp)E5Z4 z@+L7;)oTTzd9!UdjV=_t<;}hOXcXi}#M{v(O;w0!e}{6@@D{u;=ynmlhVS5e_yv9^ zqAetn#c&l->=s;$8*wu>dM)!qg5kqVm$xffA#H@P9!7Ydh1ODOzng zdR8x-=h|_4PSC?It~Vv6Frk90@P3hwlMK>v(H9TKl7XC(QkasBFeNE7`G5Y8fG#>( Q>F0YG|HJry1IPbA0HOt>YXATM diff --git a/build.sh b/build.sh index 40a54ea..e396015 100755 --- a/build.sh +++ b/build.sh @@ -1,4 +1,4 @@ -#!/usr/bin/env bash +#!/bin/sh -( cd ralib && mvn package assembly:single ) +cd ralib && mvn -Dmaven.repo.local=../m2 package assembly:single diff --git a/collate_experiment.sh b/collate_experiment.sh index 18574b3..af0f5e8 100755 --- a/collate_experiment.sh +++ b/collate_experiment.sh @@ -1,3 +1,5 @@ +#!/usr/bin/env bash + experiment=$1 inputs=$2 diff --git a/collate_results.sh b/collate_results.sh index 62b51ca..e08b476 100755 --- a/collate_results.sh +++ b/collate_results.sh @@ -1,3 +1,5 @@ +#!/usr/bin/env bash + inputs=${1:-0} series=($(ls results)) diff --git a/compare-learners.sh b/compare-learners.sh index 706bb8b..c9892d7 100755 --- a/compare-learners.sh +++ b/compare-learners.sh @@ -1,3 +1,5 @@ +#!/usr/bin/env bash + series=$1 experiment=$2 diff --git a/experiments/ct-big-datastructures/config b/experiments/ct-big-datastructures/config new file mode 120000 index 0000000..8f0b492 --- /dev/null +++ b/experiments/ct-big-datastructures/config @@ -0,0 +1 @@ +../../configs/twalk-eq-ds \ No newline at end of file diff --git a/experiments/ct-datastructures/fifo_size_7.register.xml b/experiments/ct-big-datastructures/fifo_size_7.register.xml similarity index 100% rename from experiments/ct-datastructures/fifo_size_7.register.xml rename to experiments/ct-big-datastructures/fifo_size_7.register.xml diff --git a/install-deps.sh b/install-deps.sh new file mode 100755 index 0000000..7e71e9d --- /dev/null +++ b/install-deps.sh @@ -0,0 +1,4 @@ +#!/bin/sh + +cd packages && sudo dpkg -i *.deb + diff --git a/run-big-experiments-with-ce-search-tacas2024.sh b/run-big-experiments-with-ce-search-tacas2024.sh new file mode 100755 index 0000000..56e6bc5 --- /dev/null +++ b/run-big-experiments-with-ce-search-tacas2024.sh @@ -0,0 +1,7 @@ +#!/usr/bin/env bash + +learners=("slstar" "sllambda" "sldt") + +for learner in ${learners[@]}; do + ./run_series.sh -s ct-big-datastructures -i 20 -l $learner +done diff --git a/run_evaluation.sh b/run_evaluation.sh index 164e6f8..b5b07a8 100755 --- a/run_evaluation.sh +++ b/run_evaluation.sh @@ -1,3 +1,4 @@ +#!/bin/sh # basic correctness diff --git a/search_logs.sh b/search_logs.sh index f6b87bc..51d5b85 100755 --- a/search_logs.sh +++ b/search_logs.sh @@ -1,3 +1,5 @@ +#!/usr/bin/env bash + series=$1 experiment=$2 learner=$3 From b9430ebf484bec18c7da8ef7eb67ee11b88be652 Mon Sep 17 00:00:00 2001 From: Falk Howar Date: Thu, 21 Dec 2023 23:26:07 +0100 Subject: [PATCH 2/8] update series name for fifo7 --- generate-summary-with-ce-search-tacas2024.sh | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/generate-summary-with-ce-search-tacas2024.sh b/generate-summary-with-ce-search-tacas2024.sh index 86394ad..ecb7165 100755 --- a/generate-summary-with-ce-search-tacas2024.sh +++ b/generate-summary-with-ce-search-tacas2024.sh @@ -11,12 +11,5 @@ ./compare-learners.sh ct-datastructures fifo_size_3.register ./compare-learners.sh ct-datastructures fifo_size_5.register -./compare-learners.sh ct-datastructures fifo_size_7.register +./compare-learners.sh ct-big-datastructures fifo_size_7.register -./compare-learners.sh ct-datastructures lifo_size_3.register -./compare-learners.sh ct-datastructures lifo_size_5.register -./compare-learners.sh ct-datastructures lifo_size_7.register - -./compare-learners.sh ct-datastructures set3 -./compare-learners.sh ct-datastructures set5 -./compare-learners.sh ct-datastructures set7 From 40a02feda98a09789a699eadebdb70efcde26f77 Mon Sep 17 00:00:00 2001 From: Simon Dierl Date: Fri, 22 Dec 2023 10:00:48 +0100 Subject: [PATCH 3/8] discussion results --- AEC-README.md | 183 ++++++++++++++++++++++++++++++++++++++++++----- README.md | 28 ++++---- Zenodo-README.md | 155 --------------------------------------- build.sh | 8 ++- install-deps.sh | 8 ++- 5 files changed, 193 insertions(+), 189 deletions(-) delete mode 100644 Zenodo-README.md diff --git a/AEC-README.md b/AEC-README.md index fbfbbf3..82b02a7 100644 --- a/AEC-README.md +++ b/AEC-README.md @@ -1,8 +1,8 @@ # Artifact Evaluation Readme This file details how to perform artifact evaluation. See the -`README.md` accompanying the artifact for a more general -explanation. +**Detailed RaLib Benchmarking Setup** part below for a more generic +documentation. The artifact can replicate the data presented in Table 1 as well as Figures 4 and 5 that is discussed in Section 6. @@ -17,14 +17,12 @@ no network connection is required. First, the system dependencies need to be installed. When prompted, enter the sudo password (same as username). This will take some minutes. - ``` ./install-deps.sh ``` -Next, build the RaLib package. No administrative priviliges are +Next, build the RaLib package. No administrative privileges are requires. This will also take some minutes, especially during unit testing. - ``` ./build.sh ``` @@ -37,25 +35,34 @@ contain data produced by us which will be overwritten. If some of the scripts are not run, our data will be used in the respective parts of the table and figures. -The experiments can be re-run using: +The first command uses counterexample search and should +complete in less than ten minutes: ``` -./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh -./run-big-experiments-with-ce-search-tacas2024.sh ``` - -The first command uses model checking to find counterexamples -and should complete in TODO. It generates the data presented -in the top six lines of Table 1 as well as both Figures. -**Only performing this step is recommended for early light evaluation.** - -The second command uses slower counterexample search and should -complete in TODO. It generates the data presented in lines 7 and 8 -of Table 1. +It generates the data presented +in lines 7 and 8 of Table 1. The data for the other lines and figures +is not recreated. If you are performing an **early light evaluation**, +you may want to skip the next two commands and proceed to reproducing +table and figures. + +The second command uses model checking to find counterexamples +and should complete in ca. three hours: +``` +./run-experiments-model-checker-tacas2024.sh +``` +It generates the data presented +in the top six lines of Table 1 as well as both Figures. Lines 7 to 9 of +the table are not recreated. The third command uses counterexample search on larger models -and may time out on mid-end machines after TODO. It generates -the data presented in line 9 of Table 1. +and may require more that one day on mid-end machines: +``` +./run-big-experiments-with-ce-search-tacas2024.sh +``` +It generates +the data presented in line 9 of Table 1. The data for the other lines +and the figures is not recreated. ## Reproducing Table and Figures @@ -89,3 +96,141 @@ Note that some additional files were not included in the paper. The full source code of the tool is available in the `ralib/` directory. Source code for dependencies provided by Ubuntu or Maven Central is not provided. + +# Detailed RaLib Benchmarking Setup + +This project contains benchmark examples and scripts to +run RaLib on these benchmarks. The project is organized +as follows: + +1. **benchmark problems** are located in ```benchmarks/```. + Problems are collected from papers on RaLib, from the + [automata wiki](https://automata.cs.ru.nl), and some + are generated or manually created for this project. +2. **configurations** of RaLib are located in ```configs/```. + The different configurations are used in different + series of experiments and specify aspects of the + learning experiments (how counterexamples are found, + how counterexamples are preprocessed, max. runtimes, + teachers/theories for types, etc.). +3. **experiments** are organized in series in ```experiments/```. + Series combine benchmark problems with a configuration. + The different series in this project originate from + multiple papers on RaLib / register automata learning. +4. **results** from running experiments will be stored in + ```results/```. Results include logs and models from + individual learning experiments. Some scripts will + use the results to compute plots and tables. + Templates and generated documents can be found + in ```plots/```. We ship experiment results and plots + generated by us in these locations. + +## Running experiments + +Generally, experiments are organized in series and +configurations do not specify which learner should +be used or how many times experiments should be run. +Shell scripts exist for running individual experiments, +series, and complete evaluations. + +### Experiments for the evaluation of CT-based learners (TACAS'24) + +This section describes how one can run the experiments +that are reported in the TACAS 2024 submission. +Concrete results may vary a little bit for every +execution due to randomization but trends should be +stable. + +1. Running experiments from TACAS 2024 submission + + ``` + ./run-experiments-model-checker-tacas2024.sh + ./run-experiments-with-ce-search-tacas2024.sh + ./run-big-experiments-with-ce-search-tacas2024.sh + ``` + + results (logs and models) can be found in ```results/```. + The expected running times are ca. 3 hours, less than ten + minutes, and (on mid-end machines) over a day, respectively. + + +2. Generating table in TACAS 2024 submission + + this step requires latexmk and a latex distribution + to be installed + + ``` + ./generate-table-tacas2024.sh + ``` + + The generated PDF is ```plots/tacas2024-table-results.pdf``` + + *Note:* The generated PDF has more rows and columns + than the table in the paper. + +2. Generating plots + + this step requires latexmk and a latex distribution + with pgfplots to be installed + + ``` + ./generate-plots-tacas2024.sh + ``` + + The plots in Fig. 4. are: + + ``` + plots/plot-dtls-resets.pdf + plots/plot-dtls-resets-noopt.pdf + plots/plot-dtls-counterexamples.pdf + plots/plot-dtls-wct.pdf + ``` + + The plots in Fig. 5 are + + ``` + plots/plot-gen-transitions.pdf + plots/plot-gen-registers.pdf + plots/plot-gen-registers-noopt.pdf + ``` + + *Note:* Some additional plots that are generated + were not included in the paper. + +### Running individual experiments + +``` +./run_experiment.sh [-h] -s series -e experiment -i iterations -l learner +``` + +Learn model of `experiments/[series]/[experiment].xml` with config +`experiments/[series]/config` and specified learner. Each `series` +corresponds to a subdirectory of `experiments/`, each experiment to +a XML file in a `series`. + +### Running a series + +``` +./run_series.sh [-h] -s series -i iterations -l learner +``` + +Run the series of experiments (i.e., all XML files) specified in the +directory `experiments/[series]` with the specified learner. + +### Running the complete evaluation (2015 RaLib Paper) + +``` +./run_evaluation.sh +``` + +This script contains explicit calls to `run_series.sh` for +all experiments that are to be run as part of the evaluation. + + +### Searching in logs + +``` +./search_logs.sh series experiment learner search_term +``` + +Searches for `search_term` in the logs from all iterations. diff --git a/README.md b/README.md index ca35fda..c298bf8 100644 --- a/README.md +++ b/README.md @@ -18,38 +18,39 @@ as follows: Series combine benchmark problems with a configuration. The different series in this project originate from multiple papers on RaLib / register automata learning. -4. **results** from running experiments will be stored in - ```results/```. Results include logs and models from - individual learning experiments. Some scripts will - use the results to compute plots and tables. +4. **results** from running experiments will be stored in + ```results/```. Results include logs and models from + individual learning experiments. Some scripts will + use the results to compute plots and tables. Templates and generated documents can be found - in ```plots```. + in ```plots/```. We ship experiment results and plots + generated by us in these locations. 5. **RaLib** is a git submodule in ```ralib/```. ## RaLib Installation -To install RaLib, you have to initialize the submodule -and build RaLib, which requires a JDK and Maven to be +To install RaLib, you have to initialize the submodule +and build RaLib, which requires a JDK and Maven to be installed. ``` ./build.sh ``` -## Running experiments +## Running experiments -Generally, experiments are organized in series and -configurations do not specify which learner should +Generally, experiments are organized in series and +configurations do not specify which learner should be used or how many times experiments should be run. Shell scripts exist for running individual experiments, series, and complete evaluations. ### Experiments for the evaluation of CT-based learners (TACAS'24) -This section describes how one can run the experiments +This section describes how one can run the experiments that are reported in the TACAS 2024 submission. -Concrete results may vary a little bit for every +Concrete results may vary a little bit for every execution due to randomization but trends should be stable. @@ -58,9 +59,12 @@ stable. ``` ./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh + ./run-big-experiments-with-ce-search-tacas2024.sh ``` results (logs and models) can be found in ```results/```. + The expected running times are ca. 3 hours, less than ten + minutes, and (on mid-end machines) over a day, respectively. 2. Generating table in TACAS 2024 submission diff --git a/Zenodo-README.md b/Zenodo-README.md deleted file mode 100644 index 841fe45..0000000 --- a/Zenodo-README.md +++ /dev/null @@ -1,155 +0,0 @@ -# RaLib Benchmarking Setup - -This project contains benchmark examples and scripts to -run RaLib on these benchmarks. The project is organized -as follows: - -1. **benchmark problems** are located in ```benchmarks/```. - Problems are collected from papers on RaLib, from the - [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. -2. **configurations** of RaLib are located in ```configs/```. - The different configurations are used in different - series of experiments and specify aspects of the - learning experiments (how counterexamples are found, - how counterexamples are preprocessed, max. runtimes, - teachers/theories for types, etc.). -3. **experiments** are organized in series in ```experiments/```. - Series combine benchmark problems with a configuration. - The different series in this project originate from - multiple papers on RaLib / register automata learning. -4. **results** from running experiments will be stored in - ```results/```. Results include logs and models from - individual learning experiments. Some scripts will - use the results to compute plots and tables. - Templates and generated documents can be found - in ```plots/```. We ship experiment results and plots - generated by us in these locations. - -## Dependency Installation - -First, the system dependencies need to be installed. All -dependency packages are shipped with this artifact in the -```packages``` directory. When prompted, enter your password. - -``` -./install-deps.sh -``` - -## RaLib Installation - -Next, build the RaLib package. The build is performed using -Apache Maven. All Maven dependencies are pre-downloaded to the -```m2``` directory. - -``` -./build.sh -``` - -## Running experiments - -Generally, experiments are organized in series and -configurations do not specify which learner should -be used or how many times experiments should be run. -Shell scripts exist for running individual experiments, -series, and complete evaluations. - -### Experiments for the evaluation of CT-based learners (TACAS'24) - -This section describes how one can run the experiments -that are reported in the TACAS 2024 submission. -Concrete results may vary a little bit for every -execution due to randomization but trends should be -stable. - -1. Running experiments from TACAS 2024 submission - - ``` - ./run-experiments-model-checker-tacas2024.sh - ./run-experiments-with-ce-search-tacas2024.sh - ``` - - results (logs and models) can be found in ```results/```. - - -2. Generating table in TACAS 2024 submission - - this step requires latexmk and a latex distribution - to be installed - - ``` - ./generate-table-tacas2024.sh - ``` - - The generated PDF is ```plots/tacas2024-table-results.pdf``` - - *Note:* The generated PDF has more rows and columns - than the table in the paper. - -2. Generating plots - - this step requires latexmk and a latex distribution - with pgfplots to be installed - - ``` - ./generate-plots-tacas2024.sh - ``` - - The plots in Fig. 4. are: - - ``` - plots/plot-dtls-resets.pdf - plots/plot-dtls-resets-noopt.pdf - plots/plot-dtls-counterexamples.pdf - plots/plot-dtls-wct.pdf - ``` - - The plots in Fig. 5 are - - ``` - plots/plot-gen-transitions.pdf - plots/plot-gen-registers.pdf - plots/plot-gen-registers-noopt.pdf - ``` - - *Note:* Some additional plots that are generated - were not included in the paper. - -### Running individual experiments - -``` -./run_experiment.sh [-h] -s series -e experiment -i iterations -l learner -``` - -Learn model of `experiments/[series]/[experiment].xml` with config -`experiments/[series]/config` and specified learner. Each `series` -corresponds to a subdirectory of `experiments/`, each experiment to -a XML file in a `series`. - -### Running a series - -``` -./run_series.sh [-h] -s series -i iterations -l learner -``` - -Run the series of experiments (i.e., all XML files) specified in the -directory `experiments/[series]` with the specified learner. - -### Running the complete evaluation (2015 RaLib Paper) - -``` -./run_evaluation.sh -``` - -This script contains explicit calls to `run_series.sh` for -all experiments that are to be run as part of the evaluation. - - -### Searching in logs - -``` -./search_logs.sh series experiment learner search_term -``` - -Searches for `search_term` in the logs from all iterations. - diff --git a/build.sh b/build.sh index e396015..28ecdda 100755 --- a/build.sh +++ b/build.sh @@ -1,4 +1,8 @@ #!/bin/sh -cd ralib && mvn -Dmaven.repo.local=../m2 package assembly:single - +cd ralib +if [ -e ../m2 ]; then + mvn -Dmaven.repo.local=../m2 package assembly:single +else + mvn package assembly:single +fi diff --git a/install-deps.sh b/install-deps.sh index 7e71e9d..f7d63bf 100755 --- a/install-deps.sh +++ b/install-deps.sh @@ -1,4 +1,10 @@ #!/bin/sh -cd packages && sudo dpkg -i *.deb +if [ -e packages ]; then + cd packages && sudo dpkg -i *.deb +else + echo "No packages directory found." + echo "This script is only needed for artifact evaluation." + echo "When using this repository normally, it should not be run." +fi From 0d6fe32296e1173d65fdc0fc575af2b63ea55a6d Mon Sep 17 00:00:00 2001 From: Falk Howar Date: Sun, 24 Dec 2023 07:44:39 +0100 Subject: [PATCH 4/8] reorder rows in table --- generate-summary-with-ce-search-tacas2024.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/generate-summary-with-ce-search-tacas2024.sh b/generate-summary-with-ce-search-tacas2024.sh index ecb7165..7027471 100755 --- a/generate-summary-with-ce-search-tacas2024.sh +++ b/generate-summary-with-ce-search-tacas2024.sh @@ -2,12 +2,12 @@ ./compare-learners.sh -h -./compare-learners.sh ct-with-rwalk abp_output -./compare-learners.sh ct-with-rwalk abp_receiver3 ./compare-learners.sh ct-with-rwalk channel_frame +./compare-learners.sh ct-with-rwalk abp_receiver3 +./compare-learners.sh ct-with-rwalk palindrome ./compare-learners.sh ct-with-rwalk login +./compare-learners.sh ct-with-rwalk abp_output ./compare-learners.sh ct-with-rwalk sip -./compare-learners.sh ct-with-rwalk palindrome ./compare-learners.sh ct-datastructures fifo_size_3.register ./compare-learners.sh ct-datastructures fifo_size_5.register From 91b8f9faa03729a7a956c73a429302333808e528 Mon Sep 17 00:00:00 2001 From: Kostis Sagonas Date: Sun, 24 Dec 2023 13:02:10 +0100 Subject: [PATCH 5/8] Update to ralib with one memorable bug fixed --- ralib | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ralib b/ralib index 9bef181..08b1183 160000 --- a/ralib +++ b/ralib @@ -1 +1 @@ -Subproject commit 9bef1816799d80d2b8024e909b4fd652a8677416 +Subproject commit 08b118391d8a42e501a7794aa8d5041a63bdd815 From e773bd3d931a84b8f72ef2b1ca6b206cfd1d9710 Mon Sep 17 00:00:00 2001 From: Simon Dierl Date: Thu, 28 Dec 2023 14:24:10 +0100 Subject: [PATCH 6/8] automate artifact packaging --- .gitignore | 4 ++++ artifact-debian-deps | 56 ++++++++++++++++++++++++++++++++++++++++++++ build-artifact.sh | 55 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 115 insertions(+) create mode 100644 artifact-debian-deps create mode 100755 build-artifact.sh diff --git a/.gitignore b/.gitignore index 257c329..b638448 100644 --- a/.gitignore +++ b/.gitignore @@ -18,3 +18,7 @@ results/* plots/*.csv plots/*.dvi plots/*.pdf + +m2/ +packages/ +*-repro.zip diff --git a/artifact-debian-deps b/artifact-debian-deps new file mode 100644 index 0000000..52337f5 --- /dev/null +++ b/artifact-debian-deps @@ -0,0 +1,56 @@ +http://security.ubuntu.com/ubuntu/pool/main/t/texlive-bin/libkpathsea6_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/main/t/texlive-bin/libptexenc1_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/main/t/texlive-bin/libsynctex2_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/main/t/texlive-bin/libtexlua53_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/main/t/texlive-bin/libtexluajit2_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/universe/t/texlive-bin/texlive-binaries_2021.20210626.59705-1ubuntu0.1_amd64.deb +http://security.ubuntu.com/ubuntu/pool/universe/o/openjdk-17/openjdk-17-jre-headless_17.0.9%2b9-1%7e22.04_amd64.deb +http://security.ubuntu.com/ubuntu/pool/universe/o/openjdk-17/openjdk-17-jdk-headless_17.0.9%2b9-1%7e22.04_amd64.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/tex-common/tex-common_6.17_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/l/lmodern/fonts-lmodern_2.004.5-6.1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/main/t/t1utils/t1utils_1.41-4build2_amd64.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/teckit/libteckit0_2.5.11%2bds1-1_amd64.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/z/zziplib/libzzip-0-13_0.13.72%2bdfsg.1-1.1_amd64.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/texlive-base/texlive-base_2021.20220204-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/texlive-base/texlive-latex-base_2021.20220204-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/l/latexmk/latexmk_4.76-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/a/apache-pom/libapache-pom-java_18-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/a/atinject-jsr330/libatinject-jsr330-api-java_1.0%2bds1-5_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/g/geronimo-interceptor-3.0-spec/libgeronimo-interceptor-3.0-spec-java_1.0.1-4fakesync_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/c/cdi-api/libcdi-api-java_1.2-3_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libc/libcommons-cli-java/libcommons-cli-java_1.4-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/c/commons-parent/libcommons-parent-java_43-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/c/commons-io/libcommons-io-java_2.11.0-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libc/libcommons-lang3-java/libcommons-lang3-java_3.11-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libc/libcommons-logging-java/libcommons-logging-java_1.2-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/g/geronimo-annotation-1.3-spec/libgeronimo-annotation-1.3-spec-java_1.3-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libj/libjsr305-java/libjsr305-java_0.1%7e%2bsvn49-11_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/g/guava-libraries/libguava-java_29.0-6_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/liba/libaopalliance-java/libaopalliance-java_20070526-6_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/g/guice/libguice-java_4.2.3-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/h/hawtjni/libhawtjni-runtime-java_1.17-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/j/jansi-native/libjansi-native-java_1.8-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/j/jansi/libjansi-java_1.18-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/m/maven-parent/libmaven-parent-java_31-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-utils2/libplexus-utils2-java_3.3.0-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/w/wagon/libwagon-provider-api-java_3.3.4-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/m/maven-resolver/libmaven-resolver-java_1.4.2-3build1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/m/maven-shared-utils/libmaven-shared-utils-java_3.3.0-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-cipher/libplexus-cipher-java_1.8-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-classworlds/libplexus-classworlds-java_2.6.0-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-containers/libplexus-component-annotations-java_2.1.0-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-interpolation/libplexus-interpolation-java_1.26-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/p/plexus-sec-dispatcher/libplexus-sec-dispatcher-java_1.4-4_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libs/libslf4j-java/libslf4j-java_1.7.32-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/s/sisu-inject/libsisu-inject-java_0.3.4-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/s/sisu-plexus/libsisu-plexus-java_0.3.4-3_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/m/maven/libmaven3-core-java_3.6.3-5_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/w/wagon/libwagon-file-java_3.3.4-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/w/wagon/libwagon-http-shaded-java_3.3.4-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/m/maven/maven_3.6.3-5_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/a/auctex/preview-latex-style_12.2-1ubuntu1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libp/libpdfbox-java/libfontbox-java_1.8.16-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/libp/libpdfbox-java/libpdfbox-java_1.8.16-2_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/texlive-base/texlive-latex-recommended_2021.20220204-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/texlive-base/texlive-pictures_2021.20220204-1_all.deb +http://fr.archive.ubuntu.com/ubuntu/pool/universe/t/texlive-extra/texlive-latex-extra_2021.20220204-1_all.deb diff --git a/build-artifact.sh b/build-artifact.sh new file mode 100755 index 0000000..89c968a --- /dev/null +++ b/build-artifact.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash + +set -euxo pipefail + +git clean -xfd +( cd ralib; git clean -xfd ) + +rm --recursive --force packages +mkdir packages +wget --input-file artifact-debian-deps --directory-prefix packages + +rm --recursive --force m2 +mkdir m2 +podman create \ + --volume "$(pwd):/workdir" --workdir /workdir \ + --pull always --replace --rm \ + --env DEBIAN_FRONTEND=noninteractive \ + --name ralib-m2 ubuntu:22.04 sleep infinity +podman start ralib-m2 +podman exec ralib-m2 apt-get update +podman exec ralib-m2 apt-get install --yes --no-install-recommends maven openjdk-17-jdk-headless +podman exec ralib-m2 cp --recursive /workdir/ralib /ralib +podman exec --workdir /ralib ralib-m2 mvn -Dmaven.repo.local=/workdir/m2 dependency:go-offline package assembly:single +podman stop --time 0 ralib-m2 + +wget --content-disposition https://tu-dortmund.sciebo.de/s/iVtNb1kVGtIXGuO/download +tar --extract --verbose --file results-plots-*.tar.gz + +zip --symlinks --recurse-paths -9 --verbose ralib-tacas24-repro.zip \ + benchmarks \ + configs \ + experiments \ + m2 \ + packages \ + plots \ + ralib \ + results \ + build.sh \ + collate_experiment.sh \ + collate_results.sh \ + compare-learners.sh \ + generate-plots-tacas2024.sh \ + generate-summary-model-checker-tacas2024.sh \ + generate-summary-with-ce-search-tacas2024.sh \ + generate-table-tacas2024.sh \ + install-deps.sh \ + print_series.sh \ + run-big-experiments-with-ce-search-tacas2024.sh \ + run_evaluation.sh \ + run_experiment.sh \ + run-experiments-model-checker-tacas2024.sh \ + run-experiments-with-ce-search-tacas2024.sh \ + run_series.sh \ + search_logs.sh \ + --exclude '*.git' '*.directory' '*.DS_Store' '*._*' From b4f0a39274659af6d0844861c4eec1d42d3248fb Mon Sep 17 00:00:00 2001 From: Kostis Sagonas Date: Fri, 29 Dec 2023 09:37:10 +0100 Subject: [PATCH 7/8] Changes in the README files --- AEC-README.md | 69 +++++++++++++++++++++++++++------------------------ README.md | 55 ++++++++++++++++++++++++---------------- 2 files changed, 69 insertions(+), 55 deletions(-) diff --git a/AEC-README.md b/AEC-README.md index 82b02a7..c03b8a7 100644 --- a/AEC-README.md +++ b/AEC-README.md @@ -15,14 +15,14 @@ For both early light and full review, setup must be performed. All dependencies are included with this artifact, i.e., no network connection is required. First, the system dependencies need to be installed. -When prompted, enter the sudo password (same as username). This -will take some minutes. +When prompted, enter the sudo password (same as username). +This step will complete in a few minutes. ``` ./install-deps.sh ``` Next, build the RaLib package. No administrative privileges are -requires. This will also take some minutes, especially during unit testing. +required. This step also takes a few minutes. ``` ./build.sh ``` @@ -30,43 +30,46 @@ requires. This will also take some minutes, especially during unit testing. ## Reproducing Experimental Data Three scripts are provided that regenerate the experimental -output data in `results/`. Initially, this directory will -contain data produced by us which will be overwritten. If some -of the scripts are not run, our data will be used in the respective -parts of the table and figures. +output data in `results/`. -The first command uses counterexample search and should -complete in less than ten minutes: +The first command uses counterexample search and, depending +on the speed of your machine, should complete in 5 to 20 minutes: ``` ./run-experiments-with-ce-search-tacas2024.sh ``` -It generates the data presented -in lines 7 and 8 of Table 1. The data for the other lines and figures -is not recreated. If you are performing an **early light evaluation**, -you may want to skip the next two commands and proceed to reproducing -table and figures. +It generates the data for all but the last row of Table 1. +As you will notice, it runs each benchmark for 20 iterations +and reports averages and standard deviation from the data collected. +It also reports how many of these 20 iterations were successful + +If you are performing an **early light evaluation**, you probaly +want to generate a table similar to Table 1 at this point. +You can do this with the following command: +``` +./generate-table-tacas2024.sh +``` +which will leave the table in `plots/tacas2024-table-results.pdf`. + +You can now view this table or proceed directly to generating +the data for the two figures and data for the fifo7 benchmark +(the missing last row of Table 1). The second command uses model checking to find counterexamples -and should complete in ca. three hours: +and should complete in a few (3-7) hours: ``` ./run-experiments-model-checker-tacas2024.sh ``` -It generates the data presented -in the top six lines of Table 1 as well as both Figures. Lines 7 to 9 of -the table are not recreated. +It generates the data for the two figures (Figs 4 and 5). -The third command uses counterexample search on larger models -and may require more that one day on mid-end machines: +The third command generates the data for the last row of Table 1. +Be warned that it may require more than a day on mid-end machines. ``` ./run-big-experiments-with-ce-search-tacas2024.sh ``` -It generates -the data presented in line 9 of Table 1. The data for the other lines -and the figures is not recreated. ## Reproducing Table and Figures -The plots can be re-rendered using two scripts +The plots can be created using two scripts ``` ./generate-table-tacas2024.sh ./generate-plots-tacas2024.sh @@ -106,7 +109,7 @@ as follows: 1. **benchmark problems** are located in ```benchmarks/```. Problems are collected from papers on RaLib, from the [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. + are generated or manually created. 2. **configurations** of RaLib are located in ```configs/```. The different configurations are used in different series of experiments and specify aspects of the @@ -125,7 +128,7 @@ as follows: in ```plots/```. We ship experiment results and plots generated by us in these locations. -## Running experiments +## Running Experiments Generally, experiments are organized in series and configurations do not specify which learner should @@ -136,25 +139,25 @@ series, and complete evaluations. ### Experiments for the evaluation of CT-based learners (TACAS'24) This section describes how one can run the experiments -that are reported in the TACAS 2024 submission. +that are reported in the TACAS 2024 paper. Concrete results may vary a little bit for every execution due to randomization but trends should be stable. -1. Running experiments from TACAS 2024 submission +1. Running experiments from TACAS 2024 paper ``` - ./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh + ./run-experiments-model-checker-tacas2024.sh ./run-big-experiments-with-ce-search-tacas2024.sh ``` results (logs and models) can be found in ```results/```. - The expected running times are ca. 3 hours, less than ten - minutes, and (on mid-end machines) over a day, respectively. - + The expected running times are between 5 and 15 minutes, + a few hours, and (on mid-end machines) over a day, + respectively. -2. Generating table in TACAS 2024 submission +2. Generating table in TACAS 2024 paper this step requires latexmk and a latex distribution to be installed diff --git a/README.md b/README.md index c298bf8..f0d912d 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,13 @@ -# RaLib Benchmarking Setup +# RaLib Benchmarking Repository -This project contains benchmark examples and scripts to -run RaLib on these benchmarks. The project is organized -as follows: +This repository contains benchmark examples and scripts to run +[RaLib](https://github.com/LearnLib/ralib) on these benchmarks. +The repository is organized as follows: 1. **benchmark problems** are located in ```benchmarks/```. Problems are collected from papers on RaLib, from the [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. + are generated or manually created. 2. **configurations** of RaLib are located in ```configs/```. The different configurations are used in different series of experiments and specify aspects of the @@ -16,7 +16,7 @@ as follows: teachers/theories for types, etc.). 3. **experiments** are organized in series in ```experiments/```. Series combine benchmark problems with a configuration. - The different series in this project originate from + The different series in this repository originate from multiple papers on RaLib / register automata learning. 4. **results** from running experiments will be stored in ```results/```. Results include logs and models from @@ -30,44 +30,56 @@ as follows: ## RaLib Installation -To install RaLib, you have to initialize the submodule -and build RaLib, which requires a JDK and Maven to be -installed. +To install RaLib on a machine with internet access, you have +to initialize the submodule with the following commands. +Note that the TACAS 2024 artifact already contains RaLib's +code, so you can skip this initial step and simply execute +the build script. + +``` +git submodule init +git submodule update +``` + +You can then build RaLib with the following command, +which requires a JDK and Maven to be installed. ``` ./build.sh ``` -## Running experiments +## Running Experiments -Generally, experiments are organized in series and +Generally, experiments are organized in series, and configurations do not specify which learner should be used or how many times experiments should be run. Shell scripts exist for running individual experiments, series, and complete evaluations. +Note that most of RaLib's algirithms use randomization, +so concrete results typically vary for every execution. +Thus, some of the scripts below run a number of +iterations and report averages and standard deviations. + ### Experiments for the evaluation of CT-based learners (TACAS'24) This section describes how one can run the experiments -that are reported in the TACAS 2024 submission. -Concrete results may vary a little bit for every -execution due to randomization but trends should be -stable. +that are reported in the TACAS 2024 paper. -1. Running experiments from TACAS 2024 submission +1. Running experiments from TACAS 2024 paper ``` - ./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh + ./run-experiments-model-checker-tacas2024.sh ./run-big-experiments-with-ce-search-tacas2024.sh ``` results (logs and models) can be found in ```results/```. - The expected running times are ca. 3 hours, less than ten - minutes, and (on mid-end machines) over a day, respectively. + The expected running times are between 5 and 15 minutes, + few hours, and (on mid-end machines) over a day, + respectively. - -2. Generating table in TACAS 2024 submission +2. Generating table in TACAS 2024 paper this step requires latexmk and a latex distribution to be installed @@ -147,4 +159,3 @@ all experiments that are to be run as part of the evaluation. ``` Searches for `search_term` in the logs from all iterations. - From 00d341bd2e26ca94e6ced973b6b4b3f63fd9e2c1 Mon Sep 17 00:00:00 2001 From: Paul Date: Tue, 2 Jan 2024 12:58:35 +0200 Subject: [PATCH 8/8] Fix compare learners --- compare-learners.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compare-learners.sh b/compare-learners.sh index c9892d7..bf9b7e6 100755 --- a/compare-learners.sh +++ b/compare-learners.sh @@ -50,7 +50,7 @@ print_stat() { if [ $n -gt 0 ]; then avg=$((sum/n)) std=$(std_dev "${vals[@]}" | sed "s/,/\./") - printf "%5d, %6.2f, " $avg $std + LC_NUMERIC=C printf "%5d, %6.2f, " $avg $std else printf ", , " fi