Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,7 @@ results/*
plots/*.csv
plots/*.dvi
plots/*.pdf

m2/
packages/
*-repro.zip
239 changes: 239 additions & 0 deletions AEC-README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# Artifact Evaluation Readme

This file details how to perform artifact evaluation. See the
**Detailed RaLib Benchmarking Setup** part below for a more generic
documentation.

The artifact can replicate the data presented in Table 1 as well
as Figures 4 and 5 that is discussed in Section 6.
Note that timing data is heavily machine-dependent,
although the relative timings should be reproducible.

## General Setup

For both early light and full review, setup must be performed.
All dependencies are included with this artifact, i.e.,
no network connection is required.
First, the system dependencies need to be installed.
When prompted, enter the sudo password (same as username).
This step will complete in a few minutes.
```
./install-deps.sh
```

Next, build the RaLib package. No administrative privileges are
required. This step also takes a few minutes.
```
./build.sh
```

## Reproducing Experimental Data

Three scripts are provided that regenerate the experimental
output data in `results/`.

The first command uses counterexample search and, depending
on the speed of your machine, should complete in 5 to 20 minutes:
```
./run-experiments-with-ce-search-tacas2024.sh
```
It generates the data for all but the last row of Table 1.
As you will notice, it runs each benchmark for 20 iterations
and reports averages and standard deviation from the data collected.
It also reports how many of these 20 iterations were successful

If you are performing an **early light evaluation**, you probaly
want to generate a table similar to Table 1 at this point.
You can do this with the following command:
```
./generate-table-tacas2024.sh
```
which will leave the table in `plots/tacas2024-table-results.pdf`.

You can now view this table or proceed directly to generating
the data for the two figures and data for the fifo7 benchmark
(the missing last row of Table 1).

The second command uses model checking to find counterexamples
and should complete in a few (3-7) hours:
```
./run-experiments-model-checker-tacas2024.sh
```
It generates the data for the two figures (Figs 4 and 5).

The third command generates the data for the last row of Table 1.
Be warned that it may require more than a day on mid-end machines.
```
./run-big-experiments-with-ce-search-tacas2024.sh
```

## Reproducing Table and Figures

The plots can be created using two scripts
```
./generate-table-tacas2024.sh
./generate-plots-tacas2024.sh
```

This will take some minutes. The plots will be generated from all
data present in `results/`, i.e., if some experiment scripts were
not run, they will include our data.

The generated files that match table and figures are:

1. `plots/tacas2024-table-results.pdf`, which should match Table 1.
Some extraneous data will be included and the experiment names
will slightly differ.
2. `plots/plot-dtls-resets.pdf`, which should match Figure 4(a).
3. `plots/plot-dtls-resets-noopt.pdf`, which should match Figure 4(b).
4. `plots/plot-dtls-counterexamples.pdf`, which should match Figure 4(c).
5. `plots/plot-dtls-wct.pdf`. which should match Figure 4(d).
6. `plots/plot-gen-transitions.pdf`, which should match Figure 5(a).
7. `plots/plot-gen-registers.pdf`, which should match Figure 5(b).
8. `plots/plot-gen-registers-noopt.pdf`, which should match Figure 5(c).

Note that some additional files were not included in the paper.

## Source Code

The full source code of the tool is available in the `ralib/` directory.
Source code for dependencies provided by Ubuntu or Maven Central is not
provided.

# Detailed RaLib Benchmarking Setup

This project contains benchmark examples and scripts to
run RaLib on these benchmarks. The project is organized
as follows:

1. **benchmark problems** are located in ```benchmarks/```.
Problems are collected from papers on RaLib, from the
[automata wiki](https://automata.cs.ru.nl), and some
are generated or manually created.
2. **configurations** of RaLib are located in ```configs/```.
The different configurations are used in different
series of experiments and specify aspects of the
learning experiments (how counterexamples are found,
how counterexamples are preprocessed, max. runtimes,
teachers/theories for types, etc.).
3. **experiments** are organized in series in ```experiments/```.
Series combine benchmark problems with a configuration.
The different series in this project originate from
multiple papers on RaLib / register automata learning.
4. **results** from running experiments will be stored in
```results/```. Results include logs and models from
individual learning experiments. Some scripts will
use the results to compute plots and tables.
Templates and generated documents can be found
in ```plots/```. We ship experiment results and plots
generated by us in these locations.

## Running Experiments

Generally, experiments are organized in series and
configurations do not specify which learner should
be used or how many times experiments should be run.
Shell scripts exist for running individual experiments,
series, and complete evaluations.

### Experiments for the evaluation of CT-based learners (TACAS'24)

This section describes how one can run the experiments
that are reported in the TACAS 2024 paper.
Concrete results may vary a little bit for every
execution due to randomization but trends should be
stable.

1. Running experiments from TACAS 2024 paper

```
./run-experiments-with-ce-search-tacas2024.sh
./run-experiments-model-checker-tacas2024.sh
./run-big-experiments-with-ce-search-tacas2024.sh
```

results (logs and models) can be found in ```results/```.
The expected running times are between 5 and 15 minutes,
a few hours, and (on mid-end machines) over a day,
respectively.

2. Generating table in TACAS 2024 paper

this step requires latexmk and a latex distribution
to be installed

```
./generate-table-tacas2024.sh
```

The generated PDF is ```plots/tacas2024-table-results.pdf```

*Note:* The generated PDF has more rows and columns
than the table in the paper.

2. Generating plots

this step requires latexmk and a latex distribution
with pgfplots to be installed

```
./generate-plots-tacas2024.sh
```

The plots in Fig. 4. are:

```
plots/plot-dtls-resets.pdf
plots/plot-dtls-resets-noopt.pdf
plots/plot-dtls-counterexamples.pdf
plots/plot-dtls-wct.pdf
```

The plots in Fig. 5 are

```
plots/plot-gen-transitions.pdf
plots/plot-gen-registers.pdf
plots/plot-gen-registers-noopt.pdf
```

*Note:* Some additional plots that are generated
were not included in the paper.

### Running individual experiments

```
./run_experiment.sh [-h] -s series -e experiment -i iterations -l learner
```

Learn model of `experiments/[series]/[experiment].xml` with config
`experiments/[series]/config` and specified learner. Each `series`
corresponds to a subdirectory of `experiments/`, each experiment to
a XML file in a `series`.

### Running a series

```
./run_series.sh [-h] -s series -i iterations -l learner
```

Run the series of experiments (i.e., all XML files) specified in the
directory `experiments/[series]` with the specified learner.

### Running the complete evaluation (2015 RaLib Paper)

```
./run_evaluation.sh
```

This script contains explicit calls to `run_series.sh` for
all experiments that are to be run as part of the evaluation.


### Searching in logs

```
./search_logs.sh series experiment learner search_term
```

Searches for `search_term` in the logs from all iterations.
16 changes: 16 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# License Information

The components of this package are all released under various
open-source licenses:

1. All Ubuntu packages in `packages/` are released under various
licenses. See each package's metadata for details.
2. All Maven dependencies in `m2/` are released under various
licenses. See each artifact's metadata for details.
3. RALib and our extensions are released under the Apache License,
Version 2.0.
4. All configuration files and scripts are released under the
MIT License.
5. The benchmark problems are either publicly available from the
Automata Wiki without restrictions or released under the MIT
License.
Loading