This is the official implementation of the DeepHYDRA algorithm presented in the paper "DeepHYDRA: Resource-Efficient Time-Series Anomaly Detection in Dynamically-Configured Systems".
The folder envs/ contains the conda environments for the different models. You can create the conda environments with the following command:
conda create --name <env_name> --file <file>
In the respective conda environments, install the Python packages not installed via conda with
pip install -r <env_name>_python_requirements.txt
This is a bit messy, and we will probably streamline this in the future.
For the machine-1-1 dataset, extract the files in the archive datasets/smd/machine-1-1.tar.gz.
For the HLT datasets, retrieve the original datasets from here and place them in the subfolder datasets/hlt. Afterwards, run the scripts
generate_hlt_datasets.py
generate_combined_detection_test_set.py
Use the conda environment contained in envs/dataset_generation.txt for this step.
To run the one-liner baselines, run the script
run_one_liners.sh
in the subfolder baselines/one-liners.
To run the MERLIN scripts, you have to clone the py-merlin repository. Build and install this package inside the environment contained in envs/merlin.txt. Afterwards, you waill be able to run the script
run_merlin.sh
in the subfolder baselines/merlin.
Use the respectively named conda environments envs/informers.txt and envs/tranad.txt to run the specific models. You can run the models with the parameters used in the paper by executing the scripts
run_smd.sh
run_hlt.sh
run_hlt_unaugmented.sh
Contained in the subfolders transformer_based_detection/informers and transformer_based_detection/tranad.
After training the models on the reduced HLT data, you can run the combined detection method using the generated checkpoints. To do this, run the scripts
run_informers_combined.sh
run_tranad_combined.sh
contained in the subfolders detection_combined/benchmark/informers and detection_combined/benchmark/tranad.
Running all of the scripts described above should have populated the folders evaluation/combined_detection/predictions, evaluation/reduced_detection/predictions, and evaluation/smd/predictions with the necessary files to calculate the performance metrics and generate the comparison plots shown in the paper. You can calculate the metrics by running the scripts
get_results_over_random_seeds.sh
in the subfolders evaluation/combined_detection, evaluation/reduced_detection, and evaluation/smd/. Note that the results for TranAD on the machine-1-1 dataset are stored directly in the model folder as results_tranad_machine-1-1.csv.
The plots can be generated by running
plot_comparison_plots.py
in the subfolder evaluation/combined/detection.