Skip to content

Releases: ubc-provenance/PIDSMaker

PIDSMaker 2.1.1 - KDD'26 paper

Choose a tag to compare

@TristanBilot TristanBilot released this 25 May 16:24
3260273

State of the PIDSMaker repo at the time of paper acceptance to KDD'26 D&B.

Includes an improvement to velox: removing x_is_tuple makes velox actually better in average

PIDSMaker 2.1.0

Choose a tag to compare

@TristanBilot TristanBilot released this 27 Apr 08:53
73e9fc5

Release Notes

  • Add datasets: Carbanak v2 and Atlas v2
  • Add download script for datasets
  • Fix Docker installation permission errors
  • Fix wandb integration with new API key format by upgrading to [wandb 0.24.1](b69ad97)
  • Refine MAGIC architecture to closer match the original paper
  • Minor code formatting improvements

PIDSMaker 2.0.0

Choose a tag to compare

@TristanBilot TristanBilot released this 29 Jan 23:25
667bc8f

Changelog

Support for FIVEDIRECTIONS and TRACE datasets (E3/E5)

These new datasets can now be used in PIDSMaker.

Best hyperparameters

We provide the hyperparameters we found through grid-search tuning for the main PIDSs. Due to instability, those not guarantee good results but help users toward their hyperparameter tuning process.

Pipeline stages simplification (comes with breaking changes)

Instead of using tasks and subtasks (e.g., detection.gnn_training), we have removed the prefixes like detection, evaluation, preprocessing, etc. as those didn't bring any value to the framework.
The overall pipeline has been simplified to:

config/
├── orthrus.yml
├── kairos.yml
└── ...
pidsmaker/
├── main.py
│── config/
│   ├── config.py
│   └── pipeline.py
├── tasks/                      
│   ├── construction.py
│   ├── transformation.py
│   ├── featurization.py
│   ├── batching.py
│   ├── training.py
│   ├── evaluation.py
│   └── triage.py

Renaming of argument paths:

Before After
preprocessing.build_graphs construction
preprocessing.transformation transformation
featurization.feat_training featurization
detection.graph_preprocessing batching
detection.gnn_training training
detection.evaluation evaluation
detection.triage triage

Docstring

Some docs has been added within the code for better understanding.

Docs

Added details to the docs, notably the pipeline, provenance basics, instability.
Also changed the logo.

Fixes

We fixed a parsing error (#28), and an important error on optc datasets (#22)

PIDSMaker 1.0.1

Choose a tag to compare

@TristanBilot TristanBilot released this 31 Oct 03:14
660c06c

PIDSMaker 1.0.1

Changelog

Fix non-determinism in graph construction and in word2vec (used by Orthrus and Velox)

Before this release, running two times the same system with exact same config could lead to different results.
The first reason was that we applied sorting in build_default_graphs on a few edges with same timestamps and different attributes.
Sorting with collisions is non deterministic so some edges could be swapped, leading to radical changes in accuracy after multiple epochs of training. This kind of instability is indeed the most important limitation of current architectures, which should be fixed in future research.
Regarding word2vec, we were using num_workers>1 before this release, which led to non-determinism due to multiprocessing orchestration. Setting only one worker makes the embedding training deterministic.

Add support for the reapr labels for the E3-CADETS and E3-THEIA datasets

This release adds the reapr labels.
It will now be possible on the E3-CADETS and E3-THEIA datasets to use those labels instead of those from Orthrus.

Support for installation with Apptainer

Installation is now also possible with Apptainer (ex Singularity) instead of Docker.

Add dataset preprocessing scripts

We now provide the scripts that preprocess the raw DARPA files into the postgres databases for transparency and supporting integration of new datasets.

Remove tuned configurations

The --tuned yml files for existing PIDSs were obsolete due to the instability and non-determinism present in the framework. We remove them to make it more clear that anyone must eventually run hyperparameter tuning on its end to get an optimal model.

README update

We add more concrete examples of how using PIDSMaker to update existing systems in the README.

Usenix Security 2025

Choose a tag to compare

@tfjmp tfjmp released this 05 Jun 16:49
usenixsec

usenixsec