ML caching with guarantees

Introduction

The cache replacement pipeline has been cloned and slightly differed from https://github.com/google-research/google-research/tree/master/cache_replacement.

Getting the traces

To avoid problems with reproduction, we have saved all the traces used in our work to Dropbox. To move forward, please copy the traces you would like to work on to the ./cache_replacement/policy_learning/cache/traces/ folder.

Setting up the environment

This project uses python3.7. Please install the required packages with

pip install -r requirements.txt

Afterwards, please install OpenAI baselines with

pip install -e git+https://github.com/openai/baselines.git@ea25b9e8b234e6ee1bca43083f8f3cf974143998#egg=baselines

Training models with evaluation and parsing args

To run training, use this command:

bash run.sh <DATASET> <DEVICE> <FRACTION> <DAGGER> <RESULT_DIR> <STEPS> <SAVE_EVAL_FREQ>

where:

DATASET - the name of used dataset (do not forget to download all three splits: train, valid, test)
DEVICE - the GPU device to train on
FRACTION - the fratcion of the train set to use (eg. 1, 0.01)
DAGGER (True or False) - whether to use DAgger
RESULT_DIR - data folder
STEPS - number of training steps
SAVE_EVAL_FREQ - frequency of evaluation/checkpoint saving (should be at least 2x smaller than STEPS)

For example:

bash run.sh astar 0 0.33 True ./results 20000 5000

This example script will do the following:

Train a Parrot model with DAgger on 33% of the astar dataset, and several things save to the ./results/astar__dagger=true__fraction=0.33 folder:
- evictions and predictions folders will contain easily readible evictions and predictions of the model during training
- three .json files containing the configs used in the training
- tensorboard folder with visualization data
- checkpoints folder with saved models
- logs.txt with the cache hit rates per full validation (this will be used in evaluation to get the best checkpoint)
Evaluate the trained model on the test set, saving the results to ./results/astar__dagger=true__fraction=0.33/test folder.
Create a ./results/parsed/astar__dagger=true__fraction=0.33 folder that will contain the crucial files (parsed outs + logs with scores)
Parse the evictions and predictions from the evaluation, to a more leightweight format.

Evaluating learning-augmented algorithms

To evaluate the learning-augmented algorithms, do

for i in <RESULT_DIR>/parsed/*/0/*.pkl; do python3 mts/main.py all -k16 -d64 -f $i | tail -n3 >> evaluated.csv; done

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
cache_replacement		cache_replacement
mts		mts
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
parse_outs.py		parse_outs.py
requirements.txt		requirements.txt
run.sh		run.sh
run_simulation.sh		run_simulation.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML caching with guarantees

Introduction

Getting the traces

Setting up the environment

Training models with evaluation and parsing args

Evaluating learning-augmented algorithms

About

Releases

Packages

Contributors 3

Languages

License

chledowski/ml_caching_with_guarantees

Folders and files

Latest commit

History

Repository files navigation

ML caching with guarantees

Introduction

Getting the traces

Setting up the environment

Training models with evaluation and parsing args

Evaluating learning-augmented algorithms

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages