libCacheSim - building and running cache simulations

libCacheSim - building and running cache simulations

News

2023 June: QDLP is available now, see our paper for details.
2023 Oct: S3-FIFO and SIEVE are available! These are very simple algorithms that are very effective in reducing cache misses. Try them out in libCacheSim and your production!

What is libCacheSim

a high-performance cache simulator for running cache simulations.
a high-performance and versatile trace analyzer for analyzing different cache traces.
a high-performance library for building cache simulators.

libCacheSim features

High performance - over 20M requests/sec for a realistic trace replay.
High memory efficiency - predictable and small memory footprint.
State-of-the-art algorithms - eviction algorithms, admission algorithms, sampling techniques, approximate miss ratio computation, see here.
Parallelism out-of-the-box - uses the many CPU cores to speed up trace analysis and cache simulations.
The ONLY feature-rich trace analyzer - all types of trace analysis you need, see here.
Simple API - easy to build cache clusters, multi-layer caching, etc.; see here.
Extensible - easy to support new trace types or eviction algorithms; see here.

Supported algorithms

cachesim supports the following algorithms:

Build and Install libCacheSim

One-line install

We provide some scripts for quick installation of libCacheSim.

cd scripts && bash install_dependency.sh && bash install_libcachesim.sh;

If this does not work, please

let us know what system you are using and what error you get
read the following sections for self-installation.

Install dependency

libCacheSim uses cmake build system and has a few dependencies: glib, tcmalloc, zstd. Please see install.md for instructions on how to install the dependencies.

Build libCacheSim

cmake recommends out-of-source build, so we do it in a new directory:

git clone https://github.com/1a1a11a/libCacheSim
pushd libCachesim;
mkdir _build && cd _build;
cmake .. && make -j;
[sudo] make install;
popd;

Usage

cachesim (a high-performance cache simulator)

After building and installing libCacheSim, cachesim should be in the _build/bin/ directory.

basic usage

./bin/cachesim trace_path trace_type eviction_algo cache_size [OPTION...]

use ./bin/cachesim --help to get more information.

Run a single cache simulation

Run the example traces with LRU eviction algorithm and 1GB cache size.

# Note that no space between the cache size and the unit, and the unit is not case-sensitive
./bin/cachesim ../data/trace.vscsi vscsi lru 1gb

Run multiple cache simulations with different cache sizes

# Note that there is no space between the cache sizes
./bin/cachesim ../data/trace.vscsi vscsi lru 1mb,16mb,256mb,8gb

# Besides absolute cache size, you can also use a fraction of the working set size
./bin/cachesim ../data/trace.vscsi vscsi lru 0.001,0.01,0.1,0.2

# besides using byte as the unit, you can also treat all objects having the same size, and the size is the number of objects
./bin/cachesim ../data/trace.vscsi vscsi lru 1000,16000 --ignore-obj-size 1

# use a csv trace, note the qutation marks when you have multiple options
./bin/cachesim ../data/trace.csv csv lru 1gb -t "time-col=2, obj-id-col=5, obj-size-col=4"

# use a csv trace with more options
./bin/cachesim ../data/trace.csv csv lru 1gb -t "time-col=2, obj-id-col=5, obj-size-col=4, delimiter=,, has-header=true"

See quick start cachesim for more usages.

Plot miss ratio curve

You can plot miss ratios of different algorithms and sizes, and plot the miss ratios over time.

# plot miss ratio over size
cd scripts;
python3 plot_mrc_size.py --tracepath ../data/twitter_cluster52.csv --trace-format csv --trace-format-params="time-col=1,obj-id-col=2,obj-size-col=3,delimiter=," --algos=fifo,lru,lecar,s3fifo --sizes=0.001,0.002,0.005,0.01,0.02,0.05,0.1,0.2,0.3,0.4

# plot miss ratio over time
python3 plot_mrc_time.py --tracepath ../data/twitter_cluster52.csv --trace-format csv --trace-format-params="time-col=1, obj-id-col=2, obj-size-col=3, delimiter=,," --algos=fifo,lru,lecar,s3fifo --report-interval=30 --miss-ratio-type="accu"

Trace analysis

libCacheSim also has a trace analyzer that provides a lot of useful information about the trace. And it is very fast, designed to work with billions of requests. It also coms with a set of scripts to help you analyze the trace. See trace analysis for more details.

Using libCacheSim as a library

libCacheSim can be used as a library for building cache simulators. For example, you can build a cache cluster with consistent hashing or a multi-layer cache simulator.

Here is a simplified example showing the basic APIs.

#include <libCacheSim.h>

/* open trace, see quickstart_lib.md for opening csv and binary trace */
reader_t *reader = open_trace("../data/trace.vscsi", VSCSI_TRACE, NULL);

/* create a container for reading from trace */
request_t *req = new_request();

/* create a LRU cache */
common_cache_params_t cc_params = {.cache_size=1024*1024U}; 
cache_t *cache = LRU_init(cc_params, NULL); 

/* counters */
uint64_t n_req = 0, n_miss = 0;

/* loop through the trace */
while (read_one_req(reader, req) == 0) {
    if (!cache->get(cache, req)) {
        n_miss++;
    }
    n_req++;
}

printf("miss ratio: %.4lf\n", (double)n_miss / n_req);

/* cleaning */
close_trace(reader);
free_request(req);
cache->cache_free(cache);

save this to test.c and compile it with

gcc test.c $(pkg-config --cflags --libs libCacheSim glib-2.0) -o test.out

See here for more details, and see example folder for examples on how to use libCacheSim, such as building a cache cluster with consistent hashing, multi-layer cache simulators.

Extending libCacheSim (new algorithms and trace types)

libCacheSim supports txt, csv, and binary traces. We prefer binary traces because it allows libCacheSim to run faster, and the traces are more compact.

We also support zstd compressed binary traces without decompression. This allows you to store the traces with less space.

If you need to add a new trace type or a new algorithm, please see here for details.

Open source cache traces

In the repo, there are sample (one from cloudphysics and one from twitter) traces in different formats (csv, txt, vscsi, and oracleGeneral). Note that the provided traces are very small samples and should not be used for evaluating different algorithms' miss ratios. The full traces can be found either with the original release or the processed oracleGeneral format.

Note that the oracleGeneral traces are compressed with zstd and have the following format:

struct {
    uint32_t timestamp;
    uint64_t obj_id;
    uint32_t obj_size;
    int64_t next_access_vtime;  // -1 if no next access
}

The compressed traces can be used with libCacheSim without decompression. And libCacheSim provides a tracePrint tool to print the trace in human-readable format.

Dataset	Year	Type	Original release	OracleGeneral format
Tencent Photo	2018	object	link	link
WikiCDN	2019	object	link	link
Tencent CBS	2020	block	link	link
Alibaba Block	2020	block	link	link
Twitter	2020	key-value	link	link
MetaKV	2022	key-value	link	link
MetaCDN	2023	object	link	link

Among the large number of traces, I recommend using the newer traces from Twitter (cluster52), Wiki, and Meta.

Questions?

Please join the Google group https://groups.google.com/g/libcachesim and ask questions.

Contributions

We gladly welcome pull requests. Before making any large changes, we recommend opening an issue and discussing your proposed changes.
If the changes are minor, then feel free to make them without discussion. This project adheres to Google's coding style. By participating, you are expected to uphold this code.

Reference

@inproceedings{yang2020-workload,
    author = {Juncheng Yang and Yao Yue and K. V. Rashmi},
    title = {A large scale analysis of hundreds of in-memory cache clusters at Twitter},
    booktitle = {14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)},
    year = {2020},
    isbn = {978-1-939133-19-9},
    pages = {191--208},
    url = {https://www.usenix.org/conference/osdi20/presentation/yang},
    publisher = {USENIX Association},
}

@inproceedings{yang2023-s3fifo,
  title = {FIFO Queues Are All You Need for Cache Eviction},
  author = {Juncheng Yang and Yazhuo Zhang and Ziyue Qiu and Yao Yue and K.V. Rashmi},
  isbn = {9798400702297},
  publisher = {Association for Computing Machinery},
  booktitle = {Symposium on Operating Systems Principles (SOSP'23)},
  pages = {130–149},
  numpages = {20},
  year={2023}
}

@inproceedings{yang2023-qdlp,
  author = {Juncheng Yang and Ziyue Qiu and Yazhuo Zhang and Yao Yue and K.V. Rashmi},
  title = {FIFO Can Be Better than LRU: The Power of Lazy Promotion and Quick Demotion},
  year = {2023},
  isbn = {9798400701955},
  publisher = {Association for Computing Machinery},
  doi = {10.1145/3593856.3595887},
  booktitle = {Proceedings of the 19th Workshop on Hot Topics in Operating Systems (HotOS23)},
  pages = {70–79},
  numpages = {10},
}

If you used libCacheSim in your research, please cite the above papers. And we welcome you to send us a link to your paper and add a reference to references.md.

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 754 Commits
.github/workflows		.github/workflows
cmake		cmake
data		data
doc		doc
doc0		doc0
example		example
libCacheSim		libCacheSim
random		random
scripts		scripts
test		test
.clang-format		.clang-format
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
.sshyncignore		.sshyncignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
FAQ.md		FAQ.md
LICENSE		LICENSE
README.md		README.md
dockerfile		dockerfile
libCacheSim.cmake.in		libCacheSim.cmake.in
libCacheSim.pc.in		libCacheSim.pc.in
references.md		references.md
requirements.txt		requirements.txt
test.c		test.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

libCacheSim - building and running cache simulations

News

What is libCacheSim

libCacheSim features

Supported algorithms

Build and Install libCacheSim

One-line install

Install dependency

Build libCacheSim

Usage

cachesim (a high-performance cache simulator)

basic usage

Run a single cache simulation

Run multiple cache simulations with different cache sizes

Plot miss ratio curve

Trace analysis

Using libCacheSim as a library

Extending libCacheSim (new algorithms and trace types)

Open source cache traces

Questions?

Contributions

Reference

License

Related

About

Releases

Packages

Contributors 11

Languages

License

vincenzon/libCacheSim

Folders and files

Latest commit

History

Repository files navigation

libCacheSim - building and running cache simulations

News

What is libCacheSim

libCacheSim features

Supported algorithms

Build and Install libCacheSim

One-line install

Install dependency

Build libCacheSim

Usage

cachesim (a high-performance cache simulator)

basic usage

Run a single cache simulation

Run multiple cache simulations with different cache sizes

Plot miss ratio curve

Trace analysis

Using libCacheSim as a library

Extending libCacheSim (new algorithms and trace types)

Open source cache traces

Questions?

Contributions

Reference

License

Related

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Languages

Packages