Skip to content

Code for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

License

Notifications You must be signed in to change notification settings

dywsjtu/apparate

Repository files navigation

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

This repository contains the source code implementation of the SOSP '24 paper Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving.

Please note that the arXiv version is not up to date with our SOSP submission. We will update the arXiv paper once the camera-ready version is finalized.

Getting Started

Apparate is implemented in Python. We have tested Apparate on Ubuntu 22.04 with Python 3.8.13.

Detailed instructions on how to reproduce the main results from our SOSP paper are in EXPERIMENTS.md.

References

@article{dai2023apparate,
  title={Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving},
  author={Dai, Yinwei and Pan, Rui and Iyer, Anand and Li, Kai and Netravali, Ravi},
  journal={arXiv preprint arXiv:2312.05385},
  year={2023}
}

About

Code for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages