The SuperSONIC project implements server infrastructure for inference-as-a-service applications in large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.
The main components of SuperSONIC are:
- Nvidia Triton inference servers
- Dynamic muti-purpose Envoy Proxy:
- Load balancing
- Client connection rate limiting
- GPU saturation prevention
- Token-based authentication
- Load-based autoscaling via KEDA
helm repo add supersonic https://fastmachinelearning.org/SuperSONIC
helm repo update
helm install <release-name> supersonic/supersonic --values <your-values.yaml> -n <namespace>
To construct the values.yaml
file for your application, follow Configuration guide.
The full list of configuration parameters is available in the Configuration reference.
CMS | ATLAS | IceCube | |
---|---|---|---|
Geddes cluster (Purdue) | ✅ | - | - |
Nautilus cluster (NRP) | ✅ | ⏳ | ✅ |