Releases: substratusai/kubeai
Releases · substratusai/kubeai
helm-chart-models-0.10.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.10.0
Private Open AI Platform for Kubernetes.
kubeai 0.12.0
What's Changed
- Add initial dependabot config by @alpe in #317
- Fix formatting for docs by @samos123 in #325
- bump openai python client in e2e test by @samos123 in #328
- Bump actions/checkout from 2 to 4 in the actions-all group by @dependabot in #327
- Bump gocloud.dev/pubsub/kafkapubsub from 0.39.0 to 0.40.0 by @dependabot in #319
- Bump gocloud.dev/pubsub/rabbitpubsub from 0.39.0 to 0.40.0 by @dependabot in #322
- Proposal: Cache-optimized routing by @nstogner in #314
- Add Vultr to adopters by @happytreees in #334
- Add Arcee to adopters in README.md by @nason in #335
- add nvidia-gpu-rtx4070-8gb and qwen2.5 models by @samos123 in #326
- remove deprecated owner field in models chart by @samos123 in #336
- helm: support adding labels to models by @samos123 in #338
- vLLM: Add support for loading models from PVC by @samos123 in #339
- update README by @samos123 in #337
- Add llama 3.3 70b GH200 model and GH200 profile by @samos123 in #341
New Contributors
- @dependabot made their first contribution in #327
- @nason made their first contribution in #335
Full Changelog: v0.11.0...v0.12.0
kubeai 0.11.0
What's Changed
- improve caching docs by @samos123 in #295
- Update kubernetes api reference by @samos123 in #290
- Deep Chat integration by @nstogner in #294
- Add gh200 support and model by @happytreees in #300
- update README by @samos123 in #296
- Update README.md by @samos123 in #305
- add llama 3.1 70b fp8 model on 1 x gh200 by @samos123 in #302
- Llama 3.1 70b with pipeline parallelism by @samos123 in #307
- add k8s device plugin / GPU operator values file by @samos123 in #308
- Add Lambda's tutorial and video to the README's table of adopters by @cbrownstein-lambda in #309
- update vllm image for GPU and TPU to v0.6.4.post1 by @samos123 in #310
- add a generic K8s install guide by @samos123 in #312
- LoRA Adapters for vLLM & support for s3, gs, oss for pulling adapters and models (to cache) from buckets by @nstogner in #304
- Add Configure Text Generation Models guide by @samos123 in #313
New Contributors
- @happytreees made their first contribution in #300
- @cbrownstein-lambda made their first contribution in #309
Full Changelog: v0.10.0...v0.11.0
helm-chart-models-0.9.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.9.0
Private Open AI Platform for Kubernetes.
kubeai 0.10.0
What's Changed
- Adding Build WF timeout to address stuck WF's by @Sudhamsh in #281
- Add support for HTTP X-Label-Selector headers to support Multitenancy by @nstogner in #282
- add kubeai metrics service endpoint by @kaiehrhardt in #284
- increase caching e2e test timeout by @samos123 in #288
- Add EKS Installation Guide by @samos123 in #287
- add caching models with EFS guide by @samos123 in #289
New Contributors
- @Sudhamsh made their first contribution in #281
- @kaiehrhardt made their first contribution in #284
Full Changelog: v0.9.0...v0.10.0
helm-chart-models-0.8.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.8.0
Private Open AI Platform for Kubernetes.
kubeai 0.9.0
Highlights
- Autoscaling now works for any engine including Ollama and FasterWhisper
- Add ability to cache models using shared filesystems (Filestore, EFS, etc)
What's Changed
- Autoscale based on KubeAI OpenTelemetry active requests metrics by @nstogner in #261
- add resourceProfiles and 405b on A100 80GB by @samos123 in #264
- Refactor e2e tests by @nstogner in #263
- Add Autoscaler State ConfigMap by @nstogner in #268
- add tpu quota to GKE install guide and use values-gke.yaml by @samos123 in #271
- update vllm images to 0.6.3 by @samos123 in #273
- Shared filesystem caching by @nstogner in #272
- add manual test of vLLM on GPU and TPU by @samos123 in #279
Full Changelog: v0.8.0...v0.9.0