Skip to content

[Feature] KubeAI for OPEA v1.4 #1074

@joshuayao

Description

@joshuayao

Priority

Undecided

OS type

Ubuntu

Hardware type

Xeon-GNR

Running nodes

Single Node

Description

OPEA will integrate KubeAI for efficient inference on Kubernetes in v1.4.

@poussa provided the following plan:

  • Model Profiles
    • 10 models for Gaudi and CPU
    • Benchmark and tune each model (1,2,4,8 replicas)
  • OPEA Configuration
    • Resource Profiles (k8s resources definition)
    • Cache Profiles (k8s storage definition)
    • Resource and ache profiles are referenced from the models
  • Load Balancing
    • Validation and tuning on Gaudi
    • Benchmarking
  • Auto Scaling
    • Validation and tuning on Gaudi
    • Benchmarking
  • Observability
    • Validation and tuning on Gaudi

Sub-issues

Metadata

Metadata

Assignees

Labels

Backlogfeatures in backlogfeatureNew feature or request

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions