[Feature] KubeAI for OPEA v1.4

### Priority

Undecided

### OS type

Ubuntu

### Hardware type

Xeon-GNR

### Running nodes

Single Node

### Description

OPEA will integrate KubeAI for efficient inference on Kubernetes in v1.4. 

@poussa provided the following plan:

- [ ] Model Profiles
  - [ ] 10 models for Gaudi and CPU
  - [ ]  Benchmark and tune each model (1,2,4,8 replicas)
- [ ] OPEA Configuration
  - [ ] Resource Profiles (k8s resources definition)
  - [ ] Cache Profiles (k8s storage definition)
  - [ ] Resource and ache profiles are referenced from the models
- [ ] Load Balancing
  - [ ] Validation and tuning on Gaudi
  - [ ] Benchmarking
- [ ] Auto Scaling
  - [ ] Validation and tuning on Gaudi
  - [ ] Benchmarking
- [ ] Observability
  - [ ] Validation and tuning on Gaudi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] KubeAI for OPEA v1.4 #1074

Priority

OS type

Hardware type

Running nodes

Description

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] KubeAI for OPEA v1.4 #1074

Description

Priority

OS type

Hardware type

Running nodes

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions