Skip to content

Conversation

@PeterYang12
Copy link
Collaborator

Description

Add 4 llm model CRs for kubeai on gaudi platform

Issues

N/A

Dependencies

N/A

Copy link
Collaborator

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, looks fine for me.

@eero-t
Copy link
Collaborator

eero-t commented Jun 3, 2025

@poussa After profiles have been merged for all the models we want to support in 1.4 release, I think a model/profile matrix table like this should also be added to the README:
https://github.com/opea-project/Enterprise-Inference/blob/main/docs/supported-models.md

IMHO it should list also how many devices the model (profile) needs. Maybe also what are the currently specified min/max replicas, so that one knows how many fit into given cluster, if they're deployed as-is.

@joshuayao joshuayao linked an issue Jun 4, 2025 that may be closed by this pull request
2 tasks
@poussa poussa merged commit f3d3d1c into opea-project:main Jun 4, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][KubeAI] Creating model profiles for 10 LLMs on Gaudi and Xeon

3 participants