[KubeAI][Models] Added 2 new model files and updated model parameters #1150

vrantala · 2025-07-04T07:11:37Z

Description

Added mistral-7b-instruct-v0.3 and mixtral-8x7b-instruct-v0.1 model yaml files. Updated model parameters more optimal based on benchmark-serving testing with default parameters: request-rate=800, Input/Output tokens=200/200

Issues

n/a.

Type of change

[ x] New feature (non-breaking change which adds new functionality)

Dependencies

n/a.

Tests

Run KubeAI's benchmarking-serving tests to all models.

* Added mistral and mistral model yaml files * Updated model parameters more optimal based on benchmark-serving testing with default parameters: request-rate=800, Input/Output tokens=200/200 Signed-off-by: Rantala <[email protected]>

kubeai/models/deepseek-r1-distill-llama-70b-gaudi.yaml

kubeai/models/deepseek-r1-distill-llama-8b-gaudi.yaml

poussa · 2025-07-04T07:55:53Z

Also, add short README.md to the models directory stating that vLLM benchmarking serving script was used to find the parameters for the models, with the arguments request-rate=800, Input/Output tokens=200/200, concurrency = xxx.

kubeai/models/llama-3.1-8b-instruct-gaudi.yaml

kubeai/models/llama-3.3-70b-instruct-gaudi.yaml

kubeai/models/mistral-7b-instruct-v0.3-gaudi.yaml

Signed-off-by: Rantala <[email protected]>

vrantala · 2025-07-04T08:45:26Z

Also, add short README.md to the models directory stating that vLLM benchmarking serving script was used to find the parameters for the models, with the arguments request-rate=800, Input/Output tokens=200/200, concurrency = xxx.

Added README.md file

eero-t

Few minor changes still needed.

kubeai/models/README.md

kubeai/models/qwen2.5-7b-instruct-gaudi.yaml

eero-t · 2025-07-04T09:27:08Z

CI also seems to require adding README to some "toctree":
kubeai/models/README.md: WARNING: document isn't included in any toctree

@poussa Any idea what / where that is?

poussa · 2025-07-04T10:17:57Z

CI also seems to require adding README to some "toctree": kubeai/models/README.md: WARNING: document isn't included in any toctree

When you add a README file, you need to add a reference to it from some other README. In this case, just add link to the kubeai/README.md, for example:

The following [models](models/README.md) are included.

Signed-off-by: Rantala <[email protected]>

for more information, see https://pre-commit.ci

vrantala · 2025-07-04T10:44:10Z

CI also seems to require adding README to some "toctree": kubeai/models/README.md: WARNING: document isn't included in any toctree

When you add a README file, you need to add a reference to it from some other README. In this case, just add link to the kubeai/README.md, for example:

The following [models](models/README.md) are included.

Added link to kubeai/models/README.md

eero-t

Approved.

Main README update did not fix the toctree CI check though, so apparently the README needs to be linked also somewhere else?

eero-t · 2025-07-04T11:10:05Z

I think this can be merged despite toctree error. CI test should tell where the link should be added...

eero-t · 2025-07-04T12:12:57Z

kubeai/models/qwen2.5-7b-instruct-gaudi.yaml

+  maxReplicas: 8
+  # Equals to max-num-seqs (batch-size)
+  targetRequests: 512
+  resourceProfile: gaudi-for-text-generation::1


Missed the double double-colon typo...

chensuyue · 2025-07-07T01:35:26Z

This PR merged without CI pass, that will block IO build CI in all the other PRs, https://github.com/opea-project/GenAIExamples/actions/runs/16105452982/job/45440218033?pr=1922

vrantala requested review from mkbhanda and poussa as code owners July 4, 2025 07:11

Added 2 new model files and updated model parameters

0802ae8

* Added mistral and mistral model yaml files * Updated model parameters more optimal based on benchmark-serving testing with default parameters: request-rate=800, Input/Output tokens=200/200 Signed-off-by: Rantala <[email protected]>

vrantala force-pushed the main branch from d968d7e to 0802ae8 Compare July 4, 2025 07:33

poussa requested a review from eero-t July 4, 2025 07:44

poussa requested changes Jul 4, 2025

View reviewed changes

kubeai/models/deepseek-r1-distill-llama-70b-gaudi.yaml Outdated Show resolved Hide resolved

kubeai/models/deepseek-r1-distill-llama-70b-gaudi.yaml Outdated Show resolved Hide resolved

kubeai/models/deepseek-r1-distill-llama-8b-gaudi.yaml Outdated Show resolved Hide resolved

eero-t requested changes Jul 4, 2025

View reviewed changes

kubeai/models/llama-3.1-8b-instruct-gaudi.yaml Show resolved Hide resolved

kubeai/models/llama-3.3-70b-instruct-gaudi.yaml Outdated Show resolved Hide resolved

kubeai/models/mistral-7b-instruct-v0.3-gaudi.yaml Show resolved Hide resolved

Added README.md fille and updated model parameters

ce8ce09

Signed-off-by: Rantala <[email protected]>

vrantala force-pushed the main branch from 0758b20 to ce8ce09 Compare July 4, 2025 08:42

vrantala requested review from eero-t and poussa July 4, 2025 08:46

eero-t requested changes Jul 4, 2025

View reviewed changes

kubeai/models/README.md Outdated Show resolved Hide resolved

kubeai/models/README.md Show resolved Hide resolved

kubeai/models/qwen2.5-7b-instruct-gaudi.yaml Outdated Show resolved Hide resolved

Fixed README link and updated model files according to review comments

71473ce

Signed-off-by: Rantala <[email protected]>

vrantala force-pushed the main branch from 48493ed to 71473ce Compare July 4, 2025 10:40

[pre-commit.ci] auto fixes from pre-commit.com hooks

be29b46

for more information, see https://pre-commit.ci

poussa approved these changes Jul 4, 2025

View reviewed changes

eero-t approved these changes Jul 4, 2025

View reviewed changes

poussa merged commit 4e70d8b into opea-project:main Jul 4, 2025
6 of 7 checks passed

eero-t reviewed Jul 4, 2025

View reviewed changes

chensuyue mentioned this pull request Jul 7, 2025

Fix IO doc build index opea-project/docs#390

Merged

eero-t mentioned this pull request Aug 13, 2025

[Feature][KubeAI] Providing resource profiles for Gaudi and Xeon #1075

Closed

[KubeAI][Models] Added 2 new model files and updated model parameters #1150

[KubeAI][Models] Added 2 new model files and updated model parameters #1150

Uh oh!

Conversation

vrantala commented Jul 4, 2025

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poussa commented Jul 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vrantala commented Jul 4, 2025

Uh oh!

eero-t left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eero-t commented Jul 4, 2025

Uh oh!

poussa commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vrantala commented Jul 4, 2025

Uh oh!

eero-t left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eero-t commented Jul 4, 2025

Uh oh!

Uh oh!

eero-t Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

chensuyue commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

poussa commented Jul 4, 2025 •

edited

Loading

eero-t left a comment •

edited

Loading