ChatQnA - Adding files for deploy application on ROCm vLLM and ROCm TGI with Helm #949

chyundunovDatamonsters · 2025-04-05T03:51:24Z

Description

Adding files for deploy application on ROCm vLLM and ROCm TGI with Helm

Issues

Type of change

List the type of change like below. Please delete options that are not relevant.

[*] New feature (non-breaking change which adds new functionality)

Dependencies

Tests

…GI with Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao

Pls also update the valuesfiles.yaml which is used to sync the helm value files from GenAIInfra to GenAIExamples

helm-charts/chatqna/rocm-values.yaml

lianhao · 2025-04-07T01:06:40Z

@chensuyue @yongfengdu I think the CI for AMD roc should be added in the GenAIInfra too. Do you know how to do that?

chensuyue · 2025-04-10T03:21:20Z

@chensuyue @yongfengdu I think the CI for AMD roc should be added in the GenAIInfra too. Do you know how to do that?

Yes, we need it. I will take with AMD team to add the test machine into OPEA CI. And we also need to modify the CI workflow adapt with the rocm test.

yongfengdu · 2025-04-25T01:30:48Z

Could you rebase the PR with latest changes and address the comments?
If there is no special reason, you should remove the "tag: cpu-1.5", thus use the default one defined in values.yaml file which is cpu-1.6.

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao

besides the embedded comment, please also fix the rebase conflict

helm-charts/chatqna/faqgen-rocm-tgi-values.yaml

helm-charts/chatqna/faqgen-rocm-values.yaml

helm-charts/chatqna/rocm-tgi-values.yaml

helm-charts/chatqna/rocm-values.yaml

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao · 2025-04-25T05:21:01Z

@chyundunovDatamonsters please do a manual rebase locally and fix the following conflict:

Also, your manual local rebase should fix the CI failure too.

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

…o feature/ChatQnA_k8s # Conflicts: # helm-charts/chatqna/README.md

helm-charts/chatqna/faqgen-rocm-tgi-values.yaml

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao · 2025-04-28T05:55:05Z

Seems like the k8s-rocm K8s cluster has some issues. All running pods are automatically stopped and killed. @chyundunovDatamonsters please check the K8s cluster to make sure there is no node level resource pressure(i.e. cpu, memory, disk, etc.)

…o feature/ChatQnA_k8s

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao

Please fix the following yaml syntax error:

Error: parse error at (tgi/templates/deployment.yaml:101): unexpected "/" in operand

helm-charts/common/tgi/templates/deployment.yaml

helm-charts/common/tgi/rocm-values.yaml

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

…GI with Helm Signed-off-by: Chingis Yundunov <[email protected]>

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

…e/ChatQnA_k8s

for more information, see https://pre-commit.ci

chyundunovDatamonsters · 2025-05-15T13:02:11Z

The Gaudi tests fail. Please pay attention to this problem.

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

helm-charts/common/tgi/templates/deployment.yaml

lianhao · 2025-05-16T01:43:11Z

The Gaudi tests fail. Please pay attention to this problem.

@chyundunovDatamonsters The gaudi test env should fine now.

Please pay attention to my embedded comment above. Thanks!

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

chyundunovDatamonsters · 2025-05-16T04:28:06Z

@chyundunovDatamonsters please do a manual rebase locally and fix the following conflict:
Also, your manual local rebase should fix the CI failure too.

Fixed

lianhao

All seems ok except for my last unresolved comment. @chyundunovDatamonsters

helm-charts/common/tgi/values.yaml

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Signed-off-by: Eero Tamminen <[email protected]>

Introduced also by opea-project#949, and update first README clause to indicate that some of the subservices are conditional. Signed-off-by: Eero Tamminen <[email protected]>

Introduced also by #949, and update first README clause to indicate that some of the subservices are conditional. Signed-off-by: Eero Tamminen <[email protected]>

ChatQnA - Adding files for deploy application on ROCm vLLM and ROCm T…

cb9687d

…GI with Helm Signed-off-by: Chingis Yundunov <[email protected]>

chyundunovDatamonsters requested review from lianhao and yongfengdu as code owners April 5, 2025 03:51

lianhao requested changes Apr 7, 2025

View reviewed changes

helm-charts/chatqna/rocm-values.yaml Show resolved Hide resolved

helm-charts/chatqna/rocm-values.yaml Show resolved Hide resolved

chensuyue added this to the v1.4 milestone Apr 10, 2025

Adapting ChatQnA applications for deployment in the K8S environment u…

20db4a2

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao requested changes Apr 25, 2025

View reviewed changes

lianhao added the rocm label Apr 25, 2025

Adapting ChatQnA applications for deployment in the K8S environment u…

808788e

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

chyundunovDatamonsters added 2 commits April 25, 2025 12:28

Adapting ChatQnA applications for deployment in the K8S environment u…

efe2356

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Merge branch 'main' of https://github.com/opea-project/GenAIInfra int…

4db2445

…o feature/ChatQnA_k8s # Conflicts: # helm-charts/chatqna/README.md

lianhao requested changes Apr 25, 2025

View reviewed changes

helm-charts/chatqna/faqgen-rocm-tgi-values.yaml Show resolved Hide resolved

Adapting ChatQnA applications for deployment in the K8S environment u…

39f730e

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

chyundunovDatamonsters added 3 commits May 6, 2025 10:50

Merge branch 'main' of https://github.com/opea-project/GenAIInfra int…

9511da4

…o feature/ChatQnA_k8s

Adapting ChatQnA applications for deployment in the K8S environment u…

92d02d2

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

180f16f

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao requested changes May 8, 2025

View reviewed changes

lianhao reviewed May 8, 2025

View reviewed changes

helm-charts/common/tgi/templates/deployment.yaml Outdated Show resolved Hide resolved

lianhao mentioned this pull request May 9, 2025

ChatQnA - Adding files to deploy an application in the K8S environment using Helm opea-project/GenAIExamples#1759

Closed

chensuyue reviewed May 9, 2025

View reviewed changes

helm-charts/common/tgi/rocm-values.yaml Show resolved Hide resolved

chensuyue mentioned this pull request May 9, 2025

Adding a Dockerfile to build a TGI ROCm image with an unprivileged user in a container opea-project/GenAIComps#1638

Closed

chyundunovDatamonsters and others added 4 commits May 13, 2025 16:05

Adapting ChatQnA applications for deployment in the K8S environment u…

1298c18

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Merge branch 'main' into feature/ChatQnA_k8s

70e2f6d

ChatQnA - Adding files for deploy application on ROCm vLLM and ROCm T…

76d47c2

…GI with Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

7a6380d

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

chyundunovDatamonsters and others added 10 commits May 15, 2025 17:56

Adapting ChatQnA applications for deployment in the K8S environment u…

5055b00

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

5e3d4e8

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

7876371

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

2bf5991

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

efb97b4

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Adapting ChatQnA applications for deployment in the K8S environment u…

d112b67

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Merge branch 'main' into feature/ChatQnA_k8s

73ed118

Adapting ChatQnA applications for deployment in the K8S environment u…

9dc514e

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

Merge remote-tracking branch 'origin/feature/ChatQnA_k8s' into featur…

14a4e7a

…e/ChatQnA_k8s

[pre-commit.ci] auto fixes from pre-commit.com hooks

99fdebc

for more information, see https://pre-commit.ci

chyundunovDatamonsters requested a review from lianhao May 15, 2025 13:01

Adapting ChatQnA applications for deployment in the K8S environment u…

ed4d6a6

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao requested changes May 16, 2025

View reviewed changes

helm-charts/common/tgi/templates/deployment.yaml Show resolved Hide resolved

Adapting ChatQnA applications for deployment in the K8S environment u…

bce2798

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao requested changes May 16, 2025

View reviewed changes

helm-charts/common/tgi/values.yaml Outdated Show resolved Hide resolved

chensuyue approved these changes May 16, 2025

View reviewed changes

Adapting ChatQnA applications for deployment in the K8S environment u…

52bc8a7

…sing AMD GPU using Helm Signed-off-by: Chingis Yundunov <[email protected]>

lianhao approved these changes May 16, 2025

View reviewed changes

lianhao merged commit 4fdc4bb into opea-project:main May 16, 2025
29 checks passed

eero-t added a commit to eero-t/GenAIInfra that referenced this pull request May 16, 2025

Fix README bug introdoced by opea-project#949

8bc8afe

Signed-off-by: Eero Tamminen <[email protected]>

eero-t mentioned this pull request May 16, 2025

Fix README bug introdoced by #949 #1053

Merged

1 task

yongfengdu pushed a commit that referenced this pull request May 19, 2025

Fix README bug introdoced by #949 (#1053)

3b384d2

Signed-off-by: Eero Tamminen <[email protected]>

eero-t mentioned this pull request May 21, 2025

Fix ChatQnA README regression #1064

Merged

1 task

poussa pushed a commit that referenced this pull request May 22, 2025

Fix ChatQnA README regression (#1064)

31d0201

Introduced also by #949, and update first README clause to indicate that some of the subservices are conditional. Signed-off-by: Eero Tamminen <[email protected]>

eero-t mentioned this pull request Jun 27, 2025

tei: upgrade to version cpu-1.7 #1134

Closed

1 task

ChatQnA - Adding files for deploy application on ROCm vLLM and ROCm TGI with Helm #949

ChatQnA - Adding files for deploy application on ROCm vLLM and ROCm TGI with Helm #949

Uh oh!

Conversation

chyundunovDatamonsters commented Apr 5, 2025

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

lianhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lianhao commented Apr 7, 2025

Uh oh!

chensuyue commented Apr 10, 2025

Uh oh!

yongfengdu commented Apr 25, 2025

Uh oh!

lianhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lianhao commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lianhao commented Apr 28, 2025

Uh oh!

lianhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chyundunovDatamonsters commented May 15, 2025

Uh oh!

Uh oh!

lianhao commented May 16, 2025

Uh oh!

chyundunovDatamonsters commented May 16, 2025

Uh oh!

lianhao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lianhao commented Apr 25, 2025 •

edited

Loading

lianhao left a comment •

edited

Loading