Skip to content

Conversation

@chyundunovDatamonsters
Copy link
Contributor

Description

Adding a Dockerfile to build a TGI ROCm image with an unprivileged user in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Issues

Type of change

  • [*] New feature (non-breaking change which adds new functionality)

Dependencies

Tests

chensuyue and others added 25 commits April 29, 2025 20:19
Build and upstream latest base image on push event

opea-project#1314
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
…ea-project#1329)

* Add timeout param for DocSum and FaqGen to deal with long context

Make timeout param configurable, solve issue opea-project/GenAIExamples#1481

Signed-off-by: Xinyao Wang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Xinyao Wang <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
- Fix permission issue for when ingesting pptx file with embedded image
- Add more test coverage to the dataprep CI and unify common dataprep CI test code for DB backends: qdrant, milvus, redis, pgvector

Signed-off-by: Lianhao Lu <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: Raghava, Sharath <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: jeanyu-habana <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
* Add Dockerfile for build ROCm vLLM Docker image

Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
* filter none test scripts in test matrix

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
…opea-project#1377)

* [Bug: 1375] Fix Readme errors in dataprep component for all VectorDBs

Fixes opea-project#1375
Signed-off-by: Piroozan, Nariman <[email protected]>
Signed-off-by:  Ghosh, Soumyadip <[email protected]>
Signed-off-by:  Jaini, Pallavi <[email protected]>
Signed-off-by: Kavulya, Soila <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Shifani Rajabose <[email protected]>

* Improve dataprep CI and fix pptx file ingesting bug (opea-project#1334)

- Fix permission issue for when ingesting pptx file with embedded image
- Add more test coverage to the dataprep CI and unify common dataprep CI test code for DB backends: qdrant, milvus, redis, pgvector

Signed-off-by: Lianhao Lu <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

* Fix docker compose command in embedding BridgeTower readme (opea-project#1374)

Signed-off-by: Dina Suehiro Jones <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

* Changes to checkin text2graph microservice (opea-project#1357)

Signed-off-by: Raghava, Sharath <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

* [Bug: 1375] Fix Readme errors in dataprep component for all VectorDBs

Fixes opea-project#1375
Signed-off-by: Piroozan, Nariman <[email protected]>
Signed-off-by:  Ghosh, Soumyadip <[email protected]>
Signed-off-by:  Jaini, Pallavi <[email protected]>
Signed-off-by: Kavulya, Soila <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

---------

Signed-off-by: Shifani Rajabose <[email protected]>
Signed-off-by: Lianhao Lu <[email protected]>
Signed-off-by: Dina Suehiro Jones <[email protected]>
Signed-off-by: Raghava, Sharath <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Lianhao Lu <[email protected]>
Co-authored-by: Dina Suehiro Jones <[email protected]>
Co-authored-by: intelsharath <[email protected]>
Co-authored-by: Liang Lv <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: Jonathan Minkin <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Add aiofiles in requirements.txt of retriever, which is caused by cross-component function call of retriever neo4j.

Signed-off-by: letonghan <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
…pea-project#1251)

* try to leverage existed env variable instead of introducing new one

Signed-off-by: Tsai, Louie <[email protected]>

* remove ENABLE_OPEA_TELEMETRY getenv

Signed-off-by: Tsai, Louie <[email protected]>

---------

Signed-off-by: Tsai, Louie <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
* vLLM lvm integration

- integrate vLLM LVMs and set vLLM as default
- use OpenAI chat completions and cover single-image/text-only cases

Signed-off-by: Chingis Yundunov <[email protected]>
opea-project#1380)

* [Bug: 1378] Added Multimodal support for Milvus for dataprep component

Fixes opea-project#1378
Co-authored-by: Jaini, Pallavi <[email protected]>
Signed-off-by:  Ghosh, Soumyadip <[email protected]>
Signed-off-by:  Piroozan, Nariman <[email protected]>
Signed-off-by: Kavulya, Soila <[email protected]>
Signed-off-by: Rajabose, Shifani <[email protected]>
Signed-off-by: Shifani Rajabose <[email protected]>

---------

Signed-off-by: Shifani Rajabose <[email protected]>
Signed-off-by: pallavi.jaini <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: pallavi.jaini <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: Zhu, Yongbo <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Chingis Yundunov <[email protected]>
* Add LLaMA Vision OH optimization

- Add mllama OH optimization
- Fix LVM README and fix steps
- add UT
- fix 422 request body issue from wrapper to dependency
- add LOGFLAG
- add healthcheck
- Upgrade HPU driver version
- Correct compose file mllama names

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <[email protected]>
@chyundunovDatamonsters
Copy link
Contributor Author

Please confirm the changes, they block adding changes to GenAIInfra and GenAIExamples. Thanks!

@chensuyue
Copy link
Collaborator

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

@chensuyue
Copy link
Collaborator

@lianhao @yongfengdu any comments?

@chyundunovDatamonsters
Copy link
Contributor Author

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I've tried this approach, but I'll try again, maybe I did something wrong.

@chyundunovDatamonsters
Copy link
Contributor Author

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

@chyundunovDatamonsters
Copy link
Contributor Author

Please make these changes

@chensuyue
Copy link
Collaborator

Please make these changes

ok, then would you also update images in compose.yaml to align with helm charts values? Or this image used for helm charts only.

@yongfengdu
Copy link
Collaborator

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user?
This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

@chyundunovDatamonsters
Copy link
Contributor Author

Please make these changes

ok, then would you also update images in compose.yaml to align with helm charts values? Or this image used for helm charts only.

The changes have been tested for Compose and for Helm. However, their use is currently planned to be implemented only in Helm.

@chyundunovDatamonsters
Copy link
Contributor Author

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

@chensuyue
Copy link
Collaborator

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

Have you create the issue? Please let us know the link.

@chensuyue
Copy link
Collaborator

I am ok with this PR, but as a workaround, why not just keep use root? opea-project/GenAIInfra#949 (comment)

@chyundunovDatamonsters
Copy link
Contributor Author

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

Have you create the issue? Please let us know the link.

huggingface/text-generation-inference#3225

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.