Skip to content

Conversation

@chyundunovDatamonsters
Copy link
Contributor

Description

Adding a Dockerfile to build a TGI ROCm image with an unprivileged user in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Issues

Type of change

  • [*] New feature (non-breaking change which adds new functionality)

Dependencies

Tests

chensuyue and others added 25 commits April 29, 2025 20:19
Build and upstream latest base image on push event

opea-project#1314
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…ea-project#1329)

* Add timeout param for DocSum and FaqGen to deal with long context

Make timeout param configurable, solve issue opea-project/GenAIExamples#1481

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
- Fix permission issue for when ingesting pptx file with embedded image
- Add more test coverage to the dataprep CI and unify common dataprep CI test code for DB backends: qdrant, milvus, redis, pgvector

Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…ect#1374)

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: Raghava, Sharath <sharath.raghava@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: jeanyu-habana <jean1.yu@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
* Add Dockerfile for build ROCm vLLM Docker image

Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
* filter none test scripts in test matrix

Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…ge (opea-project#1376)

opea-project/GenAIExamples#1436
---------
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…opea-project#1377)

* [Bug: 1375] Fix Readme errors in dataprep component for all VectorDBs

Fixes opea-project#1375
Signed-off-by: Piroozan, Nariman <nariman.piroozan@intel.com>
Signed-off-by:  Ghosh, Soumyadip <soumyadip.ghosh@intel.com>
Signed-off-by:  Jaini, Pallavi <pallavi.jaini@intel.com>
Signed-off-by: Kavulya, Soila <soila.kavulya@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

* Improve dataprep CI and fix pptx file ingesting bug (opea-project#1334)

- Fix permission issue for when ingesting pptx file with embedded image
- Add more test coverage to the dataprep CI and unify common dataprep CI test code for DB backends: qdrant, milvus, redis, pgvector

Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

* Fix docker compose command in embedding BridgeTower readme (opea-project#1374)

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

* Changes to checkin text2graph microservice (opea-project#1357)

Signed-off-by: Raghava, Sharath <sharath.raghava@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

* [Bug: 1375] Fix Readme errors in dataprep component for all VectorDBs

Fixes opea-project#1375
Signed-off-by: Piroozan, Nariman <nariman.piroozan@intel.com>
Signed-off-by:  Ghosh, Soumyadip <soumyadip.ghosh@intel.com>
Signed-off-by:  Jaini, Pallavi <pallavi.jaini@intel.com>
Signed-off-by: Kavulya, Soila <soila.kavulya@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

---------

Signed-off-by: Shifani Rajabose <srajabose@habana.ai>
Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Signed-off-by: Raghava, Sharath <sharath.raghava@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Lianhao Lu <lianhao.lu@intel.com>
Co-authored-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Co-authored-by: intelsharath <116231490+intelsharath@users.noreply.github.com>
Co-authored-by: Liang Lv <liang1.lv@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: Lianhao Lu <lianhao.lu@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: Jonathan Minkin <minkinj@amazon.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Add aiofiles in requirements.txt of retriever, which is caused by cross-component function call of retriever neo4j.

Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…pea-project#1251)

* try to leverage existed env variable instead of introducing new one

Signed-off-by: Tsai, Louie <louie.tsai@intel.com>

* remove ENABLE_OPEA_TELEMETRY getenv

Signed-off-by: Tsai, Louie <louie.tsai@intel.com>

---------

Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…ct#1394)

Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…t#1391)

Signed-off-by: minmin-intel <minmin.hou@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
* vLLM lvm integration

- integrate vLLM LVMs and set vLLM as default
- use OpenAI chat completions and cover single-image/text-only cases

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
opea-project#1380)

* [Bug: 1378] Added Multimodal support for Milvus for dataprep component

Fixes opea-project#1378
Co-authored-by: Jaini, Pallavi <pallavi.jaini@intel.com>
Signed-off-by:  Ghosh, Soumyadip <soumyadip.ghosh@intel.com>
Signed-off-by:  Piroozan, Nariman <nariman.piroozan@intel.com>
Signed-off-by: Kavulya, Soila <soila.kavulya@intel.com>
Signed-off-by: Rajabose, Shifani <shifani.rajabose@intel.com>
Signed-off-by: Shifani Rajabose <srajabose@habana.ai>

---------

Signed-off-by: Shifani Rajabose <srajabose@habana.ai>
Signed-off-by: pallavi.jaini <pallavi.jaini@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: pallavi.jaini <pallavi.jaini@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: Zhu, Yongbo <yongbo.zhu@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…roject#1440)

Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com>
Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
* Add LLaMA Vision OH optimization

- Add mllama OH optimization
- Fix LVM README and fix steps
- add UT
- fix 422 request body issue from wrapper to dependency
- add LOGFLAG
- add healthcheck
- Upgrade HPU driver version
- Correct compose file mllama names

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
…er in a container. This is necessary to ensure that the best deployment practices in K8S are followed.

Signed-off-by: Chingis Yundunov <c.yundunov@datamonsters.com>
@chyundunovDatamonsters
Copy link
Contributor Author

Please confirm the changes, they block adding changes to GenAIInfra and GenAIExamples. Thanks!

@chensuyue
Copy link
Collaborator

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

@chensuyue
Copy link
Collaborator

@lianhao @yongfengdu any comments?

@chyundunovDatamonsters
Copy link
Contributor Author

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I've tried this approach, but I'll try again, maybe I did something wrong.

@chyundunovDatamonsters
Copy link
Contributor Author

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

@chyundunovDatamonsters
Copy link
Contributor Author

Please make these changes

@chensuyue
Copy link
Collaborator

Please make these changes

ok, then would you also update images in compose.yaml to align with helm charts values? Or this image used for helm charts only.

@yongfengdu
Copy link
Collaborator

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user?
This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

@chyundunovDatamonsters
Copy link
Contributor Author

Please make these changes

ok, then would you also update images in compose.yaml to align with helm charts values? Or this image used for helm charts only.

The changes have been tested for Compose and for Helm. However, their use is currently planned to be implemented only in Helm.

@chyundunovDatamonsters
Copy link
Contributor Author

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

@chensuyue
Copy link
Collaborator

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

Have you create the issue? Please let us know the link.

@chensuyue
Copy link
Collaborator

I am ok with this PR, but as a workaround, why not just keep use root? opea-project/GenAIInfra#949 (comment)

@chyundunovDatamonsters
Copy link
Contributor Author

Thanks, could you paste the issue you encountered when you try to run ROCm with non-root user? This will let everyone/reviewer know why we have to make this changes.

The tgi in helm charts use this method to run default with NonRoot, https://github.com/opea-project/GenAIInfra/blob/1e20de1016dfeae9cbb1cc00476f43b3024f55aa/helm-charts/common/tgi/values.yaml#L56, do we need this PR specific for tgi-rocm?

I tried it. This solution does not work on ROCm images.

I will create a problem in the TGI project and attach the link here.

Have you create the issue? Please let us know the link.

huggingface/text-generation-inference#3225

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.