Add support for external OpenAI-compatible LLM endpoints across Helm charts (chatqna, codegen) #993

devpramod · 2025-04-14T21:32:43Z

Description

This PR implements support for using external OpenAI-compatible LLM endpoints with ChatQnA, CodeGen, and DocSum Helm charts while maintaining backward compatibility with existing configurations.

Motivation
Users need the flexibility to connect to external LLM services (like OpenAI or self-hosted OpenAI-compatible endpoints) instead of always relying on the included LLM components. This adds versatility to our Helm charts without disrupting existing functionality.

Key Changes
For each Helm chart (ChatQnA, CodeGen, DocSum):

Added external-llm-values.yaml configuration file:

Provides settings for external LLM endpoints (LLM_SERVER_HOST_IP, LLM_MODEL, OPENAI_API_KEY)
Disables internal LLM services (tgi, vllm, llm-uservice) when using external endpoints

Created templates/external-llm-configmap.yaml:

Manages environment variables required for external LLM integration
Conditionally created only when external LLM is enabled

Updated templates/deployment.yaml:

Added conditional logic to use ConfigMap for environment variables when external LLM is enabled
Maintained all existing functionality and conditional paths for internal LLM services

Updated Chart.yaml:

Added proper conditions for all dependencies to make them optional
Enables selective component deployment based on configuration values

Updated README.md:

Added example commands showing how to use external LLM endpoints
Maintained consistent placeholder values across all services

Issues

Fixes #1015

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

'n/a'

Tests

Tested backward compatibility to ensure existing helm charts with default values.yaml is working

lianhao

2 more things besides the embedded comment:

Will there be a different image of opea/chatqna for the external LLM endpoint use case? Or the image will be the same?
In order to make CI happy to test external-llm-values.yaml, we should have a valid LLM_SERVER_HOST_IP and OPENAI_API_KEY in CI environment. I believe you should have the similar CI requirements in GenAIExamples repo when make changes to chatqna mega gateway. We should follow the same convention here.

helm-charts/chatqna/Chart.yaml

helm-charts/chatqna/external-llm-values.yaml

devpramod · 2025-04-16T01:08:07Z

@lianhao
Should I set the defaults for codegen and docsum values.yaml as well?
i.e. .enabled?

lianhao · 2025-04-16T01:22:23Z

@lianhao Should I set the defaults for codegen and docsum values.yaml as well? i.e. .enabled?

yes. Also maybe we can split PRs for chatqna, codegen, docsum respectively? Just in case the mega gateway PR in GenAIExamples are not merged in the same time?

helm-charts/chatqna/external-llm-values.yaml

helm-charts/codegen/templates/ui-nginx.yaml

helm-charts/codegen/values.yaml

helm-charts/common/ui/templates/configmap.yaml

helm-charts/docsum/values.yaml

helm-charts/chatqna/templates/deployment.yaml

lianhao · 2025-04-17T03:52:01Z

@devpramod We need you to do us a favor. Could you please do a manual rebase on your branch before your next round updating of this PR? Otherwise, it will trigger lots of unnecessary test cases.

lianhao

Please be noted that currently CI is under heavy burden doing release CD tests for 1.3

helm-charts/common/ui/templates/configmap.yaml

helm-charts/docsum/values.yaml

helm-charts/codegen/Chart.yaml

helm-charts/chatqna/templates/deployment.yaml

helm-charts/codegen/templates/external-llm-configmap.yaml

helm-charts/codegen/templates/ui-nginx.yaml

helm-charts/codegen/values.yaml

helm-charts/docsum/templates/deployment.yaml

helm-charts/docsum/templates/external-llm-configmap.yaml

helm-charts/chatqna/templates/deployment.yaml

eero-t

Looks OK once Lianhao's comments are resolved (and CI passes).

lianhao

@devpramod please address the 2 remaining embedded comment. Also please rebase and fix the test error

Thx~

helm-charts/codegen/templates/deployment.yaml

helm-charts/chatqna/templates/deployment.yaml

poussa

LGTM. Please address @lianhao comments and we are good to merge, I think.

poussa · 2025-05-13T07:45:35Z

LGTM. Please address @lianhao comments and we are good to merge, I think.

@devpramod ping.

devpramod · 2025-05-13T14:06:34Z

Hi @poussa
working on resolving the comment, and also working on updating the CI for the new values file.
will update soon.

eero-t · 2025-05-16T15:00:03Z

Deployment templates are missing llm-uservice.enabled check(s), because CI test Helm installs fail to no template "llm-uservice.fullname" error, both for "CodeGen":

+ helm install --create-namespace --namespace infra-codegen-7875974e --wait --timeout 600s --set GOOGLE_API_KEY=*** --set GOOGLE_CSE_ID=*** --set web-retriever.GOOGLE_API_KEY=*** --set web-retriever.GOOGLE_CSE_ID=*** --set global.HUGGINGFACEHUB_API_TOKEN=*** --set global.modelUseHostPath=/data2/hf_model --values helm-charts//codegen/external-llm-values.yaml codegen14000636 helm-charts//codegen
Error: INSTALLATION FAILED: template: codegen/templates/deployment.yaml:38:24: executing "codegen/templates/deployment.yaml" at <include "llm-uservice.fullname" (index .Subcharts "llm-uservice")>: error calling include: template: no template "llm-uservice.fullname" associated with template "gotpl"
+ echo 'Failed to install chart /codegen'

And for "DocSum":

+ helm install --create-namespace --namespace infra-docsum-50d1e674 --wait --timeout 600s --set GOOGLE_API_KEY=*** --set GOOGLE_CSE_ID=*** --set web-retriever.GOOGLE_API_KEY=*** --set web-retriever.GOOGLE_CSE_ID=*** --set global.HUGGINGFACEHUB_API_TOKEN=*** --set global.modelUseHostPath=/data2/hf_model --values helm-charts//docsum/external-llm-values.yaml docsum14000012 helm-charts//docsum
Error: INSTALLATION FAILED: template: docsum/templates/deployment.yaml:38:24: executing "docsum/templates/deployment.yaml" at <include "llm-uservice.fullname" (index .Subcharts "llm-uservice")>: error calling include: template: no template "llm-uservice.fullname" associated with template "gotpl"
+ echo 'Failed to install chart /docsum'

I'm not sure why chatqna,k8s-gaudi,guardrails-gaudi-values fails, maybe because Helm install or vLLM timeout?

+ helm install --create-namespace --namespace infra-chatqna-cbdb40b4 --wait --timeout 900s --set GOOGLE_API_KEY=*** --set GOOGLE_CSE_ID=*** --set web-retriever.GOOGLE_API_KEY=*** --set web-retriever.GOOGLE_CSE_ID=*** --set global.HUGGINGFACEHUB_API_TOKEN=*** --set global.modelUseHostPath=/data2/hf_model --values helm-charts//chatqna/guardrails-gaudi-values.yaml chatqna13235020 helm-charts//chatqna
Error: INSTALLATION FAILED: context deadline exceeded
+ echo 'Failed to install chart /chatqna'
...
[pod/chatqna13235020-vllm-f4587ff8b-6fdgl/vllm] INFO 05-14 00:05:31 hpu_model_runner.py:1730] [Warmup][Graph/Decode][355/416] batch_size:2 num_blocks:384 free_mem:14.45 GiB
-----------------------------------
+ exit 1

chatqna,external-llm-values CI test fails due to something looking like mismatch between how CI tests ChatQnA:

 [pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]   File "/home/user/comps/cores/telemetry/opea_telemetry.py", line 61, in wrapper
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]     res = await func(*args, **kwargs)
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]   File "/home/user/comps/cores/mega/orchestrator.py", line 258, in execute
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]     endpoint = self.services[cur_node].endpoint_path(inputs["model"])
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]   File "/home/user/comps/cores/mega/micro_service.py", line 142, in endpoint_path
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]     model_endpoint = model.split("/")[1]
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243]                      ~~~~~~~~~~~~~~~~^^^
[pod/chatqna13235243-5dd6bff8c6-rsgmk/chatqna13235243] IndexError: list index out of range

and this PR opea-project/GenAIComps#1583 merged few weeks ago, that changed above code.

lianhao

We should follow the chatqna method to set external LLM related config to avoid the above errors @eero-t mentioned

helm-charts/chatqna/templates/external-llm-configmap.yaml

helm-charts/codegen/templates/deployment.yaml

helm-charts/docsum/templates/deployment.yaml

for more information, see https://pre-commit.ci

…rnal LLM and data-prep configurations; update values.yaml for productivity suite and docsum to enable necessary services and endpoints. Signed-off-by: devpramod <[email protected]>

…onditionals in deployment and configmap templates for chatqna, codegen, and docsum; update values.yaml to enable whisper and redis-vector-db services. Signed-off-by: devpramod <[email protected]>

Signed-off-by: devpramod <[email protected]>

…o override them if necessary Signed-off-by: devpramod <[email protected]>

…d env variable for api key Signed-off-by: devpramod <[email protected]>

…iables in deployments Signed-off-by: devpramod <[email protected]>

…le charts Signed-off-by: devpramod <[email protected]>

lianhao · 2025-07-01T00:53:45Z

@devpramod please do the following steps to make CI happy:

Ask Suyue to add those secrets used in the CI into GenAIInfra repo. (She's the one who have the access to do that)
Submit a separate PR for the changes in the _helm_e2e.html alone.
Once step 2 PR is merged, rebase this PR then we should be all good.

lianhao

I'm ok with this PR itself. But we need to make some changes to make CI happy, following the steps as I listed here #993 (comment)

Signed-off-by: devpramod <[email protected]>

…point ready Signed-off-by: Tsai, Louie <[email protected]>

lianhao

@louie-tsai @devpramod I would suggest some minor changes, and also please help resolve the merge conflict issue. Others LGTM

lianhao · 2025-07-04T05:45:27Z

helm-charts/chatqna/README.md

 #helm install chatqna chatqna --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set global.modelUseHostPath=${MODELDIR} --set tgi.LLM_MODEL_ID=${MODELNAME} -f chatqna/rocm-tgi-values.yaml

+# To use with external OpenAI compatible LLM endpoint
+#helm install chatqna chatqna -f chatqna/external-llm-values.yaml --set externalLLM.LLM_SERVER_HOST_IP="http://your-llm-server" --set externalLLM.LLM_MODEL="your-model" --set externalLLM.OPENAI_API_KEY="your-api-key"


Please change the file name external-llm-values.yaml accordingly

@poussa already did a merge, so I added separate #1153 PR to fix the value file names.

lianhao · 2025-07-04T05:45:51Z

helm-charts/codegen/README.md

 # helm install codegen codegen --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set global.modelUseHostPath=${MODELDIR} --set llm-uservcie.LLM_MODEL_ID=${MODELNAME} --set tgi.LLM_MODEL_ID=${MODELNAME} -f codegen/rocm-tgi-values.yaml
-
+# To use with external OpenAI compatible LLM endpoint
+# helm install codegen codegen -f codegen/external-llm-values.yaml --set externalLLM.LLM_SERVER_HOST_IP="http://your-llm-server" --set externalLLM.LLM_MODEL="your-model" --set externalLLM.OPENAI_API_KEY="your-api-key"


Please change the file name external-llm-values.yaml accordingly

eero-t · 2025-07-18T14:48:47Z

helm-charts/chatqna/Chart.yaml

+    condition: data-prep.enabled
  - name: ui
    alias: chatqna-ui
    version: 0-latest
    repository: "file://../common/ui"
+    condition: chatqna-ui.enabled


Disabling either data-prep or chatqna-ui means that Helm fails parsing the ChatQnA NGINX template: https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/chatqna/templates/nginx.yaml

(Because their names have dashes, Helm does not understand if one adds ifs to the template for them being enabled.)

eero-t · 2025-07-18T14:52:42Z

helm-charts/codegen/Chart.yaml

+    condition: data-prep.enabled
  - name: ui
    version: 0-latest
    repository: "file://../common/ui"
    alias: codegen-ui
+    condition: codegen-ui.enabled


Disabling either data-prep or codegen-ui means that Helm fails parsing the CodeGen NGINX template:
https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/codegen/templates/ui-nginx.yaml

eero-t · 2025-07-18T14:55:51Z

helm-charts/chatqna/variant_external-llm-values.yaml

+# Disable internal LLM services when using external LLM
+llm-uservice:
+  enabled: false
+
+vllm:
+  enabled: false
+
+tgi:
+  enabled: false


This is missing:

ollama: enabled: false

eero-t · 2025-07-18T18:39:56Z

Added PR #1166 for fixing above and some additional issues I noticed later on.

lianhao requested changes Apr 15, 2025

View reviewed changes

lianhao requested changes Apr 16, 2025

View reviewed changes

devpramod force-pushed the helm-external-llm branch from 0da28c5 to 50abb8c Compare April 17, 2025 22:19

lvliang-intel marked this pull request as ready for review April 21, 2025 02:27

lvliang-intel requested a review from yongfengdu as a code owner April 21, 2025 02:27

lvliang-intel added the v1.3 label Apr 21, 2025

devpramod force-pushed the helm-external-llm branch from eaacd8b to eb0bdc8 Compare April 21, 2025 23:12

devpramod requested review from mkbhanda and poussa as code owners April 21, 2025 23:12

lianhao requested changes Apr 23, 2025

View reviewed changes

helm-charts/chatqna/templates/deployment.yaml Show resolved Hide resolved

eero-t reviewed Apr 23, 2025

View reviewed changes

devpramod force-pushed the helm-external-llm branch from 18bacac to 95fef45 Compare April 23, 2025 22:26

lianhao reviewed Apr 25, 2025

View reviewed changes

helm-charts/codegen/templates/deployment.yaml Show resolved Hide resolved

devpramod force-pushed the helm-external-llm branch from 95fef45 to 8f6e1ea Compare April 25, 2025 20:44

lianhao requested changes Apr 27, 2025

View reviewed changes

helm-charts/chatqna/templates/deployment.yaml Outdated Show resolved Hide resolved

devpramod force-pushed the helm-external-llm branch from 3f701c5 to 8f6e1ea Compare April 28, 2025 19:54

lianhao requested changes Apr 29, 2025

View reviewed changes

helm-charts/chatqna/templates/deployment.yaml Outdated Show resolved Hide resolved

poussa reviewed May 8, 2025

View reviewed changes

devpramod force-pushed the helm-external-llm branch from 0a69a59 to 897fa9d Compare May 13, 2025 23:42

lianhao requested changes May 19, 2025

View reviewed changes

helm-charts/chatqna/templates/external-llm-configmap.yaml Outdated Show resolved Hide resolved

helm-charts/codegen/templates/deployment.yaml Outdated Show resolved Hide resolved

helm-charts/docsum/templates/deployment.yaml Outdated Show resolved Hide resolved

eero-t mentioned this pull request May 19, 2025

[Feature] Enable remote inference endpoints in helm chats for ChatQnA, CodeGen and DocSum #1055

Closed

joshuayao linked an issue May 20, 2025 that may be closed by this pull request

[Feature] Enable remote inference endpoints in helm chats for ChatQnA, CodeGen and DocSum #1055

Closed

joshuayao removed the v1.3 label May 20, 2025

pre-commit-ci bot and others added 8 commits June 30, 2025 18:48

[pre-commit.ci] auto fixes from pre-commit.com hooks

0c78cd9

for more information, see https://pre-commit.ci

Refactor Helm chart templates to simplify conditional checks for exte…

b41d6cc

…rnal LLM and data-prep configurations; update values.yaml for productivity suite and docsum to enable necessary services and endpoints. Signed-off-by: devpramod <[email protected]>

Refactor external LLM configuration checks in Helm charts; simplify c…

0362d65

…onditionals in deployment and configmap templates for chatqna, codegen, and docsum; update values.yaml to enable whisper and redis-vector-db services. Signed-off-by: devpramod <[email protected]>

remove unnecessary conditional checks for LLM, fix indentation

f361956

Signed-off-by: devpramod <[email protected]>

move optional external llm configmap to after default env variables t…

3813ff5

…o override them if necessary Signed-off-by: devpramod <[email protected]>

Enhance LLM environment variable configuration in deployment.yaml; ad…

f16d6fb

…d env variable for api key Signed-off-by: devpramod <[email protected]>

refactor: replace external LLM ConfigMaps with direct environment var…

057fac1

…iables in deployments Signed-off-by: devpramod <[email protected]>

feat: add LLM_SERVER_PORT to external LLM configuration across multip…

8b1b207

…le charts Signed-off-by: devpramod <[email protected]>

devpramod force-pushed the helm-external-llm branch from c713304 to 167d0f5 Compare June 30, 2025 22:59

lianhao approved these changes Jul 1, 2025

View reviewed changes

poussa self-requested a review July 1, 2025 05:40

poussa approved these changes Jul 1, 2025

View reviewed changes

devpramod force-pushed the helm-external-llm branch from 167d0f5 to 8b1b207 Compare July 1, 2025 22:44

devpramod and others added 5 commits July 1, 2025 22:54

revert the changes for docsum to main state

3e57f52

Signed-off-by: devpramod <[email protected]>

for codegen - update LLM_MODEL -> LLM_MODEL_ID

5894ba0

Signed-off-by: devpramod <[email protected]>

Merge branch 'main' into helm-external-llm

509815c

Merge branch 'main' into helm-external-llm

1a00ed2

change llm yaml files to bypass remote endpoint CI testing before end…

273fa44

…point ready Signed-off-by: Tsai, Louie <[email protected]>

lianhao reviewed Jul 4, 2025

View reviewed changes

lianhao changed the title ~~Add support for external OpenAI-compatible LLM endpoints across Helm charts (chatqna, codegen, docsum)~~ Add support for external OpenAI-compatible LLM endpoints across Helm charts (chatqna, codegen) Jul 4, 2025

lianhao added this to the v1.4 milestone Jul 4, 2025

Merge branch 'main' into helm-external-llm

6d20fca

poussa merged commit c977012 into opea-project:main Jul 4, 2025
32 checks passed

eero-t mentioned this pull request Jul 4, 2025

Fix READMEs to match (external LLM) values file names #1153

Merged

1 task

eero-t reviewed Jul 18, 2025

View reviewed changes

eero-t mentioned this pull request Jul 18, 2025

External LLM endpoint usage improvements + fixes #1166

Merged

2 tasks

eero-t mentioned this pull request Jul 18, 2025

Enable docsum to connect with external llm endpoints via helm #1141

Merged

3 tasks

Add support for external OpenAI-compatible LLM endpoints across Helm charts (chatqna, codegen) #993

Add support for external OpenAI-compatible LLM endpoints across Helm charts (chatqna, codegen) #993

Uh oh!

Conversation

devpramod commented Apr 14, 2025 • edited by lianhao Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

lianhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devpramod commented Apr 16, 2025

Uh oh!

lianhao commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lianhao commented Apr 17, 2025

Uh oh!

lianhao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eero-t left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lianhao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poussa left a comment

Choose a reason for hiding this comment

Uh oh!

poussa commented May 13, 2025

Uh oh!

devpramod commented May 13, 2025

Uh oh!

eero-t commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lianhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lianhao commented Jul 1, 2025

Uh oh!

lianhao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

devpramod commented Apr 14, 2025 •

edited by lianhao

Loading

lianhao left a comment •

edited

Loading

eero-t left a comment •

edited

Loading

lianhao left a comment •

edited

Loading

eero-t commented May 16, 2025 •

edited

Loading

lianhao left a comment •

edited

Loading

lianhao left a comment •

edited

Loading