Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions helm-charts/docsum/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,16 @@ dependencies:
- name: llm-uservice
version: 0-latest
repository: "file://../common/llm-uservice"
condition: llm-uservice.enabled
- name: whisper
version: 0-latest
repository: "file://../common/whisper"
condition: whisper.enabled
- name: ui
version: 0-latest
repository: "file://../common/ui"
alias: docsum-ui
condition: docsum-ui.enabled
- name: nginx
version: 0-latest
repository: "file://../common/nginx"
Expand Down
6 changes: 5 additions & 1 deletion helm-charts/docsum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@ helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --set global.modelUs
# helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --values docsum/rocm-values.yaml
# To use AMD ROCm device with TGI
# helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --values docsum/rocm-tgi-values.yaml

# To use with external OpenAI-compatible LLM endpoints (OpenAI, vLLM, TGI, etc.)
# This configures the llm-uservice to connect to external LLM providers while maintaining DocSum compatibility
# helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --values docsum/variant_external-llm-values.yaml --set llm-uservice.env.OPENAI_API_KEY="your-api-key" --set llm-uservice.env.LLM_MODEL_ID="gpt-4-turbo"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this also set suitable (OpenAI) LLM_ENDPOINT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eero-t Thanks for pointing out. This made me realize I wasn't passing the env variables correctly and that we needed a new env variable in the configmap for OPENAI_API_KEY. Please review.

# For vLLM/TGI endpoints:
# helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --values docsum/variant_external-llm-values.yaml --set llm-uservice.env.LLM_ENDPOINT="http://your-vllm-server/v1" --set llm-uservice.env.LLM_MODEL_ID="your-model"
```

## Verify
Expand Down
2 changes: 2 additions & 0 deletions helm-charts/docsum/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,12 @@ spec:
value: {{ include "llm-uservice.fullname" (index .Subcharts "llm-uservice") }}
- name: LLM_SERVICE_PORT
value: {{ index .Values "llm-uservice" "service" "port" | quote }}
{{- if .Values.whisper.enabled }}
- name: ASR_SERVICE_HOST_IP
value: {{ include "whisper.fullname" (index .Subcharts "whisper") }}
- name: ASR_SERVICE_PORT
value: {{ index .Values "whisper" "service" "port" | quote }}
{{- end }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
Expand Down
15 changes: 15 additions & 0 deletions helm-charts/docsum/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ affinity: {}

# To override values in subchart llm-uservice
llm-uservice:
enabled: true
image:
repository: opea/llm-docsum
DOCSUM_BACKEND: "vLLM"
Expand All @@ -79,6 +80,8 @@ vllm:
nginx:
enabled: false
docsum-ui:
# if false, set also nginx.enabled=false
enabled: true
image:
repository: opea/docsum-gradio-ui
tag: "latest"
Expand All @@ -101,8 +104,20 @@ docsum-ui:
# type: ClusterIP

dashboard:
enabled: true
prefix: "OPEA DocSum"

whisper:
enabled: true

# External LLM configuration
externalLLM:
enabled: false
LLM_SERVER_HOST: "http://your-llm-server"
LLM_SERVER_PORT: "80"
LLM_MODEL: "your-model"
OPENAI_API_KEY: "your-api-key"

global:
http_proxy: ""
https_proxy: ""
Expand Down
19 changes: 19 additions & 0 deletions helm-charts/docsum/variant_external-llm-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright (C) 2024 Intel Corporation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright (C) 2024 Intel Corporation
# Copyright (C) 2025 Intel Corporation

# SPDX-License-Identifier: Apache-2.0

# External LLM configuration - configures llm-uservice to use external LLM providers
# This keeps the llm-uservice wrapper (required for /v1/docsum endpoint) but connects it to external LLMs
llm-uservice:
enabled: true # Keep the wrapper service for DocSum compatibility
env:
# Configure llm-uservice to use external OpenAI-compatible endpoints
LLM_ENDPOINT: "https://api.openai.com/v1" # External LLM API endpoint (OpenAI, vLLM, TGI, etc.)
OPENAI_API_KEY: "${OPENAI_API_KEY}" # API key for authentication
LLM_MODEL_ID: "gpt-4-turbo" # Model to use
TEXTGEN_BACKEND: "openai" # Backend type for OpenAI-compatible endpoints
Copy link
Collaborator

@eero-t eero-t Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is DocSum service, not TextGen.

Also, it looks that giving anything else than TGI or vLLM to the *_BACKEND variables is going to fail Helm install: https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/llm-uservice/templates/configmap.yaml

(In TEXTGEN_BACKEND case there seems to be also BEDROCK option, but that does not really help here.)

Looking at the backend code of llm-uservice for the DOCSUM_BACKEND="vLLM" option, it seems like it would be OpenAI API compatible, but unfortunately it used hard-coded openai_api_key value (EMPTY string): https://github.com/opea-project/GenAIComps/blob/main/comps/llms/src/doc-summarization/integrations/vllm.py#L58

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is hard coded to empty string but the access_token is passed as an authentication header which is taken from the environment variable https://github.com/opea-project/GenAIComps/blob/2444e6984e27dd34aed5f0b341690e046356940d/comps/llms/src/doc-summarization/integrations/vllm.py#L52

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eero-t Based on Sri's comment I have fallen back to docusm backend with vLLM, please review.


# Disable local inference services since we're using external LLMs
vllm:
enabled: false
tgi:
enabled: false
Loading