diff --git a/cloud-service-provider/aws/eks/terraform/README.md b/cloud-service-provider/aws/eks/terraform/README.md index 1570db699..027316398 100644 --- a/cloud-service-provider/aws/eks/terraform/README.md +++ b/cloud-service-provider/aws/eks/terraform/README.md @@ -49,7 +49,7 @@ Now you should have access to the cluster via the `kubectl` command. Deploy ChatQnA Application with Helm ```bash -helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} +helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HF_TOKEN=${HFTOKEN} ``` Create the PVC as mentioned [above](#-persistent-volume-claim) diff --git a/cloud-service-provider/azure/aks/terraform/README.md b/cloud-service-provider/azure/aks/terraform/README.md index 3df1799d5..dd2a6c20a 100644 --- a/cloud-service-provider/azure/aks/terraform/README.md +++ b/cloud-service-provider/azure/aks/terraform/README.md @@ -53,7 +53,7 @@ Now you should have access to the cluster via the `kubectl` command. Deploy ChatQnA Application with Helm ```bash -helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} +helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HF_TOKEN=${HFTOKEN} ``` Create the PVC as mentioned [above](#-persistent-volume-claim) diff --git a/cloud-service-provider/gcp/gke/terraform/README.md b/cloud-service-provider/gcp/gke/terraform/README.md index 398d42a99..96d0ac1ce 100644 --- a/cloud-service-provider/gcp/gke/terraform/README.md +++ b/cloud-service-provider/gcp/gke/terraform/README.md @@ -92,7 +92,7 @@ Now you should have access to the cluster via the `kubectl` command. Deploy ChatQnA Application with Helm ```bash -helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} +helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HF_TOKEN=${HFTOKEN} ``` Create the Storage Class and PVC as mentioned [above](#-persistent-volume-claim) diff --git a/dev/helm-chart-starter/README.md b/dev/helm-chart-starter/README.md index 981c34b87..876123d58 100644 --- a/dev/helm-chart-starter/README.md +++ b/dev/helm-chart-starter/README.md @@ -10,7 +10,7 @@ To install the chart, run the following: cd GenAIInfra/helm-charts/common export HFTOKEN="insert-your-huggingface-token-here" # To deploy microserice on CPU -helm install --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} +helm install --set global.HF_TOKEN=${HFTOKEN} ``` @@ -32,6 +32,6 @@ curl http://localhost:8080/v1/ \ ## Values -| Key | Type | Default | Description | -| ------------------------------- | ------ | ------------------------------------ | --------------------- | -| global.HUGGINGFACEHUB_API_TOKEN | string | `insert-your-huggingface-token-here` | HuggingFace API token | +| Key | Type | Default | Description | +| --------------- | ------ | ------------------------------------ | --------------------- | +| global.HF_TOKEN | string | `insert-your-huggingface-token-here` | HuggingFace API token | diff --git a/dev/helm-chart-starter/templates/configmap.yaml b/dev/helm-chart-starter/templates/configmap.yaml index 1cf0af8f3..216a7e294 100644 --- a/dev/helm-chart-starter/templates/configmap.yaml +++ b/dev/helm-chart-starter/templates/configmap.yaml @@ -8,8 +8,7 @@ metadata: labels: {{- include ".labels" . | nindent 4 }} data: - HUGGINGFACEHUB_API_TOKEN: {{ .Values.global.HUGGINGFACEHUB_API_TOKEN | quote}} - HF_TOKEN: {{ .Values.global.HUGGINGFACEHUB_API_TOKEN | quote}} + HF_TOKEN: {{ .Values.global.HF_TOKEN | quote}} {{- if .Values.global.HF_ENDPOINT }} HF_ENDPOINT: {{ .Values.global.HF_ENDPOINT | quote}} {{- end }} diff --git a/dev/helm-chart-starter/values.yaml b/dev/helm-chart-starter/values.yaml index 5cb24d39b..80b390a18 100644 --- a/dev/helm-chart-starter/values.yaml +++ b/dev/helm-chart-starter/values.yaml @@ -157,7 +157,7 @@ global: http_proxy: "" https_proxy: "" no_proxy: "" - HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here" + HF_TOKEN: "insert-your-huggingface-token-here" # service account name to be shared with all parent/child charts. # If set, it will overwrite serviceAccount.name. # If set, and serviceAccount.create is false, it will assume this service account is already created by others. diff --git a/helm-charts/README.md b/helm-charts/README.md index 40af54d25..2fc37b064 100644 --- a/helm-charts/README.md +++ b/helm-charts/README.md @@ -45,14 +45,14 @@ Refer to [GenAIComps](https://github.com/opea-project/GenAIComps) for details of ### From Source Code These Helm charts are designed to be easy to start, which means you can deploy a workload easily without further options. -However, `HUGGINGFACEHUB_API_TOKEN` should be set in most cases for a workload to start up correctly. +However, `HF_TOKEN` should be set in most cases for a workload to start up correctly. Examples of deploy a workload: ``` export myrelease=mytgi export chartname=common/tgi helm dependency update $chartname -helm install $myrelease $chartname --set global.HUGGINGFACEHUB_API_TOKEN="insert-your-huggingface-token-here" +helm install $myrelease $chartname --set global.HF_TOKEN="insert-your-huggingface-token-here" ``` Depending on your environment, you may want to customize some of the options, see [Helm Charts Options](#helm-charts-options) for further information. @@ -76,7 +76,7 @@ There are global options (which should be shared across all components of a work | Helm chart | Options | Description | | ---------- | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| global | HUGGINGFACEHUB_API_TOKEN | Your own HuggingFace token, there is no default value. If not set, you might fail to start the component. | +| global | HF_TOKEN | Your own HuggingFace token, there is no default value. If not set, you might fail to start the component. | | global | http_proxy https_proxy no_proxy | Proxy settings. If you are running the workloads behind the proxy, you'll have to add your proxy settings here. | | global | modelUsePVC | The PersistentVolumeClaim you want to use as HuggingFace hub cache. Default "" means not using PVC. Only one of modelUsePVC/modelUseHostPath can be set. | | global | modelUseHostPath | If you don't have Persistent Volume in your k8s cluster and want to use local directory as HuggingFace hub cache, set modelUseHostPath to your local directory name. Note that this can't share across nodes. Default "". Only one of modelUsePVC/modelUseHostPath can be set. | diff --git a/helm-charts/TDX.md b/helm-charts/TDX.md index edb44bec6..a24d15443 100644 --- a/helm-charts/TDX.md +++ b/helm-charts/TDX.md @@ -30,7 +30,7 @@ This guide assumes that: Follow the below steps on the server node with Intel Xeon Processor: -1. [Install Ubuntu 24.04 and enable Intel TDX](https://github.com/canonical/tdx/blob/noble-24.04/README.md#setup-host-os) +1. [Install Ubuntu 24.04 and enable Intel TDX](https://github.com/canonical/tdx/blob/3.2/README.md#setup-host-os) 2. Check, if Intel TDX is enabled: ```bash @@ -84,7 +84,7 @@ Follow the steps below to deploy ChatQnA: ``` helm install $myrelease $chartname \ - --set global.HUGGINGFACEHUB_API_TOKEN="${HFTOKEN}" --set vllm.LLM_MODEL_ID="${MODELNAME}" \ + --set global.HF_TOKEN="${HFTOKEN}" --set vllm.LLM_MODEL_ID="${MODELNAME}" \ --set redis-vector-db.tdxEnabled=true --set redis-vector-db.resources.limits.memory=4Gi \ --set retriever-usvc.tdxEnabled=true --set retriever-usvc.resources.limits.memory=7Gi \ --set tei.tdxEnabled=true --set tei.resources.limits.memory=4Gi \ diff --git a/helm-charts/docsum/README.md b/helm-charts/docsum/README.md index 8e05c6300..cd03cdd4d 100644 --- a/helm-charts/docsum/README.md +++ b/helm-charts/docsum/README.md @@ -17,7 +17,7 @@ helm dependency update docsum export HFTOKEN="insert-your-huggingface-token-here" export MODELDIR="/mnt/opea-models" export MODELNAME="meta-llama/Meta-Llama-3-8B-Instruct" -helm install docsum docsum --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set global.modelUseHostPath=${MODELDIR} --set llm-uservice.LLM_MODEL_ID=${MODELNAME} --set vllm.LLM_MODEL_ID=${MODELNAME} +helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --set global.modelUseHostPath=${MODELDIR} --set llm-uservice.LLM_MODEL_ID=${MODELNAME} --set vllm.LLM_MODEL_ID=${MODELNAME} # To use Gaudi device with vLLM # helm install docsum docsum --set global.HF_TOKEN=${HFTOKEN} --values docsum/gaudi-values.yaml # To use Gaudi device with TGI