Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions kubeai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,26 +300,28 @@ on the configuration options.

# Observability

With [Prometheus](../helm-charts/monitoring.md) running, install script can enable monitoring of the vLLM inference engine instances.
With [kube-prometheus-stack](../helm-charts/monitoring.md) Helm chart already deployed, install script will automatically enable monitoring for the vLLM inference engine pods.

Script requires Prometheus Helm chart release name for that, e.g.
If script did not detect it, one can specify Prometheus Helm chart release manually:

```
release=prometheus-stack
./install.sh $release
```

Port-forward Grafana.
If script finds also a (running) Grafana instance, it will install "vLLM scaling" and "vLLM details" dashboards for it.

But they can be installed also manually afterwards:

```
kubectl port-forward -n $ns svc/$release-grafana 3000:80
ns=monitoring # Grafana namespace
kubectl apply -n $ns -f grafana/vllm-scaling.yaml -f grafana/vllm-details.yaml
```

Install "vLLM scaling" and "vLLM details" dashboards, to the same namespace as Grafana.
Then port-forward Grafana.

```
ns=monitoring
kubectl apply -n $ns -f grafana/vllm-scaling.yaml -f grafana/vllm-details.yaml
kubectl port-forward -n $ns svc/$release-grafana 3000:80
```

Open web-browser to `http://localhost:3000` with `admin` / `prom-operator` given as the username / password for login, to view the dashboards.
Expand Down
44 changes: 36 additions & 8 deletions kubeai/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,38 +8,66 @@ DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
error_exit () {
name=${0##*/}
cat <<EOF

Usage: [HF_TOKEN=<token>] $name [HF token file] [Prometheus release name]

HuggingFace token for accessing models can be given either as an
environment variable, or as a name of a file containing the token
value.

If Prometheus Helm release name is given, Prometheus monitoring is
enabled for the inference engine instances, and vLLM dashboard
(configMap) is installed for Grafana.
If script finds deployed "kube-prometheus-stack" Helm chart release,
it enables Prometheus monitoring for the vLLM inference engine pods.
vLLM dashboards (configMaps) are installed if also a running Grafana
instance is detected.

Prometheus release name can also be given as an argument,

ERROR: $1!
EOF
exit 1
}

metrics=""
release=""
for arg in "$@"; do
if [ -f "$arg" ]; then
echo "Using HF token from '$arg' file."
HF_TOKEN=$(cat "$arg")
else
echo "Enabling vLLM inference pod monitoring for '$arg' Prometheus Helm install."
metrics="--set metrics.prometheusOperator.vLLMPodMonitor.labels.release=$arg"
metrics="$metrics --set metrics.prometheusOperator.vLLMPodMonitor.enabled=true"
release="$arg"
fi
done

if [ -z "$HF_TOKEN" ]; then
error_exit "HF token missing"
fi

helm upgrade --install opea -n kubeai kubeai/kubeai \
if [ -z "$release" ]; then
if [ -z "$(which jq)" ]; then
error_exit "please install 'jq' to parse Helm releases info"
fi
# check whether cluster has deployed Prometheus Helm chart release, if none specified
release=$(helm list -A -o json | jq '.[] | select(.chart|match("^kube-prometheus-stack")) | select(.status=="deployed") | .name' | tr -d '"')
fi

metrics=""
if [ -n "$release" ]; then
running="status.phase=Running"
grafana="app.kubernetes.io/name=grafana"
jsonpath="{.items[0].metadata.namespace}"

# check for Grafana namespace
ns=$(kubectl get -A pod --field-selector="$running" --selector="$grafana" -o jsonpath="$jsonpath")
if [ -n "$ns" ]; then
echo "Grafana available, installing vLLM dashboards to '$ns' namespace."
kubectl apply -n $ns -f $DIR/grafana/vllm-scaling.yaml -f $DIR/grafana/vllm-details.yaml
fi

echo "Enabling vLLM pod monitoring for '$release' Prometheus Helm install."
metrics="--set metrics.prometheusOperator.vLLMPodMonitor.labels.release=$release"
metrics="$metrics --set metrics.prometheusOperator.vLLMPodMonitor.enabled=true"
fi

helm upgrade --install opea-kubeai -n kubeai kubeai/kubeai \
--create-namespace \
--set secrets.huggingface.token="$HF_TOKEN" \
-f $DIR/opea-values.yaml $metrics