|
| 1 | +# OPEA applications GCP GKE deployment guide |
| 2 | + |
| 3 | +This guide shows how to deploy OPEA applications on Google Cloud Platform (GCP) Google Kubernetes Engine (GKE) using Terraform. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- Access to GCP GKE |
| 8 | +- [Terraform](https://developer.hashicorp.com/terraform/tutorials/gcp-get-started/install-cli), [GCP CLI](https://cloud.google.com/sdk/docs/install-sdk) and [Helm](https://helm.sh/docs/helm/helm_install/),[kubectl](https://kubernetes.io/docs/tasks/tools/) installed on your local machine. |
| 9 | + |
| 10 | +## Setup |
| 11 | + |
| 12 | +The setup uses Terraform to create GKE cluster with the following properties: |
| 13 | + |
| 14 | +- 1-node GKE cluster with 100 GB disk and `n4-standard-8` preemptible SPOT instance (8 vCPU and 32 GB memory) |
| 15 | +- Cluster autoscaling up to 5 nodes |
| 16 | + |
| 17 | +Pre GKE Cluster setup |
| 18 | + |
| 19 | +- After you've installed the gcloud SDK, initialize it by running the following command. |
| 20 | + |
| 21 | +```bash |
| 22 | +gcloud init |
| 23 | +``` |
| 24 | + |
| 25 | +- This will authorize the SDK to access GCP using your user account credentials and add the SDK to your PATH. This steps requires you to login and select the project you want to work in. Finally, add your account to the Application Default Credentials (ADC). This will allow Terraform to access these credentials to provision resources on GCloud. |
| 26 | + |
| 27 | +```bash |
| 28 | +gcloud auth application-default login |
| 29 | +``` |
| 30 | + |
| 31 | +In here, you will find four files used to provision a VPC, subnets and a GKE cluster. |
| 32 | + |
| 33 | +- vpc.tf provisions a VPC and subnet. A new VPC is created for this tutorial so it doesn't impact your existing cloud environment and resources. This file outputs region. |
| 34 | + |
| 35 | +- main.tf provisions a GKE cluster and a separately managed node pool (recommended). Separately managed node pools allows you to customize your Kubernetes cluster profile — this is useful if some Pods require more resources than others. You can learn more here. The number of nodes in the node pool is defined also defined here. |
| 36 | + |
| 37 | +- opea-chatqna.tfvars is a template for the project_id, cluster_name and region variables. |
| 38 | + |
| 39 | +- versions.tf sets the Terraform version to at least 0.14. |
| 40 | + |
| 41 | +## Update your opea-chatqna.tfvars file |
| 42 | + |
| 43 | +Replace the values in your opea-chatqna.tfvars file with your project_id, cluster_name and region. Terraform will use these values to target your project when provisioning your resources. Your opea-chatqna.tfvars file should look like the following. |
| 44 | + |
| 45 | +```bash |
| 46 | + # opea-chatqna.tfvars |
| 47 | + project_id = "REPLACE_ME" |
| 48 | + region = "us-central1" |
| 49 | +``` |
| 50 | + |
| 51 | +You can find the project your gcloud is configured to with this command. |
| 52 | + |
| 53 | +```bash |
| 54 | + gcloud config get-value project |
| 55 | +``` |
| 56 | + |
| 57 | +The region has been defaulted to us-central1; you can find a full list of gcloud regions - https://cloud.google.com/compute/docs/regions-zones |
| 58 | + |
| 59 | +Initialize the Terraform environment. |
| 60 | + |
| 61 | +```bash |
| 62 | +terraform init |
| 63 | +``` |
| 64 | + |
| 65 | +## GKE cluster |
| 66 | + |
| 67 | +By default, 1-node cluster is created which is suitable for running the OPEA application. See `main.tf` upto max_node_count = 5, if you want to tune the cluster properties, e.g., number of nodes, instance types or disk size. |
| 68 | + |
| 69 | +## Persistent Volume Claim |
| 70 | + |
| 71 | +OPEA needs a volume where to store the model. For that we need to create Kubernetes Persistent Volume Claim (PVC). OPEA requires `ReadWriteOnce` option since multiple pods needs access to the storage and they can be on different nodes. On GKE, We are installing Storage Class that support n4-standard-8 which is hyper-balanced . Thus, each OPEA application below uses the file `eks-fs-pvc.yaml` to create Storage Class and PVC in its namespace. |
| 72 | + |
| 73 | +## OPEA Applications |
| 74 | + |
| 75 | +### ChatQnA |
| 76 | + |
| 77 | +Use the commands below to create GKE cluster. |
| 78 | + |
| 79 | +```bash |
| 80 | +terraform plan --var-file opea-chatqna.tfvars -out opea-chatqna.plan |
| 81 | +terraform apply "opea-chatqna.plan" |
| 82 | +``` |
| 83 | + |
| 84 | +Once the cluster is ready, update kubectl config |
| 85 | + |
| 86 | +```bash |
| 87 | +gcloud container clusters get-credentials "cluster_name"-gke --region us-central1 --project "project_id" |
| 88 | +``` |
| 89 | + |
| 90 | +Now you should have access to the cluster via the `kubectl` command. |
| 91 | + |
| 92 | +Deploy ChatQnA Application with Helm |
| 93 | + |
| 94 | +```bash |
| 95 | +helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} |
| 96 | +``` |
| 97 | + |
| 98 | +Create the Storage Class and PVC as mentioned [above](#-persistent-volume-claim) |
| 99 | + |
| 100 | +```bash |
| 101 | +kubectl apply -f gke-fs-pvc.yaml -n chatqna |
| 102 | +``` |
| 103 | + |
| 104 | +After a while, the OPEA application should be running. You can check the status via `kubectl`. |
| 105 | + |
| 106 | +```bash |
| 107 | +kubectl get pod -n chatqna |
| 108 | +``` |
| 109 | + |
| 110 | +You can now start using the OPEA application. |
| 111 | + |
| 112 | +```bash |
| 113 | +OPEA_SERVICE=$(kubectl get svc -n chatqna chatqna -ojsonpath='{.status.loadBalancer.ingress[0].hostname}') |
| 114 | +curl http://${OPEA_SERVICE}:8888/v1/chatqna \ |
| 115 | + -H "Content-Type: application/json" \ |
| 116 | + -d '{"messages": "What is the revenue of Nike in 2023?"}' |
| 117 | +``` |
| 118 | + |
| 119 | +Cleanup |
| 120 | + |
| 121 | +Delete the cluster via the following command. |
| 122 | + |
| 123 | +```bash |
| 124 | +helm uninstall -n chatqna chatqna |
| 125 | +terraform destroy -var-file opea-chatqna.tfvars |
| 126 | +``` |
0 commit comments