Skip to content

Commit

Permalink
feat(examples): Vertex Machine Learning Pipeline (#66)
Browse files Browse the repository at this point in the history
* refacto ml-vertexpipeline

* fix empty spaces

* fix conflict

* fix lint

* Add aditional instructions before run Notebooks

* Add aditional instructions before run Notebooks

* change data to use variable as input

* add new variables values for data filters

* fixes for envs and iam roles

* add missing variables

* README update

* update with tests

* update tests

* kfp==2.7.0

* add vertex_model_sa as prod_sa

* updating PR

* add vpc-sc rules

* update

* update readme

* update vpc-sc rules

* small fixes for lint and documentation

* update README

* update README

* Update README for Github App ID and more details about Develop, Non-Production and Production environments

* Add missing logging project at vpc-sc directional rule

* Set github_app_installation_id and github_remote_uri value as empty

* fix github_app_installation_id format

* fix for_each for artifact_registry_iam_member

* fix lint

* add terraform init for 1-org

* fix for_each for google_storage_bucket_iam_member

* Fixes for machine-learning-pipeline/README.md

* fix for Github_app_id

* READMEs update

* Update README

* add changes

* bump project-factory version

* add note about bash terminal

* bump project-factory version

* Note about inconsistent final plan

* fix project-factory bump version

* remove hardcode data

* Fix and improvements for Machine Learning Example

* Automated replacement of placeholders

* add discalimers

* rewriting

* rewrite

* Path fix

* Update for deploy with terraform local and cloudbuild sections

* Fix indentation.

* update steps to add SA in the service perimeter

* perma-diff in provider causes Cloud Functions in 1-org to always fail

* fix command path

* Revert "fix command path"

This reverts commit ea02006.

* add step to unset billing/quota_project

* fix path for BQ commands

* fix placeholders for census_pipeline.ipynb

* fix path for terraform local deploy

* update notebook dependencies

* update placeholders

* fix typo

* fix placeholders for compile_pipeline

* add detail about https in the clone repo step for Vertex

* update docker image

* update juniper notebooks

* fix conflict

* Update steps from machine-learning-pipeline example

* add README

* update README for machine learning example

---------

Co-authored-by: caetano-colin <[email protected]>
  • Loading branch information
renato-rudnicki and caetano-colin authored Oct 8, 2024
1 parent aea1dd9 commit ba52535
Show file tree
Hide file tree
Showing 74 changed files with 3,419 additions and 1,212 deletions.
5 changes: 4 additions & 1 deletion 0-bootstrap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ To run the commands described in this document, install the following:
- [Terraform](https://www.terraform.io/downloads.html) version 1.5.7
- [jq](https://jqlang.github.io/jq/download/) version 1.6.0 or later

**Note:** Make sure that you use version 1.5.7 of Terraform throughout this series. Otherwise, you might experience Terraform state snapshot lock errors.
**Notes:**

- Make sure that you use version 1.5.7 of Terraform throughout this series. Otherwise, you might experience Terraform state snapshot lock errors.
- It is recommended to use Bash terminal de deploy the code from this repository. Using other terminals might cause unexpected behaviours.

Also make sure that you've done the following:

Expand Down
2 changes: 2 additions & 0 deletions 1-org/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,8 @@ If required, run `terraform output cloudbuild_project_id` in the `0-bootstrap` f
```bash
git checkout -b production
git push origin production

cd ..
```

1. Proceed to the [2-environments](../2-environments/README.md) step.
Expand Down
3 changes: 2 additions & 1 deletion 1-org/envs/shared/ml_key_rings.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ module "kms_keyring" {
keyring_admins = [
"serviceAccount:${local.projects_step_terraform_service_account_email}"
]
project_id = module.common_kms.project_id

project_id = module.org_kms.project_id
keyring_regions = var.keyring_regions
keyring_name = var.keyring_name
}
18 changes: 9 additions & 9 deletions 1-org/envs/shared/projects.tf
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ locals {

module "org_audit_logs" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -66,7 +66,7 @@ module "org_audit_logs" {

module "org_billing_logs" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -98,7 +98,7 @@ module "org_billing_logs" {

module "org_kms" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -131,7 +131,7 @@ module "org_kms" {

module "org_secrets" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -163,7 +163,7 @@ module "org_secrets" {

module "interconnect" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -195,7 +195,7 @@ module "interconnect" {

module "scc_notifications" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -227,7 +227,7 @@ module "scc_notifications" {

module "dns_hub" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -267,7 +267,7 @@ module "dns_hub" {

module "base_network_hub" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"
count = var.enable_hub_and_spoke ? 1 : 0

random_project_id = true
Expand Down Expand Up @@ -316,7 +316,7 @@ resource "google_project_iam_member" "network_sa_base" {

module "restricted_network_hub" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"
count = var.enable_hub_and_spoke ? 1 : 0

random_project_id = true
Expand Down
6 changes: 3 additions & 3 deletions 1-org/modules/cai-monitoring/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ data "archive_file" "function_source_zip" {

module "cloudfunction_source_bucket" {
source = "terraform-google-modules/cloud-storage/google//modules/simple_bucket"
version = "~>3.4"
version = "~>5.0"

project_id = var.project_id
name = "bkt-cai-monitoring-${random_id.suffix.hex}-sources-${data.google_project.project.number}-${var.location}"
Expand Down Expand Up @@ -121,7 +121,7 @@ resource "google_cloud_asset_organization_feed" "organization_feed" {

module "pubsub_cai_feed" {
source = "terraform-google-modules/pubsub/google"
version = "~> 5.0"
version = "~> 6.0"

topic = "top-cai-monitoring-${random_id.suffix.hex}-event"
project_id = var.project_id
Expand All @@ -142,7 +142,7 @@ resource "google_scc_source" "cai_monitoring" {
// Cloud Function
module "cloud_function" {
source = "GoogleCloudPlatform/cloud-functions/google"
version = "0.4.1"
version = "0.5"

function_name = "caiMonitoring"
description = "Check on the Organization for members (users, groups and service accounts) that contains the IAM roles listed."
Expand Down
4 changes: 2 additions & 2 deletions 1-org/modules/cai-monitoring/versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ terraform {
required_providers {
google = {
source = "hashicorp/google"
version = ">= 3.77"
version = ">= 3.77, <=5.37"
}
google-beta = {
source = "hashicorp/google-beta"
version = ">= 3.77"
version = ">= 3.77, <=5.37"
}
random = {
source = "hashicorp/random"
Expand Down
4 changes: 2 additions & 2 deletions 1-org/modules/network/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

module "base_shared_vpc_host_project" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down Expand Up @@ -56,7 +56,7 @@ module "base_shared_vpc_host_project" {

module "restricted_shared_vpc_host_project" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
12 changes: 5 additions & 7 deletions 2-environments/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ Run `terraform output cloudbuild_project_id` in the `0-bootstrap` folder to get
git push origin production
```

### Read this before continuing further
### `N.B.` Read this before continuing further

A logging project will be created in every environment (`development`, `non-production`, `production`) when running this code. This project contains a storage bucket for the purposes of project logging within its respective environment. This requires the `[email protected]` group permissions for the storage bucket. Since foundations has more restricted security measures, a domain restriction constraint is enforced. This restraint will prevent the google cloud-storage-analytics group to be added to any permissions. In order for this terraform code to execute without error, manual intervention must be made to ensure everything applies without issue.

Expand All @@ -196,7 +196,7 @@ You will be doing this procedure for each environment (`development`, `non-produ
Make sure your git is checked out to the development branch by running `git checkout development` on `GCP_ENVIRONMENTS_PATH`.

```bash
(cd $GCP_ENVIRONMENTS_PATH && git checkout development)
(cd $GCP_ENVIRONMENTS_PATH && git checkout development && ./tf-wrapper.sh init development)
```

2. Retrieve the bucket name and project id from terraform outputs.
Expand Down Expand Up @@ -244,7 +244,7 @@ You will be doing this procedure for each environment (`development`, `non-produ
Make sure your git is checked out to the `non-production` branch by running `git checkout non-production` on `GCP_ENVIRONMENTS_PATH`.

```bash
(cd $GCP_ENVIRONMENTS_PATH && git checkout non-production)
(cd $GCP_ENVIRONMENTS_PATH && git checkout non-production && ./tf-wrapper.sh init non-production)
```

2. Retrieve the bucket name and project id from terraform outputs.
Expand Down Expand Up @@ -292,7 +292,7 @@ You will be doing this procedure for each environment (`development`, `non-produ
Make sure your git is checked out to the `production` branch by running `git checkout production` on `GCP_ENVIRONMENTS_PATH`.

```bash
(cd $GCP_ENVIRONMENTS_PATH && git checkout production)
(cd $GCP_ENVIRONMENTS_PATH && git checkout production && ./tf-wrapper.sh init production)
```

2. Retrieve the bucket name and project id from terraform outputs.
Expand Down Expand Up @@ -405,7 +405,6 @@ To use the `validate` option of the `tf-wrapper.sh` script, please follow the [i
export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT=$(terraform -chdir="../0-bootstrap/" output -raw environment_step_terraform_service_account_email)
echo ${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}
```
1. Ensure you [disable The Organization Policy](#read-this-before-continuing-further) on the `development` folder before continuing further.
1. Run `init` and `plan` and review output for environment development.
Expand Down Expand Up @@ -447,7 +446,6 @@ To use the `validate` option of the `tf-wrapper.sh` script, please follow the [i
```bash
./tf-wrapper.sh apply non-production
```
1. Ensure you [disable The Organization Policy](#read-this-before-continuing-further) on the `non-production` folder before continuing further.
1. Run `init` and `plan` and review output for environment production.
Expand Down Expand Up @@ -477,6 +475,6 @@ Before executing the next stages, unset the `GOOGLE_IMPERSONATE_SERVICE_ACCOUNT`
unset GOOGLE_IMPERSONATE_SERVICE_ACCOUNT
cd ../..
```
```
1. You can now move to the instructions in the network step. To use the [Dual Shared VPC](https://cloud.google.com/architecture/security-foundations/networking#vpcsharedvpc-id7-1-shared-vpc-) network mode go to [3-networks-dual-svpc](../3-networks-dual-svpc/README.md).
2 changes: 1 addition & 1 deletion 2-environments/modules/env_baseline/kms.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

module "env_kms" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
2 changes: 1 addition & 1 deletion 2-environments/modules/env_baseline/ml_logging.tf
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ data "google_storage_project_service_account" "gcs_logging_account" {

module "env_logs" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
2 changes: 1 addition & 1 deletion 2-environments/modules/env_baseline/monitoring.tf
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

module "monitoring_project" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
2 changes: 1 addition & 1 deletion 2-environments/modules/env_baseline/secrets.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

module "env_secrets" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
2 changes: 2 additions & 0 deletions 3-networks-dual-svpc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,8 @@ Before executing the next stages, unset the `GOOGLE_IMPERSONATE_SERVICE_ACCOUNT`

```bash
unset GOOGLE_IMPERSONATE_SERVICE_ACCOUNT
cd ../..
```

1. You can now move to the instructions in the [4-projects](../4-projects/README.md) step.
5 changes: 3 additions & 2 deletions 5-app-infra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,9 +134,9 @@ The Pipeline is connected to a Google Cloud Source Repository with a simple stru
└── tf2-gpu.2-13:0.1
└── Dockerfile
```
for the purposes of this example, the pipeline is configured to monitor the `main` branch of this repository.
For the purposes of this example, the pipeline is configured to monitor the `main` branch of this repository.

each folder under `images` has the full name and tag of the image that must be built. Once a change to the `main` branch is pushed, the pipeline will analyse which files have changed and build that image out and place it in the artifact repository. For example, if there is a change to the Dockerfile in the `tf2-cpu-13:0.1` folder, or if the folder itself has been renamed, it will build out an image and tag it based on the folder name that the Dockerfile has been housed in.
Each folder under `images` has the full name and tag of the image that must be built. Once a change to the `main` branch is pushed, the pipeline will analyse which files have changed and build that image out and place it in the artifact repository. For example, if there is a change to the Dockerfile in the `tf2-cpu-13:0.1` folder, or if the folder itself has been renamed, it will build out an image and tag it based on the folder name that the Dockerfile has been housed in.

Once pushed, the pipeline build logs can be accessed by navigating to the artifacts project name created in step-4:

Expand Down Expand Up @@ -363,6 +363,7 @@ The pipeline also listens for changes made to `plan`, `development`, `non-produc
1. Update the `log_bucket` variable with the value of the `logs_export_storage_bucket_name`.

```bash
terraform -chdir="../gcp-org/envs/shared" init
export log_bucket=$(terraform -chdir="../gcp-org/envs/shared" output -raw logs_export_storage_bucket_name)
echo "log_bucket = ${log_bucket}"
sed -i "s/REPLACE_LOG_BUCKET/${log_bucket}/" ./common.auto.tfvars
Expand Down
1 change: 0 additions & 1 deletion 5-app-infra/modules/service_catalog/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ resource "google_storage_bucket_iam_member" "bucket_role" {
role = "roles/storage.admin"
member = google_service_account.trigger_sa.member
}

resource "google_sourcerepo_repository_iam_member" "read" {
project = var.project_id
repository = var.name
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ remote_state_bucket = "REMOTE_STATE_BUCKET"

log_bucket = "REPLACE_LOG_BUCKET"

# github_ api_ token = "PUT IN TOKEN"
# github_ api_ token = "GITHUB_APP_TOKEN"

# github_app_installation_id = "18685983"
# github_app_installation_id = "GITHUB_APP_ID"

# github_remote_uri = "https://github.com/badal-io/ml-foundations-tf-modules.git"
# github_remote_uri = "GITHUB_REMOTE_URI"

Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
FROM tensorflow/tensorflow:2.8.0
RUN pip install tensorflow-io==0.25.0 protobuf==3.20.0 google-cloud-bigquery==3.13.0 pandas==2.0.3 db-dtypes==1.2.0 google-cloud-aiplatform==1.36.0 google-cloud-storage==2.14.0 kfp google-cloud-pipeline-components
FROM python:3.10

RUN python3 -m pip install --no-cache-dir tensorflow-cpu==2.8.0
RUN pip install tensorflow-io==0.25.0 protobuf==3.20.3 google-cloud-bigquery==3.13.0 pandas==2.0.3 db-dtypes==1.2.0 google-cloud-aiplatform==1.36.0 google-cloud-storage==2.14.0 kfp google-cloud-pipeline-components numpy==1.26.4
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The following table outlines which of the suggested controls for Vertex Generati
|------|-------------|------|---------|:--------:|
| airflow\_config\_overrides | Airflow configuration properties to override. Property keys contain the section and property names, separated by a hyphen, for example "core-dags\_are\_paused\_at\_creation". | `map(string)` | `{}` | no |
| env\_variables | Additional environment variables to provide to the Apache Airflow scheduler, worker, and webserver processes. Environment variable names must match the regular expression [a-zA-Z\_][a-zA-Z0-9\_]*. They cannot specify Apache Airflow software configuration overrides (they cannot match the regular expression AIRFLOW\_\_[A-Z0-9\_]+\_\_[A-Z0-9\_]+), and they cannot match any of the following reserved names: [AIRFLOW\_HOME,C\_FORCE\_ROOT,CONTAINER\_NAME,DAGS\_FOLDER,GCP\_PROJECT,GCS\_BUCKET,GKE\_CLUSTER\_NAME,SQL\_DATABASE,SQL\_INSTANCE,SQL\_PASSWORD,SQL\_PROJECT,SQL\_REGION,SQL\_USER]. | `map(any)` | `{}` | no |
| github\_app\_installation\_id | The app installation ID that was created when installing Google Cloud Build in GitHub: https://github.com/apps/google-cloud-build. | `number` | n/a | yes |
| github\_app\_installation\_id | The app installation ID that was created when installing Google Cloud Build in GitHub: https://github.com/apps/google-cloud-build. | `number` | `null` | no |
| github\_name\_prefix | A name for your GitHub connection to Cloud Build. | `string` | `"github-modules"` | no |
| github\_remote\_uri | URL of your GitHub repo. | `string` | n/a | yes |
| github\_secret\_name | Name of the GitHub secret to extract GitHub token info. | `string` | `"github-api-token"` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ variable "github_name_prefix" {
variable "github_app_installation_id" {
type = number
description = "The app installation ID that was created when installing Google Cloud Build in GitHub: https://github.com/apps/google-cloud-build."
default = null
}

variable "service_account_prefix" {
Expand Down
2 changes: 1 addition & 1 deletion docs/assets/terraform/2-environments/ml_logging.tf
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ data "google_storage_project_service_account" "gcs_logging_account" {

module "env_logs" {
source = "terraform-google-modules/project-factory/google"
version = "~> 14.0"
version = "~> 15.0"

random_project_id = true
random_project_id_length = 4
Expand Down
Loading

0 comments on commit ba52535

Please sign in to comment.