Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(examples): Vertex Machine Learning Pipeline #66

Merged

Conversation

renato-rudnicki
Copy link
Contributor

This PR creates a Machine Learning Pipeline example with the following:

  • Refactor the previous machine learning pipeline example
  • Fixes cloud build service account breaking issues
  • Adds service accounts / permissions / vpc-sc rules necessary to run the example
  • Fix the documentation

Copy link
Contributor

@caetano-colin caetano-colin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in pair with Renato - LGTM

validate_bootstrap_step_external_repo
validate_bootstrap_step_external_repo
elif [[ "$TERRAFORM_LOCAL" == "true" ]]; then
validate_bootstrap_terraform_local

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
validate_bootstrap_terraform_local
validate_bootstrap_terraform_local

Copy link
Contributor

@caetano-colin caetano-colin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM - only adding some changes on the documentation

file.
This example demonstrates the process of interactive coding and experimentation using the Google Vertex AI Workbench for data scientists. The guide outlines the creation of a machine learning (ML) pipeline within a notebook on a Google Vertex AI Workbench Instance.

This environment is set up for interactive coding and experimentations. After the project is up, the vertex workbench is deployed from service catalog and the datascientis can use it to write their code including any experiments, data processing code and pipeline components. In addition, a cloud storage bucket is deployed to use as the storage for our operations. Optionally a composer environment is which will later be used to schedule the pipeline run on intervals.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This environment is set up for interactive coding and experimentations. After the project is up, the vertex workbench is deployed from service catalog and the datascientis can use it to write their code including any experiments, data processing code and pipeline components. In addition, a cloud storage bucket is deployed to use as the storage for our operations. Optionally a composer environment is which will later be used to schedule the pipeline run on intervals.
This environment is set up for interactive coding and experimentations. After the project is up, the vertex workbench will be deployed from the base environment module on (`/modules/base_env/main.tf` and the data scientists can use it to write their data processing code and pipeline components. In addition, a cloud storage bucket should be deployed to use as the storage for our operations. Optionally, a composer environment can be setup to schedule the pipeline run on intervals.

- The model is trained and deployed using the census income dataset.
- Deployment and monitoring occur in the production environment.
- A/B Testing:
- After successful pipeline runs, a new model version is deployed for A/B testing.

## Purpose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove purpose section or correct the text

@@ -420,519 +672,1534 @@ Run `terraform output cloudbuild_project_id` in the `0-bootstrap` folder to get
cd ..
```

## Running Terraform locally
### VPC-SC - Infrastructure Deployment with Local Terraform - Only proceed with these if you have not used Cloud Build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the section VPC-SC - Infrastructure Deployment with Local Terraform - Only proceed with these if you have not used Cloud Build as a subsection to Prerequisites after the VPC-SC instructions for cloudbuild

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and remove the duplicated "Usage" sections that is right above it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this issue as an improvement. I believe we having two separated READMEs will keep the doc more organized.

@sleighton2022 sleighton2022 merged commit ba52535 into GoogleCloudPlatform:main Oct 8, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants