Tips

A collection of tips for scaling jobs, generalizing jobs for flexibility, and developing ML training jobs that are portable. Think of this as DevOps for ML training jobs. The tips will show how to do multiple tasks in parallel within your code, pass parameters to jobs from the command line and input files, package training code, build custom containers with training code, and deploy training code on Vertex AI Training to take advantage of scalable managed infrastructure at the job level.

Using This Repository

Each notebook that has a parameter defined as BUCKET = PROJECT_ID can be customized:
- change this to BUCKET = PROJECT_ID + 'suffix' if you already have a GCS bucket with the same name as the project.

Notes

aiplatform Python Client
- All about the Vertex AI Python Client: versions (aiplatform_v1 and aiplatform_v1beta) and layers (aiplatform and aiplatform.gapic). Includes the deeper details and examples of using each.

Python: Notebooks on Skills For ML Training Jobs and Tasks

Python Multiprocessing
- tips for executing multiple tasks at the same time
Python Job Parameters
- tips for passing values to programs from the command line (argparse, docopt, click) or with files (JSON, YAML, pickle)
Python Client for GCS
- tips for interacting with GCS storage from Python, Vertex AI
Python Packages
- prepare ML training code with a file (modules), folders, packages, distributions (source distribution and built distribution) and storing in custom repositories with Artifact Registry
Python Custom Containers
- tips for building derivative containers with Cloud Build and Artifact Registry
Python Training
- move training code out of a notebook and into Vertex AI Training Custom Jobs
- This demonstrates many workflows for directly using the code formats created in Python Packages and for the custom container workflows created in Python Custom Containers

BigQuery: Notebooks on BigQuery Topics

BQML | Slots for ML_EXTERNAL Jobs
BQ Tools | Divide A Query - Async
BQ Explained | Columns
BQ Tools | UDF
BQ Explained | Tables
Information Schema Example
BQ Tools | Logging Sink
BQ Tools | CMEK and Cross-Region Move
BQ TOOL | LOCF Time Series

Additional Tips

Tips for working with the Python Client for BigQuery can be found here:
- 03 - Introduction to BigQuery ML (BQML)

Notebooks on Skills For BigQuery

New series will go here (see todo)

ToDo:

split this folder with subfolders
Python, BigQuery, KFP, ...
- KFP Layers: components, tasks, artifacts, pipelines, IO
- BQ Layers: project, dataset, table, rows, columns, cells + access, operations, ...
[IP] BigQuery Tips:
- BigQuery - Python Clients
- BigQuery - R
- BigQuery - Data Types
- BigQuery - Tables
- BigQuery - UDF
- BigQuery - Remote Functions
Add Git workflow tip - how to clone with PAT
[DEV] add KFP tip, include component authoring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Tips

Using This Repository

Notes

Python: Notebooks on Skills For ML Training Jobs and Tasks

BigQuery: Notebooks on BigQuery Topics

Additional Tips

Notebooks on Skills For BigQuery

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Tips

Using This Repository

Notes

Python: Notebooks on Skills For ML Training Jobs and Tasks

BigQuery: Notebooks on BigQuery Topics

Additional Tips

Notebooks on Skills For BigQuery