Common solutions and tools developed by Google Cloud's Professional Services team.
The examples folder contains example solutions across a variety of Google Cloud Platform products. Use these solutions as a reference for your own or extend them to fit your particular use case.
- Audio Content Profiling - A tool that builds a pipeline to scale the process of moderating audio files for inappropriate content using machine learning APIs.
- BigQuery Audit Log Dashboard - Solution to help audit BigQuery usage using Data Studio for visualization and a sample SQL script to query the back-end data source consisting of audit logs.
- BigQuery Billing Dashboard - Solution to help displaying billing info using Data Studio for visualization and a sample SQL script to query the back-end billing export table in BigQuery.
- BigQuery Cross Project Slot Monitoring - Solution to help monitoring slot utilization across multiple projects, while breaking down allocation per project.
- BigQuery Group Sync For Row Level Access - Sample code to synchronize group membership from G Suite/Cloud Identity into BigQuery and join that with your data to control access at row level.
- BigQuery Pipeline Utility - Python utility class for defining data pipelines in BigQuery.
- Bigtable Dataflow Cryptocurrencies Exchange RealTime Example - Apache Beam example that reads from the Crypto Exchanges WebSocket API as Google Cloud Dataflow pipeline and saves the feed in Google Cloud Bigtable. Real time visualization and query examples from GCP Bigtable running on Flask server are included.
- Bigtable Dataflow Update Table Key Pipeline - Dataflow pipeline with an example of how to update the key of an existing table. It works with any table, regardless the schema. It shows how to update your key for a table with existing data, to try out different alternatives to improve performance.
- Cloud Composer Examples - Examples of using Cloud Composer, GCP's managed Apache Airflow service.
- Cloud Composer CI/CD - Examples of using Cloud Build to deploy airflow DAGs to Cloud Composer.
- Cloud Function VM Delete Event Handler Example - Solution to automatically delete A records in Cloud DNS when a VM is deleted. This solution implements a Google Cloud Function Background Function triggered on
compute.instances.delete
events published through Stackdriver Logs Export. - Cloud SQL Custom Metric - An example of creating a Stackdriver custom metric monitoring Cloud SQL Private Services IP consumption.
- CloudML Bank Marketing - Notebook for creating a classification model for marketing using CloudML.
- CloudML Bee Health Detection - Detect if a bee is unhealthy based on an image of it and its subspecies.
- CloudML Churn Prediction - Predict users' propensity to churn using Survival Analysis.
- CloudML Deep Collaborative Filtering - Recommend songs given either a user or song.
- CloudML Energy Price Forecasting - Predicting the future energy price based on historical price and weather.
- CloudML Fraud Detection - Fraud detection model for credit-cards transactions.
- CloudML Sentiment Analysis - Sentiment analysis for movie reviews using TensorFlow
RNNEstimator
. - CloudML Scikit-learn Pipeline - This is a example for building a scikit-learn-based machine learning pipeline trainer that can be run on AI Platform. The pipeline can be trained locally or remotely on AI platform. The trained model can be further deployed on AI platform to serve online traffic.
- CloudML TensorFlow Profiling - TensorFlow profiling examples for training models with CloudML
- Data Generator - Generate random data with a custom schema at scale for integration tests or demos.
- Dataflow BigQuery Transpose Example - An example pipeline to transpose/pivot/rotate a BigQuery table.
- Dataflow Elasticsearch Indexer - An example pipeline that demonstrates the process of reading JSON documents from Cloud Pub/Sub, enhancing the document using metadata stored in Cloud Bigtable and indexing those documents into Elasticsearch.
- Dataflow Python Examples - Various ETL examples using the Dataflow Python SDK.
- Dataflow Scala Example: Kafka2Avro - Example to read objects from Kafka, and persist them encoded in Avro in Google Cloud Storage, using Dataflow with SCIO.
- Dataflow Streaming Benchmark - Utility to publish randomized fake JSON messages to a Cloud Pub/Sub topic at a configured QPS.
- Dataflow Template Pipelines - Pre-implemented Dataflow template pipelines for solving common data tasks on Google Cloud Platform.
- Dataproc Persistent History Server for Ephemeral Clusters - Example of writing logs from an ephemeral cluster to GCS and using a separate single node cluster to look at Spark and YARN History UIs.
- DLP API Examples - Examples of the DLP API usage.
- GCE Access to Google AdminSDK - Example to help manage access to Google's AdminSDK using GCE's service account identity
- Home Appliance Status Monitoring from Smart Power Readings - An end-to-end demo system featuring a suite of Google Cloud Platform products such as IoT Core, ML Engine, BigQuery, etc.
- IoT Nirvana - An end-to-end Internet of Things architecture running on Google Cloud Platform.
- Kubeflow Pipelines Sentiment Analysis - Create a Kubeflow Pipelines component and pipelines to analyze sentiment for New York Times front page headlines using Cloud Dataflow (Apache Beam Java) and Cloud Natural Language API.
- Kubeflow Fairing Example - Provided three notebooks to demonstrate the usage of Kubeflow Faring to train machine learning jobs (Scikit-Learn, XGBoost, Tensorflow) locally or in the Cloud (AI platform training or Kubeflow cluster).
- Python CI/CD with Cloud Builder and CSR - Example that uses Cloud Builder and Cloud Source Repositories to automate testing and linting.
- Pub/Sub Client Batching Example - Batching in Pub/Sub's Java client API.
- QAOA - Examples of parsing a max-SAT problem in a proprietary format.
- Redis Cluster on GKE Example - Deploying Redis cluster on GKE.
- Spinnaker - Example pipelines for a Canary / Production deployment process.
- Uploading files directly to Google Cloud Storage by using Signed URL - Example architecture to enable uploading files directly to GCS by using Signed URL.
The tools folder contains ready-made utilities which can simpilfy Google Cloud Platform usage.
- Agile Machine Learning API - A web application which provides the ability to train and deploy ML models on Google Cloud Machine Learning Engine, and visualize the predicted results using LIME through simple post request.
- Apache Beam Client Throttling - A library that can be used to limit the number of requests from an Apache Beam pipeline to an external service. It buffers requests to not overload the external service and activates client-side throttling when the service starts rejecting requests due to out of quota errors.
- AssetInventory - Import Cloud Asset Inventory resourcs into BigQuery.
- BigQuery Discount Per-Project Attribution - A tool that automates the generation of a BigQuery table that uses existing exported billing data, by attributing both CUD and SUD charges on a per-project basis.
- BigQuery Query Plan Exporter - Command line utility for exporting BigQuery query plans in a given date range.
- BigQuery Query Plan Visualizer - A web application which provides the ability to visualise the execution stages of BigQuery query plans to aid in the optimization of queries.
- BigQuery z/OS Mainframe Connector - A utility used to load COBOL MVS data sets into BigQuery and execute query and load jobs from the IBM z/OS Mainframe.
- CloudConnect - A package that automates the setup of dual VPN tunnels between AWS and GCP.
- Cloudera Parcel GCS Connector - This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster. This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster.
- Cloud AI Vision Utilities - This is an installable Python package that provides support tools for Cloud AI Vision. Currently there are a few scripts for generating an AutoML Vision dataset CSV file from either raw images or image annotation files in PASCAL VOC format.
- DNS Sync - Sync a Cloud DNS zone with GCE resources. Instances and load balancers are added to the cloud DNS zone as they start from compute_engine_activity log events sent from a pub/sub push subscription. Can sync multiple projects to a single Cloud DNS zone.
- GCE Disk Encryption Converter - A tool that converts disks attached to a GCE VM instnace from Google-managed keys to a customer-managed key stored in Cloud KMS.
- GCE Quota Sync - A tool that fetches resource quota usage from the GCE API and synchronizes it to Stackdriver as a custom metric, where it can be used to define automated alerts.
- GCE Usage Log - Collect GCE instance events into a BigQuery dataset, surfacing your vCPUs, RAM, and Persistent Disk, sliced by project, zone, and labels.
- GCP Architecture Visualizer - A tool that takes CSV output from a Forseti Inventory scan and draws out a dynamic hierarchical tree diagram of org -> folders -> projects -> gcp_resources using the D3.js javascript library.
- GCP Organization Hierarchy Viewer - A CLI utility for visualizing your organization hierarchy in the terminal.
- GCS Bucket Mover - A tool to move user's bucket, including objects, metadata, and ACL, from one project to another.
- GCS Usage Recommender - A tool that generates bucket-level intelligence and access patterns across all projects for a GCP project to generate recommended object lifecycle management.
- GKE Billing Export - Google Kubernetes Engine fine grained billing export.
- GSuite Exporter - A Python package that automates syncing Admin SDK APIs activity reports to a GCP destination. The module takes entries from the chosen Admin SDK API, converts them into the appropriate format for the destination, and exports them to a destination (e.g: Stackdriver Logging).
- Hive to BigQuery - A Python framework to migrate Hive table to BigQuery using Cloud SQL to keep track of the migration progress.
- LabelMaker - A tool that reads key:value pairs from a json file and labels the running instance and all attached drives accordingly.
- Machine Learning Auto Exploratory Data Analysis and Feature Recommendation - A tool to perform comprehensive auto EDA, based on which feature recommendations are made, and a summary report will be generated.
- Maven Archetype Dataflow - A maven archetype which bootstraps a Dataflow project with common plugins pre-configured to help maintain high code quality.
- Netblock Monitor - An Apps Script project that will automatically provide email notifications when changes are made to Google’s IP ranges.
- Site Verification Group Sync - A tool to provision "verified owner" permissions (to create GCS buckets with custom dns) based on membership of a Google Group.
- SLO Generator - A Python package that automates computation of Service Level Objectives, Error Budgets and Burn Rates on GCP, and export the computation results to available exporters (e.g: PubSub, BigQuery, Stackdriver Monitoring), using policies written in JSON format.
- Snowflake_to_BQ - A shell script to transfer tables (schema & data) from Snowflake to BigQuery.
See the contributing instructions to get started contributing.
All solutions within this repository are provided under the Apache 2.0 license. Please see the LICENSE file for more detailed terms and conditions.
This repository and its contents are not an official Google Product.
Questions, issues, and comments should be directed to [email protected].