Data Science Pipeline with ML DSL

ML-DSL is an open source machine learning library developed to simplify data specialist’s experience of interaction with Cloud Platforms such as Amazon AWS and Google Cloud Platform. It lets data scientists, data analysts configure and execute ML/DS pipelines.

ML-DSL is property of Grid Dynamics International. It consumes Amazon Services including AWS S3, EMR, SageMaker and Google services such as Cloud Storage, Cloud Dataproc and Cloud AI.

Following features are available:

Configuring and executing spark jobs for data processing using Google Dataproc and Amazon EMR
Configuring and executing ML/DS pipelines for training, deployment models on Google AI Platform and Amazon SageMaker using ml-dsl API
Configuring and executing ML/DS pipelines for data processing and training, deployment models using Jupyter Notebook Magic functions.

A Jupyter notebook of example using ml-dsl for Google Cloud Platform has been provided for your convenience.

A Jupyter notebook of example using ml-dsl for Amazon has been provided for your convenience.

ML-DSL User Guide

INSTALLING ML-DSL

Running spark jobs

Google Dataproc

Amazon EMR

Getting logs

Google Dataproc

Upgrading spark jobs

Google Dataproc

Train models

Google AIPlatform

Amazon SageMaker

Deployment models

Google AIPlatform

Amazon SageMaker

Getting predictions

Google AIPlatform

ML-DSL API Reference

ExecMagics

Executors

Helpers

Jobs

Profiles

Arguments

Artifact

ComponentType

Model

ModelBuilder

Platform

PyScript

ScriptState

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Science Pipeline with ML DSL

ML-DSL User Guide

INSTALLING ML-DSL

Running spark jobs

Google Dataproc

Amazon EMR

Getting logs

Google Dataproc

Upgrading spark jobs

Google Dataproc

Train models

Google AIPlatform

Amazon SageMaker

Deployment models

Google AIPlatform

Amazon SageMaker

Getting predictions

Google AIPlatform

ML-DSL API Reference

ExecMagics

Executors

Helpers

Jobs

Profiles

Arguments

Artifact

ComponentType

Model

ModelBuilder

Platform

PyScript

ScriptState

Clone this wiki locally