Releases: zenml-io/zenml
0.3.4
This release is a big design change and refactor. It involves a significant change in the Configuration file structure, meaning this is a breaking upgrade.
For those upgrading from an older version of ZenML, we ask to please delete their old pipelines
dir and .zenml
folders and start afresh with a zenml init
.
If only working locally, this is as simple as:
cd zenml_enabled_repo
rm -rf pipelines/
rm -rf .zenml/
And then another ZenML init:
pip install --upgrade zenml
cd zenml_enabled_repo
zenml init
New Features
- Introduced another higher-level pipeline: The NLPPipeline. This is a generic
NLP pipeline for a text-datasource based training task. Full example of how to use the NLPPipeline can be found here - Introduced a BaseTokenizerStep as a simple mechanism to define how to train and encode using any generic
tokenizer (again for NLP-based tasks). - Introduced a new HuggingFace integration, with the first concrete implementation of the BaseTokenizerStep, i.e., the HuggingFaceTokenizer.
- Show-cased how to use HuggingFace with the ZenML TrainerStep in the NLP Example.
Bug Fixes + Refactor
- Significant change to imports: Now imports are way simpler and user-friendly. E.g. Instead of:
from zenml.core.pipelines.training_pipeline import TrainingPipeline
A user can simple do:
from zenml.pipelines import TrainingPipeline
The caveat is of course that this might involve a re-write of older ZenML code imports.
Note: Future releases are also expected to be breaking. Until announced, please expect that upgrading ZenML versions may cause older-ZenML generated pipelines to behave unexpectedly.
Special shout-out to @nicholasmaiot for major contributions to this release!
0.3.3
This release is a significant one as it includes the first version of the AWS integration. It allows you to use ZenML to launch an EC2 instance as an orchestrator and execute a ZenML pipeline possibly coupled with an S3 artifact store and RDS metadata store.
It is a new feature and it does not include any breaking changes.
In order to install ZenML with the AWS integration attached, you can follow:
pip install --upgrade zenml[aws]
zenml init
New Features
- OrchestratorAWSBackend implemented to launch an EC2 instance as the orchestrator.
- While you are using the new orchestrator backend, you may use S3 and RDS.
- Implemented an example which covers the basic process if you would like to start testing it right away.
Bug Fixes + Refactor
- For more advanced use-cases, more examples will follow in the future.
- Numerous small bugs and refinements.
0.3.2
Earlier release to get the PostgreSQL datasource out quicker.
To upgrade:
pip install --upgrade zenml
New Features
- sci-kit learn example.
- PostgreSQL Datasource added.
Bug Fixes + Refactor
- Slight change to telemetry utils -> Now opt-out also sends a signal.
0.3.1
This release is a big design change and refactor. It involves a significant change in the Configuration file structure, meaning this is a breaking upgrade. For those upgrading from 0.2.0, we ask to please delete their old pipelines
dir and .zenml
folders and start afresh with a zenml init
.
If only working locally, this is as simple as:
cd zenml_enabled_repo
rm -rf pipelines/
rm -rf .zenml/
And then another init:
pip install --upgrade zenml
zenml init
New Features
- BatchInferencePipeline added for offline batch inference use-cases.
- Google Cloud Platform Bootstrapping Terraform script added for one-command bootstrapping of ZenML on GCP.
DeployPipeline
added to deploy a pipeline directly without having to create aTrainingPipeline
.
Bug Fixes + Refactor
- Now you can run pipelines from within any subdirectory in the repo.
- Relaxed restriction on custom steps having sub-directories with their module.
- Relationship between
Datasource
andData Step
refined. - Numerous small bugs and refinements to facilitate flexible API design.
Note: Future releases are also expected to be breaking. Until announced, please expect that upgrading ZenML versions may cause older-ZenML generated pipelines to behave unexpectedly.
0.2.0
This new release is a major one. Its the first to introduce our new integrations system, which is meant to be used to extend ZenML with various other ML/MLOps libraries easily. The first big advantage one gets is 🚀 PyTorch Support 🚀!
pip install --upgrade zenml
And to enable the PyTorch extension:
pip install zenml[pytorch]
New Features
- Introduced integrations for ZenML with the extra_requires setuptools paradigm.
- Added PyTorchTrainer support with easily extendable
TorchBaseTrainer
example. - Restructured trainer steps to be more intuitive to extend from Tensorflow and PyTorch. Now, we have a
TrainerStep
, followed byTFBaseTrainerStep
andTorchBaseTrainerStep
. - The
input_fn
of the TorchTrainer have implemented in a way that it can ingest from a tfrecords file. This marks one of the few projects out there
that have native support for ingesting the TFRecords format into PyTorch directly.
Bug Fixes
- Fixed an issue with
Repository.get_zenml_dir()
that caused any pipeline creates below root level to fail on creation.
Documentation Annoucement
The docs are almost complete! We are at 80% completion. Keep an eye out as we update with more details on how to use/extend ZenML and let us know via slack if there is something missing!
0.1.5
New Features
- Added Kubernetes Orchestrator to run pipelines on a kubernetes cluster.
- Added timeseries support with StandardSequencerStep.
- Added more [CLI groups] such as
step
,datasource
andpipelines
. E.g.zenml pipeline list
gives list of pipelines in current repo. - Completed a significant portion of the Docs.
- Refactored Step Interfaces for easier integrations into other libraries.
- Added a GAN Example to showcase ImageDatasource.
- Set up base for more Trainer Interfaces like PyTorch, scikit etc.
- Added ability to see historical steps.
Bug Fixes
- All files except YAML files picked up while parsing
pipelines_dir
, in reference to concerns raised in #13.
Upcoming changes
- Next release will be a major one and will involve refactoring of design decisions that might cause backward incompatible changes to existing ZenML repos.
0.1.4
0.1.4
New Features
- Ability to add a custom image to Dataflow ProcessingBackend.
Bug Fixes
- Fixed requirements.txt and setup.py to enable local build.
- Pip package should install without any requirement conflicts now.
- Added custom docs made by Jupyter book in the
docs/book
folder.
0.1.3
New Features
- Launch GCP preemptible VM instances to orchestrate pipelines with OrchestratorGCPBackend. See full example here.
- Train using Google Cloud AI Platform with SingleGPUTrainingGCAIPBackend. See full example here
- Use Dataflow for distributed preprocessing. See full example here.
- Run pipelines locally with SQLite Metadata Store, local Artifact Store, and local Pipelines Directory.
- Native Git integration: All steps are pinned with the Git SHA of the code when the pipelines it was used in is run. See details here.
- All pipelines run are reproducible with a unique combination of the Metadata Store, Artifact Store and the Pipelines Directory.
Bug Fixes
- Metadata Store and Artifact Store specified in pipelines disassociated from default .zenml_config file.
- Fixed typo in default docker images constants.