diff --git a/README.md b/README.md index 121d827..0e8bb7c 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,29 @@ -![dmlcloud logo](./misc/logo/dmlcloud_color.png) +![Dmlcloud Logo](./misc/logo/dmlcloud_color.png) --------------- -[![](https://img.shields.io/pypi/v/dmlcloud)](https://pypi.org/project/dmlcloud/) -[![](https://img.shields.io/github/actions/workflow/status/sehoffmann/dmlcloud/run_tests.yml?label=tests&logo=github)](https://github.com/sehoffmann/dmlcloud/actions/workflows/run_tests.yml) -[![](https://img.shields.io/github/actions/workflow/status/sehoffmann/dmlcloud/run_linting.yml?label=lint&logo=github)](https://github.com/sehoffmann/dmlcloud/actions/workflows/run_linting.yml) +[![PyPI Status](https://img.shields.io/pypi/v/dmlcloud)](https://pypi.org/project/dmlcloud/) +[![Documentation Status](https://readthedocs.org/projects/dmlcloud/badge/?version=latest)](https://dmlcloud.readthedocs.io/en/latest/?badge=latest) +[![Test Status](https://img.shields.io/github/actions/workflow/status/sehoffmann/dmlcloud/run_tests.yml?label=tests&logo=github)](https://github.com/sehoffmann/dmlcloud/actions/workflows/run_tests.yml) -*Flexibel, easy-to-use, opinionated* +A torch library for easy distributed deep learning on HPC clusters. Supports both slurm and MPI. No unnecessary abstractions and overhead. Simple, yet powerful, API. -*dmlcloud* is a library for **distributed training** of deep learning models with *torch*. Unlike other similar frameworks, dmcloud adds as little additional complexity and abstraction as possible. It is tailored towards a carefully selected set of libraries and workflows. +## Highlights +- Simple, yet powerful, API +- Easy initialization of `torch.distributed` +- Distributed checkpointing and metrics +- Extensive logging and diagnostics +- Wandb support +- A wealth of useful utility functions ## Installation ``` pip install dmlcloud ``` -## Why dmlcloud? -- Easy initialization of `torch.distributed` (supports *slurm* and *MPI*). -- Simple, yet powerful, API. No unnecessary abstractions and complications. -- Checkpointing and metric tracking (distributed) -- Extensive logging and diagnostics out-of-the-box. Greatly improve reproducability and traceability. -- A wealth of useful utility functions required for distributed training (e.g. for data set sharding) +## Minimal Example +*TODO* + +## Documentation + +You can find the official documentation at [Read the Docs](https://dmlcloud.readthedocs.io/en/latest/) + +