Skip to content

Distributed-Deep-Learning/fairscale

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fairscale

fairscale is a PyTorch extension library for high performance and large scale training.

fairscale supports:

  • pipeline parallelism (fairscale.nn.Pipe)
  • tensor parallelism (fairscale.nn.model_parallel)
  • optimizer state sharding (fairscale.optim.oss)

Examples

Run a 4-layer model on 2 GPUs. The first two layers run on cuda:0 and the next two layers run on cuda:1.

import torch

import fairscale

model = torch.nn.Sequential(a, b, c, d)
model = fairscale.nn.Pipe(model, balance=[2, 2], devices=[0, 1], chunks=8)

Requirements

  • PyTorch >= 1.4

Installation

Normal installation:

pip install .

Development mode:

pip install -e .

Contributors

See the CONTRIBUTING file for how to help out.

License

fairscale is licensed under the BSD-3-Clause License.

fairscale.nn.pipe is forked from torchgpipe, Copyright 2019, Kakao Brain, licensed under Apache License.

fairscale.nn.model_parallel is forked from Megatron-LM, Copyright 2020, NVIDIA CORPORATION, licensed under Apache License.

References

Here is a list of all authors on relevant research papers this work is based on:

  • torchgpipe: Chiheon Kim, Heungsub Lee, Myungryong Jeong, Woonhyuk Baek, Boogeon Yoon, Ildoo Kim, Sungbin Lim, Sungwoong Kim. [Paper] [Code]
  • ZeRO: Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He. [Paper] [Code]
  • Megatron-LM: Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro. [Paper][Code]

About

PyTorch extensions for high performance and large scale training.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 97.6%
  • Cuda 2.3%
  • Other 0.1%