- Pytorch example to train on MNIST with multiple nodes in AWS.
- Build more than two instances in AWS EC2
- Configure security group(https://pytorch.org/tutorials/beginner/aws_distributed_training_tutorial.html)
- Edit master address and master port in main functions of both node0.py and node1.py
- Run node0.py in node0 and node1.py in node1
- example
- node0: python node0.py -n 2 -g 2 -nr 0
- node1: python node1.py -n 2 -g 2 -nr 1
- python 3.7.4