This example uses the CIFAR-10 dataset to demonstrate how to train a convolutional neural network (CNN) on a multi-node multi-GPU cluster. You can run this recipe on a single or multiple nodes.
- For demonstration purposes, CIFAR-10 data preparation script and ConvNet_CIFAR10_DataAug_Distributed.py with its dependency will be deployed at Azure File Share;
- Standard output of the job and the model will be stored on Azure File Share;
- CIFAR-10 dataset(http://www.cs.toronto.edu/~kriz/cifar.html) has been preprocessed available here.
- The official CNTK example ConvNet_CIFAR10_DataAug_Distributed.py (https://github.com/Microsoft/CNTK/blob/master/Examples/Image/Classification/ConvNet/Python/ConvNet_CIFAR10_DataAug_Distributed.py) is used.
You can find Jupyter Notebook for this recipe in CNTK-GPU-Python-Distrbuted.ipynb.
You can find Azure CLI 2.0 instructions for this recipe in cli-instructions.md.
Under construction...
If you have any problems or questions, you can reach the Batch AI team at [email protected] or you can create an issue on GitHub.
We also welcome your contributions of additional sample notebooks, scripts, or other examples of working with Batch AI.