Skip to content

Commit

Permalink
Update docs (#588)
Browse files Browse the repository at this point in the history
  • Loading branch information
michaelbenayoun authored Jul 17, 2024
1 parent 1d41316 commit 2430f9f
Show file tree
Hide file tree
Showing 17 changed files with 692 additions and 457 deletions.
26 changes: 13 additions & 13 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,24 @@
- local: containers
title: Optimum Containers
- sections:
- local: tutorials/overview
title: Overview
- local: tutorials/notebooks
- local: training_tutorials/notebooks
title: Notebooks
- local: tutorials/fine_tune_bert
- local: training_tutorials/fine_tune_bert
title: Fine-tune BERT for Text Classification on AWS Trainium
- local: tutorials/stable_diffusion
title: Generate images with Stable Diffusion models on AWS Inferentia
- local: tutorials/llama2-13b-chatbot
- local: training_tutorials/finetune_llm
title: Fine-tune Llama 3 8B on AWS Trainium
title: Training Tutorials
- sections:
- local: inference_tutorials/notebooks
title: Notebooks
- local: inference_tutorials/llama2-13b-chatbot
title: Create your own chatbot with llama-2-13B on AWS Inferentia
- local: tutorials/fine_tune_llama_7b
title: Fine-tune Llama 2 7B on AWS Trainium
- local: tutorials/sentence_transformers
- local: inference_tutorials/sentence_transformers
title: Sentence Transformers on AWS Inferentia
title: Tutorials
- local: inference_tutorials/stable_diffusion
title: Generate images with Stable Diffusion models on AWS Inferentia
title: Inference Tutorials
- sections:
- local: guides/overview
title: Overview
- local: guides/setup_aws_instance
title: Set up AWS Trainium instance
- local: guides/sagemaker
Expand Down
5 changes: 3 additions & 2 deletions docs/source/guides/distributed_training.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@ But there is a caveat: each Neuron core is an independent data-parallel worker b
To alleviate that, `optimum-neuron` supports parallelism features enabling you to harness the full power of your Trainium instance:

1. [ZeRO-1](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/tutorials/training/zero1_gpt2.html): It is an optimization of data-parallelism which consists in sharding the optimizer state (which usually represents half of the memory needed on the device) over the data-parallel ranks.
2. [Tensor Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/tensor_parallelism_overview.html): It is a technique which consists in sharding each of your model parameters along a given dimension on multiple devices. The number of devices to shard your parameters on is called the `tensor_parallel_size`.
3. [Pipeline Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/pipeline_parallelism_overview.html): **coming soon!**
2. [Tensor Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/tensor_parallelism_overview.html): It is a technique which consists in sharding each of your model matrix-multiplications along a given axis (row or column) on multiple devices. It also known as intra-layer model parallelism. The number of devices to shard your parameters on is called the `tensor_parallel_size`.
3. [Sequence parallelism](https://arxiv.org/pdf/2205.05198.pdf): It is an optimization over Tensor Parallelism which shards the activations on the sequence axis outside of the tensor parallel regions. It is useful because it saves memory by sharding the activations.
4. [Pipeline Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/pipeline_parallelism_overview.html): It consists in sharding the model block layers on multiple devices. It is also known as inter-layer model parallelism. The number of devices to shard your layers on is called the `pipeline_parallel_size`.


The good news is that is it possible to combine those techniques, and `optimum-neuron` makes it very easy!
Expand Down
30 changes: 0 additions & 30 deletions docs/source/guides/overview.mdx

This file was deleted.

21 changes: 21 additions & 0 deletions docs/source/guides/setup_aws_instance.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ limitations under the License.

# Set up AWS Trainium instance

In this guide, we will show you:

1. How to create an AWS Trainium instance
2. How to use and run Jupyter Notebooks on your instance

## Create an AWS Trainium Instance

The simplest way to work with AWS Trainium and Hugging Face Transformers is the [Hugging Face Neuron Deep Learning AMI](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2) (DLAMI). The DLAMI comes with all required libraries pre-packaged for you, including the Neuron Drivers, Transformers, Datasets, and Accelerate.

To create an EC2 Trainium instance, you can start from the console or the Marketplace. This guide will start from the [EC2 console](https://console.aws.amazon.com/ec2sp/v2/).
Expand Down Expand Up @@ -96,4 +103,18 @@ instance-id: i-0570615e41700a481
+--------+--------+--------+---------+
```

## Configuring `Jupyter Notebook` on your AWS Trainium Instance

With the instance is up and running, we can ssh into it.
But instead of developing inside a terminal it is also possible to use a `Jupyter Notebook` environment. We can use it for preparing our dataset and launching the training (at least when working on a single node).

For this, we need to add a port for forwarding in the `ssh` command, which will tunnel our localhost traffic to the Trainium instance.

```bash
PUBLIC_DNS="" # IP address, e.g. ec2-3-80-....
KEY_PATH="" # local path to key, e.g. ssh/trn.pem

ssh -L 8080:localhost:8080 -i ${KEY_NAME}.pem ubuntu@$PUBLIC_DNS
```

You are done! You can now start using the Trainium accelerators with Hugging Face Transformers. Check out the [Fine-tune Transformers with AWS Trainium](./fine_tune) guide to get started.
4 changes: 2 additions & 2 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The list of officially validated models and tasks is available [here](https://hu
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">
<a
class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg"
href="./tutorials/overview"
href="./tutorials/fine_tune_bert"
>
<div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
Tutorials
Expand All @@ -34,7 +34,7 @@ The list of officially validated models and tasks is available [here](https://hu
Start here if you are using 🤗 Optimum Neuron for the first time!
</p>
</a>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/overview">
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/setup_aws_instance">
<div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
How-to guides
</div>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!---
Copyright 2023 The HuggingFace Team. All rights reserved.
Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand All @@ -18,18 +18,10 @@ limitations under the License.

We prepared some notebooks for you, so that you can run directly tutorials in the documentation.

## Training

| Notebook | Description | Studio Lab |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| [Fine-tune BERT for text classification on AWS Trainium](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-classification/notebook.ipynb) | Show how to fine-tune BERT on AWS Trainium for text classification. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-classification/notebook.ipynb) |

## Inference

| Notebook | Description | Studio Lab |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| [Create your own chatbot with llama-2-13B on AWS Inferentia](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/llama2-13b-chatbot.ipynb) | Show how to run LLama-2 13B chat model on AWS inferentia 2. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/llama2-13b-chatbot.ipynb) |
| [How to generate images with Stable Diffusion](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-txt2img.ipynb) | Show how to use stable-diffusion v2.1 model to generate images from prompts on Inferentia 2. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-txt2img.ipynb) |
| [How to generate images with Stable Diffusion XL](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-xl-txt2img.ipynb) | Show how to use stable-diffusion XL model to generate images from prompts on Inferentia 2. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-xl-txt2img.ipynb) |
| [Compute text embeddings with Sentence Transformers on Inferentia](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/sentence-transformers/getting-started.ipynb) | Show how to use Sentence Transformers to compute sentence / text embeddings on Inferentia 2. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/sentence-transformers/getting-started.ipynb) |
| [How to compile (if needed) and generate text with CodeLlama 7B](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb) | How to use CodeLlama 7B to generate code. Also walks through compilation. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb) |)
| [How to compile (if needed) and generate text with CodeLlama 7B](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb) | How to use CodeLlama 7B to generate code. Also walks through compilation. | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb) |)
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 2430f9f

Please sign in to comment.