Update docs (#588)

huggingface · Jul 17, 2024 · 2430f9f · 2430f9f
1 parent 1d41316
commit 2430f9f
Show file tree

Hide file tree

Showing 17 changed files with 692 additions and 457 deletions.
diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
@@ -8,24 +8,24 @@
   - local: containers
     title: Optimum Containers
   - sections:
-    - local: tutorials/overview
-      title: Overview
-    - local: tutorials/notebooks
+    - local: training_tutorials/notebooks
       title: Notebooks
-    - local: tutorials/fine_tune_bert
+    - local: training_tutorials/fine_tune_bert
       title: Fine-tune BERT for Text Classification on AWS Trainium
-    - local: tutorials/stable_diffusion
-      title: Generate images with Stable Diffusion models on AWS Inferentia
-    - local: tutorials/llama2-13b-chatbot
+    - local: training_tutorials/finetune_llm
+      title: Fine-tune Llama 3 8B on AWS Trainium
+    title: Training Tutorials
+  - sections:
+    - local: inference_tutorials/notebooks
+      title: Notebooks
+    - local: inference_tutorials/llama2-13b-chatbot
       title: Create your own chatbot with llama-2-13B on AWS Inferentia
-    - local: tutorials/fine_tune_llama_7b
-      title: Fine-tune Llama 2 7B on AWS Trainium
-    - local: tutorials/sentence_transformers
+    - local: inference_tutorials/sentence_transformers
       title: Sentence Transformers on AWS Inferentia
-    title: Tutorials
+    - local: inference_tutorials/stable_diffusion
+      title: Generate images with Stable Diffusion models on AWS Inferentia
+    title: Inference Tutorials
   - sections:
-    - local: guides/overview
-      title: Overview
     - local: guides/setup_aws_instance
       title: Set up AWS Trainium instance
     - local: guides/sagemaker

diff --git a/docs/source/guides/distributed_training.mdx b/docs/source/guides/distributed_training.mdx
@@ -18,8 +18,9 @@ But there is a caveat: each Neuron core is an independent data-parallel worker b
 To alleviate that, `optimum-neuron` supports parallelism features enabling you to harness the full power of your Trainium instance:
 
   1. [ZeRO-1](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/tutorials/training/zero1_gpt2.html): It is an optimization of data-parallelism which consists in sharding the optimizer state (which usually represents half of the memory needed on the device) over the data-parallel ranks.
-  2. [Tensor Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/tensor_parallelism_overview.html): It is a technique which consists in sharding each of your model parameters along a given dimension on multiple devices. The number of devices to shard your parameters on is called the `tensor_parallel_size`. 
-  3. [Pipeline Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/pipeline_parallelism_overview.html): **coming soon!**
+  2. [Tensor Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/tensor_parallelism_overview.html): It is a technique which consists in sharding each of your model matrix-multiplications along a given axis (row or column) on multiple devices. It also known as intra-layer model parallelism. The number of devices to shard your parameters on is called the `tensor_parallel_size`. 
+  3. [Sequence parallelism](https://arxiv.org/pdf/2205.05198.pdf): It is an optimization over Tensor Parallelism which shards the activations on the sequence axis outside of the tensor parallel regions. It is useful because it saves memory by sharding the activations.
+  4. [Pipeline Parallelism](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/pipeline_parallelism_overview.html): It consists in sharding the model block layers on multiple devices. It is also known as inter-layer model parallelism. The number of devices to shard your layers on is called the `pipeline_parallel_size`.
 
 
 The good news is that is it possible to combine those techniques, and `optimum-neuron` makes it very easy!

diff --git a/docs/source/guides/overview.mdx b/docs/source/guides/overview.mdx
diff --git a/docs/source/guides/setup_aws_instance.mdx b/docs/source/guides/setup_aws_instance.mdx
@@ -16,6 +16,13 @@ limitations under the License.
 
 # Set up AWS Trainium instance
 
+In this guide, we will show you:
+
+1. How to create an AWS Trainium instance
+2. How to use and run Jupyter Notebooks on your instance
+
+## Create an AWS Trainium Instance
+
 The simplest way to work with AWS Trainium and Hugging Face Transformers is the [Hugging Face Neuron Deep Learning AMI](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2) (DLAMI). The DLAMI comes with all required libraries pre-packaged for you, including the Neuron Drivers, Transformers, Datasets, and Accelerate.
 
 To create an EC2 Trainium instance, you can start from the console or the Marketplace. This guide will start from the [EC2 console](https://console.aws.amazon.com/ec2sp/v2/).
@@ -96,4 +103,18 @@ instance-id: i-0570615e41700a481
 +--------+--------+--------+---------+
 ```
 
+## Configuring `Jupyter Notebook` on your AWS Trainium Instance
+
+With the instance is up and running, we can ssh into it. 
+But instead of developing inside a terminal it is also possible to use a `Jupyter Notebook` environment. We can use it for preparing our dataset and launching the training (at least when working on a single node). 
+
+For this, we need to add a port for forwarding in the `ssh` command, which will tunnel our localhost traffic to the Trainium instance.
+
+```bash
+PUBLIC_DNS="" # IP address, e.g. ec2-3-80-....
+KEY_PATH="" # local path to key, e.g. ssh/trn.pem
+
+ssh -L 8080:localhost:8080 -i ${KEY_NAME}.pem ubuntu@$PUBLIC_DNS
+```
+
 You are done! You can now start using the Trainium accelerators with Hugging Face Transformers. Check out the [Fine-tune Transformers with AWS Trainium](./fine_tune) guide to get started.
diff --git a/docs/source/index.mdx b/docs/source/index.mdx
@@ -24,7 +24,7 @@ The list of officially validated models and tasks is available [here](https://hu
   <div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">
     <a
       class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg"
-      href="./tutorials/overview"
+      href="./tutorials/fine_tune_bert"
     >
       <div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
         Tutorials
@@ -34,7 +34,7 @@ The list of officially validated models and tasks is available [here](https://hu
         Start here if you are using 🤗 Optimum Neuron for the first time!
       </p>
     </a>
-    <a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/overview">
+    <a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./guides/setup_aws_instance">
       <div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
         How-to guides
       </div>

diff --git a/docs/source/tutorials/llama2-13b-chatbot.mdx → ...nference_tutorials/llama2-13b-chatbot.mdx b/docs/source/tutorials/llama2-13b-chatbot.mdx → ...nference_tutorials/llama2-13b-chatbot.mdx
diff --git a/docs/source/tutorials/notebooks.mdx → .../source/inference_tutorials/notebooks.mdx b/docs/source/tutorials/notebooks.mdx → .../source/inference_tutorials/notebooks.mdx
@@ -1,5 +1,5 @@
 <!---
-Copyright 2023 The HuggingFace Team. All rights reserved.
+Copyright 2024 The HuggingFace Team. All rights reserved.
 
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@@ -18,18 +18,10 @@ limitations under the License.
 
 We prepared some notebooks for you, so that you can run directly tutorials in the documentation.
 
-## Training
-
-| Notebook                                                                                                                                                                                | Description                                                                                                                                                                       |        Studio Lab                                                                                                                                                                                                       |
-|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
-| [Fine-tune BERT for text classification on AWS Trainium](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-classification/notebook.ipynb)                          | Show how to fine-tune BERT on AWS Trainium for text classification.                                                                                                               | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-classification/notebook.ipynb)                 |
-
-## Inference
-
 | Notebook                                                                                                                                                                                | Description                                                                                                                                                                       |        Studio Lab                                                                                                                                                                                                       |
 |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
 | [Create your own chatbot with llama-2-13B on AWS Inferentia](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/llama2-13b-chatbot.ipynb)                | Show how to run LLama-2 13B chat model on AWS inferentia 2.                                                                                                                       | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/llama2-13b-chatbot.ipynb)           |
 | [How to generate images with Stable Diffusion](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-txt2img.ipynb)                       | Show how to use stable-diffusion v2.1 model to generate images from prompts on Inferentia 2.                                                                                      | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-txt2img.ipynb)    |
 | [How to generate images with Stable Diffusion XL](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-xl-txt2img.ipynb)                 | Show how to use stable-diffusion XL model to generate images from prompts on Inferentia 2.                                                                                        | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-xl-txt2img.ipynb) |
 | [Compute text embeddings with Sentence Transformers on Inferentia](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/sentence-transformers/getting-started.ipynb)       | Show how to use Sentence Transformers to compute sentence / text embeddings on Inferentia 2.                                                                                      | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/sentence-transformers/getting-started.ipynb)        |
-| [How to compile (if needed) and generate text with CodeLlama 7B](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb)      | How to use CodeLlama 7B to generate code. Also walks through compilation.                                                                                                         | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb)     |)
+| [How to compile (if needed) and generate text with CodeLlama 7B](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb)      | How to use CodeLlama 7B to generate code. Also walks through compilation.                                                                                                         | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-neuron/blob/main/notebooks/text-generation/CodeLlama-7B-Compilation.ipynb)     |)
diff --git a/...ource/tutorials/sentence_transformers.mdx → ...rence_tutorials/sentence_transformers.mdx b/...ource/tutorials/sentence_transformers.mdx → ...rence_tutorials/sentence_transformers.mdx
diff --git a/docs/source/tutorials/stable_diffusion.mdx → .../inference_tutorials/stable_diffusion.mdx b/docs/source/tutorials/stable_diffusion.mdx → .../inference_tutorials/stable_diffusion.mdx
diff --git a/docs/source/tutorials/fine_tune_bert.mdx → ...rce/training_tutorials/fine_tune_bert.mdx b/docs/source/tutorials/fine_tune_bert.mdx → ...rce/training_tutorials/fine_tune_bert.mdx