diff --git a/extensions/xla/README.md b/extensions/xla/README.md index d71a0e0f2c..6182f24d54 100644 --- a/extensions/xla/README.md +++ b/extensions/xla/README.md @@ -78,7 +78,7 @@ export PJRT_DEVICE=TPU > An extensive guide on setup and available options can be found [here](https://cloud.google.com/tpu/docs/v4-users-guide). Since a new machine was created, you may need to download pretrained weights. -They can be copied to the machine using `gcloud compute tpus tpu-vm scp`, or you can follow the steps described in our [downloading guide](download_model_weights.md). +They can be copied to the machine using `gcloud compute tpus tpu-vm scp`, or you can follow the steps described in our [downloading guide](../../tutorials/download_model_weights.md). It is also recommended to set up a persistent disk from which to load checkpoints. Follow [this guide](https://cloud.google.com/tpu/docs/setup-persistent-disk#setting_up_a_tpu_vm_and_a_persistent_disk) to do so. diff --git a/tutorials/0_to_litgpt.md b/tutorials/0_to_litgpt.md index f0497b45ae..760c5d00e0 100644 --- a/tutorials/0_to_litgpt.md +++ b/tutorials/0_to_litgpt.md @@ -527,7 +527,7 @@ lm_eval --model hf \   **More information and additional resources** -- [tutorials/convert_lit_models](tutorials/convert_lit_models.md): Tutorial on converting LitGPT weights +- [tutorials/convert_lit_models](./convert_lit_models.md): Tutorial on converting LitGPT weights diff --git a/tutorials/inference.md b/tutorials/inference.md index 81cefe6816..4675624149 100644 --- a/tutorials/inference.md +++ b/tutorials/inference.md @@ -1,6 +1,6 @@ # Inference -We demonstrate how to run inference (next token prediction) with the GPT base model in the [`generate.py`](generate.py) script: +We demonstrate how to run inference (next token prediction) with the GPT base model in the [`generate.py`](../litgpt/generate/base.py) script: ```bash litgpt generate base --prompt "Hello, my name is" --checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b diff --git a/tutorials/oom.md b/tutorials/oom.md index c12573da10..c02ee5b2fd 100644 --- a/tutorials/oom.md +++ b/tutorials/oom.md @@ -34,7 +34,7 @@ However, your hardware may not support such large context lengths. Here's what y * For the finetuning scripts, you can trim the length of the samples in your dataset. All the finetuning scripts expose a `--data.max_seq_length=...` argument. This might also be useful in cases where sample lengths are highly unbalanced, as the presence of a single very long sample would incur a larger memory usage for all other - shorter samples. For example, the median length of the samples in Alpaca is 110 tokens. Truncating the Alpaca dataset to 256 max tokens reduces the memory requirements of a Falcon 7B model from 23.52 GB to 15.73 GB. For more information about the dataset truncation, please see the *Truncating datasets* section in the [prepare_datasets.md](prepare_datasets.md) tutorial. + shorter samples. For example, the median length of the samples in Alpaca is 110 tokens. Truncating the Alpaca dataset to 256 max tokens reduces the memory requirements of a Falcon 7B model from 23.52 GB to 15.73 GB. For more information about the dataset truncation, please see the *Truncating datasets* section in the [prepare_dataset.md](prepare_dataset.md) tutorial. Keep in mind that reducing the context length will affect the modelling performance on text sequences longer than the limit. diff --git a/tutorials/prepare_dataset.md b/tutorials/prepare_dataset.md index 2cb63ecee6..7f7cf238ae 100644 --- a/tutorials/prepare_dataset.md +++ b/tutorials/prepare_dataset.md @@ -79,7 +79,6 @@ For comparison, the Falcon 7B model requires 23.52 GB of memory for the original ### Alpaca-GPT4 - The Alpaca-GPT4 was built by using the prompts of the original Alpaca dataset and generate the responses via GPT 4. The dataset consists of 52,000 instructions and responses. @@ -126,7 +125,6 @@ litgpt finetune lora \ --train.max_seq_length 256 ``` -   ### Deita @@ -162,7 +160,6 @@ litgpt finetune lora \ --train.max_seq_length 512 ``` -   ### Dolly @@ -281,7 +278,6 @@ litgpt finetune lora \ However, you can also select individual subsets via comma-separated strings as follows: - ```bash litgpt finetune lora \ --data FLAN \ @@ -385,5 +381,4 @@ Note that you only need to modify a small fraction of the code file, namely the In addition to the finetuning dataset described above, LitGPT also supports several datasets for pretraining. The pretraining datasets are described in more detail in the following separate tutorial documents: -- [Pretrain Llama 2 on OpenWebText](./pretrain_openwebtext.md) - [Pretrain TinyLlama on Slimpajama and Starcoder](./pretrain_tinyllama.md)