This document explains how to pretrain LLMs using LitGPT.
You can pretrain models in LitGPT using the litgpt pretrain
API starting with any of the available architectures listed by calling litgpt pretrain
without any additional arguments:
litgpt pretrain
Shown below is an abbreviated list:
ValueError: Please specify --model_name <model_name>. Available values:
Camel-Platypus2-13B
...
Gemma-2b
...
Llama-2-7b-hf
...
Mixtral-8x7B-v0.1
...
pythia-14m
For demonstration purposes, we can pretrain a small 14 million-parameter Pythia model on the small TinyStories dataset using the debug.yaml config file as follows:
litgpt pretrain \
--model_name pythia-14m \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/pretrain/debug.yaml
You can find an end-to-end LitGPT tutorial for pretraining a TinyLlama model using LitGPT here.
Lightning Thunder is a source-to-source compiler for PyTorch, which is fully compatible with LitGPT. In experiments, Thunder resulted in a 40% speed-up compared to using regular PyTorch when finetuning a 7B Llama 2 model.
For more information, see the Lightning Thunder extension README.
The following Lightning Studio templates provide LitGPT pretraining projects in reproducible environments with multi-GPU and multi-node support:
Pretrain LLMs - TinyLlama 1.1B |
|
Continued Pretraining with TinyLlama 1.1B |
|