Skip to content

Latest commit

 

History

History
65 lines (43 loc) · 4.63 KB

pretrain.md

File metadata and controls

65 lines (43 loc) · 4.63 KB

Pretrain LLMs with LitGPT

This document explains how to pretrain LLMs using LitGPT.

 

The Pretraining API

You can pretrain models in LitGPT using the litgpt pretrain API starting with any of the available architectures listed by calling litgpt pretrain without any additional arguments:

litgpt pretrain

Shown below is an abbreviated list:

ValueError: Please specify --model_name <model_name>. Available values:
Camel-Platypus2-13B
...
Gemma-2b
...
Llama-2-7b-hf
...
Mixtral-8x7B-v0.1
...
pythia-14m

For demonstration purposes, we can pretrain a small 14 million-parameter Pythia model on the small TinyStories dataset using the debug.yaml config file as follows:

litgpt pretrain \
   --model_name pythia-14m \
   --config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/pretrain/debug.yaml

 

Pretrain a 1.1B TinyLlama model

You can find an end-to-end LitGPT tutorial for pretraining a TinyLlama model using LitGPT here.

 

Optimize LitGPT pretraining with Lightning Thunder

Lightning Thunder is a source-to-source compiler for PyTorch, which is fully compatible with LitGPT. In experiments, Thunder resulted in a 40% speed-up compared to using regular PyTorch when finetuning a 7B Llama 2 model.

For more information, see the Lightning Thunder extension README.

 

Project templates

The following Lightning Studio templates provide LitGPT pretraining projects in reproducible environments with multi-GPU and multi-node support:  

Prepare the TinyLlama 1T token dataset

Pretrain LLMs - TinyLlama 1.1B

Continued Pretraining with TinyLlama 1.1B