File tree Expand file tree Collapse file tree 1 file changed +5
-5
lines changed
templates/src/llm/finetune_distributed Expand file tree Collapse file tree 1 file changed +5
-5
lines changed Original file line number Diff line number Diff line change @@ -71,19 +71,19 @@ Where:
7171** Pythia-1B on 2× NVIDIA L40 GPUs with batch=16, seq=512:**
7272
7373``` text
74- Params: 1.0B × 2 bytes = 2.00 GB total
74+ Params: 1.0B × 2 bytes = 2.00 GiB total
7575
7676Model + Gradients + Optimizer States (with master weights):
77- (1.0B × 2 bytes × 8) / 2 GPUs ≈ 7.45 GB per GPU
77+ (1.0B × 2 bytes × 8) / 2 GPUs ≈ 7.45 GiB per GPU
7878
7979Activations (not sharded):
80- ≈ 1.0 GB per GPU (depends on hidden_dim × layers)
80+ ≈ 1.0 GiB per GPU (depends on hidden_dim × layers)
8181
8282Total steady-state:
83- ≈ 8.45 GB per GPU
83+ ≈ 8.45 GiB per GPU
8484
8585Transient all-gather overhead:
86- +10–20% (≈ 9.3 – 10.2 GB peak)
86+ +10–20% (≈ 9.3 – 10.2 GiB peak)
8787```
8888
8989---
You can’t perform that action at this time.
0 commit comments