From d5606e815d6d045635a36661e718383589e61927 Mon Sep 17 00:00:00 2001 From: Sami Jaghouar Date: Thu, 26 Sep 2024 17:19:52 +0000 Subject: [PATCH] update readme with 4 gpus instruction --- README.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index bd660d28..25861da1 100644 --- a/README.md +++ b/README.md @@ -50,12 +50,22 @@ ZERO_BAND_LOG_LEVEL=DEBUG torchrun --nproc_per_node=2 src/zeroband/train.py @co ## run diloco -To run diloco locally you can use the helper script `scripts/simulatsimulate_multi_nodee_mutl.sh` +To run diloco locally you can use the helper script `scripts/simulatsimulate_multi_nodee_mutl.sh` + +:note: you need 4 gpus to run the following command ```bash ZERO_BAND_LOG_LEVEL=DEBUG ./scripts/simulate_multi_node.sh 2 2 src/zeroband/train.py @configs/debug/diloco.toml ``` +if you have only two gpus + +```bash +ZERO_BAND_LOG_LEVEL=DEBUG ./scripts/simulate_multi_node.sh 2 1 src/zeroband/train.py @configs/debug/diloco.toml +``` + +One gpu is not supported at the moment because of a fsdp bug in our implementation. + ## run test You need a machine with a least two gpus to run the full test suite.