Skip to content

Commit

Permalink
disable wandb for non main worker in simulale script
Browse files Browse the repository at this point in the history
  • Loading branch information
samsja committed Dec 12, 2024
1 parent 337f731 commit 3387a27
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion scripts/simulate_multi_node_diloco.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ export GLOO_SOCKET_IFNAME=lo
for i in $(seq 0 $(($N - 1 )))
do
> logs/log$i.log
WANDB_MODE=$([ $i -eq 0 ] && echo "online" || echo "online") GLOBAL_UNIQUE_ID=$i GLOBAL_RANK=$i CUDA_VISIBLE_DEVICES=$(get_cuda_devices $NUM_GPU $i) uv run torchrun --nproc_per_node=$NUM_GPU --node-rank 0 --rdzv-endpoint localhost:$((BASE_PORT + $i)) --nnodes=1 $@ --data.data_rank $i --data.data_world_size $N > logs/log$i.log 2>&1 &
WANDB_MODE=$([ $i -eq 0 ] && echo "online" || echo "offline") GLOBAL_UNIQUE_ID=$i GLOBAL_RANK=$i CUDA_VISIBLE_DEVICES=$(get_cuda_devices $NUM_GPU $i) uv run torchrun --nproc_per_node=$NUM_GPU --node-rank 0 --rdzv-endpoint localhost:$((BASE_PORT + $i)) --nnodes=1 $@ --data.data_rank $i --data.data_world_size $N > logs/log$i.log 2>&1 &
child_pids+=($!)
done

Expand Down

0 comments on commit 3387a27

Please sign in to comment.