You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can I run a job with pytorch distributed training?
If I run this commend, does it work? torchrun --nproc_per_node=$WORLD_SIZE --master_port=1234 newsreclib/train.py experiment=nrms_mindsmall_pretrainedemb_celoss_bertsent
The text was updated successfully, but these errors were encountered:
You can run a job with PyTorch distributed training by changing the accelerator, strategy and devices number of the trainer. For example, you can use the ddp_config.
Alternatively, you can do this from command line as python newsreclib/train.py experiment=nrms_mindsmall_pretrainedemb_celoss_bertsent trainer.accelerator=gpu trainer.strategy=ddp trainer.devices=4
Can I run a job with pytorch distributed training?
If I run this commend, does it work?
torchrun --nproc_per_node=$WORLD_SIZE --master_port=1234 newsreclib/train.py experiment=nrms_mindsmall_pretrainedemb_celoss_bertsent
The text was updated successfully, but these errors were encountered: