-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
log here
https://github.com/AI-Hypercomputer/torchprime/actions/runs/16889868996/job/47847279717#step:7:1103
[2025-08-11 19:36:26,339][torchprime.torch_xla_models.trainer.base_trainer][INFO] - Finished training run
[2025-08-11 19:36:28,814][root][INFO] - Found newest profile: /tmp/gcs-mount/tp-run/16889868996-1/ds-v3-shallow-7c6cczci/profile/0-0/plugins/profile/2025_08_11_19_34_49/gke-tpu-7059ce52-gxns.xplane.pb
[2025-08-11 19:36:28,814][torchprime.metrics.step_duration][INFO] - Loading /tmp/gcs-mount/tp-run/16889868996-1/ds-v3-shallow-7c6cczci/profile/0-0/plugins/profile/2025_08_11_19_34_49/gke-tpu-7059ce52-gxns.xplane.pb
Plane ID: 2, Name: /device:TPU:0
Line ID: 2, Name: XLA Modules
Event Metadata Name: SyncTensorsGraph.13208(16646631473018641282), ID: 5703, Offset: 0.260 s, Duration: 0.288 s
Event Metadata Name: SyncTensorsGraph.13214(15300851610953471503), ID: 11532, Offset: 71.466 s, Duration: 0.289 s
Event Metadata Name: SyncTensorsGraph.13214(15300851610953471503), ID: 11532, Offset: 71.755 s, Duration: 0.290 s
Event Metadata Name: SyncTensorsGraph.13226(1256450336661157904), ID: 17517, Offset: 142.631 s, Duration: 0.315 s
Event Metadata Name: SyncTensorsGraph.13214(15300851610953471503), ID: 11532, Offset: 142.946 s, Duration: 0.290 s
Event Metadata Name: SyncTensorsGraph.13226(1256450336661157904), ID: 17517, Offset: 143.236 s, Duration: 0.291 s
Event Metadata Name: SyncTensorsGraph.13214(15300851610953471503), ID: 11532, Offset: 143.528 s, Duration: 0.290 s
Error executing job with overrides: ['model=deepseek-v3-shallow', 'dataset=wikitext', 'dataset.block_size=512', 'task=train', 'task.lr_scheduler.type=constant', 'task.global_batch_size=4', 'task.max_steps=15', 'ici_mesh.fsdp=4', 'profile_start_step=3', 'profile_dir=/tmp/gcs-mount/tp-run/16889868996-1/ds-v3-shallow-7c6cczci/profile/0-0', 'output_dir=/tmp/gcs-mount/tp-run/16889868996-1/ds-v3-shallow-7c6cczci/outputs/0-0']
Metadata
Metadata
Assignees
Labels
No labels