description |
---|
Save a file to the cloud to associate the current run |
There are two ways to save a file to associate with a run.
- Use
wandb.save(filename)
. - Put a file in the wandb run directory, and it will get uploaded at the end of the run.
{% hint style="info" %}
If you're resuming a run, you can recover a file by callingwandb.restore(filename)
{% endhint %}
If you want to sync files as they're being written, you can specify a filename or glob in wandb.save
.
See this report for a complete working example.
# Save a model file from the current directory
wandb.save('model.h5')
# Save all files that currently exist containing the substring "ckpt"
wandb.save('../logs/*ckpt*')
# Save any files starting with "checkpoint" as they're written to
wandb.save(os.path.join(wandb.run.dir, "checkpoint*"))
{% hint style="info" %} W&B's local run directories are by default inside the ./wandb directory relative to your script, and the path looks like run-20171023_105053-3o4933r0 where 20171023_105053 is the timestamp and 3o4933r0 is the ID of the run. You can set the WANDB_DIR environment variable, or the dir keyword argument of wandb.init to an absolute path and files will be written within that directory instead. {% endhint %}
The file "model.h5" is saved into the wandb.run.dir and will be uploaded at the end of training.
import wandb
wandb.init()
model.fit(X_train, y_train, validation_data=(X_test, y_test),
callbacks=[wandb.keras.WandbCallback()])
model.save(os.path.join(wandb.run.dir, "model.h5"))
Here's a public example page. You can see on the files tab, there's the model-best.h5. That's automatically saved by default by the Keras integration, but you can save a checkpoint manually and we'll store it for you in association with your run.
You can edit the wandb/settings
file and set ignore_globs equal to a comma separated list of globs. You can also set the WANDB_IGNORE_GLOBS environment variable. A common use case is to prevent the git patch that we automatically create from being uploaded i.e. WANDB_IGNORE_GLOBS=*.patch
If you have a long run, you might want to see files like model checkpoints uploaded to the cloud before the end of the run. By default, we wait to upload most files until the end of the run. You can add a wandb.save('*.pth')
or just wandb.save('latest.pth')
in your script to upload those files whenever they are written or updated.
If you default to saving files in AWS S3 or Google Cloud Storage, you might get this error:events.out.tfevents.1581193870.gpt-tpu-finetune-8jzqk-2033426287 is a cloud storage url, can't save file to wandb.
To change the log directory for TensorBoard events files or other files you'd like us to sync, save your files to the wandb.run.dir so they're synced to our cloud.
If you'd like to use the run name from within your script, you can use wandb.run.name
and you'll get the run name— "blissful-waterfall-2" for example.
you need to call save on the run before being able to access the display name:
run = wandb.init(...)
run.save()
print(run.name)
Call wandb.save("*.pt")
once at the top of your script after wandb.init, then all files that match that pattern will save immediately once they're written to wandb.run.dir.
There’s a command wandb gc
that you can run to remove local files that have already been synced to cloud storage. More information about usage can be found with `wandb gc —help