Skip to content

Files

Latest commit

4363a5e · Apr 15, 2020

History

History
207 lines (149 loc) · 5.51 KB

examples.md

File metadata and controls

207 lines (149 loc) · 5.51 KB
description
Here are some common use cases for pulling down data from W&B using our Python API.

API Examples

Find the run path

To use the public API, you'll often need the Run Path which is "<entity>/<project>/<run_id>" In the app, open a run and click on the Overview tab to see the run path for any run.

Read metrics from a run

This example outputs timestamp and accuracy saved with wandb.log({"accuracy": acc}) for a run saved to <entity>/<project>/<run_id>.

import wandb
api = wandb.Api()

run = api.run("<entity>/<project>/<run_id>")
if run.state == "finished":
   for k in run.history():
       print(k["_timestamp"], k["accuracy"])

Compare two runs

This will output the config parameters that are different between run1 and run2.

import wandb
api = wandb.Api()

# replace with your <entity_name>/<project_name>/<run_id>
run1 = api.run("<entity>/<project>/<run_id>")
run2 = api.run("<entity>/<project>/<run_id>")

import pandas as pd
df = pd.DataFrame([run1.config, run2.config]).transpose()

df.columns = [run1.name, run2.name]
print(df[df[run1.name] != df[run2.name]])

Outputs:

              c_10_sgd_0.025_0.01_long_switch base_adam_4_conv_2fc
batch_size                                 32                   16
n_conv_layers                               5                    4
optimizer                             rmsprop                 adam

Update metrics for a run (after run finished)

This example sets the accuracy of a previous run to 0.9. It also modifies the accuracy histogram of a previous run to be the histogram of numpy_arry

import wandb
api = wandb.Api()

run = api.run("<entity>/<project>/<run_id>")
run.summary["accuracy"] = 0.9
run.summary["accuracy_histogram"] = wandb.Histogram(numpy_array)
run.summary.update()

Update config in a run

This examples updates one of your configuration settings

import wandb
api = wandb.Api()
run = api.run("<entity>/<project>/<run_id>")
run.config["key"] = 10
run.update()

Export metrics from a single run to a CSV file

This script finds all the metrics saved for a single run and saves them to a CSV.

import wandb
api = wandb.Api()

# run is specified by <entity>/<project>/<run id>
run = api.run("<entity>/<project>/<run_id>")

# save the metrics for the run to a csv file
metrics_dataframe = run.history()
metrics_dataframe.to_csv("metrics.csv")

Export metrics from a large single run without sampling

The default history method samples the metrics to a fixed number of samples (the default is 500, you can change this with the samples argument). If you want to export all of the data on a large run, you can use the run.scan_history() method. This script loads all of the loss metrics into a variable losses for a longer run.

import wandb
api = wandb.Api()

run = api.run("<entity>/<project>/<run_id>")
history = run.scan_history()
losses = [row["Loss"] for row in history]

Export metrics from all runs in a project to a CSV file

This script finds a project and outputs a CSV of runs with name, configs and summary stats.

import wandb
api = wandb.Api()

# Change oreilly-class/cifar to <entity/project-name>
runs = api.runs("<entity>/<project>")
summary_list = [] 
config_list = [] 
name_list = [] 
for run in runs: 
    # run.summary are the output key/values like accuracy.  We call ._json_dict to omit large files 
    summary_list.append(run.summary._json_dict) 

    # run.config is the input metrics.  We remove special values that start with _.
    config_list.append({k:v for k,v in run.config.items() if not k.startswith('_')}) 

    # run.name is the name of the run.
    name_list.append(run.name)       

import pandas as pd 
summary_df = pd.DataFrame.from_records(summary_list) 
config_df = pd.DataFrame.from_records(config_list) 
name_df = pd.DataFrame({'name': name_list}) 
all_df = pd.concat([name_df, config_df,summary_df], axis=1)

all_df.to_csv("project.csv")

Download a file from a run

This finds the file "model-best.h5" associated with with run ID uxte44z7 in the cifar project and saves it locally.

import wandb
api = wandb.Api()
run = api.run("<entity>/<project>/<run_id>")
run.file("model-best.h5").download()

Download all files from a run

This finds all files associated with run ID uxte44z7 and saves them locally. (Note: you can also accomplish this by running wandb restore <RUN_ID> from the command line.)

import wandb
api = wandb.Api()
run = api.run("<entity>/<project>/<run_id>")
for file in run.files():
    file.download()

Download the best model file

import wandb
api = wandb.Api()
sweep = api.sweep("<entity>/<project>/<sweep_id>")
runs = sorted(sweep.runs, key=lambda run: run.summary.get("val_acc", 0), reverse=True)
val_acc = runs[0].summary.get("val_acc", 0)
print("Best run {runs[0].name} with {val_acc}% validation accuracy")
runs[0].file("model-best.h5").download(replace=True)
print("Best model saved to model-best.h5")

Get runs from a specific sweep

import wandb
api = wandb.Api()
sweep = api.sweep("<entity>/<project>/<sweep_id>")
print(sweep.runs)

Download system metrics data

This gives you a dataframe with all your system metrics for a run.

import wandb
api = wandb.Api()
run = api.run("<entity>/<project>/<run_id>")
system_metrics = run.history(stream = 'events')

Update Summary Metrics

You can pass your dictionary to update summary metrics

summary.update({“key”: val})