Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: lint code and text in integration tutorials #1167

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 64 additions & 32 deletions content/tutorials/integration-tutorials/huggingface.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,79 +6,99 @@ menu:
title: Hugging Face
weight: 3
---

{{< img src="/images/tutorials/huggingface.png" alt="" >}}

{{< cta-button colabLink="https://colab.research.google.com/github/wandb/examples/blob/master/colabs/huggingface/Huggingface_wandb.ipynb" >}}
Visualize your [Hugging Face](https://github.com/huggingface/transformers) model's performance quickly with a seamless [W&B](https://wandb.ai/site) integration.
Visualize your [Hugging Face](https://github.com/huggingface/transformers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please undo the hard line wrapping that is introduced in this PR? Markdown is easier to maintain and review with soft wrapping per paragraph.

model's performance quickly with a seamless [W&B](https://wandb.ai/site)
integration.

Compare hyperparameters, output metrics, and system stats like GPU utilization across your models.
Compare hyperparameters, output metrics, and system stats like GPU utilization
across your models.

## Why should I use W&B?

{.skipvale}

{{< img src="/images/tutorials/huggingface-why.png" alt="" >}}

- **Unified dashboard**: Central repository for all your model metrics and predictions
- **Unified dashboard**: Central repository for all your model metrics and
predictions
- **Lightweight**: No code changes required to integrate with Hugging Face
- **Accessible**: Free for individuals and academic teams
- **Secure**: All projects are private by default
- **Trusted**: Used by machine learning teams at OpenAI, Toyota, Lyft and more

Think of W&B like GitHub for machine learning models— save machine learning experiments to your private, hosted dashboard. Experiment quickly with the confidence that all the versions of your models are saved for you, no matter where you're running your scripts.
Think of W&B like GitHub for machine learning models— save machine learning
experiments to your private, hosted dashboard. Experiment quickly with the
confidence that all the versions of your models are saved for you, no matter
where you're running your scripts.

W&B lightweight integrations works with any Python script, and all you need to do is sign up for a free W&B account to start tracking and visualizing your models.
W&B lightweight integrations works with any Python script, and all you need to
do is sign up for a free W&B account to start tracking and visualizing your
models.

In the Hugging Face Transformers repo, we've instrumented the Trainer to automatically log training and evaluation metrics to W&B at each logging step.
In the Hugging Face Transformers repo, we've instrumented the Trainer to
automatically log training and evaluation metrics to W&B at each logging step.

Here's an in depth look at how the integration works: [Hugging Face + W&B Report](https://app.wandb.ai/jxmorris12/huggingface-demo/reports/Train-a-model-with-Hugging-Face-and-Weights-%26-Biases--VmlldzoxMDE2MTU).
Here's an in depth look at how the integration works:
[Hugging Face + W&B Report](https://app.wandb.ai/jxmorris12/huggingface-demo/reports/Train-a-model-with-Hugging-Face-and-Weights-%26-Biases--VmlldzoxMDE2MTU).

## Install, import, and log in

Install the Hugging Face and Weights & Biases libraries, and the GLUE dataset
and training script for this tutorial.


Install the Hugging Face and Weights & Biases libraries, and the GLUE dataset and training script for this tutorial.
- [Hugging Face Transformers](https://github.com/huggingface/transformers): Natural language models and datasets
- [Hugging Face Transformers](https://github.com/huggingface/transformers):
Natural language models and datasets
- [Weights & Biases]({{< relref "/" >}}): Experiment tracking and visualization
- [GLUE dataset](https://gluebenchmark.com/): A language understanding benchmark dataset
- [GLUE script](https://raw.githubusercontent.com/huggingface/transformers/refs/heads/main/examples/pytorch/text-classification/run_glue.py): Model training script for sequence classification

- [GLUE dataset](https://gluebenchmark.com/): A language understanding benchmark
dataset
- [GLUE script](https://raw.githubusercontent.com/huggingface/transformers/refs/heads/main/examples/pytorch/text-classification/run_glue.py):
Model training script for sequence classification

```notebook
!pip install datasets wandb evaluate accelerate -qU
!wget https://raw.githubusercontent.com/huggingface/transformers/refs/heads/main/examples/pytorch/text-classification/run_glue.py
```


```notebook
# the run_glue.py script requires transformers dev
!pip install -q git+https://github.com/huggingface/transformers
```

Before continuing, [sign up for a free account](https://app.wandb.ai/login?signup=true).
Before continuing,
[sign up for a free account](https://app.wandb.ai/login?signup=true).

## Put in your API key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Put in your API key
## Populate your API key


Once you've signed up, run the next cell and click on the link to get your API key and authenticate this notebook.

Once you've signed up, run the next cell and click on the link to get your API
key and authenticate this notebook.

```python
import wandb
wandb.login()
```

Optionally, we can set environment variables to customize W&B logging. See [documentation]({{< relref "/guides/integrations/huggingface/" >}}).

Optionally, we can set environment variables to customize W&B logging. See
[documentation]({{< relref "/guides/integrations/huggingface/" >}}).
Comment on lines +84 to +85
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Optionally, we can set environment variables to customize W&B logging. See
[documentation]({{< relref "/guides/integrations/huggingface/" >}}).
To customize W&B logging by setting environment variables, refer to the [Hugging Face documentation]({{< relref "/guides/integrations/huggingface/" >}}).


```python
```jupyter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```jupyter
```notebook

Also elsewhere in this PR

# Optional: log both gradients and parameters
%env WANDB_WATCH=all
```

## Train the model
Next, call the downloaded training script [run_glue.py](https://huggingface.co/transformers/examples.html#glue) and see training automatically get tracked to the Weights & Biases dashboard. This script fine-tunes BERT on the Microsoft Research Paraphrase Corpus— pairs of sentences with human annotations indicating whether they are semantically equivalent.

Next, call the downloaded training script
[run_glue.py](https://huggingface.co/transformers/examples.html#glue) and see
training automatically get tracked to the Weights & Biases dashboard. This
script fine-tunes BERT on the Microsoft Research Paraphrase Corpus— pairs of
sentences with human annotations indicating whether they are semantically
equivalent.

```python
```jupyter
%env WANDB_PROJECT=huggingface-demo
%env TASK_NAME=MRPC

Expand All @@ -96,27 +116,39 @@ Next, call the downloaded training script [run_glue.py](https://huggingface.co/t
--logging_steps 50
```

## Visualize results in dashboard
Click the link printed out above, or go to [wandb.ai](https://app.wandb.ai) to see your results stream in live. The link to see your run in the browser will appear after all the dependencies are loaded. Look for the following output: "**wandb**: 🚀 View run at [URL to your unique run]"
## Visualize results in dashboard

Click the link printed out above, or go to [wandb.ai](https://app.wandb.ai) to
see your results stream in live. The link to see your run in the browser will
appear after all the dependencies are loaded. Look for the following output:
"**wandb**: 🚀 View run at [URL to your unique run]"

**Visualize Model Performance**
It's easy to look across dozens of experiments, zoom in on interesting findings, and visualize highly dimensional data.
**Visualize Model Performance** It's easy to look across dozens of experiments,
zoom in on interesting findings, and visualize highly dimensional data.

{{< img src="/images/tutorials/huggingface-visualize.gif" alt="" >}}

**Compare Architectures**
Here's an example comparing [BERT vs DistilBERT](https://app.wandb.ai/jack-morris/david-vs-goliath/reports/Does-model-size-matter%3F-Comparing-BERT-and-DistilBERT-using-Sweeps--VmlldzoxMDUxNzU). It's easy to see how different architectures effect the evaluation accuracy throughout training with automatic line plot visualizations.
**Compare Architectures** Here's an example comparing
[BERT vs DistilBERT](https://app.wandb.ai/jack-morris/david-vs-goliath/reports/Does-model-size-matter%3F-Comparing-BERT-and-DistilBERT-using-Sweeps--VmlldzoxMDUxNzU).
It's easy to see how different architectures effect the evaluation accuracy
throughout training with automatic line plot visualizations.

{{< img src="/images/tutorials/huggingface-comparearchitectures.gif" alt="" >}}

## Track key information effortlessly by default
Weights & Biases saves a new run for each experiment. Here's the information that gets saved by default:

Weights & Biases saves a new run for each experiment. Here's the information
that gets saved by default:

- **Hyperparameters**: Settings for your model are saved in Config
- **Model Metrics**: Time series data of metrics streaming in are saved in Log
- **Terminal Logs**: Command line outputs are saved and available in a tab
- **System Metrics**: GPU and CPU utilization, memory, temperature etc.

## Learn more
- [Documentation]({{< relref "/guides/integrations/huggingface" >}}): docs on the Weights & Biases and Hugging Face integration
- [Videos](http://wandb.me/youtube): tutorials, interviews with practitioners, and more on our YouTube channel
- Contact: Message us at [email protected] with questions

- [Documentation]({{< relref "/guides/integrations/huggingface" >}}): docs on
the Weights & Biases and Hugging Face integration
- [Videos](http://wandb.me/youtube): tutorials, interviews with practitioners,
and more on our YouTube channel
- Contact: Message us at [email protected] with questions
39 changes: 20 additions & 19 deletions content/tutorials/integration-tutorials/keras.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,25 @@ title: Keras
---

{{< cta-button colabLink="https://colab.research.google.com/github/wandb/examples/blob/master/colabs/keras/Use_WandbMetricLogger_in_your_Keras_workflow.ipynb" >}}
Use Weights & Biases for machine learning experiment tracking, dataset versioning, and project collaboration.
Use Weights & Biases for machine learning experiment tracking, dataset
versioning, and project collaboration.

{{< img src="/images/tutorials/huggingface-why.png" alt="" >}}

This Colab notebook introduces the `WandbMetricsLogger` callback. Use this callback for [Experiment Tracking]({{< relref "/guides/models/track" >}}). It will log your training and validation metrics along with system metrics to Weights and Biases.

This notebook introduces the `WandbMetricsLogger` callback. Use this callback
for [Experiment Tracking]({{< relref "/guides/models/track" >}}). It will log
your training and validation metrics along with system metrics to Weights and
Biases.

## Setup and Installation

First, let us install the latest version of Weights and Biases. We will then authenticate this colab instance to use W&B.

First, let us install the latest version of Weights and Biases. We will then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
First, let us install the latest version of Weights and Biases. We will then
In the notebook, install Weights and Biases and sign in.

It's better to avoid "let us" and "we" in favor of the simple imperative. I believe both of these code blocks should be tagged up as notebooks.

authenticate this colab instance to use W&B.

```shell
pip install -qq -U wandb
```


```python
import os
import tensorflow as tf
Expand All @@ -36,17 +38,20 @@ import wandb
from wandb.integration.keras import WandbMetricsLogger
```

If this is your first time using W&B or you are not logged in, the link that appears after running `wandb.login()` will take you to sign-up/login page. Signing up for a [free account](https://wandb.ai/signup) is as easy as a few clicks.

If this is your first time using W&B or you are not logged in, the link that
appears after running `wandb.login()` will take you to sign-up/login page.
Signing up for a [free account](https://wandb.ai/signup) is as easy as a few
clicks.
Comment on lines +41 to +44
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If this is your first time using W&B or you are not logged in, the link that
appears after running `wandb.login()` will take you to sign-up/login page.
Signing up for a [free account](https://wandb.ai/signup) is as easy as a few
clicks.
If your browser is not logged in to W&B, the link that
appears after running `wandb.login()` will take you to the login page. If you don't yet have an account, [sign up for free](https://wandb.ai/signup), then log in.


```python
wandb.login()
```

## Hyperparameters

Use of proper config system is a recommended best practice for reproducible machine learning. We can track the hyperparameters for every experiment using W&B. In this colab we will be using simple Python `dict` as our config system.

Use of proper config system is a recommended best practice for reproducible
machine learning. We can track the hyperparameters for every experiment using
W&B. In this colab we will be using simple Python `dict` as our config system.
Comment on lines +52 to +54
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use of proper config system is a recommended best practice for reproducible
machine learning. We can track the hyperparameters for every experiment using
W&B. In this colab we will be using simple Python `dict` as our config system.
To reproduce a machine learning experiment, it's important to save the hyperparameters that the experiment uses. This example shows how to track an experiment's hyperparameters in a Python `dict`.

This sentence is very hard for me to understand (not due to your changes). Here is my attempt to help, which may in fact be incorrect.


```python
configs = dict(
Expand All @@ -63,14 +68,15 @@ configs = dict(

## Dataset

In this colab, we will be using [CIFAR100](https://www.tensorflow.org/datasets/catalog/cifar100) dataset from TensorFlow Dataset catalog. We aim to build a simple image classification pipeline using TensorFlow/Keras.

In this colab, we will be using
[CIFAR100](https://www.tensorflow.org/datasets/catalog/cifar100) dataset from
TensorFlow Dataset catalog. We aim to build a simple image classification
pipeline using TensorFlow/Keras.

```python
train_ds, valid_ds = tfds.load("fashion_mnist", split=["train", "test"])
```


```python
AUTOTUNE = tf.data.AUTOTUNE

Expand Down Expand Up @@ -98,15 +104,13 @@ def get_dataloader(ds, configs, dataloader_type="train"):
return dataloader
```


```python
trainloader = get_dataloader(train_ds, configs)
validloader = get_dataloader(valid_ds, configs, dataloader_type="valid")
```

## Model


```python
def get_model(configs):
backbone = tf.keras.applications.mobilenet_v2.MobileNetV2(
Expand All @@ -127,7 +131,6 @@ def get_model(configs):
return models.Model(inputs=inputs, outputs=outputs)
```


```python
tf.keras.backend.clear_session()
model = get_model(configs)
Expand All @@ -136,7 +139,6 @@ model.summary()

## Compile Model


```python
model.compile(
optimizer="adam",
Expand All @@ -150,7 +152,6 @@ model.compile(

## Train


```python
# Initialize a W&B run
run = wandb.init(project="intro-keras", config=configs)
Expand All @@ -167,4 +168,4 @@ model.fit(

# Close the W&B run
run.finish()
```
```
Loading