How to freeze parameters with `nnx` and `optax`? #4167

maxencefaldor · 2024-09-01T22:55:41Z

I would like to know what is the best way to freeze parameters in a model using nnx and optax (https://flax.readthedocs.io/en/latest/guides/training_techniques/transfer_learning.html#optax-multi-transform).

I think it would be useful to add an example to https://flax.readthedocs.io/en/latest/nnx/index.html.

The text was updated successfully, but these errors were encountered:

cgarciae · 2024-09-02T08:33:33Z

Check out 08_save_load_checkpoints.py which contains a very simple example of orbax checkpointing. We will have a proper checkpointing guide in the future as we migrate the Linen docs.

maxencefaldor · 2024-09-02T08:49:36Z

Thank you for your response, but I believe there's been a misunderstanding. My question was about freezing parameters for transfer learning, not about checkpointing or saving/loading model states.

cgarciae · 2024-09-02T09:21:16Z

Oh god I'm sorry! I read orbax.

Transfer Learning and Surgery in general is a lot simpler in the new nnx API. Here's a small working example:

class Classifier(nnx.Module):
  def __init__(self, embed_dim, num_classes, backbone, rngs):
    self.backbone = backbone
    self.head = nnx.Linear(embed_dim, num_classes, rngs=rngs)

  def __call__(self, x):
    x = self.backbone(x)
    x = self.head(x)
    return x

def load_model():
  return nnx.Linear(784, 1024, rngs=nnx.Rngs(0))

backbone = load_model()
classifier = Classifier(1024, 10, backbone, rngs=nnx.Rngs(1))

# filter to select only Params on head path
head_params = nnx.All(nnx.Param, nnx.PathContains('head'))

optimizer = nnx.Optimizer(
  classifier,
  tx=optax.adamw(3e-4),
  wrt=head_params,  # filter head params
)

# simple train step
@nnx.jit
def train_step(model, optimizer, x, y):
  def loss_fn(model):
    logits = model(x)
    return optax.softmax_cross_entropy_with_integer_labels(logits, y).mean()

  diff_state = nnx.DiffState(0, head_params) # filter head params of the first argument
  grads = nnx.grad(loss_fn, argnums=diff_state)(model)
  optimizer.update(grads)

x = jnp.ones((1, 784))
y = jnp.ones((1,), jnp.int32)
train_step(classifier, optimizer, x, y)

maxencefaldor · 2024-09-02T10:04:37Z

Excellent, that's exactly what I was looking for!

mmorinag127 · 2024-09-12T12:15:18Z

I also have a similar question about Adamw's mask for weight decay.
How can I specify which parameters are to be applied to the weight decay?
Is there any excellent way like the above?

cgarciae · 2024-09-17T22:02:42Z

@mmorinag127 because the nnx.Optimizer wrapper is very generic there is no support for mask using filters, you can probably generate a compatible mask from the state e.g.

state = nnx.state(classifier, head_params)
mask = create_mask(state) # TODO

optimizer = nnx.Optimizer(
  classifier,
  tx=optax.adamw(3e-4, mask=mask),
  wrt=head_params,  # filter head params
)

mmorinag127 · 2024-09-24T00:23:24Z

Thanks a lot @cgarciae !!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to freeze parameters with `nnx` and `optax`? #4167

How to freeze parameters with `nnx` and `optax`? #4167

maxencefaldor commented Sep 1, 2024

cgarciae commented Sep 2, 2024

maxencefaldor commented Sep 2, 2024

cgarciae commented Sep 2, 2024 •

edited

Loading

maxencefaldor commented Sep 2, 2024

mmorinag127 commented Sep 12, 2024

cgarciae commented Sep 17, 2024 •

edited

Loading

mmorinag127 commented Sep 24, 2024

How to freeze parameters with nnx and optax? #4167

How to freeze parameters with nnx and optax? #4167

Comments

maxencefaldor commented Sep 1, 2024

cgarciae commented Sep 2, 2024

maxencefaldor commented Sep 2, 2024

cgarciae commented Sep 2, 2024 • edited Loading

maxencefaldor commented Sep 2, 2024

mmorinag127 commented Sep 12, 2024

cgarciae commented Sep 17, 2024 • edited Loading

mmorinag127 commented Sep 24, 2024

How to freeze parameters with `nnx` and `optax`? #4167

How to freeze parameters with `nnx` and `optax`? #4167

cgarciae commented Sep 2, 2024 •

edited

Loading

cgarciae commented Sep 17, 2024 •

edited

Loading