Batch normalization #46

jmm34 · 2024-03-21T16:37:48Z

I've found the Zuko library to be extremely beneficial for my work. I sincerely appreciate the effort that has gone into its development. In the Masked Autoregressive Flow paper (NeurIPS, 2017), the authors incorporated batch normalization following each autoregressive layer. Could this modification be integrated into the MaskedAutoregressiveTransform function?

francois-rozet · 2024-03-21T17:24:10Z

Hello @jmm34, thanks for the kind words.

I am not a fan of batch normalization as it often leads to train/test gaps which are hard to diagnose, but I see why one would want to use it (mainly faster training).

IMO the best way to add batch normalization in Zuko would be to implement a standalone (lazy) BatchNormTranform. The user can then insert batch norm transformations anywhere in the flow.

We would accept a PR that implements this.

Edit: I think that using the current batch statistics to normalize is invalid as it would not be an invertible transformation $y = f(x)$ (impossible to know $x$ given $y$). So, we should use running statistics both during training and evaluation, and update these statistics during training.
Also, I am not sure that the scale and shift parameters are relevant (mean zero, unit variance is the target).

jmm34 · 2024-03-21T17:55:18Z

Dear @francois-rozet, thank you very much for your quick reply. I will try to make a PR using the strategy you suggest.

jmm34 added the enhancement New feature or request label Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch normalization #46

Batch normalization #46

jmm34 commented Mar 21, 2024

francois-rozet commented Mar 21, 2024 •

edited

Loading

jmm34 commented Mar 21, 2024

Batch normalization #46

Batch normalization #46

Comments

jmm34 commented Mar 21, 2024

francois-rozet commented Mar 21, 2024 • edited Loading

jmm34 commented Mar 21, 2024

francois-rozet commented Mar 21, 2024 •

edited

Loading