-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about the prior distribution #6
Comments
In the top prior layer, the mean and logs are shared in spacial dimension in case of non-conditioning (ycond=False), meaning (mean, logs) = tensor(1, 1, 1, 2n). In the implementation, we set (mean, logs)=bias of that conv layer, and let the conv(0) broadcasts the bias to shape (batch_size, height, width, 2n). Nothing more than that. So you can replace it with (mean, logs)= tf.get_variable([1,1,1, 2*n]) and tf.tile() to get the right shape. So, it's just a programming trick. |
I see that the code uses |
It's for training stability, see the experiments section in our paper. |
In the top layer, the prior distribution here use h=conv(0) + embedding as the mean and std in the case of 'ycond=True'.
It seems that the conv layer is unnecessary.
The text was updated successfully, but these errors were encountered: