Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event_ndims to Transform class to fix shape errors in non-autobatching mode #321

Closed

Conversation

jdehning
Copy link

@jdehning jdehning commented Sep 21, 2020

Closes #297

Description of the problem:

When not using auto_batching, the event_shape of the distributions and potentials must match. The reason is, when the log-likelihood is calculated, it is not possible to reduce the sum over all dimensions, as the batch_shape has to be preserved. This works well for distributions, however, transformations add a potential as log_det_jacobian with too high dimensionality. It is assumed in SamplingState.collect_unreduced_log_prob that each potential and distribution consist only of batch_dim as shape, however the backward_log_det_jacobian also includes the event_dim.

Solution

An event_ndims attribute is added to the transform class that saves the number of event dimensions of the distributions it belongs to. This number is passed to the log_det_jacobian function implemented in tensorflow_probability, which then reduces over these axes. At the end of the initialization of a pymc4 distribution, the event_ndims attribute of the transform is set by reading out the event_shape attribute of the just initialized tfp.distributions.distribution.
There is some additional code to take care of the case when the transformation changes the number of dimensions. This change of dimensions is defined in the inverse_event_ndims method of the tfp.bijectors.bijector, which is used to calculate the number of dimensions of the transformed variable.

Comparison to tensorflow_probability

In tensorflow_probability, in TransformedTransitionKernel, a similar problem has been already been solved. Their solution was to calculate the rank of the untransformed likelihood function, which is equal to the length of the batch_shape and subtract it from the rank of the input to each bijector to get event_ndims link. This is however done at the sampling stage. I suppose that theoretically, one could also implement something similar in PyMC4 in the SamplingExecutor, however, I think that it is cleaner to do it already at the initialization stage because the logic for the event_shapes of the distributions is also implemented there.

Implementation details

  • The event_ndims is read out from the tfp.distribution with prefer_static, which is copied from link.
  • One can provide a event_ndims as initialization parameter to a transform object, which than hinders the automatic setting of event_ndims by the distribution class.

Other Changes

@jdehning jdehning marked this pull request as draft September 21, 2020 00:42
@jdehning jdehning force-pushed the transformation_event_ndims branch from d4ac695 to 954c4ca Compare September 22, 2020 10:04
…tching is performed by adding an event_ndims attribute to the Transform class
@jdehning jdehning force-pushed the transformation_event_ndims branch from b4fb90a to e150815 Compare September 22, 2020 10:19
@jdehning jdehning marked this pull request as ready for review September 22, 2020 10:31
@codecov
Copy link

codecov bot commented Sep 22, 2020

Codecov Report

Merging #321 into master will decrease coverage by 0.15%.
The diff coverage is 70.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #321      +/-   ##
==========================================
- Coverage   90.89%   90.74%   -0.16%     
==========================================
  Files          36       36              
  Lines        2822     2841      +19     
==========================================
+ Hits         2565     2578      +13     
- Misses        257      263       +6     
Impacted Files Coverage Δ
pymc4/distributions/multivariate.py 100.00% <ø> (ø)
pymc4/inference/sampling.py 94.11% <ø> (ø)
pymc4/distributions/transforms.py 75.00% <66.66%> (-3.27%) ⬇️
pymc4/distributions/distribution.py 88.88% <100.00%> (+0.26%) ⬆️

@jdehning
Copy link
Author

There is still an error with the notebooks tests. I don't know exactly how to best fix it

@jdehning
Copy link
Author

jdehning commented Sep 22, 2020

The notebook creation is fixed in the makefile. Works now but I am not 100% sure that this is the correct fix.

@fonnesbeck fonnesbeck closed this Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in transformed variables with manual model vectorization
2 participants