Enhance wrapped distributions #414

alyst · 2022-06-25T06:13:32Z

Add basic WrappedDistribution type for NoDist and NamedDist and teach them a few tricks like length() and bijector().
I've discovered that these methods are missing when trying to do

DynamicPPL.tilde_assume!!(context, NoDist(prior), @varname(v), varinfo)

where prior was a Product multivariate. With the changes implemented in this PR it is working.

devmotion · 2022-06-25T19:13:55Z

I'm slightly worried about the additional complexity introduced by the new abstract type and functions such as wrapped_dist and wrapped_dist_type. Can't we just add whatever definition was missing?

In general, both distributions are only used internally in DynamicPPL and hence only the parts of the Distributions API relevant for DynamicPPL are implemented. What exactly was missing? Did you actually try to call tilde_assume!! directly?

alyst · 2022-06-25T19:51:20Z

I'm slightly worried about the additional complexity introduced by the new abstract type and functions such as wrapped_dist and wrapped_dist_type. Can't we just add whatever definition was missing?

It's just one abstract type and a very few standard boilerplate defs around it (wrapped_distr() etc). OTOH it allows to avoid the duplication of method definitions like length() etc. I see your point, but I think both approaches have advantages in terms of maintenance. Before this patch I had errors about length() and bijector() missing for NoDist, but I can see how more methods from Distributions API might be required in the future, so this PR makes it easier to add them.

Did you actually try to call tilde_assume!! directly?

Yes, I'm not using @model macro, I'm using DynamicPPL directly to have more control and flexibility in statistical models generation.

devmotion · 2022-06-25T19:57:40Z

It's just one abstract type and a very few standard boilerplate defs around it (wrapped_distr() etc). OTOH it allows to avoid the duplication of method definitions like length() etc. I see your point, but I think both approaches have advantages in terms of maintenance. Before this patch I had errors about length() and bijector() missing for NoDist, but I can see how more methods from Distributions API might be required in the future, so this PR makes it easier to add them.

I can see that point, but I'm probably biased here towards not adding additional types and things that are potentially useful at some point in the future due to the history of DynamicPPL, and VarInfo in particular: At this point it is really unclear what methods in varinfo.jl are needed, useful or should be removed. That even motivated a complete refactor and rewrite but it is still messy.

So my suggestion would be

add a MWE to the tests that is currently failing
and add only the missing definitions that make the test pass.

Did you actually try to call tilde_assume!! directly?

It would be interesting to know if that can be reproduced with a regular @model as well, or if there is some problem with how tilde_assume!! was called.

yebai

It looks sensible to me. I agree with @devmotion that we want to be careful about introducing additional types and functions into DynamicPPL in general. It does seem that this PR only adds an internal type that fixes some known issues.

For the future, we probably want to move distribution_wrappers into src/contrib so it is clear they are not part of the official DynamicPPL API.

devmotion · 2022-06-27T19:13:10Z

Can we add at least tests for every new function and type and fix the CI errors?

And I think it would be nice to see as well what actually went wrong and what has to be fixed.

devmotion · 2022-06-27T19:14:30Z

Oh it seems maybe @torfjelde has already fixed the problems in 0f9765b?

alyst · 2022-06-27T21:07:39Z

Oh it seems maybe @torfjelde has already fixed the problems in ...

It doesn't define the bijector for NoDist though.

I've added MWE to the tests.

devmotion · 2022-06-27T21:29:32Z

test/context_implementations.jl

+                x ~ NoDist(Product(fill(Uniform(-20, 20), 5)))
+                for i in eachindex(x)
+                    x[i] ~ Normal(0, 1)
+                end


This seems quite surprising, I have never seen anyone using NoDist in a model. I'm also not sure, why would you want to do that? When would such a model as the example here be useful?

This seems quite surprising, I have never seen anyone using NoDist in a model. I'm also not sure, why would you want to do that? When would such a model as the example here be useful?

a) This is a MWE
b) In the real usecase the length of the variable is ~500 elements. When I'm using x[i] ~ ... (or dot_tilde_assume()), the profiling indicates that with the current state of DynamicPPL ~50% of time is spent on indexing individual elements. That's why I've switched to multivariate distribution. With multivariate distribution the indexing overhead is resolved.
c) In the real usecase the prior is logpdf.(Ref(Normal(mean(x), sigma)), x) |> sum |> addlogp!!, so NoDist helps to declare x and its domain (also see d).
d) In the real usecase I'm switching between the evolutionary programming (BlackBoxOptim.jl) and gradient-based methods to get the MAP estimates. So while the model allows alternative parametrization, e.g. xmean ~ Normal(0, 1), xdelta .~ Normal(0, sigma), x = xmean .+ xdelta, it would be suboptimal for crossover operations; also it would introduce one extra degree of freedom.
e) I appreciate your concerns regarding the usability of MWE, but I think the problem of wrapped distributions not supporting all necessary Distributions.jl API is there, and the tests do cover that.

NoDist is an internal workaround/implementation detail but as NamedDist it's no "proper" user-facing distribution. Therefore it was not supposed to be used in a model directly, and not tested and implemented to support such use cases.

More generally, your workarounds and use of internal functionality (also addlogp!! is somewhat internal, the user-facing alternative is @addlogprob! which is still somewhat dangerous - IIRC in some cases it leads to incorrect or at least surprising results) make me wonder if there is some other functionality missing or some part of DynamicPPL that should be changed. I don't think the best solution is to start promoting and supporting such workarounds but rather we should better support the actual use cases and models in the first place. I think ideally you just implement your model in the most natural way and it works.

One thing is still not clear to me (also in your real usecase): Why do you want to declare x with a NoDist?

rather we should better support the actual use cases and models in the first place

I guess what I'm trying to achieve here with NoDist() is to declare x first, and define its prior later.

Why do you want to declare x with a NoDist?

It's not necessary, but I wanted to avoid calculating Uniform priors, both for performance and for having meaningful probabilities.

I guess what I'm trying to achieve here with NoDist() is to declare x first, and define its prior later.

But what I don't understand is why do you add a statement with NoDist first? You could just provide x as data to the model (if it is not sampled) or sample it from the actual priors (and here just preallocate the array first).

Having different statements for x where one is basically wrong seems a bit strange.

It's not necessary, but I wanted to avoid calculating Uniform priors, both for performance and for having meaningful probabilities.

But if x has a uniform prior, you should use it properly, shouldn't you? If you don't want to include the prior in your log density calculations you could condition on x or only evaluate the loglikelihood (you can even just do it for a subset of parameters).

Thanks! Would it work properly if I declare truncated(Flat(), a, b) distribution?

Yeah that should work.

@devmotion I'm a bit confused as to whether or not your saying that the fact that @alyst has to do this to achieve the desired performance is undesirable or if you're suggesting that he can achieve the same performance by writing it in a for-loop and pre-allocating? Because if you're saying the former, I think we're all on the same page.

Yes, I meant that it's undesirable that apparently workarounds such as two tilde statements for the same variable are needed to achieve performance.

Maybe we should add an offical way for declaring a variable in the model (i.e., registering it without distribution)? Possibly an official macro (similar to @addlogprob!) that would then make sure that it ends up in the variable structure. I just don't know how it would be implemented exactly. Maybe it would be easiest to only support SimpleVarInfo? I assume it could be useful in cases where you would like to loop but don't want to end up with n different variables x[1], ..., x[n] in the resulting named tuple or dictionary. Alternatively, maybe we could add something like a (arguably also a bit hacky) For/Map distribution that would allow one to write something like

@model function ... ... x ~ For(1:n) do i f(i) end ... end

The main difference to the existing possibilities would be that 1) it does not require preallocating an array etc. (such as .~), 2) it does not create n different variables x[1], ..., x[n] (such as a regular for loop), 3) it does not require allocating an array of distributions (such as arraydist/product_distribution) but only create the individual distributions on the fly.

Maybe the better approach would be to not introduce a new distribution but just support something like arraydist(f, xs).

I guess one of the main challenges would be to figure out what the type of arraydist(f, xs) should be. It might not be possible to infer if it is a MultivariateDistribution, MatrixDistribution etc. in general I assume.

Would it work properly if I declare truncated(Flat(), a, b) distribution?

Yeah that should work.

Actually, Flat() doesn't define cdf(), which is required for truncated(). But even if we define cdf(d::Flat, x) = one(x), then P(a <= d <= b) would be zero. So it would trigger an error in truncated(), and most likely in many other places.
One can define the new FlatBounded(a, b) pseudodistribution, but it looks very similar to NoDist(Uniform(a, b)) to me (except the transformation).

Actually, Flat() doesn't define cdf(), which is required for truncated(). But even if we define cdf(d::Flat, x) = one(x), then P(a <= d <= b) would be zero. So it would trigger an error in truncated(), and most likely in many other places.
One can define the new FlatBounded(a, b) pseudodistribution, but it looks very similar to NoDist(Uniform(a, b)) to me (except the transformation).

Ah I guess this is why we have the FlatPositive rather than just using truncated. But yes, it ends up being very similar to NoDist but not quite: the logpdf_with_transform is going to be different. For NoDist we want no correction but for something like FlatPositive we do we want correction.

So I've added bijector for NoDist in #415 now because it's useful for the new getindex(vi, vn, dist) methods introduced (also found a pretty significant bug when combining NoDist + transformed VarInfo) 👍

But, as I said previously, this will produce different results than something like FlatPositive which will, unlike NoDist, also include the log-absdet-jacobian correction.

torfjelde · 2022-06-28T12:28:26Z

It doesn't define the bijector for NoDist though.

I actually didn't do this deliberately because I'm uncertain if we ever want to hit this. NoDist should represent "don't do anything with this variable", but if we at some point hit bijector(nodist), then this indicates that we might be trying to compute the logabsdetjac correction which actually shouldn't be included in the log-joint computation 😕

So are we certain adding this implementation isn't doing something silently incorrect?

EDIT: See #414 (comment)

devmotion · 2022-06-29T21:36:32Z

I was just looking at https://github.com/TuringLang/Turing.jl/blob/master/src/stdlib/distributions.jl for completely unrelated reasons, and discovered
some definitions of Bijectors.logpdf_with_trans(::NoDist, x, t) 😮

Regardless of whether they are useful etc., this seems like one of the worst places to hide them 😄

torfjelde · 2022-06-30T10:21:34Z

I was just looking at https://github.com/TuringLang/Turing.jl/blob/master/src/stdlib/distributions.jl for completely unrelated reasons, and discovered some definitions of Bijectors.logpdf_with_trans(::NoDist, x, t) open_mouth

Regardless of whether they are useful etc., this seems like one of the worst places to hide them smile

Those shouldn't be there 😳

alyst · 2022-10-01T05:26:04Z

bors try

bors · 2022-10-01T05:26:06Z

🔒 Permission denied

Existing reviewers: click here to make alyst a reviewer

ParadaCarleton · 2022-12-19T19:53:05Z

bors try

bors · 2022-12-19T20:14:08Z

try

Build failed:

test (1, ubuntu-latest, x64, 1)

ParadaCarleton · 2022-12-19T21:57:56Z

bors try

ParadaCarleton · 2022-12-19T21:58:13Z

@alyst Very sorry for the delay; looks like tests aren't passing ATM.

bors · 2022-12-19T22:18:16Z

try

Build failed:

test (1, ubuntu-latest, x64, 2)

devmotion · 2022-12-21T13:47:52Z

Maybe I missed something (haven't checked this PR for a while) but I think @torfjelde's and my concerns above are still valid?

yebai approved these changes Jun 27, 2022

View reviewed changes

devmotion reviewed Jun 27, 2022

View reviewed changes

devmotion mentioned this pull request Jun 30, 2022

Remove implementations of logpdf_with_trans for NoDist TuringLang/Turing.jl#1849

Merged

alyst force-pushed the enhance_wrapped_distr branch from a4f3f28 to ebc634b Compare August 2, 2022 22:42

alyst force-pushed the enhance_wrapped_distr branch from ebc634b to e9291c7 Compare September 30, 2022 16:57

bors bot added a commit that referenced this pull request Dec 19, 2022

Try #414:

430bd21

bors bot added a commit that referenced this pull request Dec 19, 2022

Try #414:

17bffbd

alyst force-pushed the enhance_wrapped_distr branch 2 times, most recently from dd7b80b to 0fbe51f Compare March 21, 2023 19:31

alyst added 4 commits April 2, 2023 14:21

enhance wrapped distributions

f90056b

distr_wrappers: add tests for multivariate distrs

7f79c23

add tests for model with multivariate NoDist

bcca942

fix commented out tests

ec07ed5

alyst added 5 commits April 2, 2023 14:21

fix reviewdog formatting issues

e8710b3

2nd round of reviewdog fixes

2976bbf

refer WrappedDist and NoDist from API docs

ece33c6

export WrappedDist to make docs happy

490257a

3rd round of trying to make the format doggy happy

71f3304

alyst force-pushed the enhance_wrapped_distr branch from 0fbe51f to 71f3304 Compare April 2, 2023 21:25

Merge branch 'master' into enhance_wrapped_distr

017fa67

yebai mentioned this pull request Oct 24, 2024

Adds @returned_quantities macro #696

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance wrapped distributions #414

Enhance wrapped distributions #414

alyst commented Jun 25, 2022

devmotion commented Jun 25, 2022

alyst commented Jun 25, 2022

devmotion commented Jun 25, 2022

yebai left a comment

devmotion commented Jun 27, 2022

devmotion commented Jun 27, 2022

alyst commented Jun 27, 2022

devmotion Jun 27, 2022

alyst Jun 27, 2022

devmotion Jun 27, 2022

alyst Jun 28, 2022

devmotion Jun 28, 2022

torfjelde Jun 28, 2022

devmotion Jun 28, 2022

alyst Jun 28, 2022 •

edited

Loading

torfjelde Jun 29, 2022

torfjelde Jul 2, 2022

torfjelde commented Jun 28, 2022 •

edited

Loading

devmotion commented Jun 29, 2022

torfjelde commented Jun 30, 2022

alyst commented Oct 1, 2022

bors bot commented Oct 1, 2022

ParadaCarleton commented Dec 19, 2022

bors bot commented Dec 19, 2022

ParadaCarleton commented Dec 19, 2022

ParadaCarleton commented Dec 19, 2022

bors bot commented Dec 19, 2022

devmotion commented Dec 21, 2022

Enhance wrapped distributions #414

Are you sure you want to change the base?

Enhance wrapped distributions #414

Conversation

alyst commented Jun 25, 2022

devmotion commented Jun 25, 2022

alyst commented Jun 25, 2022

devmotion commented Jun 25, 2022

yebai left a comment

Choose a reason for hiding this comment

devmotion commented Jun 27, 2022

devmotion commented Jun 27, 2022

alyst commented Jun 27, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alyst Jun 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

torfjelde commented Jun 28, 2022 • edited Loading

devmotion commented Jun 29, 2022

torfjelde commented Jun 30, 2022

alyst commented Oct 1, 2022

bors bot commented Oct 1, 2022

ParadaCarleton commented Dec 19, 2022

bors bot commented Dec 19, 2022

try

ParadaCarleton commented Dec 19, 2022

ParadaCarleton commented Dec 19, 2022

bors bot commented Dec 19, 2022

try

devmotion commented Dec 21, 2022

alyst Jun 28, 2022 •

edited

Loading

torfjelde commented Jun 28, 2022 •

edited

Loading