Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change .~ to use filldist rather than a loop #824

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -45,7 +45,9 @@ x ~ product_distribution(Normal.(y))
x ~ MvNormal(fill(0.0, 2), I)
```

This is often more performant as well. Note that using `~` rather than `.~` does change the internal storage format a bit: With `.~` `x[i]` are stored as separate variables, with `~` as a single multivariate variable `x`. In most cases this does not change anything for the user, but if it does cause issues, e.g. if you are dealing with `VarInfo` objects directly and need to keep the old behavior, you can always expand into a loop, such as
This is often more performant as well.

The new implementation of `x .~ ...` is just a short-hand for `x ~ filldist(...)`, which means that `x` will be seen as a single multivariate variable. In most cases this does not change anything for the user, with the one notable exception being `pointwise_loglikelihoods`, which previously treated `.~` assignments as assigning multiple univariate variables. If you _do_ want a variable to be seen as an array of univariate variables rather than a single multivariate variable, you can always expand into a loop, such as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to clearly state that each .~ and ~ defines a single random variable. For more flexible condition'ing, I think you could try to support model | x = [missing, 1., 2., missing], which would allow users to condition on a subset of elements in x but still treat x as a single random variable. Again, please document this clearly in breaking changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The new implementation of `x .~ ...` is just a short-hand for `x ~ filldist(...)`, which means that `x` will be seen as a single multivariate variable. In most cases this does not change anything for the user, with the one notable exception being `pointwise_loglikelihoods`, which previously treated `.~` assignments as assigning multiple univariate variables. If you _do_ want a variable to be seen as an array of univariate variables rather than a single multivariate variable, you can always expand into a loop, such as
The new implementation of `x .~ ...` is just a short-hand for `x ~ product_distribution(...)`, which means that `x` will be seen as a single multivariate variable. In most cases this does not change anything for the user, with the one notable exception being `pointwise_loglikelihoods`, which previously treated `.~` assignments as assigning multiple univariate variables. If you _do_ want a variable to be seen as an array of univariate variables rather than a single multivariate variable, you can always expand into a loop, such as


```julia
dists = Normal.(y)
@@ -54,7 +56,7 @@ for i in 1:length(dists)
end
```

Cases where the right hand side is of a different dimension than the left hand side, and neither is a scalar, must be replaced with a loop. For example,
Cases where the right hand side is of a different dimension than the left hand side, and neither is a scalar, must always be replaced with a loop. For example,

```julia
x = Array{Float64,3}(undef, 2, 3, 4)
@@ -70,8 +72,6 @@ for i in 1:3, j in 1:4
end
```

This release also completely rewrites the internal implementation of `.~`, where from now on all `.~` statements are turned into loops over `~` statements at macro time. However, the only breaking aspect of this change is the above change to what's allowed on the right hand side.

### Remove indexing by samplers

This release removes the feature of `VarInfo` where it kept track of which variable was associated with which sampler. This means removing all user-facing methods where `VarInfo`s where being indexed with samplers. In particular,
7 changes: 2 additions & 5 deletions src/compiler.jl
Original file line number Diff line number Diff line change
@@ -514,13 +514,10 @@ end
Generate the expression that replaces `left .~ right` in the model body.
"""
function generate_dot_tilde(left, right)
@gensym dist left_axes idx
@gensym dist
return quote
$dist = DynamicPPL.check_dot_tilde_rhs($right)
$left_axes = axes($left)
for $idx in Iterators.product($left_axes...)
$left[$idx...] ~ $dist
end
$left ~ DynamicPPL.product_distribution(Base.fill($dist, Base.size($left)...))
end
end

16 changes: 5 additions & 11 deletions src/pointwise_logdensities.jl
Original file line number Diff line number Diff line change
@@ -157,8 +157,8 @@ y .~ Normal(μ, σ)
y ~ MvNormal(fill(μ, n), σ^2 * I)
```

In (1) and (2), `y` will be treated as a collection of `n` i.i.d. 1-dimensional variables,
while in (3) `y` will be treated as a _single_ n-dimensional observation.
In (1) `y` will be treated as a collection of `n` i.i.d. 1-dimensional variables,
while in (2) and (3) `y` will be treated as a _single_ n-dimensional observation.

This is important to keep in mind, in particular if the computation is used
for downstream computations.
@@ -216,23 +216,17 @@ OrderedDict{VarName, Matrix{Float64}} with 6 entries:
```

## Broadcasting
Note that `x .~ Dist()` will treat `x` as a collection of
_independent_ observations rather than as a single observation.
Note that `x .~ Dist()` will treat `x` as a single multivariate observation.

```jldoctest; setup = :(using Distributions)
julia> @model function demo(x)
x .~ Normal()
end;

julia> m = demo([1.0, ]);

julia> ℓ = pointwise_logdensities(m, VarInfo(m)); first(ℓ[@varname(x[1])])
-1.4189385332046727

julia> m = demo([1.0; 1.0]);

julia> ℓ = pointwise_logdensities(m, VarInfo(m)); first.((ℓ[@varname(x[1])], ℓ[@varname(x[2])]))
(-1.4189385332046727, -1.4189385332046727)
julia> ℓ = pointwise_logdensities(m, VarInfo(m)); first(ℓ[@varname(x)])
-2.8378770664093453
```

"""
12 changes: 6 additions & 6 deletions src/test_utils/models.jl
Original file line number Diff line number Diff line change
@@ -208,7 +208,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_dot_assume_observe)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

@model function demo_assume_index_observe(
@@ -293,7 +293,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_dot_assume_observe_index)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

# Using vector of `length` 1 here so the posterior of `m` is the same
@@ -374,7 +374,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_dot_assume_observe_index_literal)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

@model function demo_assume_observe_literal()
@@ -458,7 +458,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_assume_submodel_observe_index_literal)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

@model function _likelihood_multivariate_observe(s, m, x)
@@ -492,7 +492,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_dot_assume_observe_submodel)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

@model function demo_dot_assume_observe_matrix_index(
@@ -521,7 +521,7 @@ function logprior_true_with_logabsdet_jacobian(
return _demo_logprior_true_with_logabsdet_jacobian(model, s, m)
end
function varnames(model::Model{typeof(demo_dot_assume_observe_matrix_index)})
return [@varname(s[1]), @varname(s[2]), @varname(m)]
return [@varname(s), @varname(m)]
end

@model function demo_assume_matrix_observe_matrix_index(
9 changes: 2 additions & 7 deletions test/varinfo.jl
Original file line number Diff line number Diff line change
@@ -426,12 +426,7 @@ end
# Transform only one variable
all_vns = vcat(meta.s.vns, meta.m.vns, meta.x.vns, meta.y.vns)
for vn in [
@varname(s),
@varname(m),
@varname(x),
@varname(y),
@varname(x[2]),
@varname(y[2])
@varname(s), @varname(m), @varname(x), @varname(y), @varname(x), @varname(y[2])
]
target_vns = filter(x -> subsumes(vn, x), all_vns)
other_vns = filter(x -> !subsumes(vn, x), all_vns)
@@ -874,7 +869,7 @@ end
varinfo2 = last(
DynamicPPL.evaluate!!(model2, deepcopy(varinfo1), SamplingContext())
)
for vn in [@varname(x), @varname(y[1])]
for vn in [@varname(x), @varname(y)]
@test DynamicPPL.istrans(varinfo2, vn)
end
end
Loading