Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using JET.jl to determine if typed varinfo is okay #728

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

torfjelde
Copy link
Member

After a quick experiment with JET.jl I found some bugs in DynamicPPL.jl (#726), but also realized that we can JET.jl to properly check whether the a given model supports the usage of TypedVarInfo rather than requiring UntypedVarInfo.

This has been a looooooong standing issue, and this seems to work really, really well.

The problem

In Turing.jl, we use TypedVarInfo almost everywhere due to the performance charactersitics that come with it. The problem is that we do so by simply evaluating the given model once and then using the resulting (hopefully, concretety typed) varinfo for all subsequent computations. This works nicely for most typical models, but fails horribly (and uninformably) for a good chunk of models, such as

@model function demo1()
    x ~ Bernoulli()
    if x
        y ~ Normal()
    else
        z ~ Normal()
    end
end

Here we will execute the model once and get, say, a TypedVarInfo containing the variables x and y (because x happend to result in a true sample). If we then re-use this varinfo for sampling, we will ofc run into issues since z is nowhere to be seen.

Technically we can handle this by just widing the container a bit, but if we do that, we need to cpature the new varinfo, which isn't always possible, e.g. when using the LogDensityFunction in a sampler.

As a result, we have a lot of code that just makes the assumption "surely this model is 'static' in what variables and types it contains", which can sometimes be false.

The solution

This PR introduces a determine_varinfo method, which can automagically figure out whether we can use the type stable varinfo properly (i.e. without having to always capture the resulting varinfo, etc.) or if we need to use the untyped varinfo using abstract interpretation offered by JET.jl, all done statically.

Effectively what determine_varinfo does is:

  1. Execute the model once with to get the typed varinfo.
  2. Using JET.jl, statically check if we can run into type issues, e.g. container of NamedTuple{(:x, :y)} cannot handle the value for z being updated (because the entry does not exist).
  3. If we do run into errors, we return an untyped varinfo. If we don't, we return a typed one.

Note that this method doesn't say anything about whether there might be type instabilities; this only checks if we would encounter errors. We can also use JET to check type instabilites, etc., but I think that's a separate functionality and thus PR.

@torfjelde
Copy link
Member Author

See the tests for what we can properly check here. It honestly seems really good for our purposes 👀

torfjelde and others added 3 commits November 28, 2024 15:44
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@yebai
Copy link
Member

yebai commented Nov 28, 2024

That seems like an elegant trick!

Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 91.89189% with 3 lines in your changes missing coverage. Please review.

Project coverage is 84.88%. Comparing base (0a39979) to head (b6b4bff).

Files with missing lines Patch % Lines
src/model_utils.jl 60.00% 2 Missing ⚠️
src/DynamicPPL.jl 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #728      +/-   ##
==========================================
+ Coverage   84.66%   84.88%   +0.21%     
==========================================
  Files          35       36       +1     
  Lines        4180     4207      +27     
==========================================
+ Hits         3539     3571      +32     
+ Misses        641      636       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coveralls
Copy link

coveralls commented Nov 28, 2024

Pull Request Test Coverage Report for Build 12097699282

Details

  • 34 of 37 (91.89%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 84.882%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/DynamicPPL.jl 4 5 80.0%
src/model_utils.jl 3 5 60.0%
Totals Coverage Status
Change from base Build 12088194096: 0.2%
Covered Lines: 3571
Relevant Lines: 4207

💛 - Coveralls

torfjelde and others added 11 commits November 29, 2024 09:25
fallback to current behavior + `supports_varinfo` to `is_suitable_varinfo`
longer needed on Julia 1.10 and onwards + added error hint for when
JET.jl has not been loaded
provided context, but uses `SamplingContext` by default (as this
should be a stricter check than just evaluation)
in sampling context now so no need to handle this explicitly elsewhere
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@torfjelde
Copy link
Member Author

This honestly seem to work really well. I've now made it so that default_sampler and LogDensityFunction also makes use of this. The question is just how well it works with Turing.jl (will try this now).

src/model_utils.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

Uhmm this is weird. Looking at the CI logs, e.g. https://github.com/TuringLang/DynamicPPL.jl/actions/runs/12094832976/job/33727325064?pr=728#step:6:59, this PR is running with [email protected]. However, looking at the Project.toml, this is not allowed in the current commit o.O

As in, it seems as if the CI is running with the Project.toml of the master for some reason? Is this intentional?

@torfjelde
Copy link
Member Author

Hmm, this is interesting. One of the demo models fails the JET.jl check specifically on ubuntu with Julia 1.10; however, I cannot reproduce this error locally on either of my devices (one is macos and one is linux).

I'm guessing the source of the issue is this line

s .~ product_distribution([InverseGamma(2, 3) for _ in 1:d])

in combination with the following impl in Bijectors.jl

https://github.com/TuringLang/Bijectors.jl/blob/d342371da2ab090b5b519265c1951c9322a39879/src/Bijectors.jl#L198-L200

Somehow, this results in a Vector{Any} being broadcasted (see https://github.com/TuringLang/DynamicPPL.jl/actions/runs/12095001178/job/33727679874?pr=728#step:6:314).

I'm uncertain if this is the line

product_distribution([InverseGamma(2, 3) for _ in 1:d])

being inferred to a product of Vector{Any} or if it's because of the line in Bijectors.jl which is

    return logabsdetjac.((bijector(d),), eachcol(x))

and type inference just fails to infer this statement properly.

I'll try to replace the broadcast with a map, i.e.

function _logabsdetjac_dist(d::MultivariateDistribution, x::AbstractMatrix)
    return map(Base.Fix1(logabsdetjac, bijector(d)), eachcol(x))
end

which should be much better type-inference-wise.

torfjelde and others added 2 commits November 30, 2024 10:15
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@torfjelde
Copy link
Member Author

torfjelde commented Nov 30, 2024

Aye, it indeed seems like this requires a fix to Bijectors.jl.

Buuuut if we're hitting this with JET.jl, then we're likely also hitting this with some other things sometimes, e.g. I'd assume that Mooncake.jl would also hit type instabilities here? @willtebbutt

EDIT: TuringLang/Bijectors.jl#354

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants