Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with new DPPL version #1900

Merged
merged 37 commits into from
Nov 12, 2022
Merged

Compatibility with new DPPL version #1900

merged 37 commits into from
Nov 12, 2022

Conversation

torfjelde
Copy link
Member

@torfjelde torfjelde commented Nov 7, 2022

The goal of this PR is just to make Turing work with the new DynamicPPL version, not to obtain full feature-parity of SimpleVarInfo and VarInfo (though it is a significant step towards exactly this).

Main changes are:

  1. unflatten is now used to convert a vector into a AbstractVarInfo.
  2. Use (inv)link!! instead of deprecated (inv)link!.
  3. setindex!! instead of setindex!.
  4. Use evaluate and capture resulting AbstractVarInfo to also support immutable impls of AbstractVarInfo.

Closes #1899 and #1898

src/inference/mh.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

So it seems like on MacOS, the estimates are a bit more noisy, e.g. some tests are failing because now the error is slightly above the atol=0.2 in the https://github.com/TuringLang/Turing.jl/blob/master/test/inference/Inference.jl#L76-L80.

And there's a bug with MH (it's producing completely incorrect results). Trying to figure that out now.

@torfjelde
Copy link
Member Author

So it seems like on MacOS, the estimates are a bit more noisy, e.g. some tests are failing because now the error is slightly above the atol=0.2 in the https://github.com/TuringLang/Turing.jl/blob/master/test/inference/Inference.jl#L76-L80.

@yebai @devmotion Regarding this, should I just increase the threshold? It def seems like the samplers are doing the right thing (and locally on linux, the tests are passing).

@yebai
Copy link
Member

yebai commented Nov 10, 2022

Sure

@torfjelde
Copy link
Member Author

Hmm, one thing to note here: it seems the tests takes ~2X to run from before (I just looked at some previously closed PRs) 😕 I'm uncertain if this is compile-time or runtime, but nonetheless maybe something to be aware of.

@devmotion
Copy link
Member

Hmm that's very unfortunate. It would be good to know where the regression comes from, if it is run-time or compilation time, if it is AD-backend specific etc. Maybe comparing the timings and allocations in the logs (e.g., https://github.com/TuringLang/Turing.jl/actions/runs/3441679090/jobs/5741483837#step:6:825) could give some hints?

@yebai
Copy link
Member

yebai commented Nov 11, 2022

@torfjelde Re performance regression, can you check

Also, if this PR has not broken tests, I am happy to merge it as is, and then fix regression in a new PR. This will be alright if we don't make any new releases until performance regression is fixed.

@torfjelde
Copy link
Member Author

Good news! It looks like the Emcee sampler is the cause.

  1. Current
  2. Old

I need to have a look at what's causing this though. It's also "interesting" that the effect is significantly different between architectures.

@torfjelde
Copy link
Member Author

whether we accidentally evaluate the model twice somewhere in the flatten/unflatten pipeline?

And just for the record, we don't do any execution of the model in this. In particular, for VarInfo, nothing has changed in the flatten/unflatten functionality, i.e. unflatten(vi, spl, x) = VarInfo(vi, spl, x).

@torfjelde
Copy link
Member Author

Okay, so this is all very weird.

If I run the tests locally, then the same slowdown occurs when I hit include("inference/emcee.jl").

But if I copy-paste the code from runtests.jl up until and including the include("inference/emcee.jl") and execute it in the REPL (using the env created by TestEnv.activate), then it runs in 30s, as before.

So I'm very confused.

Have you seen anything like this before @devmotion?

@devmotion
Copy link
Member

Maybe, I'm not sure. But I've definitely seen cases where Pkg.test behaves differently from running the same code in the REPL.

The more obvious difference is that Pkg.test starts a Julia process with some specific commandline options (by default), leading to differences such as described in https://discourse.julialang.org/t/erratic-test-failure-difference-between-test-and-include-test-runtests-jl/72342 and similar discourse posts. So the first thing to check might be to start the Julia REPL with exactly the same commandline options as the ones used by Pkg.test (you could e.g. inspect Base.julia_cmd()).

I've also seen differences due to the use of Revise, so it might be good to also check the behaviour if Revise is not loaded (e.g., by starting a clean Julia process with --startup-file=no).

I guess you already checked if the same package versions are used?

And then, of course, scoping is different in the REPL: https://docs.julialang.org/en/v1/manual/variables-and-scoping/

@torfjelde
Copy link
Member Author

After reading your reply @devmotion I was reminded of how depwarn=yes has often be the cause of perf regressions like this, aaaand lo and behold it indeed was the case. Once TuringLang/DynamicPPL.jl#433 is through, we should be good here.

bors bot pushed a commit to TuringLang/DynamicPPL.jl that referenced this pull request Nov 11, 2022
This PR fixes the performance regressions seen for `Emcee` in TuringLang/Turing.jl#1900.

@yebai @devmotion This should be an easy merge.
test/runtests.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Member Author

Oh you made reverted some of my changes @yebai

Increasing the number of samples helped but didnt' solve it. I guess I'll just increase further.

test/inference/mh.jl Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 12, 2022

Codecov Report

Base: 81.49% // Head: 81.24% // Decreases project coverage by -0.25% ⚠️

Coverage data is based on head (ab69a21) compared to base (2d41f09).
Patch coverage: 87.35% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1900      +/-   ##
==========================================
- Coverage   81.49%   81.24%   -0.26%     
==========================================
  Files          21       21              
  Lines        1421     1418       -3     
==========================================
- Hits         1158     1152       -6     
- Misses        263      266       +3     
Impacted Files Coverage Δ
src/stdlib/distributions.jl 57.95% <0.00%> (ø)
src/inference/hmc.jl 77.41% <44.44%> (-0.65%) ⬇️
src/modes/ModeEstimation.jl 81.96% <85.71%> (-0.82%) ⬇️
src/Turing.jl 82.35% <100.00%> (+1.10%) ⬆️
src/contrib/inference/dynamichmc.jl 100.00% <100.00%> (ø)
src/contrib/inference/sghmc.jl 98.50% <100.00%> (ø)
src/inference/emcee.jl 94.11% <100.00%> (ø)
src/inference/gibbs.jl 97.33% <100.00%> (ø)
src/inference/mh.jl 84.37% <100.00%> (-0.70%) ⬇️
src/modes/OptimInterface.jl 89.18% <100.00%> (+0.14%) ⬆️
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants