API design for non-Turing samplers #886

yebai · 2019-08-12T20:07:56Z

One important design goal of Turing is to making implementing new sampling algorithms easier. There is already an ongoing effort for decoupling modelling and inference algorithms (see. e.g. #793, #746, #750). For the next steps, it would be helpful to

discuss concretely what APIs are required from the compiler to implement a new sampler
implement a few Turing wrappers for the existing sampler packages, e.g.
- https://github.com/anthofflab/RobustAdaptiveMetropolisSampler.jl
- https://github.com/aplavin/Polychord.jl
- https://github.com/alan-turing-institute/PDSampler.jl
- https://github.com/QuantifyingUncertainty/GeneralizedMetropolisHastings.jl
- https://github.com/mauro3/KissMCMC.jl
- https://github.com/madsjulia/AffineInvariantMCMC.jl
- A C++ library of Markov Chain Monte Carlo (MCMC) methods https://www.kthohr.com/mcmclib.html

More

https://github.com/madsjulia/Mads.jl - https://lanl.gov/

Related: #895 #889

trappmartin · 2019-08-12T20:38:22Z

I think the following additional functions could be a good start. (I think that the assume and observe statements should be kept.) Please correct me if I’m wrong or if I’m missing something.

parameters(model) -> Vector{VarName}
prior(model, vn) -> Distribution
logpdf(model, vi)
logjoint(model, vi)
rand(model) -> VarInfo

In addition it would be good in my opinion to have a way to track the results (logpdf) of each assume and/or each observe statement. This could be stored in an additional data type called Trace.

For this purpose I would add the following macros.

@addlogjoint logpdf(d, x) # add return value to log joint using current VarInfo
@track logpdf(d, x) # track log pdf value
@track z = x-1 # track result of expression

Those macros can be used by the user in the model block to track expressions or by the developer of inference algos.

cpfiffer · 2019-08-13T01:10:46Z

Are we assuming for this conversation that people are building a new InferenceAlgorithm, or do you want the discussion to be broader to include cases where people get rid of Turing's Sampler struct?

If the goal is just to look at new inference functions, we can streamline what we have pretty well. By default #793 will add a VarInfo to the Sampler if you don't provide a SamplerState object. It might also be worth moving the VarInfo to the Sampler struct itself rather than in the state, so that it's basically impossible for a user to not have a VarInfo we can just manipulate for them with some common functions.

Some general functions that would work using the enforced VarInfo scheme:

parameters(model, spl) or get_parameters(model, spl) to retrieve a vector of parameters or a NamedTuple
parameters!(model, spl, val) or set_parameters!(model, spl, val) to set the parameters to a value (like vi[spl])
For MH-like samplers, it might be useful to have propose(model, spl) that returns an AbstractProposal type with a convenient default like the MH proposal. The proposal might include a log PDF, some parameters values, and maybe some other stuff. Then you could call accept!(model, spl, proposal) and it would handle all the updating for you. Users who cared a lot about the proposal mechanism could overload or wrap this function if they needed to.

One thing that we might look at is how close we can make the code in Turing to the pseudo-code you might see in an algorithm block in a new paper. Turing's pretty good on the model side, but the pseudo-code side is not quite as intuitive. I think the functions mentioned by @trappmartin would go a long way towards making everything generally accessible.

trappmartin · 2019-08-13T09:11:51Z

I like we can make this in several iterations and get rid of the Sampler in the last iteration? It would probably already be nice if one can interface its own package, e.g. nested sampling, into Turing without too much hassle.

I'm not sure about the MH-like functions. But I might miss something. Maybe instead we could have simple access to internal samplers (MH, IS) and sample a value using mh(model, vn) -> Real?

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

torfjelde · 2019-09-13T05:56:36Z

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

Sorry, I completely missed this mention; didn't see it until yesterday. Also not sure if this discussion is still ongoing?

Anyways, for VI in particular the logjoint(model, vi) method proposed would be very useful. In fact I've just implemented such a method internally in my new VI PR for convenience (not meant to be exported):
https://github.com/torfjelde/Turing.jl/blob/547bb7b0f24d015371f0470345b737e52af95123/src/variational/advi.jl#L95-L100

Also a way of tracking the logpdf for each assume and observe call would be great since it would allow computation of p(x | z) and p(z). This is related to #817.

Worth noting that for using vi the Model is already instantiated, so tracking logpdf computations by adding a @track logpdf into the model wouldn't be too helpful in the VI-case.

Similarly, the parameters and parameters! functions would also be useful. Internally it's okay to just extract the parameters "manually" from the VarInfo, but for users to construct their own variational posteriors (other than mean-field approx) these methods would make it much more intuitive.

The @track macro mentioned is also something I've personally found myself wanting at times (not for VI-purposes though). I have some past experience using pymc3 in Python, and found myself using such a feature quite often. It's of course not necessary since these determinstic values can be computed with the samples of the random variables, but it's quite convenient.

trappmartin · 2019-09-13T06:22:42Z

Yeas, you are right. Things like @track are mostly relevant for the user and not so much for the internals or devs.

cpfiffer · 2019-09-17T14:15:23Z

Another bit of low-hanging fruit is to make a function model(spl). Currently the signature is model(vi, spl), but since all the samplers have a vi we can just reduce the call size.

cpfiffer · 2019-09-18T13:42:33Z

In reference to this comment:

Yes; I added it here a few days ago. I feel we can have a really flexible/advanced MH sampler in Turing, by falling back to different MH backends (e.g. Emsemble, RAM, vanilla MH).

I wonder if we should spin off an "AdvancedMH" package to show people how you might build a Turing-interfaced library.

yebai · 2019-09-18T13:44:36Z

Sounds good to me, but maybe we can do the implementation in the Turing repo first? This avoids the pain of keeping two packages synced.

cpfiffer · 2019-09-25T15:23:47Z

I'm going through some of these suggestions while I'm updating the MH code, and I'm wondering if we can just give everyone default observe/assume statements that just mostly work all the time. Is anyone opposed to updating the ones here a little?

Specifically, I'd like to do two things:

Remove the type constraints so that this function catches all samplers where they do not have a custom observe/assume.
Implement an insupport call in order to return -Inf when something's out of bounds, or adding an insupport check to the logpdf_with_trans call.

The reason I want a generically useful observe/assume set is that a lot of the inference logic we currently do in assume (take MH for example) should probably not be the go-to for developers. The main place for inference should be step!.

Additionally, most inference algorithms are already implementing the most basic possible assume statement (HMC) so we could really just remove a lot of these sampler-specific definitions and just stick with one well-designed default that new developers can just default to without thinking too hard about it.

Thoughts on that?

cpfiffer · 2019-09-25T15:23:58Z

Woops, didn't mean to close that. Sorry.

trappmartin · 2019-09-25T15:40:32Z

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

Maybe it is better to make an extra PR for this?

cpfiffer · 2019-09-25T17:12:52Z

Probably. I think we basically just need to implement the observe/assume pairs that turns the model block into a log-density function, because the newer interface stuff lends itself a lot better to cases where step! functions spit out immutable Transition structs instead of doing the state mutation we currently do everywhere.

I have a prototype function with the signature Transition(model::Model, spl::Sampler, theta::T) which just sets the VarInfo to theta and the runs the model to get the PDF. I like the work flow a lot more.

cpfiffer · 2019-09-25T17:22:48Z

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

I should clarify that my proposal has little to do with dispatch on a sampler type -- it should default to basically the following, regardless of sampler type:

function assume(. . . )
    return vi[vn], logpdf(dist, vi[vn])
end

This is just something helpful for foreign developers. If they want to manually modify the contents of the VarInfo, then this essentially turns the model block into a big log density function and they don't even have to import observe or assume. People who really like the way we do assume now can overload their own version, but I think just relaxing the type constraints and providing a super general function is really helpful for different types of workflows.

xukai92 · 2019-10-27T22:43:01Z

Having this default assume and observe makes a lot of sense to me. I think we should provide two abstract types, one with assume and observe falling back to the current HMC's and one with them falling back to the current IS's. This should cover a lot of the sampling algorithm as a default.

yebai · 2021-12-16T19:20:19Z

Closed in favour of #1735

yebai added the doc label Aug 12, 2019

yebai added the discussion label Sep 17, 2019

yebai pinned this issue Sep 17, 2019

yebai mentioned this issue Sep 18, 2019

Wrapping an external sampler: PolyChord #895

Closed

cpfiffer closed this as completed Sep 25, 2019

cpfiffer reopened this Sep 25, 2019

cpfiffer mentioned this issue Sep 28, 2019

Early API functions and the interface guide. #920

Closed

cpfiffer mentioned this issue Dec 25, 2019

VarInfo Goals TuringLang/DynamicPPL.jl#7

Closed

yebai unpinned this issue Jan 28, 2021

yebai closed this as completed Dec 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API design for non-Turing samplers #886

API design for non-Turing samplers #886

yebai commented Aug 12, 2019 •

edited

Loading

trappmartin commented Aug 12, 2019 •

edited

Loading

cpfiffer commented Aug 13, 2019

trappmartin commented Aug 13, 2019

torfjelde commented Sep 13, 2019

trappmartin commented Sep 13, 2019

cpfiffer commented Sep 17, 2019

cpfiffer commented Sep 18, 2019

yebai commented Sep 18, 2019 •

edited

Loading

cpfiffer commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

trappmartin commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

xukai92 commented Oct 27, 2019

yebai commented Dec 16, 2021

API design for non-Turing samplers #886

API design for non-Turing samplers #886

Comments

yebai commented Aug 12, 2019 • edited Loading

trappmartin commented Aug 12, 2019 • edited Loading

cpfiffer commented Aug 13, 2019

trappmartin commented Aug 13, 2019

torfjelde commented Sep 13, 2019

trappmartin commented Sep 13, 2019

cpfiffer commented Sep 17, 2019

cpfiffer commented Sep 18, 2019

yebai commented Sep 18, 2019 • edited Loading

cpfiffer commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

trappmartin commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

cpfiffer commented Sep 25, 2019

xukai92 commented Oct 27, 2019

yebai commented Dec 16, 2021

yebai commented Aug 12, 2019 •

edited

Loading

trappmartin commented Aug 12, 2019 •

edited

Loading

yebai commented Sep 18, 2019 •

edited

Loading