Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API design for non-Turing samplers #886

Closed
yebai opened this issue Aug 12, 2019 · 15 comments
Closed

API design for non-Turing samplers #886

yebai opened this issue Aug 12, 2019 · 15 comments
Labels

Comments

@yebai
Copy link
Member

yebai commented Aug 12, 2019

One important design goal of Turing is to making implementing new sampling algorithms easier. There is already an ongoing effort for decoupling modelling and inference algorithms (see. e.g. #793, #746, #750). For the next steps, it would be helpful to

More

Related: #895 #889

@yebai yebai added the doc label Aug 12, 2019
@trappmartin
Copy link
Member

trappmartin commented Aug 12, 2019

I think the following additional functions could be a good start. (I think that the assume and observe statements should be kept.) Please correct me if I’m wrong or if I’m missing something.

  • parameters(model) -> Vector{VarName}
  • prior(model, vn) -> Distribution
  • logpdf(model, vi)
  • logjoint(model, vi)
  • rand(model) -> VarInfo

In addition it would be good in my opinion to have a way to track the results (logpdf) of each assume and/or each observe statement. This could be stored in an additional data type called Trace.

For this purpose I would add the following macros.

  • @addlogjoint logpdf(d, x) # add return value to log joint using current VarInfo
  • @track logpdf(d, x) # track log pdf value
  • @track z = x-1 # track result of expression

Those macros can be used by the user in the model block to track expressions or by the developer of inference algos.

@cpfiffer
Copy link
Member

Are we assuming for this conversation that people are building a new InferenceAlgorithm, or do you want the discussion to be broader to include cases where people get rid of Turing's Sampler struct?

If the goal is just to look at new inference functions, we can streamline what we have pretty well. By default #793 will add a VarInfo to the Sampler if you don't provide a SamplerState object. It might also be worth moving the VarInfo to the Sampler struct itself rather than in the state, so that it's basically impossible for a user to not have a VarInfo we can just manipulate for them with some common functions.

Some general functions that would work using the enforced VarInfo scheme:

  • parameters(model, spl) or get_parameters(model, spl) to retrieve a vector of parameters or a NamedTuple
  • parameters!(model, spl, val) or set_parameters!(model, spl, val) to set the parameters to a value (like vi[spl])
  • For MH-like samplers, it might be useful to have propose(model, spl) that returns an AbstractProposal type with a convenient default like the MH proposal. The proposal might include a log PDF, some parameters values, and maybe some other stuff. Then you could call accept!(model, spl, proposal) and it would handle all the updating for you. Users who cared a lot about the proposal mechanism could overload or wrap this function if they needed to.

One thing that we might look at is how close we can make the code in Turing to the pseudo-code you might see in an algorithm block in a new paper. Turing's pretty good on the model side, but the pseudo-code side is not quite as intuitive. I think the functions mentioned by @trappmartin would go a long way towards making everything generally accessible.

@trappmartin
Copy link
Member

I like we can make this in several iterations and get rid of the Sampler in the last iteration? It would probably already be nice if one can interface its own package, e.g. nested sampling, into Turing without too much hassle.

I'm not sure about the MH-like functions. But I might miss something. Maybe instead we could have simple access to internal samplers (MH, IS) and sample a value using mh(model, vn) -> Real?

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

@torfjelde
Copy link
Member

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

Sorry, I completely missed this mention; didn't see it until yesterday. Also not sure if this discussion is still ongoing?

Anyways, for VI in particular the logjoint(model, vi) method proposed would be very useful. In fact I've just implemented such a method internally in my new VI PR for convenience (not meant to be exported):
https://github.com/torfjelde/Turing.jl/blob/547bb7b0f24d015371f0470345b737e52af95123/src/variational/advi.jl#L95-L100

Also a way of tracking the logpdf for each assume and observe call would be great since it would allow computation of p(x | z) and p(z). This is related to #817.

Worth noting that for using vi the Model is already instantiated, so tracking logpdf computations by adding a @track logpdf into the model wouldn't be too helpful in the VI-case.

Similarly, the parameters and parameters! functions would also be useful. Internally it's okay to just extract the parameters "manually" from the VarInfo, but for users to construct their own variational posteriors (other than mean-field approx) these methods would make it much more intuitive.

The @track macro mentioned is also something I've personally found myself wanting at times (not for VI-purposes though). I have some past experience using pymc3 in Python, and found myself using such a feature quite often. It's of course not necessary since these determinstic values can be computed with the samples of the random variables, but it's quite convenient.

@trappmartin
Copy link
Member

Yeas, you are right. Things like @track are mostly relevant for the user and not so much for the internals or devs.

@yebai yebai pinned this issue Sep 17, 2019
@cpfiffer
Copy link
Member

Another bit of low-hanging fruit is to make a function model(spl). Currently the signature is model(vi, spl), but since all the samplers have a vi we can just reduce the call size.

@cpfiffer
Copy link
Member

In reference to this comment:

Yes; I added it here a few days ago. I feel we can have a really flexible/advanced MH sampler in Turing, by falling back to different MH backends (e.g. Emsemble, RAM, vanilla MH).

I wonder if we should spin off an "AdvancedMH" package to show people how you might build a Turing-interfaced library.

@yebai
Copy link
Member Author

yebai commented Sep 18, 2019

Sounds good to me, but maybe we can do the implementation in the Turing repo first? This avoids the pain of keeping two packages synced.

@cpfiffer
Copy link
Member

I'm going through some of these suggestions while I'm updating the MH code, and I'm wondering if we can just give everyone default observe/assume statements that just mostly work all the time. Is anyone opposed to updating the ones here a little?

Specifically, I'd like to do two things:

  1. Remove the type constraints so that this function catches all samplers where they do not have a custom observe/assume.
  2. Implement an insupport call in order to return -Inf when something's out of bounds, or adding an insupport check to the logpdf_with_trans call.

The reason I want a generically useful observe/assume set is that a lot of the inference logic we currently do in assume (take MH for example) should probably not be the go-to for developers. The main place for inference should be step!.

Additionally, most inference algorithms are already implementing the most basic possible assume statement (HMC) so we could really just remove a lot of these sampler-specific definitions and just stick with one well-designed default that new developers can just default to without thinking too hard about it.

Thoughts on that?

@cpfiffer
Copy link
Member

Woops, didn't mean to close that. Sorry.

@cpfiffer cpfiffer reopened this Sep 25, 2019
@trappmartin
Copy link
Member

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

Maybe it is better to make an extra PR for this?

@cpfiffer
Copy link
Member

Probably. I think we basically just need to implement the observe/assume pairs that turns the model block into a log-density function, because the newer interface stuff lends itself a lot better to cases where step! functions spit out immutable Transition structs instead of doing the state mutation we currently do everywhere.

I have a prototype function with the signature Transition(model::Model, spl::Sampler, theta::T) which just sets the VarInfo to theta and the runs the model to get the PDF. I like the work flow a lot more.

@cpfiffer
Copy link
Member

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

I should clarify that my proposal has little to do with dispatch on a sampler type -- it should default to basically the following, regardless of sampler type:

function assume(. . . )
    return vi[vn], logpdf(dist, vi[vn])
end

This is just something helpful for foreign developers. If they want to manually modify the contents of the VarInfo, then this essentially turns the model block into a big log density function and they don't even have to import observe or assume. People who really like the way we do assume now can overload their own version, but I think just relaxing the type constraints and providing a super general function is really helpful for different types of workflows.

@xukai92
Copy link
Member

xukai92 commented Oct 27, 2019

Having this default assume and observe makes a lot of sense to me. I think we should provide two abstract types, one with assume and observe falling back to the current HMC's and one with them falling back to the current IS's. This should cover a lot of the sampling algorithm as a default.

@yebai
Copy link
Member Author

yebai commented Dec 16, 2021

Closed in favour of #1735

@yebai yebai closed this as completed Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants