-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve documentation #512
Comments
This is really useful stuff @BenjaminJCox ; thank you so much! This is partially because the documentation used to be separate from the actual code (this has improved now that we have https://turinglang.org/library) + there has been numerous improvements to a lot in this process of over in particular the past year, e.g. most samplers don't have to interact with Turing.jl at all to be compatible with turing; you can just work with LogDensityProblems.jl, and then, thanks to But all of this stuff is not at all obvious.
I think this is the way to go. A lot of the points here could probably be improved significantly by just implementing the same sampler using the difference approaches, showing the user the different possible paths one can take. I'll have a shot at this 👍 |
Something like this but more on the sampling? graphvizdigraph {
# Nodes.
tilde_node [shape=box, label="x ~ Normal()", fontname="Courier"];
base_node [shape=box, label=< left = <FONT COLOR="#3B6EA8">@varname</FONT>(x)<BR/>right = Normal()<BR/>x, vi = ... >, fontname="Courier"];
ifobs [label=< <FONT COLOR="#3B6EA8">if</FONT> isobservation >, fontname="Courier"];
tilde_assume_bangbang [shape=box, label="tilde_assume!!(context, left, right, vi)", fontname="Courier"];
tilde_observe_bangbang [shape=box, label="tilde_observe!!(context, left, right, vi)", fontname="Courier"];
tilde_assume_bangbang_inner [shape=box, label="x, lp, vi = tilde_assume(context, left, right, vi)\nreturn x, acclogp!!(vi, lp)", fontname="Courier"];
tilde_observe_bangbang_inner [shape=box, label="lp, vi = tilde_observe(context, left, right, vi)\n return left, acclogp!!(vi, lp)", fontname="Courier"];
tilde_assume [shape=box, label="tilde_assume(context, left, right, vi)", fontname="Courier"];
tilde_assume_sampling [shape=box, label="tilde_assume(rng, context, sampler, left, right, vi)", fontname="Courier"];
assume [shape=box, label="assume(left, right, vi)", style=dashed, fontname="Courier"];
assume_sampling [shape=box, label="assume(rng, sampler, left, right, vi)", style=dashed, fontname="Courier"];
tilde_observe [shape=box, label="tilde_observe(context, left, right, vi)", fontname="Courier"];
observe [shape=box, label="observe(left, right, vi)", style=dashed, fontname="Courier"];
ifsampling [label=< <FONT COLOR="#3B6EA8">if</FONT> sampling >, fontname="Courier"];
ifleafcontext1, ifleafcontext2, ifleafcontext3 [label=<<FONT COLOR="#3B6EA8">if</FONT> <FONT COLOR="#9A7500">LeafContext</FONT>>, fontname="Courier"];
childcontext1, childcontext2, childcontext3 [shape=box, label="context = childcontext(context)", fontname="Courier"];
# Edges.
tilde_node -> base_node [style=dashed, label=< <FONT COLOR="#3B6EA8">@model</FONT>>, fontname="Courier"]
base_node -> ifobs;
ifobs -> tilde_assume_bangbang [label=< <FONT COLOR="#97365B">false</FONT>>, fontname="Courier"];
ifobs -> tilde_observe_bangbang [label=< <FONT COLOR="#4F894C">true</FONT>>, fontname="Courier"];
# Assume
tilde_assume_bangbang -> tilde_assume_bangbang_inner;
tilde_assume_bangbang_inner -> tilde_assume;
tilde_assume -> ifsampling;
ifsampling -> ifleafcontext1 [label=< <FONT COLOR="#97365B">false</FONT>>, fontname="Courier"];
ifsampling -> tilde_assume_sampling [label=< <FONT COLOR="#4F894C">true</FONT>>, fontname="Courier"];
ifleafcontext1 -> childcontext1 [label=< <FONT COLOR="#97365B">false</FONT>>, fontname="Courier"]
childcontext1 -> tilde_assume;
ifleafcontext1 -> assume [label=< <FONT COLOR="#4F894C">true</FONT>>, fontname="Courier"]
tilde_assume_sampling -> ifleafcontext2;
ifleafcontext2 -> assume_sampling [label=< <FONT COLOR="#4F894C">true</FONT>>, fontname="Courier"];
ifleafcontext2 -> childcontext2 [label=< <FONT COLOR="#97365B">false</FONT>>, fontname="Courier"];
childcontext2 -> tilde_assume_sampling;
# Observe
tilde_observe_bangbang -> tilde_observe_bangbang_inner;
tilde_observe_bangbang_inner -> tilde_observe;
tilde_observe -> ifleafcontext3;
ifleafcontext3 -> childcontext3 [label=< <FONT COLOR="#97365B">false</FONT>>, fontname="Courier"];
ifleafcontext3 -> observe [label=< <FONT COLOR="#4F894C">true</FONT>>, fontname="Courier"]
childcontext3 -> tilde_observe;
} |
@torfjelde excellent diagram! |
@torfjelde This is the type of diagram I had in mind, however I think that explaining what each node means would be very helpful. Implementing the same sampler using the various different approaches seems like a very good idea. Having an implementation using LogDensityProblems.jl and the interface would be very useful as someone that is used to implementing samplers based on statistical methodology papers, as I think it would reduce the required knowledge of the internals of the PPL. |
Just as a small update, I've been working (mainly for my own benefit!) on implementing Metropolis–Hastings, and writing it up as I go along. I'm currently hosting this on my own website, but I'm sure eventually some parts of it could be reshaped into Turing docs. So far I've only been doing an implementation from scratch (part 1, part 2) but the plan is to implement it as per AbstractMCMC API, and then talk about how DynamicPPL generates the function that returns the log density. Happy to refresh some of the above as and when I get to it. |
ps. since this issue is so wide-ranging, and covers both the sampling side of things as well as the probabilistic DSL, I'm going to move this to the docs repo. :) |
As requested on the Slack I am posting my perspective as to how the docs could be improved.
The below is not complete, but these are the issues that I have come across when trying to implement an adaptive multiple importance sampler along the lines of https://arxiv.org/abs/0907.1254
All of the below comes from the perspective of a guy that designs samplers and parameter inference methods, so some things are probably obvious but perhaps named or laid out differently than expected by me as I am educated in statistics only.
Overview of potential improvements to documentation for DynamicPPL (and associated):
Suggestion:
Denote by θ the model parameters and by x the data
Most samplers can be implemented using (a subset of) the following:
• p(θ|x) the posterior likelihood
• p(θ) the parameter prior
• p(x|θ) the data likelihood
• A way to evaluate the above at arbitrary parameter values
• A way to evaluate conditionals of above for subsets of the sampled variables (e.g. let θ = [μ,σ], get p(μ|σ, x))
• first and second order derivatives of the posterior likelihood
• Transforming the parameter space to an unconstrained space
• A place to store samples from each iteration (potentially multiple samples per iteration)
• A place to store weights and other (meta)data associated with samples at each iteration
• A way to accumulate probabilities from each iteration
To this end I think that it would be invaluable to have a tutorial that implements a very basic HMC algorithm (literally just a textbook method) within Turing (i.e. using DynamicPPL and AbstractMCMC), as it will cover the majority of these in a way that allows extension.
Addendum:
I believe that with some tweaking the Turing ecosystem has potential to be an excellent tool for prototyping inference algorithms on complex models, however it is currently rather opaque as to how to implement samplers. Furthermore, it seems to be built around the implicit assumptions of MCMC, with two obvious ones being one sample per iteration and equal weighting of samples. Whilst these assumptions can be circumvented (e.g. by writing your method using only the DynamicPPL modelling interface), it would be useful to have native interop, although I appreciate this is a big ask.
I understand that improving this API is secondary to improving the user facing part of Turing, as many more people use Turing without implementing their own samplers. However, I believe that improving the documentation will attract more contributors, and allow the use of the Turing ecosystem in reseaching sampler design in addition to statistical studies.
I am of course happy to help to the best of my ability, but I do not understand the design of DynamicPPL or AbstractMCMC to an extent that I think I could.
(apologies for the formatting, I copied this over from notepad)
The text was updated successfully, but these errors were encountered: