Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds @is_post_processing macro #589

Closed
wants to merge 1 commit into from

Conversation

torfjelde
Copy link
Member

@torfjelde torfjelde commented Apr 19, 2024

NOTE: I'd love suggestions for different names though!

We're seeing more and more users making use of generated_quantities and the like, which means there are more and more use-cases where the user wants to exclude certain computations from the model during inference but include these during "post-processing"-steps (can we find a better name?).

This PR adds a macro @is_post_processing (again, better name please <3) which can be used to conditionally perform computation when not doing inference.

E.g.

julia> @model function demo()
           x ~ Normal(0, 1)
           if @is_post_processing
               return x
           else
               return nothing
           end
       end
demo (generic function with 2 methods)

julia> model = demo();

julia> model()

julia> generated_quantities(model, (x = 1,))
1.0

This topic has come up on multiple occasions: #510, #94 (comment), and definitively other places too. I know there's generally a reluctance to rely on additional macros for these purposes but AFAIK there's no better solution, so at some point, e.g. now, I think we gotta bite the bullet and get something done.

And I don't think making the user add specific arguments to the model indicating whether they're in "post-processing"-mode or inference-mode is a good way to go for the following reasons:

  1. The users needs to do this. As we have more users who are not familiar with Julia, this is not immediately obvious.
  2. Doing "correctly", i.e. in a way that the check actually compiles away is non-trivial for basic users.
  3. Only a context-based approach works well with nested models, i.e. usage of @submodel, since otherwise this "are we inference-ing"-argument needs to be passed down aaaall the models, which is annoying. Doing it with contexts, as we do in this PR, nesting of models, etc. just works.

inside-model-macros a bit clearer + a `PostProcessingContext` and
macro to enable user to perform conditional computation outside of inference
@coveralls
Copy link

coveralls commented Apr 19, 2024

Pull Request Test Coverage Report for Build 8755557702

Details

  • 23 of 25 (92.0%) changed or added relevant lines in 6 files are covered.
  • 17 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.09%) to 81.909%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/contexts.jl 16 17 94.12%
src/model.jl 2 3 66.67%
Files with Coverage Reduction New Missed Lines %
src/model.jl 1 89.22%
src/threadsafe.jl 16 48.25%
Totals Coverage Status
Change from base Build 8745687397: 0.09%
Covered Lines: 2635
Relevant Lines: 3217

💛 - Coveralls

@@ -129,6 +129,7 @@ export AbstractVarInfo,
unfix,
# Convenience macros
@addlogprob!,
@is_post_processing,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@is_post_processing,
@is_generated_quantities,

of post-processing the inference results, e.g. making predictions or computing
generated quantities.
"""
struct PostProcessingContext{Ctx} <: AbstractContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct PostProcessingContext{Ctx} <: AbstractContext
struct GeneratedQuantitiesContext{Ctx} <: AbstractContext

Copy link
Member

@yebai yebai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a helpful feature, but I'm not sure we want to promote it as the standard workflow. In my view, users should write separate functions that take the MCMCChains object as input and return generated quantities. But it might be hard to resist a sugar syntax for convenience!

@yebai yebai changed the title Adds @is_post_processing macro Adds @is_generated_quantities macro Apr 19, 2024
@torfjelde torfjelde changed the title Adds @is_generated_quantities macro Adds @is_post_processing macro Apr 19, 2024
@@ -664,3 +664,66 @@ function fixed(context::FixedContext)
# precedence over decendants of `context`.
return merge(context.values, fixed(childcontext(context)))
end

""""
Copy link
Member

@yebai yebai Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to move these new codes and the existing generated_quantities functions into a new file generated_quantities.jl, so it is self-contained.

@yebai
Copy link
Member

yebai commented May 6, 2024

In my view, users should write separate functions that take the MCMCChains object as input and return generated quantities.

Now that we can track any variable into MCMCChains #594, I propose that we encourage users to write separate

function f(s::NamedTuple)

end

Then provide

generated_quantities(f::Function, c::MCMCChains) 

as a convenience function to calculate compute-intensive generated variables instead of optionally skipping them inside the model.

@torfjelde
Copy link
Member Author

I'm very much against this. How is this better than using return? Here you'll have to extract all the parameters from the Chains (which is really, really annoying if they're not univariate), and then probably rewrite a bunch of stuff you're already doing in the model. I personally use the generated_quantities very heavily for all sorts of things, e.g. extracting a ODESolution object (which doesn't fit into a chains object), and having to write such a funciton instead would be very annoying.

IMO this is not worth it compared to just using return; the conditional evaluation is not that important + it can easily be achieved by just doing something like

@model demo() = x ~ Normal()
@model demo_for_post_inference() = (@submodel x = demo(); return f(x))

and then you can run inference on demo, avoiding f(x) during inference, while still maintaining the flexibility.

@yebai
Copy link
Member

yebai commented May 6, 2024

It's good to know the limitations of MCMCChains. I am not proposing to remove return for models so it can continue to be used.

For clarity, I am unsure of the motivation for a second model, demo_for_post_inference. Isn't what you wrote equivalent with

@model demo() = x ~ Normal()

generated_quantities(fdemo(), chain::MCMCChains.Chain) # maybe: compose operator not implemented

If so, we can consider adding a new method, generated_quantities(::ComposedFunction{…}, ::Chains{…}) to support it.

@torfjelde
Copy link
Member Author

Even if we define this generated_quantities for ComposedFunction, this would then be equivalent to

map(f, generated_quantities(model, chain))

which makes me somewhat uncertain why we'd want to hide this behind a generated_quantities method 😕

@yebai
Copy link
Member

yebai commented May 8, 2024

map(f, generated_quantities(model, chain))

That works well and could be recommended as the standard workflow. I am closing this PR now since it introduces additional syntax and complexity that can otherwise avoided without sacrificing functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants