Skip to content

Derived quantities#1547

Merged
danielturek merged 79 commits into
develfrom
derived_quantities
Jun 28, 2025
Merged

Derived quantities#1547
danielturek merged 79 commits into
develfrom
derived_quantities

Conversation

@danielturek
Copy link
Copy Markdown
Member

Adds derived quantities to the MCMC.

Let's see how many issues the testing turns up.

danielturek and others added 4 commits May 27, 2025 13:45
* Tests for mean/variance derived quantities

* Tests for logProb derived quantity

* Tests for predictive derived quantity

* Tests for custom derived quantity

* Update derived quantity tests after fixes

* Remove unneeded TODO comment

---------

Co-authored-by: Daniel Turek <danielturek@gmail.com>
@paciorek
Copy link
Copy Markdown
Contributor

I merged devel into derived_quantities to address a few testing issues that I had fixed in recent weeks. That should fix all but the mcmc and filtering gold file issues.

Let's see what this run of CI shows in terms of the gold file issues. Hopefully those are heisen-failures.

@paciorek
Copy link
Copy Markdown
Contributor

There's a failure in test-bnp that I can replicate running manually on my machine. It occurs in the MCMC compilation in this particular chunk of code:

  code=nimbleCode(
    {
      xi[1:10] ~ dCRP(1 , size=10)
      thetatilde[1] ~ dnorm(0, 1)
      thetatilde[2] ~ dt(0, 1, 1)
      thetatilde[3] ~ dt(0, 1, 1)
      s2tilde[1] ~ dinvgamma(2, 1)
      s2tilde[2] ~ dgamma(1, 1)
      s2tilde[3] ~ dgamma(1, 1)
      for(i in 1:10){
        y[i] ~ dnorm(thetatilde[xi[i]], var=s2tilde[xi[i]])
      }
    }
  )
  Inits=list(xi=rep(1, 10), thetatilde=rep(0,3), s2tilde=rep(1,3))#
  Data=list(y=rnorm(10, 0,1))
  m <- nimbleModel(code, data=Data, inits=Inits)
  cm <- compileNimble(m)
  mConf <- configureMCMC(m, monitors =  c('thetatilde', 's2tilde', 'xi'))
  expect_message(mMCMC <- buildMCMC(mConf), "The number of clusters")
  cMCMC <- compileNimble(mMCMC, project = m) 
  cMCMC$run(1, reset=FALSE) 
[...snip...]
Compiling
  [Note] This may take a minute.
  [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
terminate called after throwing an instance of 'std::length_error'
  what():  vector::_M_default_append
Aborted

@danielturek
Copy link
Copy Markdown
Member Author

@paciorek What is the intention of including reset = FALSE in the call

cMCMC$run(1, reset=FALSE) 

?

I believe that's what's causing the problem here, and the MCMC system is not designed to run with reset = FALSE, unless a prior run of the MCMC has already taken place.

@paciorek
Copy link
Copy Markdown
Contributor

Hmm.

  1. I don't know why I put reset=FALSE in that test.
  2. That said, this test has been fine for years. And to confirm, I just ran it on current nimble and does run without error.
  3. I would hope that even if we don't intend users to use reset=FALSE on first run that it would not cause an error and certainly not a crash. That said, I see that we tell users not to use FALSE on first run in roxygen.

I'm ok with taking out reset=FALSE but before doing so, I think it would be good if we understand why it causes problems now but not before.

@danielturek
Copy link
Copy Markdown
Member Author

@paciorek I'm away from a computer this week, this is from my phone; forgive briefness. This is caused by some reorganization of the initialization code I did with the setup of the MCMC, wrt mvSamples and related objects. The changes (alongside new initialization code for the derived quantities) is more logical and correct, for how the MCMC is designed to operate. I wouldn't mind also adding an error trap for this case (first run of MCMC using reset = FALSE), which would error trap this case, but I can't do that now. What do you think is best?

@paciorek
Copy link
Copy Markdown
Contributor

To me the most natural thing is that if one does reset=FALSE on first run of the MCMC it has no effect. I.e. if one does not reset MCMC sampling quantities before even running the MCMC, then the MCMC just runs as it otherwise would with whatever the initial quantities are set to..

But you have probably thought this through/understand this more than I do, so if you prefer to error trap this case, it is ok with me.

Perhaps when you work on this you can just remove the reset=FALSE from that test in test-bnp.R.

No immediate hurry here, but I'll note to myself that we are planning on making this change as part of 1.4.0 release.

@danielturek
Copy link
Copy Markdown
Member Author

danielturek commented Jun 25, 2025 via email

@paciorek
Copy link
Copy Markdown
Contributor

Sure -- I need to look at the MCMC gold file for another PR, so it may be related.

@danielturek
Copy link
Copy Markdown
Member Author

danielturek commented Jun 25, 2025 via email

@paciorek
Copy link
Copy Markdown
Contributor

@danielturek there's a problem with your fix:

** byte-compile and prepare package for lazy loading
Warning in .checkFieldsInMethod(def, fieldNames, allMethods) :
  non-local assignment to non-field names (possibly misspelled?)
    reset <<- TRUE
( in method "run" for class "MCMC")

@danielturek
Copy link
Copy Markdown
Member Author

@paciorek I just made a small update.

@paciorek
Copy link
Copy Markdown
Contributor

So some (not sure if all) of the gold file problems seem to be caused by a new space character inserted in the reporting of sampler assignment.

E.g., in line 2709 of mcmcTestLog_Correct.Rout, we have:

RW sampler: a[2]

The saved gold file has no space after "[2]", but running tests now produces a space after "a[2]".

On first glance, it looks like this is coming from the change to cat() in the show method for samplerConf.

@paciorek
Copy link
Copy Markdown
Contributor

I think all of the mcmc gold file issues are the extra space. Hopefully the filtering gold issues are too.

@danielturek
Copy link
Copy Markdown
Member Author

That sounds about right, and yes, I would prefer the space not be there. Is there any chance of re-writing the gold file? I suspect you'll say the correct path is to reintroduce the space, the (theoretically) pass testing, then re-remove the space.

@paciorek
Copy link
Copy Markdown
Contributor

@danielturek

  • Ok to merge this in? Do you want to do it?
  • I see various roxygen entries, so I am guessing roxygen is more-or-less done, but wanted to check.
  • Were you planning to draft a section for the manual or would you like me to do that?

@danielturek
Copy link
Copy Markdown
Member Author

@paciorek

  • Yes, I consider this ready to merge. I'll take care of it.
  • Yes, the roxygen is accurate.
  • There is currently no manual entry. If you're willing to draft that I'd appreciate it. It might be reasonable to (in a large part) just reference the vignette; or if not, you could repurpose a lot of information from there.

@danielturek danielturek merged commit e0f18a8 into devel Jun 28, 2025
8 checks passed
@danielturek
Copy link
Copy Markdown
Member Author

@paciorek Looks like I didn't update the NEWS file, however.

@paciorek
Copy link
Copy Markdown
Contributor

That's fine -- as of late I've just been doing NEWS all in one go right before release by looking through all the commits/closed PRs/closed issues.

I'm not sure how to feel about vignette vs. manual. Modularity can be nice and I know other packages structure docs around vignettes and the manual can be overwhelming, but I also like have all the content in one place, particularly since we're thinking of this as a core part of the MCMC system.

@perrydv @kenkellner do you have any inclinations in terms of how much to say about derived quantities in the manual vs. the material already in the vignette?

@kenkellner
Copy link
Copy Markdown
Contributor

The existing vignette covers everything and is pretty concise, so maybe it just all goes in the manual?

If most of it stays in the vignette, is it OK that the vignette is not actually included with the package but instead is on Daniel's site? Not sure if there are other situations like that in the manual.

@paciorek
Copy link
Copy Markdown
Contributor

paciorek commented Jul 3, 2025

I'm working on manual content for derived quantities.

@danielturek (and no need to respond until you have time) in your vignette, you distinguish between (1) "predictive nodes" and (2) "posterior derived quantities". I think by the latter you mean deterministic predictive "quantities", but elsewhere in our docs we refer to both determ and stoch predictive cases as "predictive nodes". So I am going to modify your language and not distinguish between (1) and (2). Please let me know if I'm misunderstanding or you otherwise are opposed to me modifying your language.

@danielturek
Copy link
Copy Markdown
Member Author

@paciorek Yes, in the text I generally said "predictive nodes" for PP stochastic nodes, and "posterior derived quantities" for PP deterministic nodes. And yes, as you said both of these are just different cases of "predictive nodes". It sounds like we're on the same page, and if you've already modified text according to how you think this should be worded, that's fine with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants