Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poisson change-point with offset? #115

Closed
jdselwyn opened this issue Apr 1, 2021 · 6 comments
Closed

Poisson change-point with offset? #115

jdselwyn opened this issue Apr 1, 2021 · 6 comments
Labels
wontfix This will not be worked on

Comments

@jdselwyn
Copy link

jdselwyn commented Apr 1, 2021

I'm trying to fit a Poisson change point model with an offset and I seem to have run into a problem with how to set up the model with the offset. Is this possible to do with mcp?

Here is an example model fit with mcp and glm where when including the offset there should be no change-point at 3 but without the offset there is.

library(tidyverse)
library(mcp)

fake_data <- tibble(x = runif(100, 0, 5),
                    y = rpois(100, if_else(x > 3, 10, 3)),
                    n = rpois(100, if_else(x > 3, 20, 6)),
                    group = x > 3)

offset_mcp <- mcp(list(y ~ 1 + offset(log(n)), 
                         ~ 1 + offset(log(n))), 
                    par_x = "x",
                    data = fake_data,
                    family = poisson(),
                    cores = 1,
                    chains = 3,
                    iter = 3000,
                    adapt = 1500)

offset_glm <- glm(y ~ group + offset(log(n)), data = fake_data, family = 'poisson')

Thank you!

@lindeloev
Copy link
Owner

offset() is not supported by mcp (yet). I must admit that I didn't know about it before now. Sorry for asking a basic question, but is it different than doing fake_data$y = fake_data$y - log(fake_data$n) and then not having an offset term in the model?

@jdselwyn
Copy link
Author

jdselwyn commented Apr 4, 2021

A bit different yeah. The example I was taught to learn offsets in poisson models was the case of modelling shark attacks where you want to account for (human) population growth as well. So instead of just modelling the number of attacks you model the number of attacks per capita which makes the model actually y/n ~ x. Which then after the glm link transformation becomes log(y) - log(n) ~ x and then log(y) ~ x + log(n). The reason not to just model y/n as gaussian is that the underlying y still follows a poisson distribution so poisson is presumably a better fit given that underlying process (i.e. no negative values).

I had a go at trying to tweak some of the jags code output by the mcp model and this is how I included it:

    # Fitted value
    y_[i_] = 
    
      # Segment 1: y ~ 1 + offset(log(n))
      (x[i_] >= cp_0) * (x[i_] < cp_1) * int_1 + 
    
      # Segment 2: y ~ 1 ~ 1 + offset(log(n))
      (x[i_] >= cp_1) * int_2 +
	  
	  #Include the offset
	  log(n[i_])

    # Likelihood and log-density for family = poisson()
    y[i_] ~ dpois(exp(y_[i_]))
    loglik_[i_] = logdensity.pois(y[i_], exp(y_[i_]))

The only change being the line adding log(n[i_]) to the fitted value calculation block. So I think the only thing that would need to happen is if an offset is used either pass to jags the transformed n vector or an untransformed n vector and whatever the person puts in as a transformation inside the offset function. And I'm sure other things to help with prediction/plotting etc. that I'm not familiar enough with the package to have noticed.

@lindeloev
Copy link
Owner

OK, I see. Thanks for the explanation! Yes, this would be possible to add, but I'll close it for now and will reconsider if it's requested by more people. It would add complexity to places I'd like to keep as clean as possible :-)

For now, to make plotting, predictions, etc. work, in v0.3 simply modify the fit$simulate() function with the updated formula too. It basically mirrors the JAGS code, so it should be easy. Save in place. This function is used by plot(), fitted(), etc.

In the upcoming v0.4+, one would need to modify fit$.internal$formula_r in addition to the JAGS code.

@lindeloev lindeloev added the wontfix This will not be worked on label Apr 5, 2021
@jdselwyn
Copy link
Author

jdselwyn commented Apr 5, 2021

Ok thank you for the pointer and understandable about keeping the code clean!. Is there a way to pass an extra variable to the jags code? When I try just changing the jags_code itself I get an error because n isn't found.

modified_model <- '
model {

  # Priors for population-level effects
  cp_0 = MINX  # mcp helper value.
  cp_2 = MAXX  # mcp helper value.

  cp_1 ~ dunif(MINX, MAXX)
  int_1 ~ dnorm(0, 1/(10)^2) 
  int_2 ~ dnorm(0, 1/(10)^2) 


  # Model and likelihood
  for (i_ in 1:length(x)) {
    X_1_[i_] = min(x[i_], cp_1)
    X_2_[i_] = min(x[i_], cp_2) - cp_1
    
    # Fitted value
    y_[i_] = 
    
      # Segment 1: y ~ 1 + offset(log(n))
      (x[i_] >= cp_0) * (x[i_] < cp_1) * int_1 + 
    
      # Segment 2: y ~ 1 ~ 1 + offset(log(n))
      (x[i_] >= cp_1) * int_2 +
	  
	  #Include the offset
	  log(n[i_])

    # Likelihood and log-density for family = poisson()
    y[i_] ~ dpois(exp(y_[i_]))
    loglik_[i_] = logdensity.pois(y[i_], exp(y_[i_]))
  }
}
'

offset_fit_mod <- mcp(list(y ~ 1 + offset(log(n)), 
                        ~ 1 + offset(log(n))), 
                   par_x = "x",
                   data = fake_data,
                   family = poisson(),
                   cores = 1,
                   chains = 3,
                   iter = 3000,
                   adapt = 1500,
                   jags_code = modified_model)

Error in rjags::jags.model(file = textConnection(jags_code), data = jags_data,  : 
  RUNTIME ERROR:
Compilation error on line 28.
Unknown variable n
Either supply values for this variable with the data
or define it  on the left hand side of a relation.


Warning message:
In run_jags(data = data, jags_code = jags_code, pars = c(all_pars,  :
  --------------
JAGS failed with the above error. Returning an `mcpfit` without samples. Inspect fit$prior and cat(fit$jags_code) to identify the problem. Read about typical problems and fixes here: https://lindeloev.github.io/mcp/articles/tips.html.

@lindeloev lindeloev mentioned this issue Apr 5, 2021
4 tasks
@lindeloev
Copy link
Owner

Ah, that raises the ability to write custom models to a level where I reaised an issue :-) See #116. It'll likely be another ½ year or so before I get to that. For now, here are two solutions (not tested). Let me know if any of them works for you!

Run it directly using rjags:

jags_model = rjags::jags.model(
  textConnection(jags_code),
  list(
    x = my_data$x,
    y = my_data$y,
    n = my_data$n
  ),
  n.chains = 3,
  n.adapt = 1000
)

fit = rjags::coda.samples(
  jags_model,
  variable.names = c("int_1", "int_2", "cp_1"),
  n.iter = 1000
)

Then explore using summary(fit), tidybayes::spread_draws(fit), etc. This prevents you from using mcp here and on

Hack v0.4 to do this

v0.4 is in development and can be installed using remotes::install_github("lindeloev/[email protected]"). This supports multiple regression, so you could include n as a model term and set the associated regression parameter to zero to cancel it out:

model = list(
  y ~ 1 + n,
  ~ 1
)

prior = list(n_1 = 0)  # multiply n with zero

fit = mcp(model, fake_data, jags_code = your_modified_jags_code)
fit$.internal$formula_r = your_modified_r_code

It's in development. Everything works except plot() for multiple-regression models and some others.

@jdselwyn
Copy link
Author

jdselwyn commented Apr 5, 2021

Awesome. Thank you for all your help!

Repository owner locked and limited conversation to collaborators Apr 5, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants