-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple predictors #90
Comments
@lindeloev Out of curiosity, do you have a timeline on when this will be implemented? |
@sjmgarnier I got this to work locally but while it provides much more modeling options (and is easier to maintain/extend), sampling of currently supported models take double the time. mcp is already like 100x slower than other packages. Tries to be useful for modeling options rather than speed. So I'm a bit in two minds whether to continue down this path. What do you think? |
@lindeloev I'll answer very selfishly by saying that I need it for a project that I'm working on :-) I could not find a satisfying alternative to perform weighted segmented regressions with random effects. And I'm a patient man with a powerful computer, so I can wait for a few minutes that the fit finishes. But I'm also very well aware of how time-consuming it is to develop and maintain packages, so I would completely understand if you decided to focus on other priorities. |
Cool, The first version with this will likely not have random effects on the RHS. But using a categorical intercept (like |
I'm working on this now and have almost finished making all design decisions. Luckily, I found an implementation that won't negatively affect performance. Expect release in a few months, depending on how hard it is to make sensible |
Is this implemented in a development version? A reviewer demands a multivariate analysis :/ |
@mattmoo I just pushed the development version to branch v0.4 here. I think you can install it using But I've only tested it on gaussian models so far, so please tripple-check. And any feedback and ideas would be much appreciated, BTW! |
There is good progress on this. I just pushed the latest version to branch v0.4. Install using I'm basically just missing |
That's great! I'm waiting on compute resources to actually run the analysis (50,000 datapoints :/) |
I just tested the performance locally on a Ryzen 5 3600. For 55.000 data points and a 15-parameter model with categorical predictors, I get around 1 sample per second. So if you run in parallel with default settings (3000 warmup iters + 1500 data iters), you might complete sampling in a matter of hours? ex = mcp_example("multiple")
df_tmp = tidyr::expand_grid(rep = 1:250, ex$data) %>% # upscale to 55000 data points
dplyr::select(-rep)
fit = mcp(ex$model, df_tmp, par_x = "x", chains = 1, adapt = 100, iter = 100) I just pushed a new commit to v0.4 which requires ~10% of the memory of the previous version during sampling. Maybe that helped too. |
I'll give it a go, from my previous experience with these data (see link), the sampling does not converge so quickly (and I do not have such a nice processor!). |
Hi, is this issue still up to date or is there additional information for how to incorporate additional predictors? |
@adrose Unfortunately not. The v0.4 branch currently doesn't run out-of-the-box due to some backwards-incompatible changes in the dependency packages, that I only incorporated into the v0.3 series. I'm presently prioritizing another project higher but really looking forward to getting v0.4 out since it's awesome! |
Love the package. Looking forward to v0.4. |
Each segment should take an arbitrary number of linear predictors. As with the
segmented
package, the only requirement is that one continuous predictor (say,x
) is the dimension of the change point. The change point is simply the value onx
where the predictions ofy
changes to a different regression model (parameter structure and/or values).So this API should work. It has the following features:
JAGS-wise the indicator functions would be the same but now we additionally pass design matrices (
X1_
,X2_
, etc.) and useinprod()
per segment. The model above would be something like:where
xi_
is a model matrix that is built R-side andx_
ispar_x
along which change points are defined. Implementing this adds the following work points:Data structure
Intercept_i
instead ofint_i
.par_x
if there is not exactly one continuous predictor.lm
andbrms
but add_segmentnumber
.base::qr()
.Modeling, sampling, and summaries
get_formula()
to match the new segment table.get_jagscode()
run_jags()
andget_jags_data()
to work with design matrices. One per (dpar-segment) combo.sigma(1 + x * group)
etc. (I think it will out of the box).plot_pars()
,hypothesis()
,summary()
,fixef()
, etc.get_summary()
: Translate between code parameter name and user-facing parameter name.Simulated, fitted, and predicted values
fit$simulate(fit, data, par1, par2, ...)
, i.e., addfit
and replacepar_x
withdata.frame/tibble
.data
have the correct format. Set factor levels to match the original data.fit$simulate()
a wrapper around a lower-level fast function to use internally. Call itfit$.internal$simulate_vec(par_x, cp_1, ..., rhs_par1, rhs_par2, ...)
. Only the former should do asserts, calladd_simulated()
, etc.ar()
simulate_vectorized()
from all internal functions instead offit$simulate()
.fitted()
,predict()
,pp_check()
, etc.mcp_examples
mcp_examples
?Plot
plot(fit, facet_by = c("my_rhs", "my_varying_cp"))
. Still default to no facets.color_by = c("my_categorical1", "my_categorical2)
. It defaults tocolor_by = "all_categorical"
, i.e. all unique combinations of categorical levels on RHS. This will also set the grouping for spaghettis. I think thatcolor_by
should pertain solely to the RHS which share change points. Varying change points will not be accepted.plot(fit, effects = "my_categorical1")
. It's likebrms::marginal_effects()
. This should probably be implemented intidy_samples()
.plot(fit, filter = data.frame(my_categorical1 = c("levelA", "levelB"), my_categorical2 = "level1")
. This is likebrms::marginal_effects()
, onlyfilter
using a data.frame replacesint_conditions
which is a named list. For variables ineffects
that are not infilter
, all levels will be included. This should probably be implemented intidy_samples()
.plot_pars()
Tests
~exp(1 + x )
.tidy_draws()
The text was updated successfully, but these errors were encountered: