-
Notifications
You must be signed in to change notification settings - Fork 32
Description
I am just switching over from the slurmR package to this package, because I started getting strange errors with slurmR. rslurm seems to be working for me well, so far. One of the things that I need to do is fit a Bayesian model (using cmdstanr) on multiple data sets. Each model for each data set will be estimated on its own node. So, if I have 50 data sets, I want to estimate 50 models in parallel. But each model estimation needs 4 cores for 4 parallel MCMC chains. I do not think the cpus_per_node argument will help, because my guess is it will try to put one data set on each node/cpu. I am thinking the process_per_node might work, but it doesn't seem like it - the documentation isn't entirely clear. (In slurmR I was able to specify an mc.cores=4 argument so that I could have 4 parallel chains within each node. Here's the call to cmdstan where I request 4 parallel chains:
fit <- mod$sample(
data = data_list,
refresh = 0,
chains = 4L,
parallel_chains = 4L,
iter_warmup = 500,
iter_sampling = 2500,
step_size = 0.1,
show_messages = FALSE
)
Thanks so much for any advice you might have. And if you'd prefer I post these questions in another forum, let me know.
Update: It looks like I don't need to specify anything as long as mc.cores is specified in my sample statement.