Parallelization support #3

the-x-at · 2016-10-06T10:15:30Z

Support for parallel simulation computation on a multiprocessor/multicore machine would be great. Limiting the number of cores used should be an optional parameter when running simulations, ideally defaulting to a single job, as many queuing systems have their own load balancing and discourage use of multiple cores for a single job.

jorainer · 2016-10-06T11:39:08Z

parallel random sampling might be tricky but eventually there might be something in BiocParallel.

the-x-at · 2021-05-26T08:42:12Z

Five years gone and not much happened. In the meantime, fixing issue #22 splits the whole simulation into small chunks of short simulations. This would theoretically be a possibility to add parallelization. OTH, this type of running threads in parallel is not appreciated by queuing systems like SLURM, as you gain an advantage over other by using multiple cores. Unless SLURM is informed about this, it will kill the job assuming excess CPU usage.

jorainer · 2021-05-26T09:40:54Z

parallel processing with SLURM works like a marvel with:

ncores <- as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE", 7)) - 1L
register(MulticoreParam(ncores))

any subsequent call to bplapply will then by default use the parallel processing setup with the number of nodes assigned by SLURM. The main issue I see is with the random numbers - we would have to ensure that not the same random numbers are picked up in the parallel jobs. Anyway, since we're running FamAgg on multiple traits in one analysis, parallelizing by trait is at present my favorite approach.

the-x-at · 2021-05-26T10:50:58Z

Looks very simple, and it also looks like one has to supply the number of CPUs (cores/threads) used by a single process when submitting a job to SLURM via sbatch -c N, where N is the number of threads you want to use. This parameter then ends up in the environment variable SLURM_JOB_CPUS_PER_NODE.
Anyway, at the moment we don't see much need for this. So the issue will remain open but no plans to tackle it.

jorainer · 2021-05-26T13:06:16Z

yep exactly, sbatch -c N is assigned the environment variable SLURM_JOB_CPUS_PER_NODE (I guess also the other variables will be available). And yes, I agree, no need to implement something at present.

the-x-at added the enhancement label May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization support #3

Parallelization support #3

the-x-at commented Oct 6, 2016

jorainer commented Oct 6, 2016

the-x-at commented May 26, 2021

jorainer commented May 26, 2021

the-x-at commented May 26, 2021

jorainer commented May 26, 2021

Parallelization support #3

Parallelization support #3

Comments

the-x-at commented Oct 6, 2016

jorainer commented Oct 6, 2016

the-x-at commented May 26, 2021

jorainer commented May 26, 2021

the-x-at commented May 26, 2021

jorainer commented May 26, 2021