Job prioritisation mode #254

bosterholz · 2022-11-15T09:44:09Z

While running multiple metagenomes it happens easily, that lots of tasks of a module flood the slurm queue and therefore "block" other modules, as they don't get process time. To give a better example, I want to run hundreds of metagenomes and I am mainly looking for MAGs, their quality and annotation. But I would also like to get everything else to look at at a later point in time. If I start everything together (for example) the plasmid module jobs could get a head start, as they just need bins to run. If so the plasmid jobs will flood the queue and block everything else until this module completes.

It would be nice if we could use slurms nice flag:

--nice[=adjustment]
    Run the job with an adjusted scheduling priority within Slurm. With no adjustment value the scheduling priority is decreased by 100. A negative nice value increases the priority, otherwise decreases it. The adjustment range is +/- 2147483645. Only privileged users can specify a negative adjustment.

to give prioritization for jobs in the queue and Nextflows maxForks, to make sure that not only one type of job fills the whole queue.
Using these two we could affect the order of execution, so that modules that are particularly interesting for a user at a given time are executed faster than others.

The text was updated successfully, but these errors were encountered:

pbelmann · 2022-11-15T20:27:37Z

Wouldn't be the --nice setting enough for this use case? Why do you also want to modify maxForks?

bosterholz · 2022-11-16T11:31:30Z

It looks like there is a finite queue size of jobs that Nextflow sends at a given time or Slurm allows to send.
If that is the case only using the --nice flag would be not enough, as for example 400 low priority jobs could just fill the whole queue and leave no room for high priority ones. Therefore the idea to use maxForks to give every job the chance to make it into the queue.

pbelmann · 2022-11-16T22:33:11Z

I understand, but the issue with maxForks is that once the high priority jobs are done, all remaining ones will still be restricted by maxForks and thereby not efficiently using the underlying infrastructure.
Why not instead increasing the Nextflow queue size? We could add it as a user parameter.

bosterholz added the enhancement New feature or request label Nov 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job prioritisation mode #254

Job prioritisation mode #254

bosterholz commented Nov 15, 2022

pbelmann commented Nov 15, 2022 •

edited

Loading

bosterholz commented Nov 16, 2022

pbelmann commented Nov 16, 2022

Job prioritisation mode #254

Job prioritisation mode #254

Comments

bosterholz commented Nov 15, 2022

pbelmann commented Nov 15, 2022 • edited Loading

bosterholz commented Nov 16, 2022

pbelmann commented Nov 16, 2022

pbelmann commented Nov 15, 2022 •

edited

Loading