-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sfitz add F8 and adjust F16 and F32 configs #256
Conversation
README.md
Outdated
- [Testing and Validation](#testing-and-validation) | ||
- [Test Data Set](#test-data-set) | ||
- [Performance Validation](#performance-validation) | ||
- [Performance Validation and Resource Requirements](#performance-validation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern here was that almost all users want to know what resources are required, but it was somewhat hidden under Performance Validation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i think this is a good change, more clearly outlines what they can expect to find
README.md
Outdated
### Performance Validation | ||
Testing was performed in the Boutros Lab SLURM Development cluster. Metrics below will be updated where relevant with additional testing and tuning outputs. Pipeline version used here is v4.0.0-rc.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update the metrics using a more recent pipeline version in another PR
config/F2.config
Outdated
@@ -6,19 +6,19 @@ | |||
process { | |||
withName: run_validate_PipeVal { | |||
cpus = 1 | |||
memory = 1.GB | |||
memory = 2.GB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these processes that require 1 cpu, may as well give half of the memory. Two of these processes can run at a time and no others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The F-series nodes don't have exactly 2xCPUs memory available so this will likely result in only one process of PipeVal running at a time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right! I changed them to 1500.MB
Sorry I am new to this pipeline, @sorelfitzgibbon is there a known reason why EDIT: disregard I see the notes in the README |
main.nf
Outdated
@@ -54,7 +54,18 @@ if (params.max_cpus < 16 || params.max_memory < 30) { | |||
------------------------------------ | |||
ERROR: Insufficient resources: ${params.max_cpus} CPUs and ${params.max_memory} of memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems you cited 'To run Mutect2 the pipeline requires at least 8 CPUs and 16 GB of memory', so I think the nested if block needs to be changed or updated depending on intended use.
if (params.max_cpus < 16 || params.max_memory < 30) {
if (params.algorithm.contains('muse') || params.algorithm.contains('mutect2'))
currently I think if you specify mutect2 and provide the cited resources it will error.
I might suggest inverting these if blocks to check first for algorithm, and then check the resources meet the minimum requirement for the given algorithm.
if (params.algorithm.contains('muse') && (params.max_cpus < 16 || params.max_memory < 32)){...}
else if (params.algorithm.contains('mutect2') && (params.max_cpus < 8 || params.max_memory < 16)){...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch @kiarod. The first pair of cutoffs was supposed to be 8 and 16. It was accidentally reverted when I merged in the main
branch using --no-ff
. I need to learn a better way to do this.
Do you think it's worth inverting the blocks? I think it would still require the same number of nested if statements and (trivially) would require evaluation of both every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries! you're 100% right. Operationally they are the same thing, so your call on the if-blocks. The motivation for inverting the if-blocks was more so because the purpose of the if-blocks are to check sufficient resource requirement for specific algorithms, so in the case I suggested, there is one check per algorithm so it might be more straightforward to read/update/maintain, but no real performance difference especially considering these blocks are encountered once in a program that basically runs for a minimum of 2 hours haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there may be an advantage to this way in that the user is given relevant information for their resources. If it's done by algorithm they may have to fail twice before getting all the information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed only one line of the error message was being printed, so I've updated them.
See new error messages:
/hot/code/sfitzgibbon/gitHub/uclahs-cds/pipeline-call-sSNV/slurm-82745.out
/hot/code/sfitzgibbon/gitHub/uclahs-cds/pipeline-call-sSNV/slurm-82746.out
Generally looks good @sorelfitzgibbon! Added one note about a potentially misbehaving if block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @sorelfitzgibbon!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments:
config/F2.config
Outdated
@@ -6,19 +6,19 @@ | |||
process { | |||
withName: run_validate_PipeVal { | |||
cpus = 1 | |||
memory = 1.GB | |||
memory = 2.GB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The F-series nodes don't have exactly 2xCPUs memory available so this will likely result in only one process of PipeVal running at a time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Description - Configuration and testing for lower resource nodes.
Added
F8.config
and adjustedF16
andF32
. Several tests were run and general conclusions added to README. I plan in a future PR to additionally update thePerformance Validation
section with recent performance data for higher resource nodes.Closes #249
Testing Results
output directory:
/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8
call_sSNV_MuSE
(even after retry with all mem - 16G)/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/SACC-exome-all-tools-exome-F8-failed-slurm-81445.out
MuSE
, succeeded, runtime = 2 hr 5 min/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/SACC-exome-all-tools-except-muse-F8-success-slurm-81827.out
call_sSNV_MuSE
(even after retry with all mem - 16G)/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/a_partial-all-tools-F8-failed-slurm-81803.out
MuSE
, succeeded, runtime = 2 days 4 hr 20 min./hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/CPCG.noContam-all-tools-except-muse-F8-slurm-81834.out
call_sSNV_MuSE
: 24G + 8G = 32G), runtime = 1 hr 21 min/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/SACC-exome-all-tools-F16-succeed2ndtry-slurm-81846.out
call_sSNV_MuSE
(even after retry with all mem: 32G)/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-add-F8/a_partial-all-tools-F16-failed-slurm-81873.out
Re-tested after merging in
main
(sfitz-external-parse-bam
branch)muse
is included - as expectedmuse
is excludedChecklist
I have read the code review guidelines and the code review best practice on GitHub check-list.
I have reviewed the Nextflow pipeline standards.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have set up or verified the branch protection rule following the github standards before opening this pull request.
I have added my name to the contributors listings in the
manifest
block in thenextflow.config
as part of this pull request; I am listed already, or do not wish to be listed. (This acknowledgement is optional.)I have added the changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.I have updated the version number in the
metadata.yaml
andmanifest
block of thenextflow.config
file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)I have tested the pipeline on at least one A-mini sample.