Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Manta CPU #161

Merged
merged 6 commits into from
Jun 25, 2024
Merged

Optimize Manta CPU #161

merged 6 commits into from
Jun 25, 2024

Conversation

Faizal-Eeman
Copy link
Contributor

@Faizal-Eeman Faizal-Eeman commented Jun 24, 2024

Description

Increment CPU allocation for Manta.

The increments in this PR were primarily adapted from pipeline-call-sSNV and tailored to pipeline-call-gSV's requirements.

Closes #154

Testing Results

Node Sample BAM Size Old Runtime (CPU = 1) New Runtime (CPU = n)
F16 HG003-5X 21G 1h 27m 11m 38s
F32 ILHNLNEV000002-N001-B01-F 85GB 7h 38m 34m
F72 ZMBNGIAB000008-N001-A01-F 369GB >20h still running 6h 1m

output dir: /hot/software/pipeline/pipeline-call-gSV/Nextflow/development/unreleased/mmootor-optimize-manta-cpu/

Checklist

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have reviewed the Nextflow pipeline standards.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have set up or verified the branch protection rule following the github standards before opening this pull request.

  • I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

  • I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

  • I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)

  • I have tested the pipeline on at least one A-mini sample with run_delly = true, run_manta = true, run_qc = true. For run_delly = true, I have tested 'variant_type' set to gSV, gCNV, and both. The paths to the test config files and output directories are captured above in the Testing Results section.

@Faizal-Eeman Faizal-Eeman requested a review from a team as a code owner June 24, 2024 21:47
Copy link

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F16.json

@ ["params","proc_resource_params","call_gSV_Manta","cpus"]
- "1"
+ "6"
@ ["process","withName:call_gSV_Manta","cpus"]
- "1"
+ "6"

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@yashpatel6
Copy link
Contributor

/fix-tests

Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to only update the F16 and F32 configs and not the rest?

@yashpatel6 yashpatel6 self-assigned this Jun 25, 2024
Copy link

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 4b7584b. You must review my work before merging this pull request!

@Faizal-Eeman
Copy link
Contributor Author

Any reason to only update the F16 and F32 configs and not the rest?

@yashpatel6

  • F72 test was in progress at the time of opening the PR. It's done now.
  • Updating M64 as well but without a test.
  • F2 update wouldn't matter as Manta doesn't run on F2.

Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -24,7 +24,7 @@ process {
}
}
withName: call_gSV_Manta {
cpus = 1
cpus = 30

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(question: non-blocking) @Faizal-Eeman can you remind us of the optimal #CPU based on the existing benchmarking?

Copy link
Contributor Author

@Faizal-Eeman Faizal-Eeman Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the paper you shared (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10976732/figure/Fig4/), it seems like higher the CPU better is the runtime for Manta. So I set 30 CPUs, so that we still have enough CPUs left to allot to Delly (& other callers) in the future.

@Faizal-Eeman Faizal-Eeman merged commit b0b9d3f into main Jun 25, 2024
7 checks passed
@Faizal-Eeman Faizal-Eeman deleted the mmootor-optimize-manta-cpu branch June 26, 2024 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize CPU allocation for Manta
3 participants