Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update resource handling #266

Merged
merged 4 commits into from
May 12, 2023
Merged

Conversation

yashpatel6
Copy link
Contributor

Description

Closes #264

Updating some resource allocations:

  • Add retry mechanism for retrying alignment process with fewer CPUs
  • Add retry mechanism for retrying MarkDuplicates with Picard
  • Add memory difference for Picard/GATK since MarkDuplicates with alt-aware mode seems to require a few GB over the setting specified by Java options memory

Testing Results

  • BWA-MEM2 - DTB-002T

    • Alignment fails for one pair of FASTQs due to memory and retry with fewer CPUs succeeds
    • Config: /hot/software/pipeline/pipeline-align-DNA/Nextflow/development/unreleased/yashpatel-update-resource-handling/DTB-002T.config
    • Input: /hot/software/pipeline/pipeline-align-DNA/Nextflow/development/unreleased/yashpatel-update-resource-handling/DTB-002T.csv
    • Output: /hot/software/pipeline/pipeline-align-DNA/Nextflow/development/unreleased/yashpatel-update-resource-handling
  • Tested Picard memory difference - /hot/project/disease/HeadNeckTumor/HNSC-000084-LNMEvolution/pipelines/align-DNA/yashpatel_test

Checklist

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have reviewed the Nextflow pipeline standards.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have set up the branch protection rule following the github standards before opening this pull request, or the branch protection rule has already been set up.

  • I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

  • I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

  • I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)

  • I have tested the pipeline on at least one A-mini sample with aligner setting to BWA-MEM2, HISAT2, and both. The paths to the test config files and output directories were attached in the Testing Results section.

@yashpatel6 yashpatel6 requested a review from a team as a code owner May 4, 2023 19:04
@tyamaguchi-ucla tyamaguchi-ucla self-assigned this May 4, 2023
@tyamaguchi-ucla
Copy link
Contributor

Looks good to me. @yashpatel6 do we want additional tests from @graceooh or @rhughwhite (or somebody else) ?

@yashpatel6
Copy link
Contributor Author

Looks good to me. @yashpatel6 do we want additional tests from @graceooh or @rhughwhite (or somebody else) ?

I tested with the samples that Jieun and Rupert had errors with and they were fine; Nick also had errors related to this branch so I've asked him to test the fix as well

@yashpatel6
Copy link
Contributor Author

Nick ran some tests and they were successful so we should be good for these allocations

Copy link
Contributor

@tyamaguchi-ucla tyamaguchi-ucla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Anything else to add guys? @rhughwhite @nkwang24

Comment on lines 45 to +46
cpus = 1
memory = 10.GB
memory = 60.GB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge increase but I guess some multi-library samples require so much memory. Picard is single-threaded and slow and this is also another reason we want to implement #234 although Picard is library-aware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I had ~6 multi-lane samples from the head and neck project (HNSC0000016) that each took ~50GB for this step. Initially tried with spark but ran out of scratch space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah Picard can actually run with very little memory but I increased the value here since the process is generally a bottle-neck process in each of the tool workflows currently. So I kept the allocation at roughly a little under half of the total memory to use as much as possible while leaving some for other misc. processes like validation/checksum generation

@yashpatel6 yashpatel6 merged commit 5d2f9a7 into main May 12, 2023
@yashpatel6 yashpatel6 deleted the yashpatel-update-resource-handling branch May 12, 2023 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation Fault in align-DNA
3 participants