Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example doc with CPU/Mem resources for very large samples #288

Open
Faizal-Eeman opened this issue May 10, 2024 · 8 comments
Open

Add example doc with CPU/Mem resources for very large samples #288

Faizal-Eeman opened this issue May 10, 2024 · 8 comments

Comments

@Faizal-Eeman
Copy link

Faizal-Eeman commented May 10, 2024

When running very large sample BAMs through call-sSNV, it likely that the pipeline would fail because of default resource configurations.

Although the base_resouce_update function in template.config is a great utility to update resources on a case-by-case basis, it is often unclear on how much of the resource is to be updated for a successful run. It would be nice provide examples of resource configurations that worked for large BAMs, perhaps in a doc/ dir of the repo.

Here are the resources I set in my pipeline run's config,

base_resource_update {
        cpus = [
            ['call_sIndel_Manta', 0.1]
        ]
        memory = [
            [['run_validate_PipeVal', 'call_sSNV_Strelka2', 'call_sSNV_Mutect2', 'call_sIndel_Manta', 'concat_VCFs_BCFtools', 'plot_VennDiagram_R', 'run_LearnReadOrientationModel_GATK', 'convert_BAM2Pileup_SAMtools'], 10],
            [['call_sSNV_MuSE', 'run_sump_MuSE'], 2]
            ]
        }

Nextflow trace files

Case 1:

Normal - 2.5TB
Tumor - 1.1TB - /hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231201T000959Z/nextflow-log/trace.txt

Case 2:

Normal - 369GB

  • Tumor 1 - 312GB - /hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/CancerGIAB/metapipeline/metapipeline-DNA-5.3.1/BNCH000122/main_workflow/output/call-sSNV-8.0.0/ZMBNGIAB000008-T001-C01-F/log-call-sSNV-8.0.0-20240421T231004Z/nextflow-log/trace.txt
  • Tumor 2 - 328GB - /hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/CancerGIAB/metapipeline/metapipeline-DNA-5.3.1/BNCH000122/main_workflow/output/call-sSNV-8.0.0/ZMBNGIAB000008-T001-C02-F/log-call-sSNV-8.0.0-20240421T231102Z/nextflow-log/trace.txt
@sorelfitzgibbon
Copy link
Contributor

@Faizal-Eeman, congrats on getting 2.5/1.1 TB samples through the pipeline! To help future users with large input files, I summarized the maximum resources actually used for each process:

Maximum values:

Tool realtime %cpu* peak_rss
run_validate_PipeVal 16h 24m 64% 20.MB
call_sSNV_SomaticSniper 5d 5h 4m 85% 18.GB
convert_BAM2Pileup_SAMtools 9d 1h 46m 90% 25.GB
call_sIndel_Manta 4d 4h 24m 239% 5.GB
call_sSNV_Strelka2 23h 33m 2530% 20.GB
call_sSNV_Mutect2 1d 7h 32m 125% 23.GB
run_LearnReadOrientationModel_GATK 22m 103% 28.GB
call_sSNV_MuSE 1d 9h 27m 1589% 119.GB
run_sump_MuSE 3m 172% 3.GB

*don't trust the %cpu numbers

@sorelfitzgibbon
Copy link
Contributor

Maximum values for Case 2 tumors 1 and 2:

Tumor Tool realtime %cpu* peak_rss
tumor1 call_sSNV_SomaticSniper 13h 30m 93% 3.2 GB
tumor2 call_sSNV_SomaticSniper 12h 47m 96% 2.6 GB
tumor1 convert_BAM2Pileup_SAMtools 12h 4m 95% 13.7 GB
tumor2 convert_BAM2Pileup_SAMtools 11h 23m 96% 13.5 GB
tumor1 call_sIndel_Manta 1d 12h 52m 79% < 1 GB
tumor2 call_sIndel_Manta 1d 11h 34m 79% < 1 GB
tumor1 call_sSNV_Strelka2 7h 3m 703% 3.3 GB
tumor2 call_sSNV_Strelka2 6h 27m 750% 2.8 GB
tumor1 call_sSNV_Mutect2 6h 40m 100% 2.8 GB
tumor2 call_sSNV_Mutect2 6h 13m 100% 2.5 GB
tumor1 run_LearnReadOrientationModel_GATK 6m 98% 2.7 GB
tumor2 run_LearnReadOrientationModel_GATK 8m 98% 3.2 GB
tumor1 call_sSNV_MuSE 5h 22m 1196% 62.5 GB
tumor2 call_sSNV_MuSE 6h 10m 1198% 67.5 GB
tumor1 run_sump_MuSE < 1m 1427% 9.1 GB
tumor2 run_sump_MuSE 6m 1126% 32.4 GB

*don't trust the %cpu numbers

@Faizal-Eeman
Copy link
Author

Faizal-Eeman commented May 14, 2024

Here are the failed logs for the 2TB sample that lead me to the CPU/memory update in the description. The memory allocation was default for these failed logs and as I identified error code 137 I updated that process's allocation accordingly. I also updated allocation for processes where I anticipated a memory error 137.

/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231102T215524Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231112T041924Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231114T003236Z
/hot/data/unregistered/Zook-Mootor-BNCH-GIAB/analysis/GIAB/AshkenazimParents/somatic-variants/test/failed/call-sSNV-7.0.0/HG002-T/log-call-sSNV-7.0.0-20231115T032828Z

@tyamaguchi-ucla
Copy link
Contributor

@sorelfitzgibbon @Faizal-Eeman do you guys think it's worth updating the M64 config based on these results? https://github.com/uclahs-cds/pipeline-call-sSNV/blob/main/config/M64.config

@sorelfitzgibbon
Copy link
Contributor

@sorelfitzgibbon @Faizal-Eeman do you guys think it's worth updating the M64 config based on these results? https://github.com/uclahs-cds/pipeline-call-sSNV/blob/main/config/M64.config

Yes, it looks like several values can be substantially lowered. I'll work on this.

@sorelfitzgibbon
Copy link
Contributor

@Faizal-Eeman it looks like these files have moved, are they still easily accessible? I'd like to check a couple little things, but not urgent.

@Faizal-Eeman
Copy link
Author

Faizal-Eeman commented Jun 4, 2024

@Faizal-Eeman it looks like these files have moved, are they still easily accessible? I'd like to check a couple little things, but not urgent.

Yes. I've updated the file paths here now.

@tyamaguchi-ucla
Copy link
Contributor

It looks like our configs miss a few processes in SomaticSniper (e.g. generate_ReadCount_bam_readcount) @sorelfitzgibbon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants