Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce available memory for Picard Mark Duplicates #272

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

ryan-moreno
Copy link

Right now, the pipeline fails for me with the following error:

Error executing process > 'NFCORE_ATACSEQ:ATACSEQ:MERGED_LIBRARY_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES (DMSO_REP4)'

Caused by:
  Process `NFCORE_ATACSEQ:ATACSEQ:MERGED_LIBRARY_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES (DMSO_REP4)` terminated with an error exit status (1)

Command executed:

  picard \
      -Xmx36g \
      MarkDuplicates \
      --ASSUME_SORTED true --REMOVE_DUPLICATES false --VALIDATION_STRINGENCY LENIENT --TMP_DIR tmp \
      --INPUT DMSO_REP4.mLb.sorted.bam \
      --OUTPUT DMSO_REP4.mLb.mkD.sorted.bam \
      --REFERENCE_SEQUENCE genome.fa \
      --METRICS_FILE DMSO_REP4.mLb.mkD.sorted.MarkDuplicates.metrics.txt
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ATACSEQ:ATACSEQ:MERGED_LIBRARY_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES":
      picard: $(echo $(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:)
  END_VERSIONS

Command exit status:
  1

Command output:
  Error occurred during initialization of VM
  Could not reserve enough space for 37748736KB object heap

I ran into this issue with the cutandrun pipeline and the rnaseq pipeline. In both cases, the fix was to allocate only a fraction of the available memory when launching the process. Here is the relevant change in the cutandrun pipeline. Thanks @drpatelh for the change in the cutandrun repo.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/atacseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@ryan-moreno ryan-moreno marked this pull request as ready for review May 1, 2023 16:10
@ryan-moreno
Copy link
Author

To make this run with my setup, I also had to hard code the memory allocated in /atacseq/modules/nf-core/picard/collectmultiplemetrics/main.nf as 2g. I'm not sure the proper way to deal with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants