Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify docs on different tximport count files #1366

Merged
merged 7 commits into from
Sep 3, 2024

Conversation

pmoris
Copy link
Contributor

@pmoris pmoris commented Aug 29, 2024

I went down a rabbit hole trying to understand the different types of abundance files that are generated by the rnaseq pipeline, and after reading through the various docs (pipeline, tximport, deseq2) and several github/slackm tracks, I think that the current wording of the rnaseq docs is confusing/worded poorly/wrong?

Basically, I believe that the first of the three options described in this section of the docs, should actually be bias-uncorrected counts with an offset (or original counts with an offset, as in the tximport docs). I.e., either you use

  1. bias-corrected counts through library/length scaling (recommended for things like limma)
    or
  2. you use an offset based on the length matrix (which is automatically done when loading in txi objects via DESeq2's DESeqDataSetFromTximport) , or
  3. original counts without offset or scaling => only for cases where the reads don't have length bias (like 3' rnaseq).

Currently, the docs read as if bias correction and offsets are two separate procedures that can be combined, which confused me quite a bit when I first read it, since it didn't match the explanation in tximport's docs nor the one in the github issue comment that is linked in the docs (#499 (comment)).

I talked about this some more in this slack thread if more background is required: https://nfcore.slack.com/archives/CE8SSJV3N/p1724863202019919?thread_ts=1686156473.934039&cid=CE8SSJV3N

PR checklist

Linting fails, but not on any of the files I changed (│ files_unchanged: docs/images/nf-core-rnaseq_logo_dark.png does not match the template). The other checklist items are not applicable.

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Aug 29, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 3587b18

+| ✅ 173 tests passed       |+
#| ❔   9 tests were ignored |#
!| ❗   7 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-09-03 14:57:22

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input! Just clarified/ extended a couple of points.

docs/output.md Outdated Show resolved Hide resolved
docs/output.md Outdated Show resolved Hide resolved
pinin4fjords and others added 3 commits September 3, 2024 14:11
Clarify output files:
- add gene_lengths description
- clarify that scaled output files contain estimated counts from abundances
docs/output.md Outdated Show resolved Hide resolved
docs/output.md Outdated Show resolved Hide resolved
docs/output.md Outdated Show resolved Hide resolved
docs/output.md Outdated Show resolved Hide resolved
@pinin4fjords pinin4fjords added this to the 3.15.0 milestone Sep 3, 2024
@pmoris
Copy link
Contributor Author

pmoris commented Sep 3, 2024

Thanks for the review and feedback by the way! Hope this ends up being helpful.

@pinin4fjords pinin4fjords merged commit 0d93da5 into nf-core:dev Sep 3, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants