Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit subsampling step in rnaseq pipeline #1097

Open
ewallace opened this issue Oct 17, 2023 · 1 comment
Open

Explicit subsampling step in rnaseq pipeline #1097

ewallace opened this issue Oct 17, 2023 · 1 comment
Milestone

Comments

@ewallace
Copy link

Description of feature

Subsampling seq data before running a pipeline is good practice to test configurations and fail fast. Allowing the user to subsample the input data before running the entire pipeline, would provide a quicker in-line way to validate that the pipeline runs, troubleshoot, and check inputs.

I would like to request optional subsampling as a feature, I think it will save a lot of people a lot of time. Yes, it's possible for users to manually subsample data and then feed that in to the pipeline, but that seems to be against the nextflow spirit. Having this option inline would let users test-run the pipeline with --subsample-reads 100000 then test everything within minutes, followed by editing that one parameter to run on all the input data.

Probably it's achievable with fq subsample.

Note that the current (v3.12.0) "subsample" step does not do that, see issue #1095.

Issue #1096 suggests a different workaround only if using FastP for alignment.

@drpatelh drpatelh added this to the 3.15.0 milestone May 13, 2024
@pinin4fjords pinin4fjords modified the milestones: 3.15.0, 3.16.0 May 29, 2024
@drpatelh
Copy link
Member

Could be solved by #1096

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants