Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate primer-based binning prior to primer removal #51

Closed
lina-kim opened this issue May 10, 2023 · 2 comments · Fixed by #72
Closed

Incorporate primer-based binning prior to primer removal #51

lina-kim opened this issue May 10, 2023 · 2 comments · Fixed by #72
Assignees
Labels
core-function Related to core functionality

Comments

@lina-kim
Copy link
Collaborator

lina-kim commented May 10, 2023

Original issue split primer trimming into #62. Cutadapt can't be used for binning as the QIIME 2 plugin requires an input artifact of type MultiplexedSingleEndBarcodeInSequence rather than the SampleData[PairedEndSequencesWithQuality] used for sequences downloaded with q2-fondue and typical of those downloaded from the SRA.

Instead, use VSEARCH to bin primers, as suggested on the QIIME 2 Forum. -> the alignment-only method of VSEARCH isn't wrapped in the QIIME 2 ecosystem!

To incorporate in two steps before denoising:

  • Initially to demultiplex amplicons, binning reads into each library-amplicon combination
    • e.g. V1V2, V2V3, V3V4, V4V5, V5V7, V7V9, ITS
  • Then primer/adapter trimming and QC (this one with MultiQC output)
@lina-kim lina-kim added core-function Related to core functionality quick-fix Low-hanging fruit labels May 10, 2023
@lina-kim lina-kim added this to the Microbiota Vault v1.0 🪁 milestone May 10, 2023
@lina-kim lina-kim self-assigned this May 10, 2023
@lina-kim lina-kim removed the quick-fix Low-hanging fruit label Jun 5, 2023
@lina-kim lina-kim changed the title Incorporate cutadapt into workflow for QC Incorporate primer-based binning prior to primer removal Jun 9, 2023
@lina-kim
Copy link
Collaborator Author

lina-kim commented Jun 9, 2023

Might as well close. Makes sense to return to the original plan of custom artifact, split, followed by primer trimming. Unless I hear a strong case for binning?

@lina-kim
Copy link
Collaborator Author

lina-kim commented Nov 8, 2023

Let's rethink this. The original one-step cutadapt binning method won't be possible without this feature or something along those lines.

If we could perform a simplified binning in a computationally efficient way, though, that would be great. The primary advantage is that we'd be able to run the workflow without requiring extensive inputs of the user: attaching a primer name, primer sequence, and a truncation length to every single sample. Binning allows us to have primer name/sequence/truncation length input completely separately from samples.

lina-kim added a commit that referenced this issue Nov 10, 2023
Closes #51 and #62. Introduces Cutadapt to the workflow for both primer
removal and binning. Though it may not be the most computationally
efficient method, the workflow takes a single FASTQ input artifact and
splits it into N artifacts depending on the N primers provided by the
user.

As a result, this removes the former FASTQ split processes. I'm open to
further discussion on whether we should bring that back for efficiency's
sake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-function Related to core functionality
Projects
None yet
1 participant