Add info about the sequencing tech TAG and reflect that on the reports #150

abhi18av · 2023-04-04T09:37:20Z

As part of 4-APR meeting.

Focus on homogenous (sequencing) datasets
(IN FUTURE) Accommodate hybrid datasets and reflect on the final results (nanopore/illumina)

@vrennie @TimHHH , where exactly do we need to add this sequencing platform information i.e. which summary files?

TimHHH · 2023-04-04T12:50:15Z

@vrennie @TimHHH , where exactly do we need to add this sequencing platform information i.e. which summary files?

I would think a column in the summary stats file.

vrennie · 2023-04-11T07:02:50Z

Yes, I agree with Tim, just a column that looks like this:

Sequencing Technology
Illlumina
ONT
ONT
Illumina
Illumina
Illumina
...

abhi18av · 2023-04-11T08:36:21Z

Okay, I understand this would be added to the summary stats file 👍

However, there's one more detail worth mentioning here, currently we hard-code the sequencing technology in the bam_rg_string

MAGMA/workflows/validate_fastqs_wf.nf

Line 30 in 786d13d

    
           bam_rg_string ="@RG\\tID:${flowcell}.${lane}\\tSM:${study}.${sample}\\tPL:illumina\\tLB:lib${library}\\tPU:${flowcell}.${lane}.${index_sequence}"

Should we not add this column to the input-samplesheet as well?

vrennie · 2023-04-11T08:58:11Z

Yes, good catch @abhi18av, lets add this as a column to the samplesheet.

TimHHH · 2023-04-17T09:01:36Z

Yes, ideally the user provides the sequencing technology in the sample sheet and this is then used in the bam_rg_string along the lines of PL:${technology}. The documentation has to be clear that only one technology is allowed per sample sheet.

abhi18av · 2023-04-17T16:47:31Z

Guys, what about reflecting that on the actual sample name as well? Something like Shea2017_2021_396.SRR16089406.LNA.A1.ILMN.1.1.1

The NCBI currently lists the following platforms used for the sequences

ILLUMINA
ION_TORRENT
ABI_SOLID
PACBIO_SMRT
CAPILLARY
OXFORD_NANOPORE
LS454
BGISEQ

To avoid long names, we can perhaps standardize the acronyms like ILMN / ONT / PCB / ION etc - what do you think?

vrennie · 2023-04-17T19:14:06Z

I think unless the full name messes up the .csv its better to keep the full name

abhi18av changed the title ~~Accommodate the nanopore (only) and hybrid datasets and reflect on the final results (nanopore/illumina)~~ Add info about the sequencing tech TAG and reflect that on the reports Apr 4, 2023

abhi18av added the enhancement New feature or request label Apr 4, 2023

abhi18av self-assigned this Apr 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add info about the sequencing tech TAG and reflect that on the reports #150

Add info about the sequencing tech TAG and reflect that on the reports #150

abhi18av commented Apr 4, 2023 •

edited

Loading

TimHHH commented Apr 4, 2023

vrennie commented Apr 11, 2023

abhi18av commented Apr 11, 2023

vrennie commented Apr 11, 2023

TimHHH commented Apr 17, 2023 •

edited

Loading

abhi18av commented Apr 17, 2023

vrennie commented Apr 17, 2023

Add info about the sequencing tech TAG and reflect that on the reports #150

Add info about the sequencing tech TAG and reflect that on the reports #150

Comments

abhi18av commented Apr 4, 2023 • edited Loading

TimHHH commented Apr 4, 2023

vrennie commented Apr 11, 2023

abhi18av commented Apr 11, 2023

vrennie commented Apr 11, 2023

TimHHH commented Apr 17, 2023 • edited Loading

abhi18av commented Apr 17, 2023

vrennie commented Apr 17, 2023

abhi18av commented Apr 4, 2023 •

edited

Loading

TimHHH commented Apr 17, 2023 •

edited

Loading