Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question for constructing samplesheet #349

Closed
pseudacriscrucifer opened this issue Feb 15, 2024 · 2 comments
Closed

question for constructing samplesheet #349

pseudacriscrucifer opened this issue Feb 15, 2024 · 2 comments

Comments

@pseudacriscrucifer
Copy link

Hi,

How would you recommend I assemble a samplesheet for nf-core/atacseq with the following example of input fastq.gz files.

12_B9_KOL4052A19_S2_L001_I1_001.fastq.gz
12_B9_KOL4052A19_S2_L001_R1_001.fastq.gz
12_B9_KOL4052A19_S2_L001_R2_001.fastq.gz
12_B9_KOL4052A19_S2_L001_R3_001.fastq.gz
12_B9_KOL4052A19_S2_L002_I1_001.fastq.gz
12_B9_KOL4052A19_S2_L002_R1_001.fastq.gz
12_B9_KOL4052A19_S2_L002_R2_001.fastq.gz
12_B9_KOL4052A19_S2_L002_R3_001.fastq.gz
13_B10_KOL4052A20_S4_L001_I1_001.fastq.gz
13_B10_KOL4052A20_S4_L001_R1_001.fastq.gz
13_B10_KOL4052A20_S4_L001_R2_001.fastq.gz
13_B10_KOL4052A20_S4_L001_R3_001.fastq.gz
13_B10_KOL4052A20_S4_L002_I1_001.fastq.gz
13_B10_KOL4052A20_S4_L002_R1_001.fastq.gz
13_B10_KOL4052A20_S4_L002_R2_001.fastq.gz
13_B10_KOL4052A20_S4_L002_R3_001.fastq.gz

This represents sequencing files from two samples (12_B9 and 13_B10), and is paired-end. Any help would be appreciated - I have tried numerous constructs for samplesheet.csv!

@bjlang
Copy link
Contributor

bjlang commented Mar 20, 2024

Depending on whether Rx refers to biological or technical replicates, your samplesheet would look something like

sample,fastq_1,fastq_2,replicate
12_B9,12_B9_KOL4052A19_S2_L001_R1_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R1_001.fastq.gz,1
12_B9,12_B9_KOL4052A19_S2_L001_R2_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R2_001.fastq.gz,2
12_B9,12_B9_KOL4052A19_S2_L001_R3_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R3_001.fastq.gz,3
13_b10,13_B10_KOL4052A20_S4_L001_R1_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R1_001.fastq.gz,1
13_b10,13_B10_KOL4052A20_S4_L001_R2_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R2_001.fastq.gz,2
13_b10,13_B10_KOL4052A20_S4_L001_R3_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R3_001.fastq.gz,3

or

sample,fastq_1,fastq_2,replicate
12_B9,12_B9_KOL4052A19_S2_L001_R1_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R1_001.fastq.gz,1
12_B9,12_B9_KOL4052A19_S2_L001_R2_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R2_001.fastq.gz,1
12_B9,12_B9_KOL4052A19_S2_L001_R3_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R3_001.fastq.gz,1
13_b10,13_B10_KOL4052A20_S4_L001_R1_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R1_001.fastq.gz,1
13_b10,13_B10_KOL4052A20_S4_L001_R2_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R2_001.fastq.gz,1
13_b10,13_B10_KOL4052A20_S4_L001_R3_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R3_001.fastq.gz,1

In the latter case (technical replicates) the files will be merged after alignment but before any analysis.

Both examples miss still the *_I1_001.fastq.gz files. If these are more replicates, then add them accordingly. However, if they are input controls then use the --with_control flag for running the pipeline and your samplesheet would rather look like

sample,fastq_1,fastq_2,replicate,control,control_replicate
12_B9,12_B9_KOL4052A19_S2_L001_R1_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R1_001.fastq.gz,1,12_B9_INPUT_CTRL,1
12_B9,12_B9_KOL4052A19_S2_L001_R2_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R2_001.fastq.gz,2,12_B9_INPUT_CTRL,1
12_B9,12_B9_KOL4052A19_S2_L001_R3_001.fastq.gz,12_B9_KOL4052A19_S2_L002_R3_001.fastq.gz,3,12_B9_INPUT_CTRL,1
12_B9_INPUT_CTRL,12_B9_KOL4052A19_S2_L001_I1_001.fastq.gz,12_B9_KOL4052A19_S2_L002_I1_001.fastq.gz,1,,
13_b10,13_B10_KOL4052A20_S4_L001_R1_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R1_001.fastq.gz,1,13_b10_INPUT_CTRL,1
13_b10,13_B10_KOL4052A20_S4_L001_R2_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R2_001.fastq.gz,2,13_b10_INPUT_CTRL,1
13_b10,13_B10_KOL4052A20_S4_L001_R3_001.fastq.gz,13_B10_KOL4052A20_S4_L002_R3_001.fastq.gz,3,13_b10_INPUT_CTRL,1
13_b10_INPUT_CTRL,13_B10_KOL4052A20_S4_L001_I1_001.fastq.gz,13_B10_KOL4052A20_S4_L002_I1_001.fastq.gz,1,,

@JoseEspinosa
Copy link
Member

As this issue had not more activity I will close it now. Feel free to reach us if you have any further questions @pseudacriscrucifer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants