GitHub

Nextflow workflows for characterizing bovine immunological nanopore sequencing data

This workflow has been tested on Narval but should work on any of the other Digital Research Alliance of Canada systems. With a bit of editing these scripts should be able to run on any system with Nextflow and Fastqc, Fastp and Seqkit and Bioawk installed.

Note : The nextflow scripts expects compressed fastq files. Note: Nextflow doesn't like relative paths for input files. ~/somewhere/or/other works as does $(realpath a/relative/path)

Single data-set

In an interative job

module load StdEnv/2020 nextflow/23.04.3

nextflow run <path/to/fastqc_and_fastp.nf> --input_file <input_file.fastq.gz> --output_dir <path/to/results_folder>
nextflow run <path/to/IgG_filtering.nf> --input_file <fastp-trimmed-file.fastq.gz> --output_dir <path/to/results_folder>

Multiple data-sets

Check quality and trim sequences using Fastqc and Fastp Assuming you've put the basecalled files in ~/scratch/bovine_nanopore_data and that you are currently in the folder that you want all the results in.

export SCRIPTS_DIR="path/to/scripts/folder"

module load meta-farm/1.0.2

farm_init.run fastqc_fastp-farm

find ~/scratch/bovine_nanopore_data -name '*.fastq.gz' | parallel --dry-run 'NXF_WORK=$SLURM_TMPDIR/work nextflow run '${SCRIPTS_DIR}'/fastqc_and_fastp.nf --input_file {}  --output_dir '$(pwd)/'$(echo "{/.}" | sed "s/.fastq//")_results' > fastqc_fastp-farm/table.dat

eval cp ${SCRIPTS_DIR}/fastqc_and_fastp-job_script.sh fastqc_fastp-farm/job_script.sh
eval cp ${SCRIPTS_DIR}/single_case.sh fastqc_fastp-farm/single_case.sh

cd fastqc_fastp-farm && submit.run 2

Run filtering pipeline on each fastp trimmed file. Assuming you are still in the fastqc_fastp-farm folder, that the meta-farm module is still loaded, and that the previous jobs are finished

cd ..
farm_init.run IgG_filtering-farm

find bc*/fastp -name 'bc*trimmed.fastq.gz' | parallel --dry-run 'NXF_WORK=$SLURM_TMPDIR/work nextflow run '${SCRIPTS_DIR}'/IgG_filtering.nf --input_file '$(pwd)'/{}  --output_dir '$(pwd)/'$(echo "{/.}" | sed "s/-trimmed.fastq//")_results' > IgG_filtering-farm/table.dat

eval cp ${SCRIPTS_DIR}/IgG_filtering-job_script.sh IgG_filtering-farm/job_script.sh
eval cp ${SCRIPTS_DIR}/single_case.sh IgG_filtering-farm/single_case.sh

cd IgG_filtering-farm && submit.run 2

Use query.run inside the farm folders to check on the overall status over the jobs If you have a choice of which def-account_name to use when you submit jobs you can over ride the default one for the meta-farm jobs by submitting with submit.run 4 '--account def-other_account'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow workflows for characterizing bovine immunological nanopore sequencing data

Single data-set

Multiple data-sets

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
IgG_filtering-job_script.sh		IgG_filtering-job_script.sh
IgG_filtering.nf		IgG_filtering.nf
README.md		README.md
fastqc_and_fastp-job_script.sh		fastqc_and_fastp-job_script.sh
fastqc_and_fastp.nf		fastqc_and_fastp.nf
processes.nf		processes.nf
single_case.sh		single_case.sh

harohodg/filtering_bovineIgG_rep

Folders and files

Latest commit

History

Repository files navigation

Nextflow workflows for characterizing bovine immunological nanopore sequencing data

Single data-set

Multiple data-sets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages