-
Notifications
You must be signed in to change notification settings - Fork 1
RNASeq I QC, Mapping w STAR
Log into Isaac Next Gen. After your password, type 1, and it will send you a Duo push.
ssh <yourusername>@login.isaac.utk.edu
Go to the project directory
/lustre/isaac/proj/UTK0208/rnaseq
Let's take a peak at the raw data
ls raw_data
- A209 bud 600 chill hours rep1; Prunus persica; RNA-Seq (SRR10269867) - early blooming, bud in ecodormancy
- A209 bud 600 chill hours rep2; Prunus persica; RNA-Seq (SRR10269868) - early blooming, bud in ecodormancy
- A318 bud 600 chill hours rep1; Prunus persica; RNA-Seq (SRR10269871) - late blooming, bud in endodormancy
- A318 bud 600 chill hours rep2; Prunus persica; RNA-Seq (SRR10269872) - late blooming, bud in endodormancy
You will see a directory set up for our practice. cd into it and create a directory for your lab
cd analysis
mkdir <yourusername>
cd <yourusername>
Lets check the quality of the files. This is worth a look, as the quality stats for RNASeq differ in important ways from DNA.
mkdir 1_fastqc
cd 1_fastqc
ln -s /lustre/isaac/proj/UTK0208/rnaseq/raw_data/*fastq.gz .
To run fastqc, we need to create an analysis script. Open fastqc.qsub
and paste in:
#!/bin/bash
#SBATCH -J fastqc
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH -A ISAAC-UTK0208
#SBATCH -p condo-epp622
#SBATCH -q condo
#SBATCH -t 00:30:00
module load fastqc
fastqc *gz
Run on Isaac:
sbatch fastqc.sh
Monitor:
squeue -u <yourusername>
You can also monitor progress by keeping tabs on the slurm output:
cat slurm-######.out
I already did this, so you do not have to do it. Here is the script I used:
#!/bin/bash
#SBATCH -J star-index
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH -A ISAAC-UTK0208
#SBATCH -p condo-epp622
#SBATCH -q condo
#SBATCH -t 00:30:00
#SBATCH --mem-per-cpu=16G
module load star
STAR \
--runMode genomeGenerate \
--genomeDir STAR_idx \
--genomeFastaFiles Ppersica_298_v2.0.fa \
--runThreadN 1 \
--genomeSAindexNbases 11 \
--sjdbGTFfile Ppersica_298_v2.1.gene_exons.gff3 \
--sjdbGTFtagExonParentTranscript Parent \
--sjdbOverhang 149
Note addition of more memory! If you don't add that it fails with the error slurmstepd: error: Detected 1 oom-kill event(s) in StepId=261994.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
Create and move into a new dir, 2_star
, then link the fastq.gz files again.
Lets start by mapping one read pair. Let's create a STAR.qsh script.
#!/bin/bash
#SBATCH -J star-map
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH -A ISAAC-UTK0208
#SBATCH -p condo-epp622
#SBATCH -q condo
#SBATCH -t 00:30:00
#SBATCH --mem-per-cpu=8G
module load star
STAR \
--genomeDir /lustre/isaac/proj/UTK0208/rnaseq/raw_data/STAR_idx \
--runThreadN 2 \
--readFilesIn EarlyBlommingRep1_1.fastq.gz EarlyBlommingRep1_2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix EarlyBlommingRep1 \
--outSAMtype BAM SortedByCoordinate