Skip to content

ivlachos/Manatee1.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Manatee

Manatee version 1.0

What is Manatee?

Manatee is a tool for detection, quantification and analysis of small ncRNAs 
from next-generation sequencing data.

DEPENDENCIES

  1. perl
  2. Set::IntervalTree: perl package
  3. SAMtools: need to be installed and added to your PATH
  4. Bowtie: executable file included in Manatee package, no installation required

INSTALLATION (Unix/Linux)

Install the required dependencies and execute Manatee main script as described in the usage section.

Set::IntervalTree

cpan

install Set::IntervalTree

PACKAGE FILES

The following compontents are included in the Manatee package.

bowtie-1.0.1       % directory with bowtie aligner

config             % configuration file

Manatee            % Perl core program for sRNA analysis

README.md          % this file

trans-index        % directory where the transcriptome index will be stored

USAGE with configuration file

Syntax:

Manatee -config <file> -i <file> -o <dir>

-config <file>

Path to configuration file.

-i  <file>

Path to pre-processed FASTQ or FASTA file. Valid formats: .fa, .fasta, .fastq, .fq, .fa.gz, .fasta.gz, .fastq.gz, .fq.gz.

-o <dir>

Path to directory where the output will be stored.

USAGE with input parameters

Syntax:

Manatee [OPTIONS] -i <file> -o <dir> -index <ebwt> -genome <file> -annotation <file>

-i <file>

Path to pre-processed FASTQ or FASTA file. Valid formats: .fa, .fasta, .fastq, .fq, .fa.gz, .fasta.gz, .fastq.gz, .fq.gz.

-o <dir>

Path to directory where the output will be stored.

-index <ebwt>

Path and basename of the genome index to be searched. The basename is the name of any of the index files up to but not including the final .1.ebwt/.rev.1.ebwt/etc.

-genome <file>

Path to genome FA or FASTA file.

-annotation <file>

Path to non coding annotation file. File should contain the following tab seperated elements: chromosome, strand, start loci, end loci, biotype, transcript id, transcript name.

OPTIONS

-t_index <ebwt>

Path and basename of the transcriptome index to be searched. The basename is the name of any of the index files up to but not including the final .1.ebwt/.rev.1.ebwt/etc. If left blank, in case of non existing index, Manatee will generate transcriptome index based on the provided non coding annotation and will store that index within the transcripts directory.

-cores <int>

Number of alignment cores (default: -cores 1).

-collapse <yes/no>

Collapse reads with the same genomic sequences. This setting reduces significantly the execution time. Possible values ues/no (default: -collapse yes).

-m <int>

Max of multimapping loci, -m in bowtie execution. The mapping algorithm will be applied only for reads with multi-mapped loci less or equal than -m. Reads with multimapped loci that exceed the m will be aligned against transcriptome (default: m=50).

-mismatches <int> 

Maximun number of mismatches in genomic alignments (default: mismatches=1).

-m <int>

Max of multimapping loci, -m in bowtie execution. The mapping algorithm will be applied only for reads with multi-mapped loci less or equal than m. Reads with multimapped loci that exceed the -m will be aligned against transcriptome (default: -m 200).

-s <yes/no>

Strand specific mode of the algorithm (default -strand_mode yes).

-cd <int>

Minimum number of unannotated read abundances per cluster (default: -cd 5).

-cdi <int>

Clusters of unannotated reads will be merged if the distance between them is equal or less than cdi (default: -cdi 50).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages