Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: mate_pair_dist

Description

mate_pair_dist calculated the distance between mate-pair reads that have been mapped to one or more sequences such as a genome or a number of contigs. The resulting information can be used to assess the integrity of the reads.

Mate-pairs are located by mapping the mate-pair reads againt a genome or a set of contigs using one of the following mapping tools:

Where the following keys are present in the output:

  • S_ID
  • STRAND
  • Q_ID
  • S_BEG

The input mate-pair reads must be Illumina type read names where the first read ID is followed by a /1 and the second read is followed by a /2. Input order does not matter.

mate_pair_dist then seperates the reads based on S_ID and STRAND and for each Q_ID output all distances between /1 and /2 records. Thus, records of the below type are output if mate-pairs are found:

Q_ID1: 1_ClditxwXsN1/1
S_ID: M1_c1
Q_ID2: 1_ClditxwXsN1/2
DIST: 1210
---

Usage

... | mate_pair_dist [options]

Options

[-?         | --help]               #  Print full usage description.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file   -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file   -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Here is a two-step analysis of mate-pair read integrity. First we index a FASTA file with contigs using create_bowtie_index:

read_fasta -i contigs.fna | create_bowtie_index -d my_dir -i contigs -x

And then we map Illumina reads with read_fastq and determine the mate-pair distances with mate_pair_dist - which we finally plot:

read_fastq -i reads.fq | bowtie_seq -i my_dir/contigs -m 3 | mate_pair_dist | plot_lendist -k DIST -x

See also

read_fasta

read_fastq

create_bowtie_index

blast_seq

blat_seq

vmatch_seq

patscan_seq

bowtie_seq

plot_lendist

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

September 2009

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

mate_pair_dist is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally