Skip to content

remove_indel_columns

Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: remove_indel_columns

Description

remove_indel_columns removes columns in alignments containing only indels (~-._ characters) from sequences in the stream.

Usage

... | remove_indel_columns [options]

Options

[-?         | --help]               #  Print full usage description.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the following aligned sequences in the FASTA file test.fna:

>test1
_cgta.cgta-ctacg~actcgtacg-
>test2
_cgta.cgta-ctacg~actcgtacg-
>test3
_cgta.cgta-ctacg~actcgtacg-
>test4
_cgta.cgta-ctacg~actcgtacg-
>test5
_cgta.cgtagctacg~actcgtacg-

To remove the indels-only columns, do:

read_fasta -i test.fna | remove_indel_columns

SEQ_NAME: test1
SEQ: cgtacgta-ctacgactcgtacg
SEQ_LEN: 23
---
SEQ_NAME: test2
SEQ: cgtacgta-ctacgactcgtacg
SEQ_LEN: 23
---
SEQ_NAME: test3
SEQ: cgtacgta-ctacgactcgtacg
SEQ_LEN: 23
---
SEQ_NAME: test4
SEQ: cgtacgta-ctacgactcgtacg
SEQ_LEN: 23
---
SEQ_NAME: test5
SEQ: cgtacgtagctacgactcgtacg
SEQ_LEN: 23
---

See also

read_fasta

remove_indels

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

December 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

remove_indel_columns is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally