-
Notifications
You must be signed in to change notification settings - Fork 23
intersect_records
intersect_records intersect records in the stream based on overlapping intervals. Intersection is done by
splitting the stream and intersect all records with a specific key with all records without a specific key.
Intersection are done by locating overlapping intervals of S_BEG and S_END positions with the same S_ID and optionally
the same STRAND. If a overlap is found the record without the specific key is emitted to the stream, unless the --inverse
switch is set which results in non-intersecting records being emitted.
... | intersect_records <-k key> [options]
[-? | --help] # Print full usage description.
[-k <string> | --key=<string>] # Key used for intersection.
[-s | --strand] # Only intervals on the same strand are intersected.
[-i | --inverse] # Only non-intersecting records are emitted.
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following two test files foo.tab
:
#S_ID S_BEG S_END STRAND
ID000001 100 400 +
ID000001 500 800 +
and bar.tab
:
#S_ID S_BEG S_END STRAND
ID000001 200 300 +
ID000001 600 700 -
In order to intersect foo.tab
with bar.tab
so that records from bar.tab
are emitted if they intersect
with records in foo.tab
we use read_tab like this:
read_tab -i foo.tab | add_ident -k INTERSECT | read_tab -i bar.tab | intersect_records -k INTERSECT
STRAND: +
S_ID: ID000001
S_BEG: 200
S_END: 300
---
STRAND: -
S_ID: ID000001
S_BEG: 600
S_END: 700
---
Now, to intersect in a strand dependent manner use the -s
switch:
read_tab -i foo.tab | add_ident -k INTERSECT | read_tab -i bar.tab | intersect_records -k INTERSECT -s
STRAND: +
S_ID: ID000001
S_BEG: 200
S_END: 300
---
And to inverse the result so that only non-intersecting records are emitted use the -i
switch:
read_tab -i foo.tab | add_ident -k INTERSECT | read_tab -i bar.tab | intersect_records -k INTERSECT -si
STRAND: -
S_ID: ID000001
S_BEG: 600
S_END: 700
---
Martin Asser Hansen - Copyright (C) - All rights reserved.
December 2009
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
intersect_records is part of the Biopieces framework.