-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
137 changed files
with
6,998 additions
and
3,292 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
Commands from section 'annotation' | ||
------------------------------------ | ||
|
||
|
||
closest_gn_to_feat | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Find the n closest genes/transcripts for each peak (or the oppposite). | ||
|
||
**Example:** Find the closest tss to a set of peak | ||
|
||
.. command-output:: gtftk closest_gn_to_feat -t tss -r simple_peaks.bed6 -i simple.gtf -c simple.chromInfo -p 10 -K toto -n transcript_id,gene_id | ||
:shell: | ||
|
||
**Example:** Find the closest tss to a set of peak. Use the gene-centric and uncollapsed outout. | ||
|
||
.. command-output:: gtftk closest_gn_to_feat -t tss -r simple_peaks.bed6 -i simple.gtf -c simple.chromInfo -p 10 -K toto -n transcript_id,gene_id -gu | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk closest_gn_to_feat -h | ||
:shell: | ||
|
||
|
||
closest_genes | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Find the n closest genes for each transcript. | ||
|
||
**Example:** | ||
|
||
.. command-output:: gtftk get_example | bedtools sort | gtftk closest_genes -f | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk closest_genes -h | ||
:shell: | ||
|
||
|
||
overlapping | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Find transcripts whose body/TSS/TTS region extended in 5' and 3' (-u/-d) overlaps with any transcript from another gene. Strandness is not considered by default. Used --invert-match to find those that do not overlap. If --annotate-gtf is used, all lines of the input GTF file will be printed and a new key containing the list of overlapping transcripts will be added to the transcript features/lines (key will be 'overlapping_*' with * one of body/TSS/TTS). The --annotate-gtf and --invert-match arguments are mutually exclusive. | ||
|
||
|
||
**Example:** Find transcript whose promoter overlap transcript from other genes. | ||
|
||
.. command-output:: gtftk get_example -f chromInfo > simple_join_chromInfo.txt; gtftk get_example | gtftk overlapping -c simple_join_chromInfo.txt -t promoter -u 10 -d 10 -a | gtftk select_by_key -k feature -v transcript | gtftk tabulate -k transcript_id,overlap_promoter_u0.01k_d0.01k | head | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk overlapping -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
divergent | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Find transcript with divergent promoters. These transcripts will be defined here | ||
as those whose promoter region (defined by -u/-d) overlaps with the tss of | ||
another gene in reverse/antisens orientation. This may be useful to select | ||
coding genes in head-to-head orientation or LUAT as described in "Divergent | ||
transcription is associated with promoters of transcriptional regulators" | ||
(Lepoivre C, BMC Genomics, 2013). The ouput is a GTF with an additional key | ||
('divergent') whose value is set to '.' if the gene has no antisens transcript | ||
in its promoter region. If the gene has an antisens transcript in its promoter | ||
region the 'divergent' key is set to the identifier of the transcript whose tss | ||
is the closest relative to the considered promoter. The tss to tss distance is | ||
also provided as an additional key (dist_to_divergent). | ||
|
||
|
||
**Example:** Flag divergent transcripts in the example dataset. Select them and produce a tabulated output. | ||
|
||
.. command-output:: gtftk get_example -f chromInfo > simple_join_chromInfo.txt; gtftk get_example | gtftk divergent -c simple_join_chromInfo.txt -u 10 -d 10| gtftk select_by_key -k feature -v transcript | gtftk tabulate -k transcript_id,divergent,dist_to_divergent | head -n 7 | ||
:shell: | ||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk divergent -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
convergent | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Find transcript with convergent tts. These transcripts will be defined here | ||
as those whose tts region (defined by -u/-d) overlaps with the tts of | ||
another gene in reverse/antisens orientation. The ouput is a GTF with an | ||
additional key ('convergent') whose value is set to '.' if the gene has no | ||
convergent transcript in its tts region. If the gene has an antisens transcript | ||
in its tts region the 'convergent' key is set to the identifier of the | ||
transcript whose tts is the closest relative to the considered tts. | ||
The tts to tts distance is also provided as an additional key (dist_to_convergent). | ||
|
||
|
||
**Example:** Flag divergent transcripts in the example dataset. Select them and produce a tabulated output. | ||
|
||
.. command-output:: gtftk get_example -f chromInfo > simple_join_chromInfo.txt; gtftk get_example | gtftk convergent -c simple_join_chromInfo.txt -u 25 -d 25| gtftk select_by_key -k feature -v transcript | gtftk tabulate -k transcript_id,convergent,dist_to_convergent| head -n 4 | ||
:shell: | ||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk convergent -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
exon_sizes | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Add a new key to transcript features containing a comma separated list of exon sizes. | ||
|
||
|
||
**Example:** | ||
|
||
.. command-output:: gtftk get_example | gtftk exon_sizes | gtftk select_by_key -t | ||
:shell: | ||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk exon_sizes -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
|
||
intron_sizes | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Add a new key to transcript features containing a comma separated list of intron sizes. | ||
|
||
|
||
**Example:** | ||
|
||
.. command-output:: gtftk get_example | gtftk intron_sizes | gtftk select_by_key -t | ||
:shell: | ||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk intron_sizes -h | ||
:shell: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
Commands from section 'convertion' | ||
----------------------------------- | ||
|
||
convert | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** This command can be used to convert to various formats. Currently only a limited number is supported. | ||
|
||
* **bed**: classical bed6 format. | ||
* **bed6**: classical bed6 format. | ||
* **bed3**: bed3 format. | ||
|
||
|
||
**Example:** Get the gene features and convert them to bed6. | ||
|
||
.. command-output:: gtftk get_example | gtftk select_by_key -k feature -v gene | gtftk convert -n gene_id -f bed6| head -n 3 | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk convert -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
tabulate | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
**Description:** Extract key/values from the GTF and convert them to tabulated format. When requesting coordinates they will be provided in 1-based format. | ||
|
||
|
||
**Example:** Simply get the list of transcripts and gene. | ||
|
||
.. command-output:: gtftk get_example -f gtf | gtftk select_by_key -k feature -v transcript| gtftk tabulate -k gene_id,transcript_id -s "|" | ||
:shell: | ||
|
||
.. warning:: By default tabulate will discard any line for which one of the selected key is not defined. Use -x (--accept-undef) to print them. | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk tabulate -h | ||
:shell: | ||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
|
||
bed_to_gtf | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
|
||
**Description:** Convert a bed file to gtf-like format. | ||
|
||
**Example:** | ||
|
||
.. command-output:: gtftk get_example |gtftk convert| gtftk bed_to_gtf -t transcript | head -n 5 | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk bed_to_gtf -h | ||
:shell: | ||
|
||
|
||
------------------------------------------------------------------------------------------------------------------ | ||
|
||
|
||
convert_ensembl | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
|
||
**Description:** Convert the GTF file to ensembl format. Essentially add 'transcript'/'gene' features. | ||
|
||
**Example:** Delete gene and transcript feature. Regenerate them. | ||
|
||
.. command-output:: gtftk get_example | gtftk select_by_key -k feature -v gene,transcript -n| gtftk convert_ensembl | gtftk select_by_key -k gene_id -v G0001 | ||
:shell: | ||
|
||
|
||
**Arguments:** | ||
|
||
.. command-output:: gtftk bed_to_gtf -h | ||
:shell: |
Oops, something went wrong.