Skip to content

Commit

Permalink
adaptation from issues #13 #14
Browse files Browse the repository at this point in the history
  • Loading branch information
lpantano committed Aug 9, 2017
1 parent a93387d commit 5125b18
Showing 1 changed file with 19 additions and 15 deletions.
34 changes: 19 additions & 15 deletions format/definition.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,25 +21,29 @@ Note: Keep in mind this is for the output of a pipeline, so we know there will b
Please add description for each columnd/attribute

* header:
* database: `##source-ontology LINK TO DATABASE` include version
* commands used to generate the file. At least information about adapter removal and filtering
* database: `##source-ontology LINK TO DATABASE` include version and link
* commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD: `
* genome version used (maybe try to get from BAM file if GFF3 generated from it)
* sample names used in attribute:Expression
* column1: seqID:
* column2: source: databases used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13
* column3: type: ref_miRNA, isomiRs: https://github.com/miRTop/incubator/issues/13 (SO:0002166 ref_miRNA and SO:0002167)
* column4/5: start/end: question about precursor position or genomic position?
* sample names used in attribute:Expression: `## colData:` separated by spaces
* small RNA GFF version `## version: 0.9`
* column1: seqID: precursor name
* column2: source: databases (lower case) used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13. With the version number after `_` character: `mirbase_21`
* column3: type: `ref_miRNA, isomiR`: https://github.com/miRTop/incubator/issues/13 (SO:0002166 ref_miRNA and SO:0002167 isomiR)
* column4/5: start/end: precursor start/end as indicated by alignment tool
* column6: score:
* column7: strand:
* column7: strand
* column8: phase: (For features of type "CDS", the phase indicates where the feature begins with reference to the reading frame)
* column9: attributes
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes.
* Name:
* column9: attributes:
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes. The tool will implement this, so an API can be used to fill this field.
* Name: mature name
* Parent: hairpin precursor name
* Alias: get names from miRBase/miRgeneDB
* Expression: raw counts separated by `,`
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features)

* Variant: categorical types: iso_5p, iso_3p, iso_snp(_seed/_central_supp), iso_add (adapted from isomiR-SEA)
* Cigar: CIGAR string as indicated here: []
* Alias: get names from miRBase/miRgeneDB or other database separated by `,`
* Genomic: positions on the genome in the following format: `chr:start-end,chr:start-end`
* Expression: raw counts separated by `,`. It should be in the same order than `colData` in the header.
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features). PASS can have subclases: `PASS:te`: meaning the sequence pass but the tools consider variants showed here are not trusted. REJECT can go with any short word explaining why it was rejected: `REJECT:lowcounts`. In this case the sequence will be skipped for data mining of the file when quering counts or summarize miRNA expression.
* Seed_fam: in the format of 2-8 nts and reference miRNA sharing the seed. Usefull to go for pre-computed target predictions: `ATGCTGT:mir34a_5p`

**API**

Expand Down

0 comments on commit 5125b18

Please sign in to comment.