Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Access the coding sequence in specific transcript #267

Open
HealHer opened this issue Aug 4, 2022 · 0 comments
Open

Cannot Access the coding sequence in specific transcript #267

HealHer opened this issue Aug 4, 2022 · 0 comments

Comments

@HealHer
Copy link

HealHer commented Aug 4, 2022

Hi,

Thank you very much for this package. I am using it daily for getting sequence and relevant information about transcripts.

I am getting a bit confused with one particular example regarding the transcript named ENST00000429617.
I am following this method to get it:

pyensembl install --release 55 --species homo_sapiens
python3
>>> ensembl = pyensembl.EnsemblRelease(release=75)
>>> tx = ensembl.transcript_by_id("ENST00000429617")

I am trying to recover the coding sequence so I tried: tx.coding_sequence with the error which stitulate that there are no start codon involved.

When I looked inside the GTF there is a coding sequence associated for this specific transcript (starting from exon 2)
grep "ENST00000429617" Homo_sapiens.GRCh37.75.gtf

10      protein_coding  transcript      115438942       115486178       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115438942       115439108       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "1"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00001604369"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115457253       115457362       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "2"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00003512533"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  CDS     115457253       115457362       .       +       0       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "2"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; protein_id "ENSP00000400094"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115480791       115480927       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "3"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00003505017"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  CDS     115480791       115480927       .       +       1       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "3"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; protein_id "ENSP00000400094"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115481410       115481538       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "4"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00003555462"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  CDS     115481410       115481538       .       +       2       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "4"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; protein_id "ENSP00000400094"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115485121       115485296       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "5"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00003542525"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  CDS     115485121       115485296       .       +       2       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "5"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; protein_id "ENSP00000400094"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  exon    115486064       115486178       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "6"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; exon_id "ENSE00001632192"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  CDS     115486064       115486178       .       +       0       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; exon_number "6"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; protein_id "ENSP00000400094"; tag "cds_end_NF"; tag "mRNA_end_NF";
10      protein_coding  UTR     115438942       115439108       .       +       .       gene_id "ENSG00000165806"; transcript_id "ENST00000429617"; gene_name "CASP7"; gene_source "ensembl_havana"; gene_biotype "protein_coding"; transcript_name "CASP7-004"; transcript_source "havana"; tag "cds_end_NF"; tag "mRNA_end_NF";

And this is also confirmed by the UCSC genome browser (link to specific region)

Can you please point me to where I can access this coding sequence?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant