-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide mapping of non standard chromosome names from ensembl to UCSC #88
Comments
Thanks @ivanek for the input! As you point out we have to include the |
Hello ensembldb people, FWIW a while ago I replaced
Note that this only works for "registered genomes" (unfortunately genome registration in GenomeInfoDb is a manual process 😞).
The See H. |
Thanks Herve @hpages , I'll look into how I can integrate that into |
Thanks for the info @hpages - is there a way to translate also the genome version from Ensembl to UCSC? Your |
I'm not aware of an easy/reliable way to translate an Ensembl genome version to an UCSC genome. Could the data.frame returned by
This is assuming that:
That's a lot of assumption but maybe they are satisfied for the small set of organisms you want to support. Also for 3. we can always register new UCSC genomes in GenomeInfoDb. Note however that using a loose assembly name like GRCh38 sounds risky. In the latest Ensembl release (107), they use GRCh38.p13 for Homo sapiens but this could change any time e.g. they could switch to GRCh38.p14 in the next release, in which case some Ensembl chromosome names will no longer be mapped to a UCSC name. FWIW
We could export and document it if that would help. H. |
Is there a chance to implement conversion of non-standard chromosome
names from ensembl format to UCSC (NCBI)?
The
(GenomeInfoDb)[http://bioconductor.org/packages/release/bioc/html/GenomeInfoDb.html]
package provides function
fetchExtendedChromInfoFromUCSC
to fetchadditional chromosome info, however the ensembl names are not part of
it. I guess an ideal situation would be, if this function would also
consider additional table present in goldenPath database directory.
For human (hg38):
For mouse (mm10):
Unfortunately those table names and format are not identical across genome versions but
the
fetchExtendedChromInfoFromUCSC
function seems to handle thisinconsistency anyway for already provided information.
The text was updated successfully, but these errors were encountered: