Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Transition to UniProtKB #43

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

[WIP] Transition to UniProtKB #43

wants to merge 1 commit into from

Conversation

egonw
Copy link
Member

@egonw egonw commented Aug 26, 2023

No description provided.

@egonw egonw self-assigned this Aug 26, 2023
@egonw
Copy link
Member Author

egonw commented Oct 10, 2023

I tested this code base with an old Ensembl 105 ID mapping database, and it loads fine (with old full name with the QC and reports as UniProtKB as full name in the output:

INFO: old database is EnsemblGenomes 49 (build: 20220621)
INFO: new database is EnsemblGenomes 49 (build: 20220621)
INFO: Number of ids in T (GeneOntology): 18841 (unchanged)
INFO: Number of ids in En (Ensembl): 61487 (unchanged)
INFO: Number of ids in Om (OMIM): 16002 (unchanged)
INFO: Number of ids in X (Affy): 946224 (unchanged)
INFO: Number of ids in H (HGNC): 39506 (unchanged)
INFO: Number of ids in Wg (WikiGenes): 25645 (unchanged)
INFO: Number of ids in Q (RefSeq): 248485 (unchanged)
INFO: Number of ids in Il (Illumina): 73313 (unchanged)
INFO: Number of ids in Uc (UCSC Genome Browser): 227239 (unchanged)
INFO: Number of ids in Pd (PDB): 47662 (unchanged)
INFO: Number of ids in L (Entrez Gene): 25645 (unchanged)
INFO: Number of ids in S (UniProtKB): 77949 (unchanged)
INFO: Number of ids in Hac (HGNC Accession number): 39506 (unchanged)
INFO: Number of ids in Mb (miRBase Sequence): 3692 (unchanged)
INFO: Number of ids in Ag (Agilent): 117826 (unchanged)
INFO: Number of ids in Rf (Rfam): 58 (unchanged)
INFO: Attribute provided: Type
INFO: Attribute provided: Description
INFO: Attribute provided: Symbol
INFO: Attribute provided: Chromosome
INFO: new size is 798 Mb (changed +0.0%)

@tabbassidaloii, when you have a Ensembl 109 release with this code base, I would like to compare that with the above file too. Then we compare two Derby files with different 'full names'. It should work, but would love to confirm that experimentally.

@tabbassidaloii
Copy link
Member

Great, I will check that.
But this file you checked is not from Ensembl itself but from plants or fungi. Ensembl v109 would not include it. I should compare it with Ensembl Plants or Fungi v56. Which species did you run for this test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants