Ensembl-ID-Match

This R script reads old Ensembl gene IDs from an Excel file (from a column titled "Given"), maps them to their corresponding new Ensembl gene IDs using the biomaRt package, and saves the full mapping, including gene symbols, to a new Excel file. The script is designed to handle Ensembl IDs from an older genome version (such as Rnor6.0, Ensembl version 80), and it connects to both the old and latest Ensembl databases to perform the mapping. The input Excel file should be named "Unmatched Ensembl.xlsx" and should contain a column labeled "Given", where the old Ensembl IDs are stored. Ensure the file path is updated accordingly in the script. The output will be an Excel file named "Updated Ensembl.xlsx", which includes the old Ensembl IDs, the associated gene symbols, and the new Ensembl IDs. If any old ID cannot be mapped to a new one, the script will insert NA in the New Ensembl ID column. The script also calculates and displays the percentage of IDs that were successfully mapped to gene symbols and new Ensembl IDs. It assumes that the necessary R packages (biomaRt, readxl, writexl) are installed, and will install them automatically if not. To run the script, open it in R or RStudio and execute it. The updated Excel file will be saved in the same directory, providing a convenient way to update old gene annotation data with the latest Ensembl information.

Will display the success rate of symbol and new Ensembl ID matching. Currently I'm getting approximately 95% success with symbol matching and 33% for new ID. This could be due to merged/separated sequences in the Ensembl databases but feel free to comment if you have suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Advanced Only 3 Species Build.R		Advanced Only 3 Species Build.R
Advanced added download button.R		Advanced added download button.R
Advanced alias plus biotype - not 3 databases.R		Advanced alias plus biotype - not 3 databases.R
Advanced double output fixed - weird ID results sometimes.R		Advanced double output fixed - weird ID results sometimes.R
Advanced dropdown 3 DB 3 species version drop.R		Advanced dropdown 3 DB 3 species version drop.R
Final Ensembl GitHub.R		Final Ensembl GitHub.R
List of Genes or IDs.R		List of Genes or IDs.R
Lookup 5PM WORKs interactive table long GO no KEGG.R		Lookup 5PM WORKs interactive table long GO no KEGG.R
README.md		README.md
Works perfect 3 species filters and buttons work rat.R		Works perfect 3 species filters and buttons work rat.R
app.R		app.R
appLinks works.R		appLinks works.R
appLinks.R		appLinks.R
applicationLD.R		applicationLD.R
appwcrisprchanges diseases and links work build.R		appwcrisprchanges diseases and links work build.R
ensembl_lookup_new_databases.R		ensembl_lookup_new_databases.R
rat outdated ensembl 3 databases percentages.R		rat outdated ensembl 3 databases percentages.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ensembl-ID-Match

About

Releases

Packages

Languages

lindseydruschel/Ensembl-ID-Match

Folders and files

Latest commit

History

Repository files navigation

Ensembl-ID-Match

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages