Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

name2taxid includes some questionable material #27

Open
arendsee opened this issue Mar 17, 2018 · 4 comments
Open

name2taxid includes some questionable material #27

arendsee opened this issue Mar 17, 2018 · 4 comments

Comments

@arendsee
Copy link
Collaborator

This happens:

> name2taxid("s2")
"164330"
 > name2taxid("s2") %>% taxid2name
"Thauera aminoaromatica"

S2 is a strain name for this bacteria. See here.

The problem is that I allow matches against any name_class. Here are all the name classes in the database:

name_class count(name_class)
acronym 1167
anamorph 302
authority 410075
blast name 229
common name 14204
equivalent name 25058
genbank acronym 486
genbank anamorph 107
genbank common name 28182
genbank synonym 2958
in-part 628
includes 36595
misnomer 1386
misspelling 35975
scientific name 1689025
synonym 168033
teleomorph 179
type material 11449

So the question is, which of these should we include?

Most of them seem pretty reasonable. The problematic ones are type material and acronym. Perhaps we should allow the user to select which name classes to allow?

@sckott
Copy link
Contributor

sckott commented Mar 17, 2018

Perhaps we should allow the user to select which name classes to allow?

that seems reasonable, by default excluding type material and acronym?

@arendsee
Copy link
Collaborator Author

@sckott Sounds good. Is this an issue in taxize as well?

@sckott
Copy link
Contributor

sckott commented Mar 18, 2018

this is what we get with get_uid

taxize::get_uid("s2")
#> 
#> Retrieving data for taxon 's2'
#> 
#> [1] "164330"
#> attr(,"class")
#> [1] "uid"
#> attr(,"match")
#> [1] "found"
#> attr(,"multiple_matches")
#> [1] FALSE
#> attr(,"pattern_match")
#> [1] FALSE
#> attr(,"uri")
#> [1] "https://www.ncbi.nlm.nih.gov/taxonomy/164330"

@maelle maelle closed this as completed Sep 9, 2022
@maelle
Copy link
Member

maelle commented Sep 9, 2022

unarchiving it thanks to @stitam 😸

@maelle maelle reopened this Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants