Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use @en language tag for all labels #293

Open
jamesaoverton opened this issue Jan 28, 2025 · 7 comments
Open

Use @en language tag for all labels #293

jamesaoverton opened this issue Jan 28, 2025 · 7 comments
Assignees

Comments

@jamesaoverton
Copy link
Member

The new release of COB caused a label conflict for RO oborel/obo-relations#826: "realizes" vs "realizes"@en.

I'm trying to use a ROBOT template src/ontology/cob-edit.tsv going forward, so the easy thing is for all labels to have a single type: xsd:string or @en. I opted for xsd:string and now I think that was a mistake. I want to change the template so that all labels use @en. That would fix the RO problem, but might cause similar problems for other projects.

Despite the breakage, I think that @en throughout is "the right thing to do", and something we've been moving towards across OBO.

Alternatively, I could hack an exception for "realizes", which is the immediate cause of the RO problem. But I'd rather solve it properly.

@bpeters42
Copy link
Contributor

I would support going with @en across all COB labels. We certainly at some point would want to have multi-lingual support for COB, so better to bite that bullet now.

@allenbaron
Copy link

I know the issue of language tags has been discussed in OBO meetings and have done my best to follow decisions in OBOFoundry/OBOFoundry.github.io#479 but it is not clear to me whether @en tags should be added or removed.

Many of the report tests in ROBOT are affected by the presence or absence of a language tag, e.g. multiple labels test. I suspect this might lead to issues if there is not consistency across OBO or updates to these tests to handle language tags.

Some related issues I've seen:

I apologize if this goes beyond the intended discussion for COB but since it will be an upper ontology it seems like it will be relevant across all OBO.

@jamesaoverton
Copy link
Member Author

That issue OBOFoundry/OBOFoundry.github.io#479 seemed to have clear consensus on this narrow point, then kinda fell off track at the end.

OBO Principle 12: Naming Conventions requires exactly one rdfs:label (for technical compatibility reasons, which ROBOT checks for) in English. (Then you can have any number of synonyms and other annotations in any languages you wish, and display them however you like.) So it seems reasonable to me to push for @en labels in COB.

@jamesaoverton
Copy link
Member Author

@cmungall objected strongly to this proposal when I talked to him on Thursday, on the grounds that even small changes to labels like this could break existing queries and workflows.

@allenbaron
Copy link

I can understand where Chris is coming from. Small changes can have unintended consequences and do often require modification of workflows and queries. That is often not fun.

In the case of COB, I was under the impression that it is still in development and not yet used in existing queries and workflows (please forgive me naivete here, I am a bit out of the loop on COB's latest progress). If COB is not in general use yet, it seems that choosing a robust and interoperable route would be in the best interest of the entire OBO Foundry and future users.

On the point of robustness and interoperability, the comments in obophenotype/human-phenotype-ontology#10559 give me the impression that using language tags will better support interoperability with the broader biomedical and semantic web communities. Bigger communities, like those supporting and using Wikidata, use language tags and their presence forces people to design more robust queries and workflows that won't break as more inclusive text information in the form of new languages is added. If you query Wikidata without a language tag specified, you get all the languages and know that on subsequent queries if you want a specific language you should query for it and if you don't want to see it, you'll need to remove it. The approach to use is immediately apparent. If you query something without language tags, assuming none will be present and even one sneaks in, it can throw a real wrench into things.

At the moment, it's not like there's consistency across the OBO Foundry on language tags either. FoodOn has quite a few languages and uses the corresponding tags, HP has decided to drop tags for English only, and my personal queries to see how language tags are used in DO's imports from other OBO ontologies returns at least some language-tagged labels in most. It seems to me that queries and workflows already have to deal with the issue of language tags and making them consistent by adding language tags everywhere is the only real way to alleviate this problem.

@jamesaoverton
Copy link
Member Author

If COB is not in general use yet, it seems that choosing a robust and interoperable route would be in the best interest of the entire OBO Foundry and future users.

We can do what we want for new terms in the COB namespace. I'll switch them to @en, which is what I should have done on #288.

The problem is with imported terms. Here I'm proposing using @en for labels of imported terms, as a way of pushing the upstream ontologies to change, and that's what Chris is objecting to.

@allenbaron
Copy link

Oh. I apologize. I misunderstood. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants