From 70fd252b9f6522370559df2ae7e75b1f9b12ffee Mon Sep 17 00:00:00 2001 From: Nico Matentzoglu Date: Thu, 16 Mar 2023 14:23:35 +0200 Subject: [PATCH] Add curation_rule to SSSOM (#258) Fixes #166 - [X] `docs/` have been added/updated if necessary - [X] `make test` has been run locally - [X] tests have been added/updated (if applicable) - [ ] The need for representing specific curation rules is everywhere, see #166. It will be very difficult, if not impossible, to standardise curation rules, so I would advocate we leave this totally open for now. Representing the rules as a resource basically makes them an open ended enum - which gives us more flexibility for adding structure later. @saubin78 suggested to create a class "MappingRule" and have curation rules being instances of mapping rules. We also should decide reasonably soon if we want to rename curation rule into mapping rule altogether, if we agree with @saubin78 assertion in #166 that computational rules can also be curation rules (I think that's fair!). --- examples/schema/curation_rule.sssom.tsv | 14 +++++++++++ examples/schema/curation_rule_text.sssom.tsv | 14 +++++++++++ examples/schema/curation_rule_text2.sssom.tsv | 18 +++++++++++++ src/sssom_schema/schema/sssom_schema.yaml | 25 +++++++++++++++++++ 4 files changed, 71 insertions(+) create mode 100644 examples/schema/curation_rule.sssom.tsv create mode 100644 examples/schema/curation_rule_text.sssom.tsv create mode 100644 examples/schema/curation_rule_text2.sssom.tsv diff --git a/examples/schema/curation_rule.sssom.tsv b/examples/schema/curation_rule.sssom.tsv new file mode 100644 index 00000000..3e9d7b7e --- /dev/null +++ b/examples/schema/curation_rule.sssom.tsv @@ -0,0 +1,14 @@ +#curie_map: +# HP: http://purl.obolibrary.org/obo/HP_ +# MP: http://purl.obolibrary.org/obo/MP_ +# orcid: https://orcid.org/ +# DISEASE_MAPPING_COMMONS_RULES: https://w3id.org/sssom/commons/disease/curation-rules/ +#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule.sssom.tsv +#license: "https://creativecommons.org/publicdomain/zero/1.0/" +#creator_id: orcid:0000-0002-7356-1779 +#mapping_provider: "https://w3id.org/sssom/core_team" +#comment: This is an example file for the SSSOM for illustration only. Its contents are entirely fabricated. +subject_id predicate_id object_id mapping_justification curation_rule see_also +HP:0009124 skos:exactMatch MP:0000003 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR2 https://github.com/mapping-commons/disease-mappings/issues/16 +HP:0008551 skos:exactMatch MP:0000018 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR3 https://github.com/mapping-commons/disease-mappings/issues/16 +HP:0000411 skos:exactMatch MP:0000021 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR3 https://github.com/mapping-commons/disease-mappings/issues/16 diff --git a/examples/schema/curation_rule_text.sssom.tsv b/examples/schema/curation_rule_text.sssom.tsv new file mode 100644 index 00000000..2a2fc60f --- /dev/null +++ b/examples/schema/curation_rule_text.sssom.tsv @@ -0,0 +1,14 @@ +#curie_map: +# HP: http://purl.obolibrary.org/obo/HP_ +# MP: http://purl.obolibrary.org/obo/MP_ +# orcid: https://orcid.org/ +# DISEASE_MAPPING_COMMONS_RULES: https://w3id.org/sssom/commons/disease/curation-rules/ +#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule_text.sssom.tsv +#license: "https://creativecommons.org/publicdomain/zero/1.0/" +#creator_id: orcid:0000-0002-7356-1779 +#mapping_provider: "https://w3id.org/sssom/core_team" +#comment: This is an example file for the SSSOM for illustration only. Its contents are entirely fabricated. +subject_id predicate_id object_id mapping_justification curation_rule_text see_also +HP:0009124 skos:exactMatch MP:0000003 semapv:ManualMappingCuration The two phenotypes inhere in homologous structures and exhibit the same phenotypic quality https://github.com/mapping-commons/disease-mappings/issues/16 +HP:0008551 skos:exactMatch MP:0000018 semapv:ManualMappingCuration The two phenotypes inhere in homologous structures and exhibit the same phenotypic quality https://github.com/mapping-commons/disease-mappings/issues/16 +HP:0000411 skos:exactMatch MP:0000021 semapv:ManualMappingCuration The two phenotypes are associated with the exact same set of diseases https://github.com/mapping-commons/disease-mappings/issues/16 diff --git a/examples/schema/curation_rule_text2.sssom.tsv b/examples/schema/curation_rule_text2.sssom.tsv new file mode 100644 index 00000000..3c5777ea --- /dev/null +++ b/examples/schema/curation_rule_text2.sssom.tsv @@ -0,0 +1,18 @@ +#curie_map: +# WTO: http://purl.obolibrary.org/obo/WTO_ +# CO321: "http://www.cropontology.org/rdf/CO_321:" +# ror: https://ror.org/ +#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule_text2.sssom.tsv +#license: "https://www.etalab.gouv.fr/licence-ouverte-open-licence/" +#comment: This is an example file for the SSSOM for illustration only. This example was extracted from a real mapping set where the subject source (WTO) is an ontology used to annotate text (e.g. scientific literature) and the object source (CO321) is an ontology used to annotate the traits evaluated from observational data. The objective of the alignment is to allow information retrieval from both textual and experimental phenotypic dataset. +#creator_id: ror:02kvxyf05 +#creator_label: "INRAE" +subject_id subject_label predicate_id object_id object_label mapping_justification curation_rule_text comment +WTO:0000304 cold resistance skos:closeMatch CO321:0000080 Cold tolerance semapv:ManualMappingCuration Rule 4: We consider that "tolerance" and "resistance" are almost equivalent when applied to abiotic environmental conditions. +WTO:0000450 aluminium toxicity skos:closeMatch CO321:0000079 Aluminum tolerance semapv:ManualMappingCuration Rule 3: We consider that the user of the information retrieval function interested in plant traits related to metal toxicity (WTO) also wants to retrieve observational data measuring the plant tolerance to the same metal (CO_321). The rule metal + toxicity (WTO) <-> metal + tolerance (CO321) is valid for any kind of metal. +WTO:0000065 anther extrusion skos:exactMatch CO321:0000982 Anther extrusion semapv:ManualMappingCuration +WTO:0000296 aphid resistance skos:closeMatch CO321:0000085 Aphid damage semapv:ManualMappingCuration Rule 2: We consider that the user of the information retrieval function interested in plant traits related to damages caused by some animal, insect, nematode, etc. also wants to retrieve observational data mentioning resistance to the same living organism. +WTO:0000281 Armyworm resistance skos:closeMatch CO321:0000086 Armyworm damage semapv:ManualMappingCuration Rule 2: We consider that the user of the information retrieval function interested in plant traits related to damages caused by some animal, insect, nematode, etc. also wants to retrieve observational data mentioning resistance to the same living organism. +WTO:0000125 awn color skos:exactMatch CO321:0000960 Awn color semapv:ManualMappingCuration +WTO:0000126 awn length skos:exactMatch CO321:0000026 Awn length semapv:ManualMappingCuration +WTO:0000452 bacterial leaf blight resistance skos:closeMatch CO321:0000932 Bacterial leaf blight severity semapv:ManualMappingCuration Rule 1.3: We consider that the user of the information retrieval function, given a pathogen or a disease, would like to retrieve all data, independently of the way the affection is observed. In observational data, a severity score is represented by two digits representing the vertical disease progress and an estimate of severity. The capacity of resistance to a disease would be deduced from the severity of this one on the plant. \ No newline at end of file diff --git a/src/sssom_schema/schema/sssom_schema.yaml b/src/sssom_schema/schema/sssom_schema.yaml index 33bccfb8..7ed412c4 100644 --- a/src/sssom_schema/schema/sssom_schema.yaml +++ b/src/sssom_schema/schema/sssom_schema.yaml @@ -446,6 +446,29 @@ slots: examples: - value: semapv:Stemming - value: semapv:StopWordRemoval + curation_rule: + description: A curation rule is a (potentially) complex condition executed by an agent that led to the establishment of a mapping. + Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion. The curation + rule is captured as a resource rather than a string, which enables higher levels of transparency and sharing across mapping sets. + The URI representation of the curation rule is expected to be a resolvable identifier which provides details about the nature of the curation rule. + range: EntityReference + multivalued: true + see_also: + - https://github.com/mapping-commons/sssom/issues/166 + - https://github.com/mapping-commons/sssom/pull/258 + - https://github.com/mapping-commons/sssom/blob/master/examples/schema/curation_rule.sssom.tsv + curation_rule_text: + description: A curation rule is a (potentially) complex condition executed by an agent that led to the establishment of a mapping. + Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion. The curation + rule should be captured as a resource (entity reference) rather than a string (see curation_rule element), which enables higher levels of transparency and sharing across mapping sets. + The textual representation of curation rule is intended to be used in cases where (1) the creation of a resource is not practical from the + perspective of the mapping_provider and (2) as an additional piece of metadata to augment the curation_rule element with a human readable text. + range: string + multivalued: true + see_also: + - https://github.com/mapping-commons/sssom/issues/166 + - https://github.com/mapping-commons/sssom/pull/258 + - https://github.com/mapping-commons/sssom/blob/master/examples/schema/curation_rule_text.sssom.tsv semantic_similarity_score: description: A score between 0 and 1 to denote the semantic similarity, where 1 denotes equivalence. @@ -539,6 +562,8 @@ classes: - mapping_tool_version - mapping_date - confidence + - curation_rule + - curation_rule_text - subject_match_field - object_match_field - match_string