Skip to content

Commit

Permalink
Add curation_rule to SSSOM (#258)
Browse files Browse the repository at this point in the history
Fixes #166

- [X] `docs/` have been added/updated if necessary
- [X] `make test` has been run locally
- [X] tests have been added/updated (if applicable)
- [ ]

The need for representing specific curation rules is everywhere, see
#166. It will be very difficult, if not impossible, to standardise
curation rules, so I would advocate we leave this totally open for now.
Representing the rules as a resource basically makes them an open ended
enum - which gives us more flexibility for adding structure later.
@saubin78 suggested to create a class "MappingRule" and have curation
rules being instances of mapping rules.

We also should decide reasonably soon if we want to rename curation rule
into mapping rule altogether, if we agree with @saubin78 assertion in
#166 that computational rules can also be curation rules (I think that's
fair!).
  • Loading branch information
matentzn authored Mar 16, 2023
1 parent 01db05c commit 70fd252
Show file tree
Hide file tree
Showing 4 changed files with 71 additions and 0 deletions.
14 changes: 14 additions & 0 deletions examples/schema/curation_rule.sssom.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#curie_map:
# HP: http://purl.obolibrary.org/obo/HP_
# MP: http://purl.obolibrary.org/obo/MP_
# orcid: https://orcid.org/
# DISEASE_MAPPING_COMMONS_RULES: https://w3id.org/sssom/commons/disease/curation-rules/
#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule.sssom.tsv
#license: "https://creativecommons.org/publicdomain/zero/1.0/"
#creator_id: orcid:0000-0002-7356-1779
#mapping_provider: "https://w3id.org/sssom/core_team"
#comment: This is an example file for the SSSOM for illustration only. Its contents are entirely fabricated.
subject_id predicate_id object_id mapping_justification curation_rule see_also
HP:0009124 skos:exactMatch MP:0000003 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR2 https://github.com/mapping-commons/disease-mappings/issues/16
HP:0008551 skos:exactMatch MP:0000018 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR3 https://github.com/mapping-commons/disease-mappings/issues/16
HP:0000411 skos:exactMatch MP:0000021 semapv:ManualMappingCuration DISEASE_MAPPING_COMMONS_RULES:MPR3 https://github.com/mapping-commons/disease-mappings/issues/16
14 changes: 14 additions & 0 deletions examples/schema/curation_rule_text.sssom.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#curie_map:
# HP: http://purl.obolibrary.org/obo/HP_
# MP: http://purl.obolibrary.org/obo/MP_
# orcid: https://orcid.org/
# DISEASE_MAPPING_COMMONS_RULES: https://w3id.org/sssom/commons/disease/curation-rules/
#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule_text.sssom.tsv
#license: "https://creativecommons.org/publicdomain/zero/1.0/"
#creator_id: orcid:0000-0002-7356-1779
#mapping_provider: "https://w3id.org/sssom/core_team"
#comment: This is an example file for the SSSOM for illustration only. Its contents are entirely fabricated.
subject_id predicate_id object_id mapping_justification curation_rule_text see_also
HP:0009124 skos:exactMatch MP:0000003 semapv:ManualMappingCuration The two phenotypes inhere in homologous structures and exhibit the same phenotypic quality https://github.com/mapping-commons/disease-mappings/issues/16
HP:0008551 skos:exactMatch MP:0000018 semapv:ManualMappingCuration The two phenotypes inhere in homologous structures and exhibit the same phenotypic quality https://github.com/mapping-commons/disease-mappings/issues/16
HP:0000411 skos:exactMatch MP:0000021 semapv:ManualMappingCuration The two phenotypes are associated with the exact same set of diseases https://github.com/mapping-commons/disease-mappings/issues/16
18 changes: 18 additions & 0 deletions examples/schema/curation_rule_text2.sssom.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#curie_map:
# WTO: http://purl.obolibrary.org/obo/WTO_
# CO321: "http://www.cropontology.org/rdf/CO_321:"
# ror: https://ror.org/
#mapping_set_id: https://w3id.org/sssom/commons/examples/curation_rule_text2.sssom.tsv
#license: "https://www.etalab.gouv.fr/licence-ouverte-open-licence/"
#comment: This is an example file for the SSSOM for illustration only. This example was extracted from a real mapping set where the subject source (WTO) is an ontology used to annotate text (e.g. scientific literature) and the object source (CO321) is an ontology used to annotate the traits evaluated from observational data. The objective of the alignment is to allow information retrieval from both textual and experimental phenotypic dataset.
#creator_id: ror:02kvxyf05
#creator_label: "INRAE"
subject_id subject_label predicate_id object_id object_label mapping_justification curation_rule_text comment
WTO:0000304 cold resistance skos:closeMatch CO321:0000080 Cold tolerance semapv:ManualMappingCuration Rule 4: We consider that "tolerance" and "resistance" are almost equivalent when applied to abiotic environmental conditions.
WTO:0000450 aluminium toxicity skos:closeMatch CO321:0000079 Aluminum tolerance semapv:ManualMappingCuration Rule 3: We consider that the user of the information retrieval function interested in plant traits related to metal toxicity (WTO) also wants to retrieve observational data measuring the plant tolerance to the same metal (CO_321). The rule metal + toxicity (WTO) <-> metal + tolerance (CO321) is valid for any kind of metal.
WTO:0000065 anther extrusion skos:exactMatch CO321:0000982 Anther extrusion semapv:ManualMappingCuration
WTO:0000296 aphid resistance skos:closeMatch CO321:0000085 Aphid damage semapv:ManualMappingCuration Rule 2: We consider that the user of the information retrieval function interested in plant traits related to damages caused by some animal, insect, nematode, etc. also wants to retrieve observational data mentioning resistance to the same living organism.
WTO:0000281 Armyworm resistance skos:closeMatch CO321:0000086 Armyworm damage semapv:ManualMappingCuration Rule 2: We consider that the user of the information retrieval function interested in plant traits related to damages caused by some animal, insect, nematode, etc. also wants to retrieve observational data mentioning resistance to the same living organism.
WTO:0000125 awn color skos:exactMatch CO321:0000960 Awn color semapv:ManualMappingCuration
WTO:0000126 awn length skos:exactMatch CO321:0000026 Awn length semapv:ManualMappingCuration
WTO:0000452 bacterial leaf blight resistance skos:closeMatch CO321:0000932 Bacterial leaf blight severity semapv:ManualMappingCuration Rule 1.3: We consider that the user of the information retrieval function, given a pathogen or a disease, would like to retrieve all data, independently of the way the affection is observed. In observational data, a severity score is represented by two digits representing the vertical disease progress and an estimate of severity. The capacity of resistance to a disease would be deduced from the severity of this one on the plant.
25 changes: 25 additions & 0 deletions src/sssom_schema/schema/sssom_schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,29 @@ slots:
examples:
- value: semapv:Stemming
- value: semapv:StopWordRemoval
curation_rule:
description: A curation rule is a (potentially) complex condition executed by an agent that led to the establishment of a mapping.
Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion. The curation
rule is captured as a resource rather than a string, which enables higher levels of transparency and sharing across mapping sets.
The URI representation of the curation rule is expected to be a resolvable identifier which provides details about the nature of the curation rule.
range: EntityReference
multivalued: true
see_also:
- https://github.com/mapping-commons/sssom/issues/166
- https://github.com/mapping-commons/sssom/pull/258
- https://github.com/mapping-commons/sssom/blob/master/examples/schema/curation_rule.sssom.tsv
curation_rule_text:
description: A curation rule is a (potentially) complex condition executed by an agent that led to the establishment of a mapping.
Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion. The curation
rule should be captured as a resource (entity reference) rather than a string (see curation_rule element), which enables higher levels of transparency and sharing across mapping sets.
The textual representation of curation rule is intended to be used in cases where (1) the creation of a resource is not practical from the
perspective of the mapping_provider and (2) as an additional piece of metadata to augment the curation_rule element with a human readable text.
range: string
multivalued: true
see_also:
- https://github.com/mapping-commons/sssom/issues/166
- https://github.com/mapping-commons/sssom/pull/258
- https://github.com/mapping-commons/sssom/blob/master/examples/schema/curation_rule_text.sssom.tsv
semantic_similarity_score:
description: A score between 0 and 1 to denote the semantic similarity, where
1 denotes equivalence.
Expand Down Expand Up @@ -539,6 +562,8 @@ classes:
- mapping_tool_version
- mapping_date
- confidence
- curation_rule
- curation_rule_text
- subject_match_field
- object_match_field
- match_string
Expand Down

0 comments on commit 70fd252

Please sign in to comment.