Skip to content

Semantic Mapping Representation

Tiffany J. Callahan edited this page Aug 11, 2021 · 18 revisions


Purpose

Collaborators: Bill Baumgartner, Nicole Vasilevsky

This page describes how semantically rich representations of the mappings were created utilizing the Resource Description Framework (RDF). This tasks consists of several steps, guided by domain experts, in order to ensure that the resulting representations were both accurate and clinically meaningful.

Table of Contents



Semantic Mapping Representation


The most important task is to develop semantic definitions for the OMOP2OBO mappings. In order to do this, logical definitions or representations of each of the mappings needed to be created. This required creating templates to represent the primary design patterns utilized by the mappings. Each of the patterns is built around the use of different combinations of the Web Ontology Language (OWL) constructors owl:intersectionOf, owl:unionOf, owl:complementOf. Examples of how each of these constructors (and combinations of them) are shown below.

Relevant GitHub Issues: issue #34

OWL Constructors used to Construct Classes

owl:complementOf
Class_Name: 'Skin appearance normal (OMOP_4021360)'
Class Expression Syntax: not('Abnormality of the skin')

New Triples:

omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>

omop2obo:OMOP_4021360, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_4021360, oboInOwl:id, OMOP:4021360  
omop2obo:OMOP_4021360, rdf:type, owl:Class
omop2obo:OMOP_4021360, rdfs:label, 'Skin appearance normal'

omop2obo:OMOP_4021360, owl:equivalentClass, ec1

ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051  # has part
ec1, owl:someValuesFrom, ec_not
ec_not, rdf:type, owl:Class

## Abnormality of the skin
ec_not, owl:complementOf, obo:HP_0000951

owl:unionOf
Class_Name: 'Longitudinal deficiency of tibia AND/OR fibula (OMOP_434473)'
Class Expression Syntax: ('Abnormality of fibula morphology' or 'Abnormality of tibia morphology')

New Triples:

omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>

omop2obo:OMOP_434473, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_434473, oboInOwl:id, OMOP:434473  
omop2obo:OMOP_434473, rdfs:label, "Longitudinal deficiency of tibia AND/OR fibula"
omop2obo:OMOP_434473, rdf:type, owl:Class

omop2obo:OMOP_434473, owl:equivalentClass, ec1

ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051  # has part
ec1, owl:someValuesFrom, ec_union
ec_union, rdf:type, owl:Class

## Abnormality of fibula morphology
ec_union, owl:unionOf, ec_union_member1 
ec_union_member1, rdf:first, obo:HP_0002991

## Abnormality of tibia morphology 
ec_union_member1, rdf:rest, ec_union_member2
ec_union_member2, rdf:first, obo:HP_0002992
ec_union_member2, rdf:rest, rdf:nil

owl:intersectionOf
Class_Name: 'Abnormal cervical smear (OMOP_434165)'
Class Expression Syntax: ('Abnormal cell morphology' and 'Abnormality of the uterine cervix')

New Triples:

omop2obo: <https://github.com/callahantiff/omop2obo/obo/ext/>
oboInOwl: <http://www.geneontology.org/formats/oboInOwl>
owl: <http://www.w3.org/2002/07/owl>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
rdfs: <http://www.w3.org/2000/01/rdf-schema>

omop2obo:OMOP_434165, oboInOwl:hasOBONamespace, OMOP2OBO
omop2obo:OMOP_434165, oboInOwl:id, OMOP:434165
omop2obo:OMOP_434165, rdfs:label, "Abnormal cervical smear"
omop2obo:OMOP_434165, rdf:type, owl:Class

omop2obo:OMOP_434165, owl:equivalentClass, ec1
 
ec1, rdf:type, owl:Restriction
ec1, owl:onProperty, obo:BFO_0000051  # has part
ec1, owl:someValuesFrom, ec_intersection1
ec_intersection1, rdf:type, owl:Class

## Abnormality of the uterine cervix
ec_intersection1, owl:intersectionOf, ec_intersection_member1
ec_intersection_member1, rdf:first, obo:HP_0012888

## Abnormal cell morphology
ec_intersection_member1, rdf:rest, ec_intersection_member2
ec_intersection_member2, rdf:first, obo:HP_0025461
ec_intersection_member2, rdf:rest, rdf:nil


Multi-Ontology Mapping Definitions


Once the experiments described above were complete, a more complex representation spanning all ontologies for a given mapping, rather than creating ontology-specific mappings, was created. Mappings that spanned multiple ontologies required additional content not currently included in each mapping and required new patterns specific to each clinical domain. Additional detail is included below, which describes the steps taken to add mapping categories and evidence.

Conditions


Ontologies: HPO, MONDO

Assumptions

  • All classes created in OMOP2OBO namespace using the original OMOP concept identifier
  • All mappings for Concepts Used in Practice (annotated with a mapping category other than Unmapped) that had at least 1 HPO or MONDO annotation
  • All phenotypes are rdfs:subClassOf phenotypic abnormality (HP_0000118)
  • All diseases are rdfs:subClassOf disease or disorder (MONDO_0000001)
  • Use owl:equivalentClass for all 1:1 mappings where the HPO and MONDO annotations represented the same concept
  • Use RO relations for all mappings where the HPO and MONDO annotations represent different diseases/phenotypes

Intra-Ontology Relations:

Mapping Combinations
The following 3 mapping patterns were created (click on image to enlarge in current tab):



Drug Exposure Ingredients


Ontologies: CHEBI, PRO, NCBITaxon, VO

Assumptions

  • All classes created in OMOP2OBO namespace using the original OMOP concept identifier
  • For all Concepts Used in Clinical Practice drugs and ingredients with at least 1 CHEBI annotation
  • Additional annotations are added to connect each ingredient to its RxNorm drug
  • All new OMOP2OBO classes for drug ingredients are rdfs:subClassOf chemical entity (CHEBI_24431)

Intra-Ontology Relations:
Drugs and Ingredients

Ingredients

Mapping Combinations

The following 13 mapping patterns were created (click on image to enlarge in current tab):

Class Construction Heuristics

  • Assigning NCBITaxon from spreadsheet:
    • If PRO and VO don't have a taxon assignment → NCBITaxon to both
    • If only PRO and it doesn't have a taxon assignment → NCBITaxon to PRO
    • If only VO and it doesn't have a taxon assignment → NCBITaxon to VO
    • If CHEBINCBITaxon to CHEBI



Measurements


Ontologies HPO, CHEBI, UBERON, NCBITaxon, CL, PRO

Assumptions

  • All classes created in OMOP2OBO namespace using the original OMOP concept identifier
  • For all lab test results (annotated with a mapping category other than Unmapped) with at least 1 HPO annotation and at least 1 UBERON annotation
  • Since mappings are to lab test results, but we know what LOINC code each test is assigned, additional annotations were added to connect each lab test result to its LOINC measurement_concept_id. To do this, each original OMOP concept was appended with the result type (i.e. _NORMAL, _LOW, _HIGH or _NEGATIVE, _POSITIVE)
  • All new OMOP2OBO classes for measurements are rdfs:subClassOf phenotypic abnormality (HP_0000118)

Intra-Ontology Relations:
Lab Test and Results

Lab Test Results

Mapping Combinations
The following 20 mapping patterns were created (click on image to enlarge in current tab). Note that a dashed line is used to indicate multiple patterns. There are two special cases of the patterns shown below: (1) IgE antibody tests and (2) IgA, IgD, IgG, and IgM (i.e., Antibody, but not IgE). These specific patterns are also demonstrated below.

Class Construction Heuristics

  • Assigning NCBITaxon from spreadsheet:
    • If only PRO and it doesn't have a taxon assignment → NCBITaxon to PRO
    • If CHEBINCBITaxon to CHEBI
    • All UBERONNCBITaxon_9606
    • All CLNCBITaxon_9606



OMOP2OBO Class Mapping Categories and Evidence


For the first initial release, mapping categories and evidence are represented as class annotations, similar to how synonyms and dbxrefs are annotated to Open Biomedical Ontology Foundry ontology classes. Each annotation includes metadata for the original OMOP concepts, original OBO concepts, OMOP Common Data Model (CDM) version used, ontologies version date, and url for current OMOP2OBO release. Examples for each major type of evidence are shown below:

Mapping Categories

Mapping categories added as class annotation.

Mapping Evidence

Evidence can come in the following forms:

  • OBO DbXRef to OMOP Source Code

    • OBO_DbXRef-OMOP_CONCEPT_SOURCE_CODE:xxxxxxx
    • OBO_DbXRef-OMOP_ANCESTOR_SOURCE_CODE:xxxxxxx
  • OBO Label to OMOP Synonym or Label

    • OBO_LABEL-OMOP_CONCEPT_LABEL:xxxxxxx
    • OBO_LABEL-OMOP_ANCETSOR_LABEL:xxxxxxx
    • OBO_LABEL-OMOP_CONCEPT_SYNONYM:xxxxxxx
    • OBO_LABEL-OMOP_ANCETSOR_SYNONYM:xxxxxxx
  • OBO Synonym to OMOP Synonym or Label

    • OBO_hasSynonymType-OMOP_CONCEPT_LABEL:xxxxxxx
    • OBO_hasSynonymType-OMOP_ANCETSOR_LABEL:xxxxxxx
    • OBO_hasSynonymType-OMOP_CONCEPT_SYNONYM:xxxxxxx
    • OBO_hasSynonymType-OMOP_ANCETSOR_SYNONYM:xxxxxxx
  • Concept Similarity Score → CONCEPT_SIMILARITY:OBO_URI_x.x

REPRESENTATION

DbXRef Example 1: OBO_DbXRef-OMOP_CONCEPT_SOURCE_CODE:ABC_1234567
Pattern for all DbXref evidence to an OMOP concept.

class_id SKOS:exactMatch ABC_1234567

BNode owl:annotatedSource class_id
BNode owl:annotatedProperty SKOS:exactMatch
BNode owl:annotatedTarget ABC_1234567

BNode oboInOwl:source "Mapping Category"

BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1

DbXRef Example 2: OBO_DbXRef-OMOP_ANCESTOR_SOURCE_CODE:ABC_1234567
Pattern for all DbXref evidence that includes an OMOP concept ancestor.

class_id oboInOwl:hasDbXref ABC_1234567

BNode owl:annotatedSource class_id
BNode owl:annotatedProperty oboInOwl:hasDbXref
BNode owl:annotatedTarget ABC_1234567

BNode oboInOwl:source "Mapping Category"

BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1

Label Example: OBO_LABEL-OMOP_CONCEPT_LABEL:xxxxxxx
All OBO-OMOP label matches (even those to concept ancestors) will utilize SKOS:exactMatch since this type of match only happens when the OBO and OMOP strings match exactly.

class_id SKOS:exactMatch OMOP_1234567

BNode owl:annotatedSource class_id
BNode owl:annotatedProperty SKOS:exactMatch
BNode oboInOwl:target OMOP_1234567

BNode oboInOwl:source "Mapping Category"

BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source "LABEL STRING"
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1

OBO Synonym Example: OBO_hasSynonymType-OMOP_CONCEPT_LABEL:xxxxxxx
This would be the pattern for all OBO Synonym matches. Note that this example uses a generic oboInOwl:hasSynonymType for this example, the actual axioms will use the specific types recorded from each matched ontology.

class_id oboInOwl:hasSynonymType "Synonym string"

BNode owl:annotatedSource class_id
BNode owl:annotatedProperty oboInOwl:hasSynonymType
BNode owl:annotatedTarget "Synonym string"

BNode oboInOwl:source "Mapping Category"

BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1

Similarity Example: CONCEPT_SIMILARITY:OBO_URI_1.0
The pattern for all cosine similarity-generated evidence uses the RO property is evidence with support from (RO_0002614) with the NCIT class Cosine Distance Method NCIT_C272662. In addition to extending the metadata sources to include the similarity score float value.

class_id obo:RO_0002614 NCIT_C27662

BNode owl:annotatedSource class_id
BNode owl:annotatedProperty RO_0002614
BNode owl:annotatedTarget NCIT_C27662

BNode oboInOwl:source "Cosine similarity score of x.x derived from applying a Bag-Of-Words TF-IDF vector space model to all available OMOP and OBO labels and synonyms"

BNode oboInOwl:source "Mapping Category"

BNode oboInOwl:source OBO_xxxxxxx
BNode oboInOwl:source "OBO:version date"
BNode oboInOwl:source OMOP_xxxxxxx
BNode oboInOwl:source "OMOP: common data model v5.0"
BNode oboInOwl:source http://omop2obo/wikiv1