You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: all comments in the following apply to both subject and object fields.
The normal mapping case is when one term in a source set (subject) is mapped to exactly one term in the target set (object).
There are, however, many cases where we need to map sets of terms (subject and/or object), for example:
UBERON:Eye+NCBITaxon:Xenopus->XAO:Eye
MP:adiposeTissuePhenotype+PATO:abnormal->HP:AbnormallyAdiposeTissue
MP:X due to DO:1 -> HP:Y due to MONDO:1
This is can become a complicated mess, but I suggest the following:
We allow pipe separated term lists for both subject_id and object_id. These lists are considered in the order given.
We introduce a new (optional) field called object_pattern which is, by default, none (which means everything in subject_id is considered to be a single identifier pertaining to one term). Now if someone wishes to create a complex mapping, they would write a complex expression like RO:001 some (%s and (RO:002 some %s)) or simply %s and %s (see how we did this in a different context using templates). The filler terms (%s) are filled one by one with terms from the pipe seperated list in subject_id, which materialises the expression, for example, as an owl_class_expression.
We introduce a new (optional) field called object_pattern_type, which is, if NOT set, interpreted to be a "class expression in manchester syntax" (so there is no need to set it). This could be used in the future to accomodate other kinds of patterns as well (there are complex expressions for example in the RBOX that are not class expressions, but maybe someone wants to use this to map non-owl patterns as well).
Note: all comments in the following apply to both
subject
andobject
fields.The normal mapping case is when one term in a source set (subject) is mapped to exactly one term in the target set (object).
There are, however, many cases where we need to map sets of terms (subject and/or object), for example:
UBERON:Eye+NCBITaxon:Xenopus->XAO:Eye
MP:adiposeTissuePhenotype+PATO:abnormal->HP:AbnormallyAdiposeTissue
MP:X due to DO:1 -> HP:Y due to MONDO:1
This is can become a complicated mess, but I suggest the following:
subject_id
andobject_id
. These lists are considered in the order given.object_pattern
which is, by default,none
(which means everything insubject_id
is considered to be a single identifier pertaining to one term). Now if someone wishes to create a complex mapping, they would write a complex expression likeRO:001 some (%s and (RO:002 some %s))
or simply%s and %s
(see how we did this in a different context using templates). The filler terms (%s) are filled one by one with terms from the pipe seperated list insubject_id
, which materialises the expression, for example, as an owl_class_expression.object_pattern_type
, which is, if NOT set, interpreted to be a "class expression in manchester syntax" (so there is no need to set it). This could be used in the future to accomodate other kinds of patterns as well (there are complex expressions for example in the RBOX that are not class expressions, but maybe someone wants to use this to map non-owl patterns as well).Does this make sense?
@cmungall @kshefchek @mellybelly @diatomsRcool @balhoff
The text was updated successfully, but these errors were encountered: