diff --git a/scripts/README.md b/scripts/README.md index 1332dc7..ffbd1a3 100644 --- a/scripts/README.md +++ b/scripts/README.md @@ -2,3 +2,10 @@ This folder houses scripts that can be used to generate predicted mappings, typically through a lexical mapping workflow. + +Most of the lexical mappings in Biomappings were generated with a workflow that wraps Gilda and PyOBO. +However, Biomappings is generic to any workflow that generates predictions, such as those +coming from knowledge graph embedding models. More information can be found about the helper functions +for writing your own prediction generation workflow can be found +at https://biomappings.readthedocs.io/en/latest/usage.html. This also has a summary of the data types that +correspond to rows in the mappings (`MappingTuple`) and predictions files (`PredictionTuple`). diff --git a/src/biomappings/resources/__init__.py b/src/biomappings/resources/__init__.py index 591c0bd..fb4dd97 100644 --- a/src/biomappings/resources/__init__.py +++ b/src/biomappings/resources/__init__.py @@ -152,7 +152,7 @@ class PredictionTuple(NamedTuple): the rough standard that closer to 1 is more confident and closer to 0 is less confident. Most of the lexical mappings already in Biomappings were generated with Gilda. - they were generated using Gilda. Depending on the script, the score therefore refers to either: + Depending on the script, the score therefore refers to either: 1. The Gilda match score, inspired by https://aclanthology.org/W15-3801/. Section 5.2 of the `supplementary material for the Gilda paper `_ @@ -160,7 +160,7 @@ class PredictionTuple(NamedTuple): https://github.com/biopragmatics/biomappings/blob/master/scripts/generate_agrovoc_mappings.py is an example that uses this variant. 2. A high-level estimation of the precision of the scores generated by the given script. - For example, the CL-MeSH mappings were estimated to be 90% correct, so all of the mappings + For example, the CL-MeSH mappings were estimated to be 90% correct, so all the mappings generated by https://github.com/biopragmatics/biomappings/blob/master/scripts/generate_cl_mesh_mappings.py are marked with 0.9 as its score.