Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Considerations for/against FML #18

Open
dkapitan opened this issue Dec 11, 2024 · 9 comments
Open

Discussion: Considerations for/against FML #18

dkapitan opened this issue Dec 11, 2024 · 9 comments

Comments

@dkapitan
Copy link

dkapitan commented Dec 11, 2024

For https://plugin.healthcare we are on the fence as to which mapping framework to use (more about our project here).

Termx looks very interesting, particularly because it provides a graphical UI that can be used by non-programmers (typically information analist, clinical coding specialist). However, your clear warning that FML "...is a complicated tool that is hard to create, debug, and manage in along term" has made me also consider your approach presented here.

Our priority is to choose a stack through which we can reliable do the mappings from legacy systems/custom data models as the first step to convert to FHIR, and less in a FHIR-to-FHIR context. We have specified the first set of logical models (see here, in Dutch)) and were considering using Termx FML editor to create the mapping files
image

I you can spare some time, I would love to hear your reflection on using Termx + FML vs. the FHIRPath approach.

@ruscoder
Copy link
Member

Hi,

The primary goal of FPML is to provide an easy-to-use alternative for processing FHIR QuestionnaireResponse resources. However, so far, we have used it only for QRs; of course, it is not limited to them and can transform any JSON representation into any new JSON using a JSON template.

When making a decision, I suggest you take a look at examples
https://github.com/beda-software/FHIRPathMappingLanguage/tree/main/examples/repeatable
https://github.com/beda-software/FHIRPathMappingLanguage/tree/main/examples/simple

They use qr.json as a basic JSON data that we would like to transform, and templates that do exactly the same:

  • fhirmapping.json - FML
  • fhirpathmap.json - FPML

FPML is not so well-known as FML, it's a relatively new alternative that is actively used as part of beda-emr

@ir4y maybe you can give your thoughts?

@ir4y
Copy link
Member

ir4y commented Dec 17, 2024

Hi @dkapitan
Here you can find more details about my motivation of building fhirpath mapping language https://www.youtube.com/watch?v=kr_3TFDw1Xo&t=1s

One of FPML advantages is to be LLM friendly.
So instead of visual interface you can define mappers as plan test and LLM will produce you with a mapper.
We did some experiments and it worked well. Please let me know if it is something you are interested in.

@dkapitan
Copy link
Author

dkapitan commented Dec 17, 2024

Thanks for the link @ir4y. Your approach appeals to me: Python-based, LLM friendly data DSL. Will investigate more and may get back to you.

@dkapitan
Copy link
Author

@ir4y : I have a clarifying question. I see you have included FHIR types in /python/resources.py. In the past, I have used fhir.resources for working with FHIR types in Python and it served my needs.

We are also keen to leverage Pydantic v2, so the one issue I see with fhir.resources is that full pydantic v2 compliance is still in beta. Is this the reason why you opted for including your own FHIR types in this project?

@ruscoder
Copy link
Member

This repository is more about specification rather than implementation and only TypeScript version is relevant and supports all the specification. The Python version is not relevant, so, don't pay attention to it

@ir4y
Copy link
Member

ir4y commented Dec 17, 2024

@dkapitan we are using our own FHIR type generator: https://github.com/beda-software/fhir-py-types/tree/main
It fully supports Pydantic v2.
If you are interested in python implementation for FPML we can discuss it.
It is pretty easy to port current TypeScript implementation to Python.

@dkapitan
Copy link
Author

@ir4y definitely interested to explore a Python implementation for this project. Looping in @yannick-vinkesteijn in her role as lead data engineer for our project.

To start and to make sure I understand the intention and scope of your approach:

  • We want a Data DSL, for the reasons you mentioned in the motivation
  • We want to leverage existing FHIR specifications, specifically:
    • FHIRPath for querying JSONs; good to note it works on non-FHIR json, too. Using fhirpath-py as Python implementation
    • FHIR Structure definitions, using fhir-py-types (based on pydantic v2) as Python implementation
    • x-fhir-query as the templating standard (which follows the liquid syntax) --> this is what we want to implement in Python

It is good to mention that in the PLUGIN stack, we stick to the 'Rusty Python' stack as much as possible, most notably:

  • polars as the dataframe library (we intend to replace all our pandas code in the future)
  • pandera for validating dataframe-like objects, which is also based on pydantic v2

Our main mapping use cases are to map:

i. from legacy (often csv or some other tabular format) to FHIR
ii. from FHIR to FHIR, similar to what is already explained at length here

One of the open questions I have is whether it is desirable and possible to combine fhir-py-types with polars/panderas such that we can efficiently implement the mapping from CSV to a (nested) dataframe that holds the target FHIR Resource. I think it is desirable, because this way of working is very close to what many data engineers are used to. In addition, I expect to have significant performance gains by using polars to do the heavy lifting of parsing and processing the CSVs.

I will do some explortations along these lines. Please let me know how we could proceed in this collaboration.

@dkapitan
Copy link
Author

@ir4y @yannick-vinkesteijn
Getting closer, but there are some issues to be solved.

Basic idea

We want a performant solution for tabular legacy --> FHIR mappings like this (note: this code doesn't work):

import pandera as pa
from pandera.engines.pandas_engine import PydanticModel # PydanticModel only available in pandas_engine
import polars as pl
from resources import Patient

class PatientSchema(pa.DataFrameModel):
    """Pandera schema using the pydantic model."""

    class Config:
        """Config with dataframe-level data type."""

        dtype = PydanticModel(Patient)
        coerce = True  # this is required, otherwise a SchemaInitError is raised

patient = pl.read_json("general-person-example.json")
PatientSchema.validate(patient)

Current issues

@dkapitan
Copy link
Author

this looks interesting and relevant:
https://github.com/JakobGM/patito

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants