Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization: DontEncodeElements use results in invalid resource #6304

Open
qligier opened this issue Sep 23, 2024 · 3 comments
Open

Serialization: DontEncodeElements use results in invalid resource #6304

qligier opened this issue Sep 23, 2024 · 3 comments

Comments

@qligier
Copy link

qligier commented Sep 23, 2024

Describe the bug
When using the DontEncodeElements feature of the parser, the serialized resource is invalid if an element contains a single value, and that was ignored by DontEncodeElements. The element is added without content, which is not allowed by the FHIR specification.

To Reproduce
The code

final var patient = new Patient();
patient.setId("test-parser");
patient.addName().setFamily("Doe");

final var parser = FhirContext.forR4Cached().newJsonParser();
parser.setDontEncodeElements(Set.of("*.name.family"));
parser.encodeResourceToString(patient);

results in:

{"resourceType":"Patient","id":"test-parser","name":[{}]}
<Patient xmlns="http://hl7.org/fhir"><id value="test-parser"/><name/></Patient>

Expected behavior
The parser should ignore Patient.name, as the only value in name[0] is ignored. But the parser is only checking if the name list is empty, or if all values are empty (which it is not).

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • HAPI FHIR Version: master branch
  • OS: [e.g. iOS]: Win10

Additional context
It looks like it's missing a check here:

for (IBase nextValue : values) {
if (nextValue == null || nextValue.isEmpty()) {
if (nextValue instanceof BaseContainedDt) {
if (theContainedResource || getContainedResources().isEmpty()) {
continue;
}
} else {
continue;
}
}

The EncodeContext contains the information about elements to ignore, but it's not checked at this point. It will only be checked when name[0] is serialized.
Same for the XML parser.

@jamesagnew
Copy link
Collaborator

Agree this is an issue - it would be hard to fix in a way that doesn't hurt performance though. Patches welcomed.

@qligier
Copy link
Author

qligier commented Sep 23, 2024

I can try to make a pull request for that, but my understanding of the parser is pretty low. I probably won't be able to come with an optimized patch.

About performance, what are the important use cases? I guess the performance of the parser without that option enabled is important because it most likely represents the huge majority of its usage.
Is it important to optimize the parser with that option enabled?

@jamesagnew
Copy link
Collaborator

setDontEncodeElements(...) is used extensively within the JPA server internally, so it needs to be fast since it will be a part of the code path for every write to the JPA database.

The main thing I'm worried about is that we're not running huge emptiness checks in cases where they aren't needed. So, e.g. if you've called setDontEncodeElements("Patient.name.family") we'd need some optimized code path that could test all instances of Patient.name and determine if they had a value in family only, skipping outputting even the name key if that was the case. And presumably then we'd have to test again for each individual repetition in the case where one name has other values but the other name has only a family.

This code would get complicated fast, and would need to be aware of extensions too I suppose - E.g. if you exclude Patient.name.family we'd presumably still want to include a patient name if all it had was an extension on the root of the HumanName.

One alternate approach that might conceivably work would be to rework JacksonWriter so that it buffered the output events and only flushed the buffer for a given block once it encountered actual meaningful contents in that block. This too could get complex fast though and would need to be very comprehensively unit tested and performance benchmarked to ensure no regressions occurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants