Skip to content

Conversation

LucasEby
Copy link

@LucasEby LucasEby commented Oct 1, 2025

Fixes #24804

Motivation

In ProtobufNativeSchemaTest.testSchema, the json's contents and the FileDescriptorSet field's contents do not have a deterministic order but the hardcoded string assertion assumes a deterministic order. The json serializer did not guarantee attribute order and inside FileDescriptorSet the contents can also be in different orders due to different generation paths or environments producing the contents in different orders despite the logical content being the same. Since the original test compared the raw strings/trees "as-is", harmless re-ordering could flip the test from pass to fail without any real schema change.

Modifications

We no longer compare raw strings/trees "as-is". Instead the fileDescriptorSet field is base64-decoded into a FileDescriptorSet, then converted into a JsonNode. We compare the expected and actual descriptors with JsonAssert in NON_EXTENSIBLE mode, which ignores field ordering but forbids missing or extra fields. After Replacing the original fileDescriptorSet values with the normalized JSON trees, we compare the entire schema JSON (including fileDescriptorSet) with the same order-agnostic but non-extensible assertion.

This two-stage approach is necessary because JSONAssert does not automatically re-order nested JSON string values (which is how the original fileDescriptorSet values are interpreted). By decoding and normalizing fileDescriptorSet first, we ensure the test remains robust to nondeterministic ordering inside the descriptor set, while still validating the schema structure strictly. This change keeps the spirit of the original test while eliminating failures caused solely by allowed reordering.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is already covered by existing tests, such as org.apache.pulsar.client.impl.schema.ProtobufSchemaTest#testSchema.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: LucasEby#1

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Oct 1, 2025
@LucasEby LucasEby force-pushed the testSchemaNondeterministic branch from 3146821 to 1fbb84e Compare October 1, 2025 17:52
@LucasEby LucasEby force-pushed the testSchemaNondeterministic branch 2 times, most recently from 6b132ab to d39852c Compare October 3, 2025 16:57
@LucasEby LucasEby force-pushed the testSchemaNondeterministic branch from d39852c to c84ac39 Compare October 4, 2025 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] ProtobufNativeSchemaTest.testSchema order-independent
2 participants