Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: OPT_SORT_KEYS has no effect on OPT_SERIALIZE_PYDANTIC #312

Closed
awakim opened this issue Nov 25, 2024 · 2 comments
Closed

Bug: OPT_SORT_KEYS has no effect on OPT_SERIALIZE_PYDANTIC #312

awakim opened this issue Nov 25, 2024 · 2 comments

Comments

@awakim
Copy link

awakim commented Nov 25, 2024

Description

When using the ormsgpack with the OPT_SORT_KEYS option in combination with OPT_SERIALIZE_PYDANTIC, the serialized output does not appear to sort keys as expected. This inconsistency can lead to differences in serialized output and hash values for equivalent data structures, depending on the order of the attributes in the model.

Steps to Reproduce

  1. Define two pydantic models with the same fields but in a different order.
  2. Serialize instances of these models using ormsgpack.packb with the OPT_SORT_KEYS | OPT_SERIALIZE_PYDANTIC options.
  3. Observe that the order of the keys in the serialized output differs, indicating that OPT_SORT_KEYS is not applied.

Reproducible Code

import hashlib
import ormsgpack
import pydantic


class MyModel(pydantic.BaseModel):
    foo: str
    bar: int


class MyModelSorted(pydantic.BaseModel):
    bar: int
    foo: str


def main():
    inst = MyModel(foo="hello", bar=123)
    packed = ormsgpack.packb(inst, option=ormsgpack.OPT_SORT_KEYS | ormsgpack.OPT_SERIALIZE_PYDANTIC)
    print(packed)
    print(hashlib.sha256(packed).hexdigest())

    inst_sorted = MyModelSorted(foo="hello", bar=123)
    packed = ormsgpack.packb(inst_sorted, option=ormsgpack.OPT_SORT_KEYS | ormsgpack.OPT_SERIALIZE_PYDANTIC)
    print(packed)
    print(hashlib.sha256(packed).hexdigest())


if __name__ == "__main__":
    main()

Actual Output

b'\x82\xa3foo\xa5hello\xa3bar{'
635270757b042ccd69da403ce4c6e414ef31d5bd36aa839b23aec504a4f90507
b'\x82\xa3bar{\xa3foo\xa5hello'
4ff8e8bdc8bfbd87c8183b3a0b28f45fc583c72a408be30ad75e84a34d5dad33

Expected Output

The serialized output for both instances should have keys in sorted order, as specified by the OPT_SORT_KEYS option. Therefore, the output and the resulting hash should be identical for both models.

Environment

•	ormsgpack version: 1.6.0
•	pydantic version: 2.9.2
•	Python version: 3.11.9
•	Operating System: No LSB modules are available.
          Distributor ID: Ubuntu
          Description:    Ubuntu 22.04.5 LTS
          Release:        22.04
          Codename:       jammy

Additional Context

The issue seems to occur only when OPT_SORT_KEYS is used alongside OPT_SERIALIZE_PYDANTIC. When OPT_SORT_KEYS is used independently, it works as expected.

This bug may lead to inconsistent behavior when using hashed serialized data for caching or deduplication. Please investigate and resolve this issue to ensure compatibility between the OPT_SORT_KEYS and OPT_SERIALIZE_PYDANTIC options.

@exg
Copy link
Collaborator

exg commented Nov 26, 2024

Hi, that is expected as OPT_SORT_KEYS applies only to dict objects and is documented as such. I opened #316 to support it also for pydantic models.

@awakim
Copy link
Author

awakim commented Nov 27, 2024

Thank you for this!

@exg exg closed this as completed Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants