Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mashumaro JSONEncoder/JSONDecoder behavior differs with to_json/from_json. #5588

Closed
2 tasks done
JackUrb opened this issue Jul 24, 2024 · 2 comments · Fixed by flyteorg/flytekit#2613
Closed
2 tasks done
Assignees
Labels
bug Something isn't working

Comments

@JackUrb
Copy link

JackUrb commented Jul 24, 2024

Describe the bug

Hi Team! We recently pulled master into our fork (where we're trying to find+fix bugs related to our use case), and found that a recent change (flyteorg/flytekit#2554) broke expected behavior. I'm very much a fan of not requiring the use of these mixins, but surprisingly the behavior of JSONEncoder(...).encode() ends up different from to_json. This may be considered a mashumaro bug (and I've opened Fatal1ty/mashumaro#239 there as well), but it breaks existing flytekit behavior.

This relates to the ability of mashumaro to encode/decode classes that use the Discriminator pattern in order to serialize/deserialize subclasses of a generic class. to_json correctly encodes to the target subclass, while JSONEncoder(BaseClass).encode() only encodes to the base class.

Expected behavior

Subclass encoding/decoding should still be able to expand to subclasses correctly. As such, if to_json and from_json exist they should be preferred over the JSONEncoder and JSONDecoder approaches.

Additional context to reproduce

# example_to_json.py
from dataclasses import dataclass
from mashumaro.mixins.json import DataClassJSONMixin
from mashumaro.codecs.json import JSONEncoder, JSONDecoder
from mashumaro.config import BaseConfig
from mashumaro.types import Discriminator

from enum import StrEnum, auto

class SubclassTypes(StrEnum):
    BASE = auto()
    CLASS_A = auto()
    CLASS_B = auto()

@dataclass(kw_only=True)
class BaseClass(DataClassJSONMixin):
    class Config(BaseConfig):
        discriminator = Discriminator(
            field="subclass_type",
            include_subtypes=True,
        )

    subclass_type: SubclassTypes = SubclassTypes.BASE
    base_attribute: int


@dataclass(kw_only=True)
class ClassA(BaseClass):
    subclass_type: SubclassTypes = SubclassTypes.CLASS_A
    class_a_attribute: str


@dataclass(kw_only=True)
class ClassB(BaseClass):
    subclass_type: SubclassTypes = SubclassTypes.CLASS_B
    class_b_attribute: float


if __name__ == "__main__":
    class_a = ClassA(
        base_attribute=1,
        class_a_attribute="a"
    )
    class_b = ClassB(
        base_attribute=2,
        class_b_attribute=2.0
    )
    print("Encoder(a)", JSONEncoder(BaseClass).encode(class_a))
    print("a.to_json()", class_a.to_json())

    print("Encoder(b)", JSONEncoder(BaseClass).encode(class_b))
    print("b.to_json()", class_b.to_json())

Outputs:

Encoder(a) {"subclass_type": "class_a", "base_attribute": 1}
a.to_json() {"subclass_type": "class_a", "base_attribute": 1, "class_a_attribute": "a"}
Encoder(b) {"subclass_type": "class_b", "base_attribute": 2}
b.to_json() {"subclass_type": "class_b", "base_attribute": 2, "class_b_attribute": 2.0}

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@JackUrb JackUrb added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Jul 24, 2024
Copy link

runllm bot commented Jul 24, 2024

The issue you're encountering seems to be related to the behavior of JSONEncoder and JSONDecoder from the Mashumaro library when dealing with subclass encoding/decoding. Specifically, the to_json and from_json methods handle subclass serialization correctly, while JSONEncoder(BaseClass).encode() does not.

To address this, you can ensure that Flytekit prefers the to_json and from_json methods over the JSONEncoder and JSONDecoder approaches when they exist. This will maintain the expected behavior for subclass encoding/decoding.

Would you like more details on how to implement this fix or further insights into similar reported problems?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

@Future-Outlier
Copy link
Member

But this works, right?

    print("Encoder(a)", JSONEncoder(ClassA).encode(class_a))
    print("a.to_json()", class_a.to_json())

    print("Encoder(b)", JSONEncoder(ClassB).encode(class_b))
    print("b.to_json()", class_b.to_json())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants