Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress warning about Union type #221

Open
marcelveldt opened this issue May 10, 2024 · 4 comments
Open

Suppress warning about Union type #221

marcelveldt opened this issue May 10, 2024 · 4 comments

Comments

@marcelveldt
Copy link

  • mashumaro version: 3.13
  • Python version: 3.12
  • Operating System: Debian

Description

Since the 3.13 release a warning is being logged (on all user setups as well!) about coercing a value within a Union type.
This worked perfectly well for the last couple of years so now I'm confused why this change is needed.
If needed, is there a way to override or do you have tips how I could treat this differently ? A custom rule perhaps ?

UserWarning: music_assistant.common.models.config_entries.ConfigEntry.value (typing.Union[str, int, float, bool, tuple[int, int], list[str], list[int], list[tuple[int, int]], None]): In the next release, data marked with Union type containing 'str' and 'bool' will be coerced to the value of the type specified first instead of passing it as is

Reference to the model in my code:
https://github.com/music-assistant/server/blob/main/music_assistant/common/models/config_entries.py#L46

Amazing projects btw, love it and been on board since the very beginning. It has grown into my goto recommendation for serialization libs.

What I Did

Paste the code, command(s) you ran and the output.
If there was a crash, please include the traceback here.
@MichelleArk
Copy link

We're also seeing a similar warning in dbt-core, from these two lines:

UserWarning: dbt.contracts.graph.unparsed.UnparsedMetricTypeParams.expr (typing.Union[str, bool, None]): In the next release, data marked with Union type containing 'str' and 'bool' will be coerced to the value of the type specified first instead of passing it as is

Also curious what the recommended workaround here would be

@Fatal1ty
Copy link
Owner

Fatal1ty commented May 27, 2024

now I'm confused why this change is needed.

In short, this is closely related to the following issue:

When you define a field of type str or bool, the value is passed through as is during deserialization. So, if your Union type contains str or bool on the left, it is equivalent to having Any there instead. This is a historically accepted design decision to do this, due to the fact that, as it seemed to me, validation would not be needed, but over time I began to understand that such a clumsy method could easily lead to errors.

I added this warning just to check how many people would be affected by this change, but I didn't foresee that there would be so many of them. Now I'm going to take time out to come up with a smarter way to deserialize a Union containing str and bool. I'd like to thank all those who made me understand that it is better not to make destructive changes right now.

Right now you can remove this warning using custom deserialization methods for str and bool or for the the specific Union types:

@dataclass
class Foo(DataClassDictMixin):
    foo: str | bool | date

    class Config(BaseConfig):
        serialization_strategy = {
            bool: {"deserialize": ...},
            str: {"deserialize": ...},
        }

Another way it to keep the default implementation and suppress the warning using warnings.filterwarnings:

import warnings

warnings.filterwarnings('ignore', category=UserWarning, module='mashumaro')

I will leave this issue open for now to collect more feedback and discuss the behavior with Union. I'd love to hear about your deserialization expectations for the following examples:

@dataclass
class Foo(DataClassDictMixin):
    foo: Union[str, int]

Foo.from_dict({"foo": "1"})
Foo.from_dict({"foo": 1})
@dataclass
class Foo(DataClassDictMixin):
    foo: Union[int, str]

Foo.from_dict({"foo": "1"})
Foo.from_dict({"foo": 1})
@dataclass
class Foo(DataClassDictMixin):
    foo: Union[str, date]

Foo.from_dict({"foo": "2024-05-28"})
@dataclass
class Foo(DataClassDictMixin):
    foo: Union[date, str]

Foo.from_dict({"foo": "2024-05-28"})
@dataclass
class Foo(DataClassDictMixin):
    foo: Union[bool, int]

Foo.from_dict({"foo": 1})
Foo.from_dict({"foo": True})
Foo.from_dict({"foo": "1"})
@dataclass
class Foo(DataClassDictMixin):
    foo: Union[int, bool]

Foo.from_dict({"foo": 1})
Foo.from_dict({"foo": True})
Foo.from_dict({"foo": "1"})

@marcelveldt
Copy link
Author

marcelveldt commented May 28, 2024

Thanks for the detailed answer and the recommended workarounds.
I understand the reasoning and at the same time I think I also can give you the answer to your question about expectations.

To me at least, I don't expect mashumaro to automatically coerce values into a type. That would be a nice to have but only if everything else failed (and even then only as option).

So, some examples;

foo: Union[int, str]
Foo.from_json({"foo": "1"}). --> "1" and not 1
Foo.from_json({"foo": 1}) --> 1 and not "1"

foo: Union[int, bool]
Foo.from_json({"foo": "1"}). --> ValueError
Foo.from_json({"foo": 1}) --> 1 and not True
Foo.from_json({"foo": "true"}) --> ValueError
Foo.from_json({"foo": true}) --> True

Maybe, just maybe, accept coercing only for Non-union types with some config parameter.

@tlento
Copy link

tlento commented Jun 2, 2024

I came here to comment on the destructive coercion described in the warning, so I'm glad to see this discussion and it's wonderful that Mashumaro will back off of that approach.

To me at least, I don't expect mashumaro to automatically coerce values into a type.

I heartily agree with this sentiment - Mashumaro is a broadly used serialization/deserialization library, and the decision of how to handle Union input types in particular and type coercion more broadly is an application level concern. The way I think about it is to imagine this field:

is_permission_granted: Union[bool, str]

Or this one:

is_permission_granted: bool

What should happen to is_permission_granted values of "False", "0", "F", "falsa", or "거짓"(which Google assures me is Korean for "false")?

In my opinion the answer in all cases, at least from Mashumaro's perspective, is "how the heck should I know?" Presumably someone somewhere in the call stack knows, because they understand the expected boundaries of whatever produced this input value, so the best thing we can do for them is deserialize to type without coercion and let them switch through their union type values as they normally would. In the boolean primitive case, everybody involved would be best served by throwing an informative error that lets them make the appropriate fix for their use case.

I understand this may be tricky to deal with, especially if coercion of incorrectly (or ambiguously) typed primitive inputs has been the norm, but it seems to me like the simplest and most robust end stage to target.

Continuing to allow for custom coercion directives is important. The key is the user's model config clearly shows what is meant to happen within their application, and Mashumaro doesn't need to make any hidden assumptions about how to handle off-type values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants