Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context and instantiated Netsted schemas. #1791

Open
Thomas-Z opened this issue Apr 11, 2021 · 4 comments · May be fixed by #1833
Open

Context and instantiated Netsted schemas. #1791

Thomas-Z opened this issue Apr 11, 2021 · 4 comments · May be fixed by #1833

Comments

@Thomas-Z
Copy link

Thomas-Z commented Apr 11, 2021

I like to use schema context to transmit information from one schema to another (child or parent) and the following behavior seems a little strange to me.

import marshmallow as mrsh

class SchemaA(mrsh.Schema):
    a = mrsh.fields.Integer()

    @mrsh.pre_load
    def _pre_load(self, data, **_):
        self.context["from_a"] = "a"

        return data

class SchemaB1(mrsh.Schema):
    ac = mrsh.fields.Nested(SchemaA)

    @mrsh.pre_load
    def _pre_load(self, data, **_):
        self.context["from_b1"] = "b1"

        return data

class SchemaB2(mrsh.Schema):
    ac = mrsh.fields.Nested(SchemaA())

    @mrsh.pre_load
    def _pre_load(self, data, **_):
        self.context["from_b2"] = "b2"

        return data

The only difference between SchemaB1 and SchemaB2 is that I provide the SchemaA class to the first one and an instance of SchemaA to the other.
This is the kind of things I usually do to provide a specific context to SchemaA when generated from SchemaB.
Something like that : ac = mrsh.fields.Nested(SchemaA(context={"x": 1))

sb1 = SchemaB1()
sb2 = SchemaB2()

sb1.load({"ac": {"a": 5}})
sb2.load({"ac": {"a": 5}})

print(sb1.context)
>> {'from_b1': 'b1', 'from_a': 'a'}

print(sb2.context)
>> {'from_b2': 'b2'}

Using an instance of SchemaA result in the context not being fully shared between the Nested schema and its parent.

sb1.fields["ac"].schema.context is sb1.context
>> True

Same object if using a class.

sb2.fields["ac"].schema.context is sb2.context
>> False

Different objects if using an instance.

sb2.fields["ac"].schema.context
>> {'from_b2': 'b2', 'from_a': 'a'}

SchemaA correctly inherited from its parent context.

I'm not sure if that's a bug or not but that behavior somehow feels inconsistent to me (or at least hard to predict).

Is this the expected behavior ?

Thanks for your great library!

# Name                    Version                   Build  Channel
marshmallow               3.11.1             pyhd3eb1b0_0    defaults
python                    3.8.5                h7579374_1    defaults
@Thomas-Z Thomas-Z changed the title Context and instantiated nNetsted schemas. Context and instantiated Netsted schemas. Apr 11, 2021
@lafrech
Copy link
Member

lafrech commented Jun 22, 2021

Thanks for reporting and sorry about the delay.

Indeed there is an inconsistency. See related discussion in #1617 and an implementation proposal for marshmallow 4 in #1826.

What would you expect?

  • {'from_b2': 'b2', 'from_a': 'a'}: parent updates nested schema context
  • {'from_a': 'a'} : parent overrides nested schema context

The former brings thread-safety issues, so I'm willing to chose the latter. This is what I do in #1833.

The update case is something I didn't think about when proposing #1826.

@Thomas-Z
Copy link
Author

Thomas-Z commented Jun 26, 2021

I'm not sure to fully understand your proposal and its implications but here is what I understand/expect.

I don't know what the original intent was when designing the "context" attribute but I use it a lot to provide different contexts to nested schemas:

import marshmallow as mm

DO_SOMETHING = "DO_SOMETHING"


class SchemaA(mm.Schema):
    a = mm.fields.Integer(required=True)

    @mm.post_load
    def _post_load(self, data, **_):
        do_smt = self.context.get(DO_SOMETHING, False)

        if do_smt:
            data["a"] = data["a"] * 4

        return data


class SchemaB(mm.Schema):
    ac1 = mm.fields.Nested(SchemaA(context={DO_SOMETHING: True}))
    ac2 = mm.fields.Nested(SchemaA(context={DO_SOMETHING: False}))
sch = SchemaB()

data = {"ac1": {"a": 1}, "ac2": {"a": 1}}

sch.load(data)
>> {'ac1': {'a': 4}, 'ac2': {'a': 1}}
sch = SchemaB(context={DO_SOMETHING: True})

data = {"ac1": {"a": 1}, "ac2": {"a": 1}}

sch.load(data)
>> {'ac1': {'a': 4}, 'ac2': {'a': 4}}

Both of these results seem consistent to me and are what I would expect.
I think your proposal is breaking this kind of things.

In my initial example, it was not about parents updating nested schema's context but nested schema updating their parents' context.
This behavior seems dangerous and hardly predictable.

The way I see it, each nested schema's context should inherit from its parent's context by making a copy of it (but not sharing it nor having the possibility to update it) and add its own context elements the way it's done today.

@lafrech
Copy link
Member

lafrech commented Jul 21, 2021

I don't know the original intent for this feature either. I've never used it.

To me, the point is to allow modifying the behaviour at runtime.

Your sample code uses it to modify the schema at init (import) time. This could easily be achieved by just subclassing Schema.

class SchemaA(mm.Schema):
    def __init__(self, do_smt=False, *args, **kwargs):
        self.do_smt = do_smt

    a = mm.fields.Integer(required=True)

    @mm.post_load
    def _post_load(self, data, **_):
        if self.do_smt:
            data["a"] = data["a"] * 4
        return data

class SchemaB(mm.Schema):
    ac1 = mm.fields.Nested(SchemaA(do_smt=True}))
    ac2 = mm.fields.Nested(SchemaA())

Is this correct? Or was your example too contrived and it wouldn't work in your real use case?

The rework I propose in #1826 indeed breaks your use case but I believe this is fine as you should be able to achieve what you want like I'm showing above and context would be used only to modify things at runtime, not init.


The question that remains is what to do about marshmallow 3. The fix in #1833 also breaks your use case. But it fixes a thread-safety issue that can catch users by surprise. We could argue that the way you're using it is not the way it is documented in the docs but to be fair, the doc for this feature is a bit light. We could also argue that the breaking change is likely to be caught in unit tests while the thread-safety issue is silent. We could merge the fix raise a warning when initiating a schema with a context in a Nested field. But I still don't like the idea of a breaking change in 3.x.

Or we go the other way, don't merge the fix but raise a warning to alert about the thread-safety issue. No breaking change in marshmallow 3.x. Users will have to use a schema class rather than an instance to avoid the thread-safety issue, which means they won't be able to use schema modifiers while passing a context. We could live with this corner case limitation until 4.0.

@sloria @deckar01, thoughts?

@Thomas-Z
Copy link
Author

Thomas-Z commented Aug 1, 2021

This is correct and looks cleaner than the way I'm doing it.

I think (not 100% sure) this would allow me to not use contexts this way anymore and not be too much affected by the #1833 fix.

Changing your code to this

import marshmallow as mm

DO_SOMETHING = "DO_SOMETHING"

class SchemaA(mm.Schema):
    def __init__(self, do_smt=False, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self.do_smt = do_smt

    a = mm.fields.Integer(required=True)

    @mm.post_load
    def _post_load(self, data, **_):
        do_smt = self.context.get(DO_SOMETHING, self.do_smt)

        if do_smt:
            data["a"] = data["a"] * 4

        return data


class SchemaB(mm.Schema):
    ac1 = mm.fields.Nested(SchemaA(do_smt=True))
    ac2 = mm.fields.Nested(SchemaA())

Would still allow the following usage:

sch = SchemaB(context={DO_SOMETHING: True})

data = {"ac1": {"a": 1}, "ac2": {"a": 1}}

sch.load(data)
>> {'ac1': {'a': 4}, 'ac2': {'a': 4}}

#1833 would also fix the inconsistency I mentioned in my initial message.

Like you, I'm not sure a breaking change in 3.x would be a good thing (except if I'm the only one using contexts like this, I wouldn't mind).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants