Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context is not thread-safe (at least) for nested scheme #1617

Open
hstadler opened this issue Jun 19, 2020 · 5 comments · May be fixed by #1833
Open

Context is not thread-safe (at least) for nested scheme #1617

hstadler opened this issue Jun 19, 2020 · 5 comments · May be fixed by #1833
Labels

Comments

@hstadler
Copy link

I'm developing a web application with Flask, having many marhsmallow schema classes with nested schemes. I also make use of marshmallows context.
Sometimes (e.g. first run of the application after a reboot) I noticed unexpected serialization results. It turned out that the context in one of the nested schemas differed from it's parent.

A simplified example, which shows the race condition for concurrent requests with a nested scheme:

from datetime import datetime
from flask import Flask
from marshmallow import Schema, fields, post_dump
import time
app = Flask(__name__)


class TestNestSchema(Schema):
    name = fields.String()

    @post_dump()
    def context_dump(self, data, **kwargs):
        data['before'] = self.context.get('test')
        # Do some work
        time.sleep(5)
        data['after'] = self.context.get('test')
        return data


class TestSchema(Schema):
    name = fields.String()
    nested = fields.Nested(TestNestSchema())


class TestNestedClass:
    name = 'TestNestedClass'


class TestClass:
    name = 'TestClass'
    nested = TestNestedClass()


@app.route('/')
def test_endpoint():
    test_schema = TestSchema(context={'test': str(datetime.now())})
    dump = test_schema.dump(TestClass())
    return {'data': dump}


if __name__ == "__main__":
    app.run()

When troubleshooting, I also took a brief look at the marhsmallow code and got the feeling that it wasn't just the context that might cause such problems.
Is there a way to use marhsmallow thread-safe?

@AndreiPashkin
Copy link

@hstadler, I'm not able to reproduce the race-condition. Could you please provide a more complete example?

@AndreiPashkin
Copy link

Nevermind. @hstadler, race-condition in your case if easily fixable by changing nested = fields.Nested(TestNestSchema()) to nested = fields.Nested(TestNestSchema).

@lafrech
Copy link
Member

lafrech commented Jun 14, 2021

Indeed, using a class rather than an instance should be safe. But it has the limitation that it doesn't allow the user to parametrize the nested schema with modifiers by instantiating it in the schema definition.

When using an instance, it seems the context is not replaced but updated.

if isinstance(nested, SchemaABC):
self._schema = copy.copy(nested)
self._schema.context.update(context)

    def test_context_thread_safe(self):
        class A(Schema):
            a = fields.Integer()

        class B(Schema):
            n = fields.Nested(A())  # <-- instance

        sb = B()
        sb.context = {"test_1": 1}
        sb.dump({"n": {"a": 1}})
        sb2 = B()
        sb2.context = {"test_2": 1}
        sb2.dump({"n": {"a": 1}})

In the latter case, the context is {"test_1": 1, "test_2": 1}.

I don't see the reason for the update, here. I'd change this to

            if isinstance(nested, SchemaABC):
                self._schema = copy.copy(nested)
                self._schema.context = context

I think this qualifies as a bug and could be fixed by the change above. IIUC, it only affects people using nested schema instances and contexts with different keys, such that the update differs from the assignment. (E.g. passing, {"user_id": xxx} should be safe because the next context overrides the old one.)

@sloria @deckar01 WDYT? Bug? CVE?

I'm proposing a change for marshmallow 4 that would solve this in a better way : #1826.

@sloria
Copy link
Member

sloria commented Jun 19, 2021

@lafrech Thanks for investigating. I think your proposed fix is correct. Do you have time to run with the fix/release?

@lafrech
Copy link
Member

lafrech commented Jun 22, 2021

I think the update in the original code is meant to allow a parent to update the nested schema context (as in #1791).

The fix I propose breaks this. I don't see any safe way to allow updating while being thread-safe but this might be a concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants