Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Conditions in Marshmallow - Context Not thread safe? #1825

Closed
alnikolaj opened this issue Jun 11, 2021 · 3 comments
Closed

Race Conditions in Marshmallow - Context Not thread safe? #1825

alnikolaj opened this issue Jun 11, 2021 · 3 comments

Comments

@alnikolaj
Copy link

Hello everybody,

I am having a serious problem. We are serializing a list of objects up to 100 objects at a time/per request and we have some custom field methods.

In these field methods, we pass the user_id in the context and perform some queries specific to that user_id. Now... I have noticed that sometimes this extra object field is wrong and after much investigation I found out that in some cases (within the list of objects in the same request!) the user_id was wrong and suddenly changed. The user_id passed through the context changed mid serialization of these 100 objects... I checked this "wrong" user and saw that his/her last activity was the same as mine what leads me to the conclusion that for some reason Marshmallow is not thread-safe and there is a race condition? Here is an example of my code:

# Schema
class MySchema(SQLAlchemyAutoSchema):
    class Meta:
        model = Object

    extra = fields.Method('custom_field', dump_only=True)

    def custom_field(self, obj):
        user_id = self.context.get('user_id')

        # User Favorites
        favorited = False
        in_favorites = Favorite.query.filter_by(user_id = user_id).first()
        if in_favorites:
            favorited = True
        
        return {
            'user_id' : user_id, 
            'favorited' : favorited
        }

# API
objects = Object.query.all()
user = User.query.get(id=id)

many_schema = MySchema(many=True)
many_schema.context['user_id'] = user.id
return many_schema.dump(objects)

I am unsure how to proceed now. I would need to fix this as soon as possible.

Thanks in advance.

@lafrech
Copy link
Member

lafrech commented Jun 11, 2021

I suspect the same schema instance is shared among all requests. This shouldn't happen if the view is passed a schema class and instantiates the schema on each request.

Please post a sample code to show the way you use the schema in the view function.

@alnikolaj
Copy link
Author

Example:

#api.py
from flask_classful import FlaskView
from app.models import User, Object
from serializer.py import MySchema

schema = MySchema()

class ObjectListAPI(FlaskView):
    def get(self):
        # Query User
        user_id = request.args.get('user_id')
        user_obj = User.query.get(user_id)
        objects = Object.query.all()

        schema.context['user_id'] = user.id
        return schema.dump(objects)

I had the MySchema initialisation outside of the Get Function / FlaskView, which might have caused the problem?

@lafrech
Copy link
Member

lafrech commented Jun 14, 2021

Exactly.

This is not exactly a marshmallow thread-safety issue. If a object instance is shared by multiple threads, setting an attribute in one thread modifies the object for the other threads as well.

(Things can get tricky with nested schemas, though: #1617.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants