Deadlock when deploying in FastAPI + Gunicorn/Uvicorn + Docker #12746

dandiep · 2023-06-22T19:52:00Z

dandiep
Jun 22, 2023

I've got a FastAPI application where we're using spacy with various transformer models to parse text. Under concurrent load, the spacy (or the transformers spacy uses) seem to deadlock. Every thread gets deadlocked on this same torch.nn.Linear and nothing advances. I can only reproduce this in Docker, I haven't been able to reproduce it on my mac.

Here's a thread dump from pystack:

Traceback for thread 11 (gunicorn) [] (most recent call last):
    (Python) File "/usr/local/lib/python3.9/threading.py", line 937, in _bootstrap
        self._bootstrap_inner()
    (Python) File "/usr/local/lib/python3.9/threading.py", line 980, in _bootstrap_inner
        self.run()
    (Python) File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
        result = context.run(func, *args)
    (Python) File "/app/squidgy_parser/endpoints.py", line 75, in hello
        pipeline("Hello")
    (Python) File "/usr/local/lib/python3.9/site-packages/spacy/language.py", line 1026, in __call__
        doc = proc(doc, **component_cfg.get(name, {}))  # type: ignore[call-arg]
    (Python) File "/usr/local/lib/python3.9/site-packages/spacy_transformers/pipeline_component.py", line 192, in __call__
        outputs = self.predict([doc])
    (Python) File "/usr/local/lib/python3.9/site-packages/spacy_transformers/pipeline_component.py", line 229, in predict
        activations = self.model.predict(docs)
    (Python) File "/usr/local/lib/python3.9/site-packages/thinc/model.py", line 315, in predict
        return self._func(self, X, is_train=False)[0]
    (Python) File "/usr/local/lib/python3.9/site-packages/spacy_transformers/layers/transformer_model.py", line 199, in forward
        model_output, bp_tensors = transformer(wordpieces, is_train)
    (Python) File "/usr/local/lib/python3.9/site-packages/thinc/model.py", line 291, in __call__
        return self._func(self, X, is_train=is_train)
    (Python) File "/usr/local/lib/python3.9/site-packages/thinc/layers/pytorchwrapper.py", line 219, in forward
        Ytorch, torch_backprop = model.shims[0](Xtorch, is_train)
    (Python) File "/usr/local/lib/python3.9/site-packages/thinc/shims/pytorch.py", line 92, in __call__
        return self.predict(inputs), lambda a: ...
    (Python) File "/usr/local/lib/python3.9/site-packages/thinc/shims/pytorch.py", line 110, in predict
        outputs = self._model(*inputs.args, **inputs.kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 851, in forward
        encoder_outputs = self.encoder(
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 526, in forward
        layer_outputs = layer_module(
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 411, in forward
        self_attention_outputs = self.attention(
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 338, in forward
        self_outputs = self.self(
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 218, in forward
        value_layer = self.transpose_for_scores(self.value(hidden_states))
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
    (Python) File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
        return F.linear(input, self.weight, self.bias)

The code itself is pretty simple. I have a function that loads spacy and caches it in a dict:

def get_spacy_pipeline(language):
    if language not in nlp_pipelines:
        threadLock.acquire()
        try:
            if language not in nlp_pipelines:
                Language.get_language(language)
                
                if language not in models:
                    raise Exception("Unsupported language %s" % language)
                
                logger.info("Loading spacy model for %s" % language)
                nlp = spacy.load(model_dir + f"/spacy-{spacy_version}/{models[language]}/{models[language]}-{spacy_version}", disable=['ner', 'textcat'])
                nlp.add_pipe("emoji", first=True)
                nlp_pipelines[language] = nlp
        finally:
            threadLock.release()

    return nlp_pipelines[language]

Then my FastAPI request is this:

@router.get('/api/{language}/hello', response_model=str)
def hello(language: str) -> str:
    pipeline = get_spacy_pipeline(language)
    pipeline("Hello")
    return f"Hello {language}"

A single request at a time will execute, but if I fire a bunch of requests, I inevitably get the deadlock above.

I've set Gunicorn workers to 1, and the worker type is Uvicorn. I cannot have more than 1 worker because that will create multiple instances of each transformer model and I will run out of memory.

If, I add locks around the spacy code, I don't appear to be able to reproduce the issue. To do that, I create a wrapper:


class SpacyWrapper:
    def __init__(self, spacy):
        self.spacy = spacy

    def __call__(self, *args: Any, **kwds: Any) -> Any:
        threadLock.acquire()
        try:
            return self.spacy(*args, **kwds)
        finally:
            threadLock.release()

    def pipe(self, *args, **kwargs):
        threadLock.acquire()
        try:
            return self.spacy.pipe(*args, **kwargs)
        finally:
            threadLock.release()

and then just do SpacyWrapper(nlp)

Any ideas on what could be going on? Or investigative paths to proceed down?

kadarakos · 2023-06-23T10:31:39Z

kadarakos
Jun 23, 2023

Hey dandiep,

We have a few issues related to the deadlock you are experiencing. Please check out this one and let us know if you need more information: #4667

0 replies

dandiep · 2023-06-23T21:46:07Z

dandiep
Jun 23, 2023
Author

OK, can confirm. torch.set_num_threads(1) made all my problems go away.

@kadarakos can I recommend that this get added to the docs somewhere? I think deploying spacy for inference behind an API isn't that uncommon... I lost a good week figuring out what in the world was going on here and building a reproducible test case.

2 replies

dandiep Jun 23, 2023
Author

Also, forgot to say, thanks for the help! I'm not sure I would have found that otherwise.

svlandeg Jun 26, 2023

Hi Dan, this is in fact mentioned in the docs as a warning: https://spacy.io/usage/processing-pipelines#multiprocessing ;-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Deadlock when deploying in FastAPI + Gunicorn/Uvicorn + Docker #12746

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Deadlock when deploying in FastAPI + Gunicorn/Uvicorn + Docker #12746

Uh oh!

Uh oh!

dandiep Jun 22, 2023

Replies: 2 comments · 2 replies

Uh oh!

kadarakos Jun 23, 2023

Uh oh!

dandiep Jun 23, 2023 Author

Uh oh!

dandiep Jun 23, 2023 Author

Uh oh!

svlandeg Jun 26, 2023

dandiep
Jun 22, 2023

Replies: 2 comments 2 replies

kadarakos
Jun 23, 2023

dandiep
Jun 23, 2023
Author

dandiep Jun 23, 2023
Author