Prediction is not thread-safe #782

knkski · 2022-08-16T18:50:47Z

Describe the bug
I have a flask app that has an endpoint that looks something like this, where model is FullBayesianForecaster:

@app.route("/predict", methods=["POST"])
def predict():
    df = get_df()
    result = model.predict(df, seed=1234)
    return result.to_dict(orient="records")

I tested it locally with curl and it worked great: the flask app returned the same predictions for the same input. I then deployed it to a test environment, and another service that talks to it immediately started getting non-deterministic results for requests with the same parameters. I checked the prediction results with another curl request to the test environment, and got the same result back every time, as expected.

After scratching my head a bit, I tried using threading.Lock like this:

with lock:
    result = model.predict(df, seed=1234)

And then the non-determinism went away. This leads me to believe that model prediction is not thread safe. I haven't yet dug down far enough to know for sure what's causing the issue, but this seems to be at least one likely culprit:

https://github.com/uber/orbit/blob/27371ec/orbit/forecaster/full_bayes.py#L96-L97

If multiple threads are calling model.predict such that they fight over the value of self._prediction_meta here, that seems likely to cause predictions to be not thread-safe:

https://github.com/uber/orbit/blob/c232980/orbit/forecaster/forecaster.py#L389

To Reproduce

Try running model.predict with different parameters in the presence of multiple threads, and set seed=a_fixed_number.

Expected behavior

Calling model.predict with seed=a_fixed_number in the presence of multiple threads returns deterministic predictions.

Screenshots

N/A

Environment (please complete the following information):

OS: Ubuntu
Python Version: 3.9
Versions of Major Dependencies (pandas, scikit-learn, cython): [e.g. pandas==1.3.5, scikit-learn==<not installed>, cython==0.29.30]

Additional context

N/A

The text was updated successfully, but these errors were encountered:

edwinnglabs · 2022-08-19T21:40:03Z

@knkski are you saying you get same result across threads even with the setting seed=None ?

knkski · 2022-08-19T22:30:35Z

@edwinnglabs I haven't tested that, I'd assume that I'd get different results each time then. This issue arose after setting the random seed, assuming that would fix the non-deterministic predictions I was seeing. Since I started getting deterministic predictions with a thread lock, I didn't try seeing what happens with a thread lock, but without a random seed.

edwinnglabs · 2022-08-19T23:46:17Z

So you want to see same behavior on each thread you call model.predict? If so, the other around as you described should work by using seed=a_fixed_number. Please let me know if I understand the problem correctly...

knkski · 2022-08-22T16:54:08Z

Yeah, I'm looking to get deterministic predictions on every thread, given that I pass in seed=a_fixed_number. Right now I'm not seeing that; using multiple threads will return non-deterministic predictions, even if I set seed=a_fixed_number, and that's what I think the bug is.

knkski added the bug Something isn't working label Aug 16, 2022

edwinnglabs assigned edwinnglabs and wangzhishi Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prediction is not thread-safe #782

Prediction is not thread-safe #782

knkski commented Aug 16, 2022 •

edited

Loading

edwinnglabs commented Aug 19, 2022

knkski commented Aug 19, 2022

edwinnglabs commented Aug 19, 2022 •

edited

Loading

knkski commented Aug 22, 2022

Prediction is not thread-safe #782

Prediction is not thread-safe #782

Comments

knkski commented Aug 16, 2022 • edited Loading

edwinnglabs commented Aug 19, 2022

knkski commented Aug 19, 2022

edwinnglabs commented Aug 19, 2022 • edited Loading

knkski commented Aug 22, 2022

knkski commented Aug 16, 2022 •

edited

Loading

edwinnglabs commented Aug 19, 2022 •

edited

Loading