Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction is not thread-safe #782

Open
knkski opened this issue Aug 16, 2022 · 4 comments
Open

Prediction is not thread-safe #782

knkski opened this issue Aug 16, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@knkski
Copy link

knkski commented Aug 16, 2022

Describe the bug
I have a flask app that has an endpoint that looks something like this, where model is FullBayesianForecaster:

@app.route("/predict", methods=["POST"])
def predict():
    df = get_df()
    result = model.predict(df, seed=1234)
    return result.to_dict(orient="records")

I tested it locally with curl and it worked great: the flask app returned the same predictions for the same input. I then deployed it to a test environment, and another service that talks to it immediately started getting non-deterministic results for requests with the same parameters. I checked the prediction results with another curl request to the test environment, and got the same result back every time, as expected.

After scratching my head a bit, I tried using threading.Lock like this:

with lock:
    result = model.predict(df, seed=1234)

And then the non-determinism went away. This leads me to believe that model prediction is not thread safe. I haven't yet dug down far enough to know for sure what's causing the issue, but this seems to be at least one likely culprit:

https://github.com/uber/orbit/blob/27371ec/orbit/forecaster/full_bayes.py#L96-L97

If multiple threads are calling model.predict such that they fight over the value of self._prediction_meta here, that seems likely to cause predictions to be not thread-safe:

https://github.com/uber/orbit/blob/c232980/orbit/forecaster/forecaster.py#L389

To Reproduce

Try running model.predict with different parameters in the presence of multiple threads, and set seed=a_fixed_number.

Expected behavior

Calling model.predict with seed=a_fixed_number in the presence of multiple threads returns deterministic predictions.

Screenshots

N/A

Environment (please complete the following information):

  • OS: Ubuntu
  • Python Version: 3.9
  • Versions of Major Dependencies (pandas, scikit-learn, cython): [e.g. pandas==1.3.5, scikit-learn==<not installed>, cython==0.29.30]

Additional context

N/A

@knkski knkski added the bug Something isn't working label Aug 16, 2022
@edwinnglabs
Copy link
Collaborator

@knkski are you saying you get same result across threads even with the setting seed=None ?

@knkski
Copy link
Author

knkski commented Aug 19, 2022

@edwinnglabs I haven't tested that, I'd assume that I'd get different results each time then. This issue arose after setting the random seed, assuming that would fix the non-deterministic predictions I was seeing. Since I started getting deterministic predictions with a thread lock, I didn't try seeing what happens with a thread lock, but without a random seed.

@edwinnglabs
Copy link
Collaborator

edwinnglabs commented Aug 19, 2022

So you want to see same behavior on each thread you call model.predict? If so, the other around as you described should work by using seed=a_fixed_number. Please let me know if I understand the problem correctly...

@knkski
Copy link
Author

knkski commented Aug 22, 2022

Yeah, I'm looking to get deterministic predictions on every thread, given that I pass in seed=a_fixed_number. Right now I'm not seeing that; using multiple threads will return non-deterministic predictions, even if I set seed=a_fixed_number, and that's what I think the bug is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants