Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NHITS does not give consistent results on GPU #1217

Closed
Mickailkhadhar opened this issue Nov 26, 2024 · 3 comments
Closed

NHITS does not give consistent results on GPU #1217

Mickailkhadhar opened this issue Nov 26, 2024 · 3 comments
Labels

Comments

@Mickailkhadhar
Copy link

Mickailkhadhar commented Nov 26, 2024

What happened + What you expected to happen [EDIT : does not work either with 2.0.1]

Hello, I'm tuning NHITS model when I discovered I had inconsistent results with the same experiments.
I replicated the bug using utilsforecast to generate series.
What I do in my code is generating series once. Then I create the exact same model twice.
Predictions are always different using a GPU on Databricks.
It seems that the bug never occurs when using CPU in Databricks too. I also tested it with neuralforecast 1.7.3, 1.7.5 and 2.0.1 version.

Predictions seem inconsistent regardless of the hyperparameters values and for equal random_seed.
PS : I am also using NBEATSx and TFT and have no problem with both!

Thanks in advance, Appreciate your help!

Versions / Dependencies

Within Databricks environment.
Using neuralforecast 1.7.3, 1.75, and 2.0.1
Bug reproducible on GPU but not on CPU

Reproduction script [EDIT: simplified the code but still does not work on GPU]

import pandas as pd
from neuralforecast import NeuralForecast
from neuralforecast.models import NHITS
from utilsforecast.data import generate_series

n_series = 100
freq = "W"

df = generate_series(
    n_series=n_series,
    freq=freq,
    min_length=156,
    max_length=400,
    equal_ends=True,
)

nf_1 = NeuralForecast(
    models=[
        NHITS(
            h=52,
            input_size=104,
            max_steps=100,
            random_seed=42,
        )
    ],
    freq=freq,
)
nf_2 = NeuralForecast(
    models=[
        NHITS(
            h=52,
            input_size=104,
            max_steps=100,
            random_seed=42,
        )
    ],
    freq=freq,
)

nf_1.fit(df=df)
nf_2.fit(df=df)

pred_1 = nf_1.predict(df=df)
pred_2 = nf_2.predict(df=df)

pd.testing.assert_frame_equal(pred_1, pred_2)


Issue Severity

High: It blocks me from completing my task.

@elephaint
Copy link
Contributor

elephaint commented Feb 21, 2025

It's a known issue when using CUDA that completely deterministic results are not possible.

See here: https://pytorch.org/docs/stable/notes/randomness.html

You could try prepending the following (so start with this code):

import torch
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.enabled = False

However, that still results in a small difference, albeit slightly smaller. It negatively impacts speed, though, so I'd definitely not recommend using this unless it's an academic exercise where reproducibility is more important.

Furthermore, you can enforce deterministic behaviour even more by doing the following:

torch.use_deterministic_algorithms(True)

NHITS however uses a function that (apparently) doesn't have a deterministic CUDA implementation (on my machine & PyTorch version), so using the latter will error. Ultimately I believe this is the underlying issue that causes these discrepancies to occur.

I'd personally not worry about this too much, getting completely deterministic results in general is more or less impossible in any computer environment that uses floating numbers with a limited precision and a high degree of parallelization. For example, if you'd run your algorithm on a different machine tomorrow, you'd get different results. It's near impossible to prevent that. The delta should be small up to a tolerance, but exact matching is basically impossible.

So, I'd look at the actual differences that you measure. Are they within a small tolerance? Then I'd not worry about it too much.

@elephaint
Copy link
Contributor

elephaint commented Feb 21, 2025

The NHITS function that gives the non-deterministic behavior is F.interpolate when used with linear interpolation. Hence, the solution (when you really desire deterministic behavior) is to use a deterministic interpolation method, e.g.:

from neuralforecast import NeuralForecast
from neuralforecast.models import NHITS
from utilsforecast.data import generate_series
import pandas as pd


n_series = 100
freq = "W"

df = generate_series(
    n_series=n_series,
    freq=freq,
    min_length=156,
    max_length=400,
    equal_ends=True,
)

nf_1 = NeuralForecast(
    models=[
        NHITS(
            interpolation_mode="nearest",
            h=52,
            input_size=104,
            max_steps=100,
            random_seed=42,
        )
    ],
    freq=freq,
)
nf_2 = NeuralForecast(
    models=[
        NHITS(
            interpolation_mode="nearest",
            h=52,
            input_size=104,
            max_steps=100,
            random_seed=42,
        )
    ],
    freq=freq,
)

nf_1.fit(df=df)
nf_2.fit(df=df)

pred_1 = nf_1.predict(df=df)
pred_2 = nf_2.predict(df=df)

pd.testing.assert_frame_equal(pred_1, pred_2)

which should give no errors.

@Mickailkhadhar
Copy link
Author

Thanks for you comments! It works just fine ! I'll close the issue.
Also thanks for you information about torch and cuda deterministic behaviours, i will look into it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants