Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BORF failing without numba #2245

Closed
TonyBagnall opened this issue Oct 24, 2024 · 4 comments · Fixed by #2249
Closed

[BUG] BORF failing without numba #2245

TonyBagnall opened this issue Oct 24, 2024 · 4 comments · Fixed by #2249
Assignees
Labels
bug Something isn't working transformations Transformations package

Comments

@TonyBagnall
Copy link
Contributor

Describe the bug

the BORF transformer fails when numpy turned off, in for example, the overnight codecov tests here

https://github.com/aeon-toolkit/aeon/actions/runs/11491231670

its standard numba stuff, should be quite easy to track down

Steps/Code to reproduce the bug

set

Run numba-disabled codecov tests

on any PR

Expected results

pass test

Actual results

FAILED aeon/testing/tests/test_all_estimators.py::test_all_estimators[check_fit_updates_state(estimator=BORF(),datatype=EqualLengthUnivariate-Classification-numpy3D)] - numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name '_erfinv': Cannot determine Numba type of <class 'function'>

File "aeon/transformations/collection/dictionary_based/_borf.py", line 617:
def _ppf(x, mu=0, std=1):
    return mu + math.sqrt(2) * _erfinv(2 * x - 1) * std
    ^
FAILED aeon/testing/tests/test_all_estimators.py::test_all_estimators[check_non_state_changing_method(estimator=BORF(),datatype=EqualLengthUnivariate-Classification-numpy3D)] - numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name '_erfinv': Cannot determine Numba type of <class 'function'>

File "aeon/transformations/collection/dictionary_based/_borf.py", line 617:
def _ppf(x, mu=0, std=1):
    return mu + math.sqrt(2) * _erfinv(2 * x - 1) * std
    ^
= 2 failed, 8705 passed, 10 skipped, 61 xfailed, 150 xpassed, 636 warnings in 1781.64s (0:29:41) =

Versions

No response

@TonyBagnall TonyBagnall added bug Something isn't working transformations Transformations package labels Oct 24, 2024
@Cyril-Meyer
Copy link
Contributor

I've investigated this bug and here is some information I found useful :

  • You can reproduce the bug easier
    • Set NUMBA_DISABLE_JIT=1
    • Run a copy of _borf.py with a call like _get_norm_bins(16)
  • _ppf is @nb.vectorize (and its the only one in all aeon)
  • If you change _erfinv decorator from njit into vectorize, the error disappears (this solution looks good to me)
    • Performance difference is negligible, here is code for test
      • vectorize = 3.88
      • njit = 3.76
import time
_get_norm_bins(5)
t0 = time.time()
for i in range(100000):
    _get_norm_bins(256)
t1 = time.time()
print(t1 - t0)
  • If you put _erfinv code into _ppf, the error also disappears.
@nb.vectorize(cache=True)
def _ppf(x, mu=0, std=1):

    w = -math.log((1 - (2 * x - 1)) * (1 + (2 * x - 1)))
    if w < 5:
        w = w - 2.5
        p = 2.81022636e-08
        p = 3.43273939e-07 + p * w
        p = -3.5233877e-06 + p * w
        p = -4.39150654e-06 + p * w
        p = 0.00021858087 + p * w
        p = -0.00125372503 + p * w
        p = -0.00417768164 + p * w
        p = 0.246640727 + p * w
        p = 1.50140941 + p * w
    else:
        w = math.sqrt(w) - 3
        p = -0.000200214257
        p = 0.000100950558 + p * w
        p = 0.00134934322 + p * w
        p = -0.00367342844 + p * w
        p = 0.00573950773 + p * w
        p = -0.0076224613 + p * w
        p = 0.00943887047 + p * w
        p = 1.00167406 + p * w
        p = 2.83297682 + p * w

    p = p * x
    return mu + math.sqrt(2) * p * std

Cyril-Meyer added a commit to Cyril-Meyer/aeon that referenced this issue Oct 25, 2024
@Cyril-Meyer
Copy link
Contributor

@aeon-actions-bot assign @Cyril-Meyer

@MatthewMiddlehurst
Copy link
Member

was also looking at this but came to a different solution 🙂.

@nb.njit(cache=True)
def _get_norm_bins(alphabet_size: int, mu=0, std=1):
    b = np.linspace(0, 1, alphabet_size + 1)[1:-1]
    for i, v in enumerate(b):
        b[i] = _ppf(v, mu, std)
    return b

@nb.njit(cache=True)
def _ppf(x, mu=0, std=1):
    return mu + math.sqrt(2) * _erfinv(2 * x - 1) * std

Issue is that our test disabling numba jit does not work on vectorise for whatever reason. Either can work really, this one will allow the functions to be tracked in test coverage, but not sure if it will be slower.

@MatthewMiddlehurst
Copy link
Member

though mine still rasises an error...

https://github.com/aeon-toolkit/aeon/actions/runs/11517193530/job/32061397931?pr=2249#step:7:1636

Will test yours (added the tag to your PR), would just go with that if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working transformations Transformations package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants