Skip to content

Conversation

@NimaSarajpoor
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor commented Jan 22, 2026

See #1111

Pull Request Checklist

Below is a simple checklist but please do not hesitate to ask for assistance!

  • Fork, clone, and checkout the newest version of the code
  • Create a new branch
  • Make necessary code changes
  • Install black (i.e., python -m pip install black or conda install -c conda-forge black)
  • Install flake8 (i.e., python -m pip install flake8 or conda install -c conda-forge flake8)
  • Install pytest-cov (i.e., python -m pip install pytest-cov or conda install -c conda-forge pytest-cov)
  • Run black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./ in the root stumpy directory
  • Run flake8 --extend-exclude=.venv ./ in the root stumpy directory
  • Run ./setup.sh dev && ./test.sh in the root stumpy directory
  • Reference a Github issue (and create one if one doesn't already exist)

@gitnotebooks
Copy link

gitnotebooks bot commented Jan 22, 2026

Review these changes at https://app.gitnotebooks.com/stumpy-dev/stumpy/pull/1118

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
As suggested in this comment, I will create a module for sliding dot product in this PR, and put the functions/classes there. I've started with changing the current "sliding dot product" functions in core.py to help us follow the changes more easily.

Copy link
Contributor

@seanlaw seanlaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good to me

@NimaSarajpoor
Copy link
Collaborator Author

I will create a module for sliding dot product in this PR,

seanlaw approved these changes

To make sure we are on the same page, I assume it is still okay to push changes here in this PR.

@seanlaw
Copy link
Contributor

seanlaw commented Jan 23, 2026

To make sure we are on the same page, I assume it is still okay to push changes here in this PR.

Yes, should be fine. If you feel like there are multiple distinct pieces or logical checkpoints then you may consider splitting it into multiple PRs. But if you are able to keep it simple, then we can do it here

@seanlaw
Copy link
Contributor

seanlaw commented Jan 27, 2026

@NimaSarajpoor It's not clear why we need to change docstring.py and how it relates to sdp. Perhaps we can handle anything that is wrong with docstring.py as a separate issue along with a unit test or clear example where it is currently failing?

The if-condition was changed from class_name is None to len(re.findall(r"Returns", docstring)) > 0. Because, with old condition, this line resulted in wrong outcome when it tries to capture param_section from a docstring that has an at least one output in its "Returns" section.

Frankly, I'm not following this comment and why it needs to be changed.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Jan 27, 2026

The if-condition was changed from class_name is None to len(re.findall(r"Returns", docstring)) > 0. Because, with old condition, this line resulted in wrong outcome when it tries to capture param_section from a docstring that has an at least one output in its "Returns" section.

Frankly, I'm not following this comment and why it needs to be changed

and how it relates to sdp

The issue was exposed when I added pyfftw-based sdp, which is implemented using class. The class has the method __call__, which returns a value. Note that NO method in OTHER (EXISTING) classes in STUMPY returns a value, and that's why that issue was not exposed before. When method returns a value, its corresponding variable name (mentioned in the "Returns" section) was MISTAKENLY treated as "parameters" by docstring.py. See the else part in the following if-else block:

stumpy/docstring.py

Lines 26 to 33 in ce05903

if class_name is None:
params_section = re.findall(
r"(?<=Parameters)(.*)(?=Returns)", docstring, re.DOTALL
)[0]
else:
params_section = re.findall(r"(?<=Parameters)(.*)", docstring, re.DOTALL)[0]
args = re.findall(r"(\w+)\s+\:", params_section)

Perhaps we can handle anything that is wrong with docstring.py as a separate issue along with a unit test or clear example where it is currently failing?

Sounds good. I will create an issue for doscstirng.py

@seanlaw
Copy link
Contributor

seanlaw commented Jan 27, 2026

The issue was exposed when I added pyfftw-based sdp, which is implemented using class. The class has the method call, which returns a value. Note that NO method in OTHER (EXISTING) classes in STUMPY returns a value, and that's why that issue was not exposed before. When method returns a value, its corresponding variable name (mentioned in the "Returns" section) was MISTAKENLY treated as "parameters" by docstring.py.

Given that this is an extreme case, it almost feels like it merits its on elif condition that specifically handles the specific __call__ case? Then, if the code looks so similar to the existing example handling class then we can combine it thereafter. But you should demonstrate the similarity first rather than getting too cute and obfuscating the logic. Considering how infrequent this function gets used and it is only a helper script, speed is NOT the goal here. Clarity and ease of making modifications is more important!

P.S. I can also accept the criticism (for myself) if this file was written in a way that lacked clarity and if it felt hard to modify 😅

@NimaSarajpoor
Copy link
Collaborator Author

The issue was exposed when I added pyfftw-based sdp, which is implemented using class. The class has the method call, which returns a value. Note that NO method in OTHER (EXISTING) classes in STUMPY returns a value, and that's why that issue was not exposed before. When method returns a value, its corresponding variable name (mentioned in the "Returns" section) was MISTAKENLY treated as "parameters" by docstring.py.

Given that this is an extreme case, it almost feels like it merits its on elif condition that specifically handles the specific __call__ case? Then, if the code looks so similar to the existing example handling class then we can combine it thereafter. But you should demonstrate the similarity first rather than getting too cute and obfuscating the logic. Considering how infrequent this function gets used and it is only a helper script, speed is NOT the goal here. Clarity and ease of making modifications is more important!

P.S. I can also accept the criticism (for myself) if this file was written in a way that lacked clarity and if it felt hard to modify 😅

Let's continue this conversation in the new issue #1123.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Feb 1, 2026

The following error was raised in tests/test_snippets.py, which is similar to the issue reported in #1061 (see #1061 (comment)). Will rerun the tests.

=================================== FAILURES ===================================
_______________ test_mpdist_snippets_s_with_isconstant[3-3-9-T0] _______________

T = array([ -89.29236339, -813.25166768,  851.09361758, -815.13544742,
       -898.94521221,  247.21160751, -791.30786228,...33724, -223.65196544, -259.36417272,  450.28819357,
       -271.65582493,  447.69794613,   11.59767615, -704.13251263])
m = 9, k = 3, s = 3

    @pytest.mark.parametrize("T", test_data)
    @pytest.mark.parametrize("m", m)
    @pytest.mark.parametrize("k", k)
    @pytest.mark.parametrize("s", s)
    def test_mpdist_snippets_s_with_isconstant(T, m, k, s):
        isconstant_custom_func = functools.partial(
            naive.isconstant_func_stddev_threshold, quantile_threshold=0.05
        )
        (
            ref_snippets,
            ref_indices,
            ref_profiles,
            ref_fractions,
            ref_areas,
            ref_regimes,
        ) = naive.mpdist_snippets(
            T, m, k, s=s, mpdist_T_subseq_isconstant=isconstant_custom_func
        )
        (
            cmp_snippets,
            cmp_indices,
            cmp_profiles,
            cmp_fractions,
            cmp_areas,
            cmp_regimes,
        ) = snippets(T, m, k, s=s, mpdist_T_subseq_isconstant=isconstant_custom_func)
    
>       npt.assert_almost_equal(
            ref_snippets, cmp_snippets, decimal=config.STUMPY_TEST_PRECISION
        )
E       AssertionError: 
E       Arrays are not almost equal to 5 decimals
E       
E       Mismatched elements: 9 / 27 (33.3%)
E       Max absolute difference among violations: 1390.94241219
E       Max relative difference among violations: 42.54441072
E        ACTUAL: array([[ 674.58752, -479.97183,  661.2137 ,  942.5773 ,  584.52497,
E               -294.49651, -739.9214 , -734.58463,  357.05428],
E              [ 146.43668,  775.21573,  -83.12796,   40.08262,  592.93934,...
E        DESIRED: array([[ 674.58752, -479.97183,  661.2137 ,  942.5773 ,  584.52497,
E               -294.49651, -739.9214 , -734.58463,  357.05428],
E              [ 146.43668,  775.21573,  -83.12796,   40.08262,  592.93934,...

tests/test_snippets.py:211: AssertionError
=================== 1 failed, 146 passed in 84.01s (0:01:24) ===================
Error: Test execution encountered exit code 1

@NimaSarajpoor
Copy link
Collaborator Author

This test failed and the error is provided below. It shows .CancelledError. @seanlaw should we re-open #1115 ?

tests\test_aampdist.py ......................................E           [100%]

=================================== ERRORS ====================================
_______________ ERROR at teardown of test_aampdisted[T_A1-T_B1] _______________

fut = <Future cancelled>, timeout = 0.8891491889953613

    async def wait_for(fut: Awaitable[T], timeout: float) -> T:
        async with asyncio.timeout(timeout):
>           return await fut
                   ^^^^^^^^^
E           asyncio.exceptions.CancelledError

C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py:1928: CancelledError

The above exception was the direct cause of the following exception:

    @pytest.fixture(scope="module")
    def dask_cluster():
        cluster = LocalCluster(
            n_workers=2,
            threads_per_worker=2,
            dashboard_address=None,
            worker_dashboard_address=None,
        )
        yield cluster.scheduler_address
>       cluster.close()

tests\test_aampdist.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\deploy\spec.py:299: in close
    aw = super().close(timeout)
         ^^^^^^^^^^^^^^^^^^^^^^
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\deploy\cluster.py:224: in close
    return self.sync(self._close, callback_timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py:381: in sync
    return sync(
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py:457: in sync
    raise error
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py:431: in f
    result = yield future
             ^^^^^^^^^^^^
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\tornado\gen.py:783: in run
    value = future.result()
            ^^^^^^^^^^^^^^^
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\deploy\spec.py:454: in _close
    await self._correct_state()
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\deploy\spec.py:365: in _correct_state_internal
    await asyncio.gather(*tasks)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\nanny.py:619: in close
    await self.kill(timeout=timeout, reason=reason)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\nanny.py:400: in kill
    await self.process.kill(reason=reason, timeout=timeout)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\nanny.py:883: in kill
    await process.join(max(0, deadline - time()))
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\process.py:330: in join
    await wait_for(asyncio.shield(self._exit_future), timeout)
C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py:1927: in wait_for
    async with asyncio.timeout(timeout):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Timeout [expired]>
exc_type = <class 'asyncio.exceptions.CancelledError'>
exc_val = CancelledError(), exc_tb = <traceback object at 0x000001FDBE426740>

    async def __aexit__(
        self,
        exc_type: Optional[Type[BaseException]],
        exc_val: Optional[BaseException],
        exc_tb: Optional[TracebackType],
    ) -> Optional[bool]:
        assert self._state in (_State.ENTERED, _State.EXPIRING)
    
        if self._timeout_handler is not None:
            self._timeout_handler.cancel()
            self._timeout_handler = None
    
        if self._state is _State.EXPIRING:
            self._state = _State.EXPIRED
    
            if self._task.uncancel() <= self._cancelling and exc_type is exceptions.CancelledError:
                # Since there are no new cancel requests, we're
                # handling this.
>               raise TimeoutError from exc_val
E               TimeoutError

C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\asyncio\timeouts.py:115: TimeoutError
------------------------------ Captured log call ------------------------------
INFO     distributed.scheduler:scheduler.py:5958 Receive client connection: Client-ce9dba58-00ad-11f1-84dc-002248c3d637
INFO     distributed.core:core.py:883 Starting established connection to tcp://127.0.0.1:55312
INFO     distributed.scheduler:scheduler.py:6003 Remove client Client-ce9dba58-00ad-11f1-84dc-002248c3d637
INFO     distributed.core:core.py:908 Received 'close-stream' from tcp://127.0.0.1:55312; closing.
INFO     distributed.scheduler:scheduler.py:6003 Remove client Client-ce9dba58-00ad-11f1-84dc-002248c3d637
INFO     distributed.scheduler:scheduler.py:5995 Close client connection: Client-ce9dba58-00ad-11f1-84dc-002248c3d637
---------------------------- Captured log teardown ----------------------------
INFO     distributed.scheduler:scheduler.py:7614 Retire worker addresses (stimulus_id='retire-workers-1770088192.6515667') (0, 1)
INFO     distributed.nanny:nanny.py:611 Closing Nanny at 'tcp://127.0.0.1:55273'. Reason: nanny-close
INFO     distributed.nanny:nanny.py:858 Nanny asking worker to close. Reason: nanny-close
INFO     distributed.nanny:nanny.py:611 Closing Nanny at 'tcp://127.0.0.1:55275'. Reason: nanny-close
INFO     distributed.nanny:nanny.py:858 Nanny asking worker to close. Reason: nanny-close
INFO     distributed.core:core.py:908 Received 'close-stream' from tcp://127.0.0.1:55283; closing.
INFO     distributed.scheduler:scheduler.py:5444 Remove worker addr: tcp://127.0.0.1:55281 name: 0 (stimulus_id='handle-worker-cleanup-1770088192.6547666')
INFO     distributed.core:core.py:893 Connection to tcp://127.0.0.1:55286 has been closed.
INFO     distributed.scheduler:scheduler.py:5444 Remove worker addr: tcp://127.0.0.1:55284 name: 1 (stimulus_id='handle-worker-cleanup-1770088192.6560135')
INFO     distributed.scheduler:scheduler.py:5582 Lost all workers
WARNING  distributed.nanny:nanny.py:879 Worker process still alive after 4.0 seconds, killing
WARNING  distributed.nanny:nanny.py:879 Worker process still alive after 4.0 seconds, killing
ERROR    tornado.application:ioloop.py:778 Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x000001FDC1E5D610>>, <Task finished name='Task-5628' coro=<SpecCluster._correct_state_internal() done, defined at C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\deploy\spec.py:352> exception=TimeoutError()>)
Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\distributed\utils.py", line 1928, in wait_for
    return await fut
           ^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\tornado\ioloop.py", line 758, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "C:\hostedtoolcache\windows\Python\3.11.9\x64\Lib\site-packages\tornado\ioloop.py", line 782, in _discard_future_result
    future.result()
TimeoutError
======================== 38 passed, 1 error in 58.01s =========================
Error: Test execution encountered exit code 1

@seanlaw
Copy link
Contributor

seanlaw commented Feb 3, 2026

This test failed and the error is provided below. It shows .CancelledError. @seanlaw should we re-open #1115 ?

@NimaSarajpoor I am seeing this error more frequently. Just re-run the test and ignore. Somehow, it is timing out when it is trying to close the cluster. I will look into it but it is very hard to reproduce because this seems to be related to the unpredictable environment in Github Actions.

Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw
I think there is nothing more to add to this PR. I left some comments to bring your attention to a few things. Can you please take a look at your convenience?

_pyfftw_sliding_dot_product = _PYFFTW_SLIDING_DOT_PRODUCT(max_n=2**20)


def _sliding_dot_product(
Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of notes for this function:

(1) I do not check if FFTW_IS_AVAILABLE in this function. Note that if FFTW_IS_AVAILABLE==False, there will be no function _pyfftw_sliding_dot_product in this module. Whatever functions a user provides in boundaries, the assumption is that it works and it returns the correct output.

(2) The default_sdp is set to the function _convolve_sliding_dot_product, which is core.sliding_dot_product in main branch.

Q,
T,
boundaries=[
[(-np.inf, 2**7 + 1), (-np.inf, np.inf), _njit_sliding_dot_product],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NimaSarajpoor

Suggested change
[(-np.inf, 2**7 + 1), (-np.inf, np.inf), _njit_sliding_dot_product],
[(2, 2**7 + 1), (2, np.inf), _njit_sliding_dot_product],

Q_boundaries[0] <= m < Q_boundaries[1]
and T_boundaries[0] <= n < T_boundaries[1]
):
return sdp_func(Q, T)
Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if a user wants to use the function sdp._pocketfft_sliding_dot_product which uses some private functions of scipy? and that might break the function... Should we care if PyPI Wheel is checked on daily basis ?

QT = convolve(Qr, T)

return QT.real[m - 1 : n]
return sdp._sliding_dot_product(Q, T)
Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: If we pass boundaries=[], then this will be the same as the function in branch main. However, I think it is safe to use the default value, which is:

boundaries=[
        [(-np.inf, 2**7 + 1), (-np.inf, np.inf), _njit_sliding_dot_product],
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants