Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/feat/slice_intersect_multi_serie…
Browse files Browse the repository at this point in the history
…s' into feat/slice_intersect_multi_series
  • Loading branch information
ymatzkevich committed Dec 13, 2024
2 parents 05a1667 + 58df84a commit 9510761
Show file tree
Hide file tree
Showing 26 changed files with 1,407 additions and 552 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ darts_logs/
docs_env
.DS_Store
.gradle
.venv

# used by CI to build with latest versions of dependencies
requirements-latest.txt
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,13 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
**Improved**

- Improvements to `ForecastingModel`: Improved `start` handling for historical forecasts, backtest, residuals, and gridsearch. If `start` is not within the trainable / forecastable points, uses the closest valid start point that is a round multiple of `stride` ahead of start. Raises a ValueError, if no valid start point exists. This guarantees that all historical forecasts are `n * stride` points away from start, and will simplify many downstream tasks. [#2560](https://github.com/unit8co/darts/issues/2560) by [Dennis Bader](https://github.com/dennisbader).
- Added `data_transformers` argument to `historical_forecasts`, `backtest`, `residuals`, and `gridsearch` that allow to automatically apply `DataTransformer` and/or `Pipeline` to the input series without data-leakage (fit on historic window of input series, transform the input series, and inverse transform the forecasts). [#2529](https://github.com/unit8co/darts/pull/2529) by [Antoine Madrona](https://github.com/madtoinou) and [Jan Fidor](https://github.com/JanFidor)
- Added `series_idx` argument to `DataTransformer` that allows users to use only a subset of the transformers when `global_fit=False` and severals series are used. [#2529](https://github.com/unit8co/darts/pull/2529) by [Antoine Madrona](https://github.com/madtoinou)
- Updated the Documentation URL of `Statsforecast` models. [#2610](https://github.com/unit8co/darts/pull/2610) by [He Weilin](https://github.com/cnhwl).

**Fixed**

- Fixed a bug which raised an error when computing residuals (or backtest with "per time step" metrics) on multiple series with corresponding historical forecasts of different lengths. [#2604](https://github.com/unit8co/darts/pull/2604) by [Dennis Bader](https://github.com/dennisbader).
- Fixed a bug when using `darts.utils.data.tabularization.create_lagged_component_names()` with target `lags=None`, that did not return any lagged target label component names. [#2576](https://github.com/unit8co/darts/pull/2576) by [Dennis Bader](https://github.com/dennisbader).
- Fixed a bug when using `num_samples > 1` with a deterministic regression model and the optimized `historical_forecasts()` method, an exception was not raised. [#2576](https://github.com/unit8co/darts/pull/2588) by [Antoine Madrona](https://github.com/madtoinou).

Expand All @@ -31,6 +35,7 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
- fixed failing docker deployment
- removed `gradle` dependency in favor of native GitHub action plugins.
- Updated ruff to v0.7.2 and target-version to python39, also fixed various typos [#2589](https://github.com/unit8co/darts/pull/2589) by [Greg DeVosNouri](https://github.com/gdevos010) and [Antoine Madrona](https://github.com/madtoinou).
- Replaced the deprecated `torch.nn.utils.weight_norm` function with `torch.nn.utils.parametrizations.weight_norm` [#2593](https://github.com/unit8co/darts/pull/2593) by [Saeed Foroutan](https://github.com/SaeedForoutan).

## [0.31.0](https://github.com/unit8co/darts/tree/0.31.0) (2024-10-13)

Expand All @@ -40,7 +45,7 @@ but cannot always guarantee backwards compatibility. Changes that may **break co

- Improvements to `metrics`:
- Added support for computing metrics on one or multiple quantiles `q`, either from probabilistic or quantile forecasts. [#2530](https://github.com/unit8co/darts/pull/2530) by [Dennis Bader](https://github.com/dennisbader).
- Added quantile interval metrics `miw` (Mean Interval Width, time aggregated) and `iw` (Interval Width, per time step / non-aggregated) which compute the width of quantile intervals `q_intervals` (expected to be a tuple or sequence of tuples with (lower quantile, upper quantile). [#2530](https://github.com/unit8co/darts/pull/2530) by [Dennis Bader](https://github.com/dennisbader).
- Added quantile interval metrics `miw` (Mean Interval Width, time aggregated) and `iw` (Interval Width, per time step / non-aggregated) which compute the width of quantile intervals `q_intervals` (expected to be a tuple or sequence of tuples with (lower quantile, upper quantile)). [#2530](https://github.com/unit8co/darts/pull/2530) by [Dennis Bader](https://github.com/dennisbader).
- Improvements to `backtest()` and `residuals()`:
- Added support for computing backtest and residuals on one or multiple quantiles `q` in the `metric_kwargs`, either from probabilistic or quantile forecasts. [#2530](https://github.com/unit8co/darts/pull/2530) by [Dennis Bader](https://github.com/dennisbader).
- Added support for parameters `enable_optimization` and `predict_likelihood_parameters`. [#2530](https://github.com/unit8co/darts/pull/2530) by [Dennis Bader](https://github.com/dennisbader).
Expand Down
4 changes: 2 additions & 2 deletions darts/ad/detectors/threshold_detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ def _detect_core(self, series: TimeSeries, name: str = "series") -> TimeSeries:

def _detect_fn(x, lo, hi):
# x of shape (time,) for 1 component
return (x < (np.NINF if lo is None else lo)) | (
x > (np.Inf if hi is None else hi)
return (x < (-np.inf if lo is None else lo)) | (
x > (np.inf if hi is None else hi)
)

detected = np.zeros_like(np_series, dtype=int)
Expand Down
65 changes: 59 additions & 6 deletions darts/dataprocessing/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from collections.abc import Iterator, Sequence
from copy import deepcopy
from typing import Union
from typing import Optional, Union

from darts import TimeSeries
from darts.dataprocessing.transformers import (
Expand Down Expand Up @@ -90,6 +90,16 @@ def __init__(
isinstance(t, InvertibleDataTransformer) for t in self._transformers
)

self._fittable = any(
isinstance(t, FittableDataTransformer) for t in self._transformers
)

self._global_fit = all(
t._global_fit
for t in self._transformers
if isinstance(t, FittableDataTransformer)
)

if verbose is not None:
for transformer in self._transformers:
transformer.set_verbose(verbose)
Expand Down Expand Up @@ -149,7 +159,9 @@ def fit_transform(
return data

def transform(
self, data: Union[TimeSeries, Sequence[TimeSeries]]
self,
data: Union[TimeSeries, Sequence[TimeSeries]],
series_idx: Optional[Union[int, Sequence[int]]] = None,
) -> Union[TimeSeries, Sequence[TimeSeries]]:
"""
For each data transformer in pipeline transform data. Then transformed data is passed to next transformer.
Expand All @@ -158,18 +170,24 @@ def transform(
----------
data
(`Sequence` of) `TimeSeries` to be transformed.
series_idx
Optionally, the index(es) of each series corresponding to their positions within the series used to fit
the transformer (to retrieve the appropriate transformer parameters).
Returns
-------
Union[TimeSeries, Sequence[TimeSeries]]
Transformed data.
"""
for transformer in self._transformers:
data = transformer.transform(data)
data = transformer.transform(data, series_idx=series_idx)
return data

def inverse_transform(
self, data: Union[TimeSeries, Sequence[TimeSeries]], partial: bool = False
self,
data: Union[TimeSeries, Sequence[TimeSeries]],
partial: bool = False,
series_idx: Optional[Union[int, Sequence[int]]] = None,
) -> Union[TimeSeries, Sequence[TimeSeries]]:
"""
For each data transformer in the pipeline, inverse-transform data. Then inverse transformed data is passed to
Expand All @@ -184,6 +202,9 @@ def inverse_transform(
partial
If set to `True`, the inverse transformation is applied even if the pipeline is not fully invertible,
calling `inverse_transform()` only on the `InvertibleDataTransformer`s
series_idx
Optionally, the index(es) of each series corresponding to their positions within the series used to fit
the transformer (to retrieve the appropriate transformer parameters).
Returns
-------
Expand All @@ -198,14 +219,18 @@ def inverse_transform(
)

for transformer in reversed(self._transformers):
data = transformer.inverse_transform(data)
data = transformer.inverse_transform(data, series_idx=series_idx)
return data
else:
for transformer in reversed(self._transformers):
if isinstance(transformer, InvertibleDataTransformer):
data = transformer.inverse_transform(data)
data = transformer.inverse_transform(
data,
series_idx=series_idx,
)
return data

@property
def invertible(self) -> bool:
"""
Returns whether the pipeline is invertible or not.
Expand All @@ -218,6 +243,34 @@ def invertible(self) -> bool:
"""
return self._invertible

@property
def fittable(self) -> bool:
"""
Returns whether the pipeline is fittable or not.
A pipeline is fittable if at least one of the transformers in the pipeline is fittable.
Returns
-------
bool
`True` if the pipeline is fittable, `False` otherwise
"""
return self._fittable

@property
def _fit_called(self) -> bool:
"""
Returns whether all the transformers in the pipeline were fitted (when applicable).
Returns
-------
bool
`True` if all the fittable transformers are fitted, `False` otherwise
"""
return all(
(not isinstance(t, FittableDataTransformer)) or t._fit_called
for t in self._transformers
)

def __getitem__(self, key: Union[int, slice]) -> "Pipeline":
"""
Gets subset of Pipeline based either on index or slice with indexes.
Expand Down
22 changes: 20 additions & 2 deletions darts/dataprocessing/transformers/base_data_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,7 @@ def transform(
series: Union[TimeSeries, Sequence[TimeSeries]],
*args,
component_mask: Optional[np.array] = None,
series_idx: Optional[Union[int, Sequence[int]]] = None,
**kwargs,
) -> Union[TimeSeries, list[TimeSeries]]:
"""Transforms a (sequence of) of series by calling the user-implemeneted `ts_transform` method.
Expand All @@ -328,6 +329,9 @@ def transform(
attribute was set to `True` when instantiating `BaseDataTransformer`, then the component mask
will be automatically applied to each `TimeSeries` input. Otherwise, `component_mask` will be
provided as an addition keyword argument to `ts_transform`. See 'Notes' for further details.
series_idx
Optionally, the index(es) of each series corresponding to their positions within the series used to fit
the transformer (to retrieve the appropriate transformer parameters).
kwargs
Additional keyword arguments for each :func:`ts_transform()` method call
Expand Down Expand Up @@ -360,10 +364,16 @@ def transform(
# Take note of original input for unmasking purposes:
if isinstance(series, TimeSeries):
data = [series]
transformer_selector = [0]
if series_idx:
transformer_selector = self._process_series_idx(series_idx)
else:
transformer_selector = [0]
else:
data = series
transformer_selector = range(len(series))
if series_idx:
transformer_selector = self._process_series_idx(series_idx)
else:
transformer_selector = range(len(series))

input_iterator = _build_tqdm_iterator(
zip(data, self._get_params(transformer_selector=transformer_selector)),
Expand Down Expand Up @@ -439,6 +449,14 @@ def _check_fixed_params(self, transformer_selector: Iterable) -> None:
)
return None

@staticmethod
def _process_series_idx(series_idx: Union[int, Sequence[int]]) -> Sequence[int]:
"""Convert the `series_idx` to a Sequence[int].
Note: the validity of the entries in series_idx is checked in _get_params().
"""
return [series_idx] if isinstance(series_idx, int) else series_idx

@staticmethod
def apply_component_mask(
series: TimeSeries,
Expand Down
16 changes: 16 additions & 0 deletions darts/dataprocessing/transformers/fittable_data_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,22 @@ def fit(
)
return self

def transform(
self,
series: Union[TimeSeries, Sequence[TimeSeries]],
*args,
component_mask: Optional[np.array] = None,
series_idx: Optional[Union[int, Sequence[int]]] = None,
**kwargs,
) -> Union[TimeSeries, list[TimeSeries]]:
return super().transform(
series=series,
*args,
component_mask=component_mask,
series_idx=series_idx if not self._global_fit else None,
**kwargs,
)

def fit_transform(
self,
series: Union[TimeSeries, Sequence[TimeSeries]],
Expand Down
20 changes: 17 additions & 3 deletions darts/dataprocessing/transformers/invertible_data_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ def inverse_transform(
series: Union[TimeSeries, Sequence[TimeSeries], Sequence[Sequence[TimeSeries]]],
*args,
component_mask: Optional[np.array] = None,
series_idx: Optional[Union[int, Sequence[int]]] = None,
**kwargs,
) -> Union[TimeSeries, list[TimeSeries], list[list[TimeSeries]]]:
"""Inverse transforms a (sequence of) series by calling the user-implemented `ts_inverse_transform` method.
Expand Down Expand Up @@ -285,6 +286,9 @@ def inverse_transform(
component_mask : Optional[np.ndarray] = None
Optionally, a 1-D boolean np.ndarray of length ``series.n_components`` that specifies
which components of the underlying `series` the inverse transform should consider.
series_idx
Optionally, the index(es) of each series corresponding to their positions within the series used to fit
the transformer (to retrieve the appropriate transformer parameters).
kwargs
Additional keyword arguments for the :func:`ts_inverse_transform()` method
Expand Down Expand Up @@ -324,16 +328,26 @@ def inverse_transform(
called_with_sequence_series = False
if isinstance(series, TimeSeries):
data = [series]
transformer_selector = [0]
if series_idx:
transformer_selector = self._process_series_idx(series_idx)
else:
transformer_selector = [0]
called_with_single_series = True
elif isinstance(series[0], TimeSeries): # Sequence[TimeSeries]
data = series
transformer_selector = range(len(series))
if series_idx:
transformer_selector = self._process_series_idx(series_idx)
else:
transformer_selector = range(len(series))
called_with_sequence_series = True
else: # Sequence[Sequence[TimeSeries]]
data = []
transformer_selector = []
for idx, series_list in enumerate(series):
if series_idx:
iterator_ = zip(self._process_series_idx(series_idx), series)
else:
iterator_ = enumerate(series)
for idx, series_list in iterator_:
data.extend(series_list)
transformer_selector += [idx] * len(series_list)

Expand Down
2 changes: 1 addition & 1 deletion darts/models/forecasting/fft.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,7 @@ def fit(self, series: TimeSeries):
]

# set all other values in the frequency domain to 0
self.fft_values_filtered = np.zeros(len(self.fft_values), dtype=np.complex_)
self.fft_values_filtered = np.zeros(len(self.fft_values), dtype=np.complex128)
self.fft_values_filtered[self.filtered_indices] = self.fft_values[
self.filtered_indices
]
Expand Down
Loading

0 comments on commit 9510761

Please sign in to comment.