This library is intended to be used as an alternative to
pd.Series.rolling
and pd.Series.expanding
to gain a speedup by using
numba optimized functions operating on numpy arrays. There are also
online classes for more efficient updates of window statistics.
pip install window-ops
conda install -c conda-forge window-ops
For a transformations n_samples
-> n_samples
you can use
[seasonal_](rolling|expanding)_(mean|max|min|std)
on an array.
pd.__version__
'1.3.5'
n_samples = 10_000 # array size
window_size = 8 # for rolling operations
season_length = 7 # for seasonal operations
execute_times = 10 # number of times each function will be executed
Average times in milliseconds.
times.applymap('{:.2f}'.format)
window_ops | pandas | |
---|---|---|
rolling_mean | 0.03 | 0.43 |
rolling_max | 0.14 | 0.57 |
rolling_min | 0.14 | 0.58 |
rolling_std | 0.06 | 0.54 |
expanding_mean | 0.03 | 0.31 |
expanding_max | 0.05 | 0.76 |
expanding_min | 0.05 | 0.47 |
expanding_std | 0.09 | 0.41 |
seasonal_rolling_mean | 0.05 | 3.89 |
seasonal_rolling_max | 0.18 | 4.27 |
seasonal_rolling_min | 0.18 | 3.75 |
seasonal_rolling_std | 0.08 | 4.38 |
seasonal_expanding_mean | 0.04 | 3.18 |
seasonal_expanding_max | 0.06 | 3.29 |
seasonal_expanding_min | 0.06 | 3.28 |
seasonal_expanding_std | 0.12 | 3.89 |
speedups = times['pandas'] / times['window_ops']
speedups = speedups.to_frame('times faster')
speedups.applymap('{:.0f}'.format)
times faster | |
---|---|
rolling_mean | 15 |
rolling_max | 4 |
rolling_min | 4 |
rolling_std | 9 |
expanding_mean | 12 |
expanding_max | 15 |
expanding_min | 9 |
expanding_std | 4 |
seasonal_rolling_mean | 77 |
seasonal_rolling_max | 23 |
seasonal_rolling_min | 21 |
seasonal_rolling_std | 52 |
seasonal_expanding_mean | 78 |
seasonal_expanding_max | 52 |
seasonal_expanding_min | 51 |
seasonal_expanding_std | 33 |
If you have an array for which you want to compute a window statistic
and then keep updating it as more samples come in you can use the
classes in the window_ops.online
module. They all have a
fit_transform
method which take the array and return the
transformations defined above but also have an update
method that take
a single value and return the new statistic.
Average time in milliseconds it takes to transform the array and perform 100 updates.
times.to_frame().applymap('{:.2f}'.format)
average time (ms) | |
---|---|
RollingMean | 0.12 |
RollingMax | 0.23 |
RollingMin | 0.22 |
RollingStd | 0.32 |
ExpandingMean | 0.10 |
ExpandingMax | 0.07 |
ExpandingMin | 0.07 |
ExpandingStd | 0.17 |
SeasonalRollingMean | 0.28 |
SeasonalRollingMax | 0.35 |
SeasonalRollingMin | 0.38 |
SeasonalRollingStd | 0.42 |
SeasonalExpandingMean | 0.17 |
SeasonalExpandingMax | 0.14 |
SeasonalExpandingMin | 0.15 |
SeasonalExpandingStd | 0.23 |