Imp/tsmixer basic #2555

eschibli · 2024-10-08T04:56:45Z

Checklist before merging this PR:

Mentioned all issues that this PR fixes or addresses.
Summarized the updates of this PR under Summary.
Added an entry under Unreleased in the Changelog.

Implements #2510

Summary

Adds the option to project to the output temporal space at the end of TS-Mixer, rather than the beginning. This was how most of the results in the original google-research paper were achieved (ie, the architecture in Fig #1 of the paper). This may allow higher performance in cases where past covariates are important by allowing a more direct series of residual connections along the input time dimension.

I allowed support for future covariates by instead projecting them into the lookback temporal space, but this probably won't perform well in cases where they are more important than the historical targets and past covariates.

Other Information

The original paper and source code do not clarify whether the final temporal projection should go before or after the final feature projection as they hardcoded hidden_size to output_dim and therefore did not have need a final feature projection. I erred on the side of putting the temporal projection first, as otherwise the common output_dim==1 could lead to unexpected, catastrophic compression before the temporal projection step.

review-notebook-app · 2024-10-23T01:50:25Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

madtoinou · 2024-10-24T07:51:50Z

Hi @eschibli,

First of all, thanks for opening this PR!

For the linting, it will make your life much easier if you follow these instruction, or you can also run it manually

…mixer-basic

codecov · 2024-10-27T18:49:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.10%. Comparing base (71a1902) to head (f9c0d15).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2555      +/-   ##
==========================================
- Coverage   94.15%   94.10%   -0.05%     
==========================================
  Files         139      139              
  Lines       14992    15006      +14     
==========================================
+ Hits        14116    14122       +6     
- Misses        876      884       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

eschibli · 2024-10-28T19:54:51Z

Hi @eschibli,
...

Thanks @madtoinou. I was not able to get Gradle running on my machine and didn't realize ruff was that easy to set up so sorry for spamming your test pipeline.

I don't believe the failing mac build is a result of my changes so it should be good for review now.

dennisbader · 2024-11-02T12:05:59Z

Hi @eschibli, thanks for the PR. Yes, the failing mac tests are unrelated to your PR, we're working on it :).
Also, give us some time to review, our capacity is currently a bit limited 🙏

…mixer-basic

eschibli · 2024-11-03T22:40:12Z

Understood Dennis

madtoinou

It looks great, thank you for this nice PR @eschibli! Some minor comment about the order of the operations/projections to make the flow more intuitive.

Could you also extend the TSMixer notebook to include a section where the difference in performance with "project_first_layer=True/False" and future covariates can be visualized?

madtoinou · 2024-11-07T11:00:46Z

darts/models/forecasting/tsmixer_model.py

+            x = _time_to_feature(x)
+
+        # Otherwise, encoder-style model with residual blocks in input time dimension
+        # In the original paper this was not implimented for future covariates,


Suggested change

# In the original paper this was not implimented for future covariates,

# In the original paper this was not implemented for future covariates,

madtoinou · 2024-11-11T09:54:59Z

darts/models/forecasting/tsmixer_model.py

+        # In the original paper this was not implimented for future covariates,
+        # but rather than ignoring them or raising an error we remap them to the input time dimension.
+        # Suboptimal but may be useful in some cases.
+        elif self.future_cov_dim:


to make it a bit more intuitive, I would move this code below, inside the if self.future_cov_dim and change the condition to if not self.project_first_layer in order to group the operation on each kind of features:

"target"; project to output time dimension in the first layer if project_first_layer = True otherwise we stay in input time dimension

"target"; do the feature_mixing_hist (not changed)

"fut_cov"; project the future covariates to input time dimension if project_first_layer=False (the logic you added)

concatenate the future covariates to the target features (not changed)

static covariates (not changed)

"target"; projection to the output time dimension if it did not occur earlier

"target"; application of fc_out, critical for probabilistic forecasts

madtoinou · 2024-11-12T16:59:06Z

darts/models/forecasting/tsmixer_model.py

            x = mixing_layer(x, x_static=x_static)

+        # If we are in the input time dimension, we need to project to the output time dimension.
+        # The original paper did not a fc_out layer (as hidden_size == output_dim)


Suggested change

# The original paper did not a fc_out layer (as hidden_size == output_dim)

# The original paper did not use a fc_out layer (as hidden_size == output_dim)

madtoinou · 2024-11-12T17:06:14Z

darts/tests/models/forecasting/test_tsmixer.py

+        if project_first_layer:
+            assert model.model.sequence_length == output_len
+        else:
+            assert model.model.sequence_length == input_len


Can the test also include a call to predict() to make sure it works as well (even if the forward pass is already occurring in the call to fit())?

eschibli added 4 commits September 3, 2024 08:03

Add tsmixer-basic

e12a751

Implimented project_first

1bd17da

Merge branch 'master' of https://github.com/unit8co/darts

34e431f

Updated changelog

bb72922

eschibli requested review from dennisbader and madtoinou as code owners October 8, 2024 04:56

eschibli added 6 commits October 8, 2024 21:53

Linting

1934928

Additional linting

c5de724

Further updated changelog

b8a2e88

More linting?

c6c7e97

Removed unnecessary layer init

79612f0

linting

6cdc9db

eschibli added 2 commits October 23, 2024 21:13

Reverted example

171dc34

linting????

f9797a3

eschibli added 4 commits October 24, 2024 18:39

auto formatting

e528f85

Merge branch 'master' of https://github.com/unit8co/darts into Imp/ts…

133547e

…mixer-basic

Merge branch 'master' of https://github.com/unit8co/darts into Imp/ts…

5afff01

…mixer-basic

Added test

3a392a3

Improved test coverage

ebe02d1

Docustring tweak

0a90f24

Merge branch 'master' of https://github.com/unit8co/darts into Imp/ts…

2674e1c

…mixer-basic

Merge branch 'master' into Imp/tsmixer-basic

2bf09ce

madtoinou reviewed Nov 12, 2024

View reviewed changes

Merge branch 'master' into Imp/tsmixer-basic

245ae09

Merge branch 'master' into Imp/tsmixer-basic

f9c0d15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Imp/tsmixer basic #2555

Imp/tsmixer basic #2555

eschibli commented Oct 8, 2024 •

edited

Loading

review-notebook-app bot commented Oct 23, 2024

madtoinou commented Oct 24, 2024

codecov bot commented Oct 27, 2024 •

edited

Loading

eschibli commented Oct 28, 2024

dennisbader commented Nov 2, 2024

eschibli commented Nov 3, 2024

madtoinou left a comment

madtoinou Nov 7, 2024

madtoinou Nov 11, 2024

madtoinou Nov 12, 2024

madtoinou Nov 12, 2024

	# In the original paper this was not implimented for future covariates,
	# In the original paper this was not implemented for future covariates,

	# The original paper did not a fc_out layer (as hidden_size == output_dim)
	# The original paper did not use a fc_out layer (as hidden_size == output_dim)

Imp/tsmixer basic #2555

Are you sure you want to change the base?

Imp/tsmixer basic #2555

Conversation

eschibli commented Oct 8, 2024 • edited Loading

Summary

Other Information

review-notebook-app bot commented Oct 23, 2024

madtoinou commented Oct 24, 2024

codecov bot commented Oct 27, 2024 • edited Loading

Codecov Report

eschibli commented Oct 28, 2024

dennisbader commented Nov 2, 2024

eschibli commented Nov 3, 2024

madtoinou left a comment

Choose a reason for hiding this comment

madtoinou Nov 7, 2024

Choose a reason for hiding this comment

madtoinou Nov 11, 2024

Choose a reason for hiding this comment

madtoinou Nov 12, 2024

Choose a reason for hiding this comment

madtoinou Nov 12, 2024

Choose a reason for hiding this comment

eschibli commented Oct 8, 2024 •

edited

Loading

codecov bot commented Oct 27, 2024 •

edited

Loading