[FEAT] Evaluation to utils #311

elephaint · 2024-12-02T14:53:42Z

PR:

Softly deprecates the HierarchicalEvaluation class in favor of a function evaluate which is a small wrapper around utilsforecast's evaluate which can work with utilsforecasts's evaluation functions. This further unifies the HF API across our packages.
The deprecation is soft in the sense that for now, we keep the HierarchicalEvaluation class and loss functions but remove it from the examples, the docs and I added a deprecation notice to each.
Updates all examples to show the new evaluation method.

Note:
utilsforecast scaled_crps is different from what is currently in hierarchicalforecast, as the former normalizes on a per-series basis, whereas the GluonTS implementation (which is followed in hierarchicalforecast currently) normalizes the CRPS using the norm based on all timeseries. Therefore, when recalculating our examples using the scaled crps of utilsforecast, we get different error metrics (and different conclusions!)

Todo / to solve:

Using utilsforecast, we evaluate slightly differently than we currently do with HF, in particular when using benchmark models in the evaluation metric. We previously would compute a relative score as follows: (overall_scalar_loss / overal_scalar_loss_benchmark), whereas using utilsforecast's loss functions, we compute the relative loss per timeseries, and compute the mean across these relative losses. Added a benchmark possibility in evaluate to obtain the correct behavior

The following loss functions should be added to utilsforecast to achieve parity with current HF functionality:

~~log_score: requires a set of inputs that's difficult to achieve in utilsforecast without some fiddling. Need to think about what this score adds beyond scaled_crps and energy_score~~ Don't think it adds much tbh.
~~energy_score~~ Deferred for now, the loss function can still be used via the old API, and it's removed from the examples, keeping only scaled_crps.
~~msse~~ Already in utilsforecast
~~rel_mse~~ not including this one in utilsforecast; users should use mse in conjunction with the benchmark-attribute in evaluate, to get the proper relative mse.

review-notebook-app · 2024-12-02T14:53:47Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

hierarchicalforecast/evaluation.py

eval_to_utils

162c67c

elephaint added the feature label Dec 2, 2024

elephaint marked this pull request as ready for review December 2, 2024 14:58

elephaint changed the title ~~[FEAT] Evalutation to utils~~ [FEAT] Evaluation to utils Dec 2, 2024

elephaint and others added 2 commits December 2, 2024 16:43

typing_3.9

d1922bb

Merge branch 'main' into feat/eval_to_utils

ea27ea1

elephaint marked this pull request as draft December 2, 2024 19:37

elephaint added 2 commits December 3, 2024 10:37

add_benchmark

dd777f8

change_readme_and_fix_examples_with_msse

6171f70

elephaint marked this pull request as ready for review December 11, 2024 14:33

elephaint requested a review from jmoralez December 11, 2024 14:35

jmoralez previously approved these changes Dec 11, 2024

View reviewed changes

hierarchicalforecast/evaluation.py Outdated Show resolved Hide resolved

hierarchicalforecast/evaluation.py Outdated Show resolved Hide resolved

comments_jose

17951a7

elephaint dismissed jmoralez’s stale review via 17951a7 December 12, 2024 17:26

elephaint requested a review from jmoralez December 12, 2024 17:26

jmoralez approved these changes Dec 12, 2024

View reviewed changes

elephaint merged commit 0cad0a1 into main Dec 12, 2024
17 checks passed

elephaint deleted the feat/eval_to_utils branch December 12, 2024 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Evaluation to utils #311

[FEAT] Evaluation to utils #311

elephaint commented Dec 2, 2024 •

edited

Loading

review-notebook-app bot commented Dec 2, 2024

[FEAT] Evaluation to utils #311

[FEAT] Evaluation to utils #311

Conversation

elephaint commented Dec 2, 2024 • edited Loading

review-notebook-app bot commented Dec 2, 2024

elephaint commented Dec 2, 2024 •

edited

Loading