Ability to specify fold indices for k-folds cross-validation #102

hihosilvers · 2023-07-03T18:58:18Z

hihosilvers
Jul 3, 2023

Hi, oftentimes with time series classification or regression you might want to train your model on subsets of a longer timeseries (particularly regression). In this case, you will have subsequent examples that are time shifted by one time unit.

E.g. Example 1 might have timestamps:
[1, 2, 3, 4, 5]
and Example 2 might have timestamps
[2, 3, 4, 5, 6]

In these cases, it is important to ensure that Examples 1 and 2 are either both used for training or both used for cross-validation in each epoch, as it is easy to inflate the results of cross-validation if one is used for training and the other used for cross-validation.

To solve this problem, we might specify that both Examples 1 and 2 are part of the same fold for cross-validation. In practice this would mean splitting folds across some fundamental difference in samples (e.g. splitting folds across different patients in an ECG dataset). This is currently not possible with HyperTS's make_experiment function as you can only specify the number of folds, but not where the folds should be (in the common real-world scenario where the length of each fold needs to be different).

This can remedied by allowing the user to specify the fold for each training example by passing an array of length n (where n is the number of examples in the dataset) to the make_experiment function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to specify fold indices for k-folds cross-validation #102

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Ability to specify fold indices for k-folds cross-validation #102

hihosilvers Jul 3, 2023

Replies: 0 comments

hihosilvers
Jul 3, 2023