Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set #12

hmzls · 2023-03-04T20:23:19Z

Hi,

I ran into a problem where the household 'MAC000193' was not available in 'selected_blocks_train_missing_imputed.parquet'.

This is probably due to a bug in Ch02 - 02-Preprocessing. Here, block_data_path.glob("*.csv") returns a list of files, for which the ordering is not deterministic. On a Windows system, this list was ordered alphabetically, but on a Linux system, the list was ordered differently.

In Ch04 - 01-Setting up Experiment Harness, 50 households are sampled. Even though random_state is specified explicitly, the results may differ because the incoming lists are ordered differently.

A quick fix is to wrap block_data_path.glob("*.csv") in the Ch02 - 02-Preprocessing notebook with sorted().
Hope this helps somebody who runs into the same problem.

The text was updated successfully, but these errors were encountered:

manujosephv · 2023-03-05T06:36:39Z

That's awesome! I've edited the code to reflect this as well..

If everybody reported the issues and suggest possible solutions, we can make the book and the associated code better for everyone!

hmzls changed the title ~~Ch04 - Baseline Forecasts, 'MAC000193' not in data set~~ Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set Mar 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set #12

Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set #12

hmzls commented Mar 4, 2023

manujosephv commented Mar 5, 2023 •

edited

Loading

Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set #12

Ch04 - Baseline Forecasts using darts, 'MAC000193' not in data set #12

Comments

hmzls commented Mar 4, 2023

manujosephv commented Mar 5, 2023 • edited Loading

manujosephv commented Mar 5, 2023 •

edited

Loading