Skip to content

Forecast Checks

Jarad Niemi edited this page Jun 4, 2020 · 18 revisions

Current Validation checks

Validation of data in the data-processed directory is performed by the function test-formatting.py using the zoltpy function validate_quantile_csv_file(). That function does these checks:

  1. metadata checks (In progress)

    • metadata file must have proper yaml format
    • metadata must include team_name,team_abbr, model_name, model_abbr, methods
    • team_name needs to be distinct from any already existing team_name
    • team_abbr needs to be distinct from any already existing team_abbr
    • model_name needs to be distinct from any already existing model_name
    • model_abbr needs to be distinct from any already existing model_abbr
    • methods is under 200 characters
    • forecast_startdate is date
    • this_model_is_an_ensemble and this_model_is_unconditional are lowercase boolean
  2. forecast checks

    • header must include location, target, type, quantile, value (required for zoltpy) and forecast_date, target_end_date
    • each row must have the same number of columns as header
    • location must be in locations.csv
    • target must be in
      paste(0:130, "day ahead inc death")
      paste(0:130, "day ahead cum death")
      paste(1:20,  "wk ahead inc death")
      paste(1:20,  "wk ahead cum death")
      paste(0:130, "day ahead inc hosp")
    • the # in "# day ahead" or "# week ahead" must be integers (redundant)
    • forecast_date and target_end_date must be in YYYY-MM-DD format
    • quantile must be in
      c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99)
    • checks quantile must be an int or float in [0, 1]
    • checks value must be an int or float
  3. validates date alignment as documented in the issue add additional validations

  4. validates quantiles and values (i.e,. at the prediction level):

    • checks that entries in value must be non-decreasing as quantiles increase
    • checks that elements in the quantile are unique
  5. validates quantiles as a group:

    • there must be exactly one point prediction for each location/target pair