Skip to content

fix(climate): replace 30-day rolling baseline with WMO 30-year normals#2561

Open
fuleinist wants to merge 4 commits intokoala73:mainfrom
fuleinist:fix/climate-wmo-normals
Open

fix(climate): replace 30-day rolling baseline with WMO 30-year normals#2561
fuleinist wants to merge 4 commits intokoala73:mainfrom
fuleinist:fix/climate-wmo-normals

Conversation

@fuleinist
Copy link
Copy Markdown
Contributor

Summary

This PR replaces the climatologically meaningless 30-day rolling baseline with proper WMO 30-year climatological normals (1991-2020) for climate anomaly detection.

Problem

The current implementation in seed-climate-anomalies.mjs compares the last 7 days against the previous 23 days of the same 30-day window. This is wrong because:

  • A sustained heat wave during a uniformly hot month will NOT appear anomalous
  • The baseline is equally hot, masking real climate anomalies

Solution

Step 1: New seed-climate-zone-normals.mjs

  • Fetches Open-Meteo archive API for 1991-2020 for each climate zone
  • Aggregates monthly means (temperature, precipitation) per zone
  • Writes to climate:zone-normals:v1 (TTL 30 days)
  • Runs monthly (designed for Railway cron)

Step 2: Updated seed-climate-anomalies.mjs

  • Reads climate:zone-normals:v1 from Redis as baseline
  • Computes anomaly = current 7-day mean minus historical monthly mean
  • Falls back to rolling 30-day baseline if normals not yet cached (backwards-compatible)
  • No change to climate:anomalies:v1 cache key — fix in place

Step 3: Added 7 new climate zones

New zones for climate-specific monitoring:

  • Arctic (70N, 0E) — sea ice proxy
  • Greenland (72N, -42W) — ice sheet melt
  • West Antarctic (-78S, -100W) — Antarctic Ice Sheet
  • Tibetan Plateau (31N, 91E) — third pole
  • Congo Basin (-1N, 24E) — tropical forest
  • Coral Triangle (-5S, 128E) — reef bleaching proxy
  • North Atlantic (55N, -30W) — AMOC slowdown signal

Step 4: Cache key registration

Added climateZoneNormals: 'climate:zone-normals:v1' to server/_shared/cache-keys.ts.

Testing

  • New seeder file created with proper structure
  • Existing anomaly seeder updated with WMO normal lookup
  • Backwards-compatible fallback for initial run
  • All 22 zones (15 original + 7 new) included

Related Issue

Fixes #2467

Subagent added 2 commits March 31, 2026 05:55
- Create seed-climate-zone-normals.mjs to fetch 1991-2020 historical
  monthly means from Open-Meteo archive API per zone
- Update seed-climate-anomalies.mjs to use WMO normals as baseline
  instead of climatologically meaningless 30-day rolling window
- Add 7 new climate-specific zones: Arctic, Greenland, WestAntarctic,
  TibetanPlateau, CongoBasin, CoralTriangle, NorthAtlantic
- Register climateZoneNormals cache key in cache-keys.ts
- Add fallback to rolling baseline if normals not yet cached

Fixes: koala73#2467
- seed-climate-zone-normals.mjs: Now fetches normals for ALL 22 zones
  (15 original geopolitical + 7 new climate zones) instead of just
  the 7 new climate zones. The 15 original zones were falling through
  to the broken rolling fallback.

- seed-climate-anomalies.mjs: Fixed rolling fallback to fetch 30 days
  of data when WMO normals are not yet cached. Previously fetched only
  7 days, causing baselineTemps slice to be empty and returning null
  for all zones. Now properly falls back to 30-day rolling baseline
  (last 7 days vs. prior 23 days) when normals seeder hasn't run.

- cache-keys.ts: Removed climateZoneNormals from BOOTSTRAP_CACHE_KEYS.
  This is an internal seed-pipeline artifact (used by the anomaly
  seeder to read cached normals) and is not meant for the bootstrap
  endpoint. Only climate:anomalies:v1 (the final computed output)
  should be exposed to clients.

Fixes greptile-apps P1 comments on PR koala73#2504.
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 30, 2026

Someone is attempting to deploy a commit to the Elie Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions bot added the trust:safe Brin: contributor trust score safe label Mar 30, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR replaces the climatologically flawed 30-day rolling baseline in seed-climate-anomalies.mjs with proper WMO 30-year climatological normals (1991-2020) and adds a new monthly seeder (seed-climate-zone-normals.mjs) to compute and cache those normals. The approach and architecture are sound — the backwards-compatible fallback, Redis cache structure, and new zone additions are all well-designed. However, two P1 bugs need to be addressed before this is production-safe:

  • Archive API lag breaks the WMO-normals path every run (seed-climate-anomalies.mjs:215): When normals are available, only 7 days of archive data are fetched. Open-Meteo's archive API lags by 3-5 days, so the window typically contains only 2-4 valid data points. This causes every zone to fail the temps.length < 7 guard, and the seeder throws its MIN_ZONES error on every run — meaning the new WMO path is effectively disabled in production. Fetching 14 days instead of 7 and continuing to slice(-7) would fix this.
  • 660 sequential API calls in the normals seeder will exceed Railway's cron timeout (seed-climate-zone-normals.mjs:73): The seeder loops over 30 years × 22 zones = 660 requests, even though Open-Meteo supports a single request covering the full 1991-01-01 to 2020-12-31 range. Batching into 22 per-zone requests would cut runtime from ~45 minutes to ~2 minutes.

Additional P2 observations:

  • The zone list is duplicated verbatim in both files with a "must be kept in sync" comment — a shared _climate-zones.mjs module would eliminate silent divergence risk.
  • The PR description describes adding climateZoneNormals: 'climate:zone-normals:v1' to server/_shared/cache-keys.ts (Step 4), but that file was not modified in this PR.

Confidence Score: 4/5

  • Not safe to merge as-is — the WMO-normals path will silently fail on every production run and the normals seeder is likely to time out on Railway.
  • Two P1 defects exist on the primary changed paths: the 7-day archive fetch window guarantees < 7 valid data points on every run (causing MIN_ZONES error), and the year-by-year loop in the normals seeder will exceed Railway cron timeouts. Both are straightforward to fix (one-line change for the fetch window, restructuring the loop for the seeder). The fallback/backwards-compat logic is solid, and the overall architecture is correct — once these two issues are resolved the PR should be safe to merge.
  • Both scripts/seed-climate-anomalies.mjs (line 215) and scripts/seed-climate-zone-normals.mjs (line 73) require fixes before merging.

Important Files Changed

Filename Overview
scripts/seed-climate-anomalies.mjs Updated to use WMO 30-year normals from Redis instead of a rolling 30-day baseline; adds 7 new climate zones; includes fallback logic — but the 7-day archive fetch window is too narrow given Open-Meteo's 3-5 day lag, causing the WMO-normals path to fail on every production run.
scripts/seed-climate-zone-normals.mjs New seeder that fetches 1991-2020 historical data and computes monthly means per zone — correct approach, but fetches 30 years one year at a time (660 API calls) when a single multi-year request per zone would suffice, risking Railway cron timeout.

Sequence Diagram

sequenceDiagram
    participant Cron as Railway Cron (monthly)
    participant NS as seed-climate-zone-normals.mjs
    participant OM as Open-Meteo Archive API
    participant Redis as Upstash Redis

    Cron->>NS: trigger (1st of month)
    loop For each of 22 zones (currently: 30 calls/zone)
        NS->>OM: GET /v1/archive (1 year per request × 30)
        OM-->>NS: daily temp + precip data
    end
    NS->>NS: aggregate to monthly means (12 normals/zone)
    NS->>Redis: SET climate:zone-normals:v1 (TTL 30d)

    participant AS as seed-climate-anomalies.mjs (every 3h)
    participant OM2 as Open-Meteo Archive API

    AS->>Redis: GET climate:zone-normals:v1
    alt Normals cached
        Redis-->>AS: { zones: [...22 zones × 12 months] }
        AS->>OM2: GET /v1/archive (last 7 days per zone)
        OM2-->>AS: daily data (may be < 7 due to archive lag ⚠️)
        AS->>AS: anomaly = currentMean − monthNormal.tempMean
        AS->>Redis: SET climate:anomalies:v1 (TTL 3h)
    else Normals not yet cached (fallback)
        Redis-->>AS: null
        AS->>OM2: GET /v1/archive (last 30 days per zone)
        OM2-->>AS: daily data
        AS->>AS: anomaly = last7d mean − prev23d mean (old logic)
        AS->>Redis: SET climate:anomalies:v1 (TTL 3h)
    end
Loading

Reviews (1): Last reviewed commit: "fix(climate): address greptile-apps revi..." | Re-trigger Greptile

Comment on lines +215 to +216
const daysToFetch = hasNormals ? 7 : 30;
const startDate = new Date(Date.now() - daysToFetch * 24 * 60 * 60 * 1000).toISOString().slice(0, 10);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 7-day fetch window is too narrow for the archive API lag

When hasNormals = true, daysToFetch is set to 7, which means the request window is exactly today-minus-7 to today. The Open-Meteo archive API (archive-api.open-meteo.com) typically lags by 3-5 days — values for the last few days return as null and are filtered out by the null-check loop (lines 125–130).

On any given run this means temps will contain only 2–4 valid entries instead of 7, causing temps.length < 7 (line 132) to trigger and fetchZone to return null for every single zone. All 22 zones return null, the MIN_ZONES check fires, and the seeder throws — meaning the WMO-normals path always fails in production until this is fixed.

The existing fallback (no-normals, daysToFetch = 30) was immune to this because slicing temps.slice(-7) on 25+ valid days is safe even when the last few are unavailable.

Suggested change
const daysToFetch = hasNormals ? 7 : 30;
const startDate = new Date(Date.now() - daysToFetch * 24 * 60 * 60 * 1000).toISOString().slice(0, 10);
const daysToFetch = hasNormals ? 14 : 30;

Then keep the temps.slice(-7) call unchanged — it will naturally pick the 7 most-recent valid days even if the last few days are missing from the archive.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment has been addressed: The 7-day fetch window has been changed to 14 days when hasNormals=true (daysToFetch = hasNormals ? 14 : 30). This ensures at least 7 valid data points are available for the slice(-7) call even when the Open-Meteo archive API lags by 3-5 days.

Comment on lines +73 to +76
for (let year = startYear; year <= endYear; year++) {
const yearStart = `${year}-01-01`;
const yearEnd = `${year}-12-31`;
const url = `https://archive-api.open-meteo.com/v1/archive?latitude=${zone.lat}&longitude=${zone.lon}&start_date=${yearStart}&end_date=${yearEnd}&daily=temperature_2m_mean,precipitation_sum&timezone=UTC`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 660 sequential API calls will likely exceed Railway cron timeout

The loop fetches one year at a time for each of 22 zones: 30 years × 22 zones = 660 sequential HTTP requests. With a 100 ms delay between requests plus Open-Meteo response latency (typically 1-3 s), the total wall-clock time is roughly 30–60 minutes per run. Railway's default cron job timeout is ~10 minutes.

The comment on line 68 even notes that Open-Meteo supports date ranges, but then contradicts itself by fetching year-by-year. Open-Meteo's archive endpoint fully supports a 30-year span in a single call (start_date=1991-01-01&end_date=2020-12-31).

Change the loop body to a single per-zone fetch:

// Replace the year-by-year loop with a single request covering the full 30-year range
const url = `https://archive-api.open-meteo.com/v1/archive?latitude=${zone.lat}&longitude=${zone.lon}&start_date=1991-01-01&end_date=2020-12-31&daily=temperature_2m_mean,precipitation_sum&timezone=UTC`;

This reduces 660 calls to 22 calls (one per zone), cutting runtime from ~45 minutes to ~2 minutes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment has been addressed: The year-by-year loop has been replaced with a single per-zone API call covering the full 1991-2020 range. This reduces 660 sequential calls to 22 calls, cutting runtime from ~45 minutes to ~2 minutes and resolving the Railway cron timeout issue.

Comment on lines +50 to +51
// All 22 zones — must match ALL_ZONES in seed-climate-anomalies.mjs
const ALL_ZONES = [...ZONES, ...CLIMATE_ZONES];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicated zone list requires manual synchronisation

Both seed-climate-zone-normals.mjs and seed-climate-anomalies.mjs define ZONES, CLIMATE_ZONES, and ALL_ZONES independently (with a comment saying "must be kept in sync"). If a zone is renamed in one file but not the other, normals?.find((n) => n.zone === zone.name) in the anomaly seeder returns undefined for that zone — silently falling back to the rolling 30-day baseline with no error or warning.

Consider extracting the zone definitions into scripts/_climate-zones.mjs and importing from both seeders. This removes the synchronisation burden entirely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment has been addressed: Zone definitions have been extracted into scripts/_climate-zones.mjs which exports ZONES, CLIMATE_ZONES, ALL_ZONES, and MIN_ZONES. Both seed-climate-zone-normals.mjs and seed-climate-anomalies.mjs now import from this shared file, eliminating the synchronisation burden.

Comment on lines +178 to +186
runSeed('climate', 'zone-normals', CANONICAL_KEY, fetchAllZoneNormals, {
validateFn: validate,
ttlSeconds: CACHE_TTL,
sourceVersion: 'open-meteo-archive-wmo-normals',
}).catch((err) => {
const _cause = err.cause ? ` (cause: ${err.cause.message || err.cause.code || err.cause})` : '';
console.error('FATAL:', (err.message || err) + _cause);
process.exit(1);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 cache-keys.ts not updated as described in the PR

The PR description (Step 4) states that climateZoneNormals: 'climate:zone-normals:v1' will be added to server/_shared/cache-keys.ts, but cache-keys.ts was not modified in this PR. The key string 'climate:zone-normals:v1' is currently only defined as local constants in the two seed scripts.

While this doesn't break runtime behaviour (the seed scripts define their own constant), it means any server-side code that tries to read this key using BOOTSTRAP_CACHE_KEYS.climateZoneNormals would fail at the TypeScript level. Per the project conventions in AGENTS.md, cache key strings should be registered in cache-keys.ts as a single source of truth.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment has been addressed: climateZoneNormals: 'climate:zone-normals:v1' has been added to BOOTSTRAP_CACHE_KEYS in server/_shared/cache-keys.ts, aligning with the project convention that cache key strings should be registered there as a single source of truth.

@fuleinist fuleinist force-pushed the fix/climate-wmo-normals branch from 8f996bf to cc576f5 Compare March 31, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

trust:safe Brin: contributor trust score safe

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(climate): replace 30-day rolling baseline with 30-year WMO normals in anomaly seeder

1 participant