feat(climate+health):add shared air quality seed and mirrored health by FayezBast · Pull Request #2634 · koala73/worldmonitor

FayezBast · 2026-04-03T01:20:14Z

Summary

Adds a shared air-quality ingestion path backed by OpenAQ v3 with optional WAQI supplementation. The new seed writes the same station payload to both health:air-quality:v1 and climate:air-quality:v1, adds mirrored health/climate RPCs for that data, updates generated API clients/server stubs and OpenAPI docs, and registers the new cache/bootstrap keys plus lightweight frontend hydration consumers.

NOTE :
u mentioned that no api key req for OpenAQ v3 but it req one,
Hourly execution is an external infra dependency, not repo-managed. This PR assumes an hourly cron for node scripts/seed-health-air-quality.mjs; repo config does not enforce it (see vercel.json (line 3), crons: []). Ensure Railway cron is set to run every 1 hour.

Fixes #2471

Type of change

Bug fix
New feature
New data source / feed
Documentation

Affected areas

API endpoints (/api/*)
Other: Seed scripts, Redis cache/bootstrap, generated RPC clients/server stubs

Checklist

No API keys or secrets committed
TypeScript compiles without errors (npm run typecheck)

…climate RPCs

mintlify · 2026-04-03T01:20:19Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
WorldMonitor	🟢 Ready	View Preview	Apr 3, 2026, 1:21 AM

vercel · 2026-04-03T01:20:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
worldmonitor	Ignored	Preview	Apr 3, 2026 5:42am

greptile-apps · 2026-04-03T01:29:35Z

Greptile Summary

This PR introduces a shared air-quality ingestion pipeline backed by OpenAQ v3 (with optional WAQI supplementation). A single hourly seed script writes the same station payload to both health:air-quality:v1 and climate:air-quality:v1 in Redis, wiring mirrored RPCs for both the health and climate domains, updating generated proto/client/server stubs, registering new bootstrap hydration keys, and adding frontend service consumers. The implementation correctly follows the project's gateway, caching, and proto-contract patterns, and includes solid unit test coverage for all pure functions.

Key findings from the review:

Misleading waqiEntries parameter in buildAirQualityPayload — the exported function accepts pre-normalized station objects, not raw WAQI API entries, but the parameter name implies the latter. A future caller passing raw WAQI data would silently produce incorrect/empty station lists.
Silent exit(0) on missing OPENAQ_API_KEY — a fatal configuration error (absent API key) is swallowed by the graceful-failure path and reported to Railway as a successful cron run, masking misconfiguration.
Lock timeout shorter than worst-case execution time — the distributed lock is acquired for 120 s, but sequential pagination of up to 20 pages per OpenAQ endpoint (each with a 30 s timeout and 2 retries) can realistically exceed this, allowing a second Railway instance to start concurrently.
OPENAQ_META_KEY naming inconsistency — named after the data source rather than the destination domain, unlike its sibling CLIMATE_META_KEY.

Confidence Score: 3/5

Mergeable with caution — the core data flow is correct but the seed script has observability and concurrency gaps that should be addressed before relying on it in production.

The handler, proto, generated-stub, and frontend service changes are clean and low risk. The seed script is the main concern: the silent exit(0) on a missing API key will make production misconfiguration invisible to Railway monitoring, the lock TTL is shorter than the realistic worst-case runtime allowing concurrent writes, and the misleading parameter name creates a future maintainability hazard.

scripts/seed-health-air-quality.mjs — lock timeout, silent misconfiguration exit, and parameter naming issues.

Important Files Changed

Filename	Overview
scripts/seed-health-air-quality.mjs	Core seed script — fetches OpenAQ v3 + optional WAQI data, merges and writes to two Redis keys; has a misleading `waqiEntries` parameter name, silent exit(0) on missing API key, and a lock timeout shorter than worst-case execution time.
server/_shared/air-quality-stations.ts	Shared TypeScript normalizer for air quality station payloads; correctly handles camelCase/snake_case field aliasing and defensive null checking.
server/_shared/cache-keys.ts	Adds CLIMATE_AIR_QUALITY_KEY and HEALTH_AIR_QUALITY_KEY constants plus corresponding bootstrap tier entries; correctly mirrors api/bootstrap.js registration.
server/worldmonitor/health/v1/list-air-quality-alerts.ts	Health RPC handler reads from health:air-quality:v1; gracefully falls back to payload.alerts for forward-compatibility.
server/worldmonitor/climate/v1/list-air-quality-data.ts	Climate RPC handler reads from climate:air-quality:v1; mirrors the health handler structure cleanly.
tests/air-quality-seed.test.mjs	Good unit test coverage for AQI helpers, station building, and merge/dedup logic; correctly validates mirrored pipeline commands; uses OPENAQ_META_KEY name that should be updated if constant is renamed.
api/bootstrap.js	Adds climateAirQuality and healthAirQuality to both BOOTSTRAP_CACHE_KEYS and FAST_KEYS; correctly duplicates keys from cache-keys.ts as required by Edge Function isolation rules.
server/gateway.ts	Registers /api/climate/v1/list-air-quality-data and /api/health/v1/list-air-quality-alerts under the fast cache tier; no issues.

Sequence Diagram

sequenceDiagram
    participant Railway as Railway Cron (hourly)
    participant Seed as seed-health-air-quality.mjs
    participant Lock as Redis Lock
    participant OAQ as OpenAQ v3 API
    participant WAQI as WAQI API (optional)
    participant Redis as Redis (Upstash)
    participant Bootstrap as /api/bootstrap
    participant Health as /api/health/v1/list-air-quality-alerts
    participant Climate as /api/climate/v1/list-air-quality-data

    Railway->>Seed: node seed-health-air-quality.mjs
    Seed->>Lock: acquireLock(health:air-quality, 120s)
    alt Lock acquired
        par Parallel fetches
            Seed->>OAQ: GET /v3/locations?parameters=pm25 (paginated)
            Seed->>OAQ: GET /v3/parameters/2/latest (paginated)
            Seed->>WAQI: GET /map/bounds/ x6 tiles
        end
        Seed->>Seed: buildOpenAqStations() + buildWaqiStations()
        Seed->>Seed: mergeAirQualityStations()
        Seed->>Redis: PIPELINE SET health:air-quality:v1 EX 3600
        Seed->>Redis: PIPELINE SET climate:air-quality:v1 EX 3600
        Seed->>Redis: PIPELINE SET seed-meta:health:air-quality EX 604800
        Seed->>Redis: PIPELINE SET seed-meta:climate:air-quality EX 604800
        Seed->>Lock: releaseLock()
    else Lock skipped
        Seed->>Seed: process.exit(0)
    end

    Bootstrap->>Redis: MGET health:air-quality:v1, climate:air-quality:v1
    Bootstrap-->>Bootstrap: hydrate client

    Health->>Redis: GET health:air-quality:v1
    Health-->>Health: normalizeAirQualityStations() to AirQualityAlert[]

    Climate->>Redis: GET climate:air-quality:v1
    Climate-->>Climate: normalizeAirQualityStations() to AirQualityStation[]

_{Reviews (1): Last reviewed commit: "feat(climate+health):add shared air qual..." | Re-trigger Greptile}

greptile-apps · 2026-04-03T01:29:39Z

scripts/seed-health-air-quality.mjs

+export function buildAirQualityPayload({ locations = [], latestMeasurements = [], waqiEntries = [], nowMs = Date.now() }) {
+  const openAqStations = buildOpenAqStations(locations, latestMeasurements, nowMs);
+  const mergedStations = mergeAirQualityStations(openAqStations, waqiEntries);
+  return {
+    stations: mergedStations.map(toOutputStation),
+    fetchedAt: nowMs,
+  };
+}


Misleading waqiEntries parameter accepts pre-normalized stations, not raw API entries

The parameter is named waqiEntries, implying raw WAQI API payloads (which have fields like lat, lon, aqi as a string, station.name, iaqi.pm25.v). However, the implementation passes the value directly to mergeAirQualityStations, which expects already-normalized station objects with city, countryCode, lng, measuredAt, etc.

The actual caller — fetchAirQualityPayload — correctly passes pre-normalized objects (output of buildWaqiStations), so there is no runtime bug today. However, the exported signature is misleading:

The tests use waqiEntries: [], masking the discrepancy.

A future caller passing raw WAQI entries would silently produce wrong results: stationIdentity would read station.city (undefined on raw entries), collapsing all stations to the same key or dropping them entirely.

Rename the parameter to waqiStations to make the contract explicit:

Suggested change

export function buildAirQualityPayload({ locations = [], latestMeasurements = [], waqiEntries = [], nowMs = Date.now() }) {

const openAqStations = buildOpenAqStations(locations, latestMeasurements, nowMs);

const mergedStations = mergeAirQualityStations(openAqStations, waqiEntries);

return {

stations: mergedStations.map(toOutputStation),

fetchedAt: nowMs,

};

}

export function buildAirQualityPayload({ locations = [], latestMeasurements = [], waqiStations = [], nowMs = Date.now() }) {

const openAqStations = buildOpenAqStations(locations, latestMeasurements, nowMs);

const mergedStations = mergeAirQualityStations(openAqStations, waqiStations);

Also update the call site in fetchAirQualityPayload (line 392) to pass waqiStations instead of waqiEntries: waqiStations.

greptile-apps · 2026-04-03T01:29:40Z

scripts/seed-health-air-quality.mjs

+    payload = await withRetry(() => fetchAirQualityPayload(), 2, 1_000);
+  } catch (error) {
+    await releaseLock(`${domain}:${resource}`, runId);
+    const durationMs = Date.now() - startMs;
+    const cause = error?.cause ? ` (cause: ${error.cause.message || error.cause.code || error.cause})` : '';
+    console.error(`  FETCH FAILED: ${error?.message ?? error}${cause}`);
+    await extendExistingTtl([
+      HEALTH_AIR_QUALITY_KEY,
+      CLIMATE_AIR_QUALITY_KEY,
+      OPENAQ_META_KEY,
+      CLIMATE_META_KEY,
+    ], CACHE_TTL).catch(() => {});
+    console.log(`\n=== Failed gracefully (${Math.round(durationMs)}ms) ===`);
+    process.exit(0);


Missing API key causes silent exit(0) — Railway cron reports success on fatal misconfiguration

When OPENAQ_API_KEY is absent, buildOpenAqHeaders() throws immediately. This propagates through fetchPagedResults → fetchAirQualityPayload → the withRetry wrapper (2 retries, both instantly fail) → the catch block below.

The catch block releases the lock, extends existing TTL with stale data, then calls process.exit(0). Railway sees a successful exit code and reports the cron run as successful. A missing API key will therefore pass silently every hour, serve ever-more-stale data, and never surface as an alert.

Consider distinguishing configuration errors from transient network errors and exiting with a non-zero code for the former, so Railway cron failure alerts fire when the environment is misconfigured.

greptile-apps · 2026-04-03T01:29:41Z

scripts/seed-health-air-quality.mjs

+async function fetchPagedResults(fetchPage, label) {
+  const results = [];
+  let expectedFound = 0;
+
+  for (let page = 1; page <= OPENAQ_MAX_PAGES; page++) {
+    const payload = await fetchPage(page);
+    const pageResults = Array.isArray(payload?.results) ? payload.results : [];
+    results.push(...pageResults);
+
+    const found = toFiniteNumber(payload?.meta?.found);
+    const effectiveLimit = toFiniteNumber(payload?.meta?.limit) ?? OPENAQ_PAGE_LIMIT;
+    if (found != null && found > 0) expectedFound = found;
+
+    if (pageResults.length < effectiveLimit) break;
+    if (expectedFound > 0 && results.length >= expectedFound) break;
+  }
+
+  if (results.length === 0) {
+    throw new Error(`${label}: no results returned`);
+  }
+
+  return results;
+}


Sequential pagination with a 120 s lock could time out under API pressure

fetchPagedResults fetches pages one-by-one up to OPENAQ_MAX_PAGES = 20. Both fetchOpenAqLocationsPage and fetchOpenAqLatestPage are limited to 20 pages each. While the two calls are parallelized via Promise.all, within each branch pages are sequential.

With AbortSignal.timeout(30_000) per request and 2 retries (1 s delay), a single page can take up to ~63 s in the worst case. With just a couple of slow pages, the total runtime can exceed the 120 s lock lifetime. Once the lock expires, a second Railway cron instance can acquire it and run concurrently — both writing the same payload, but doubling external API traffic and Redis pipeline pressure.

Consider increasing the lock TTL to at least 600_000 (10 minutes) to match a realistic worst-case runtime:

Suggested change

async function fetchPagedResults(fetchPage, label) {

const results = [];

let expectedFound = 0;

for (let page = 1; page <= OPENAQ_MAX_PAGES; page++) {

const payload = await fetchPage(page);

const pageResults = Array.isArray(payload?.results) ? payload.results : [];

results.push(...pageResults);

const found = toFiniteNumber(payload?.meta?.found);

const effectiveLimit = toFiniteNumber(payload?.meta?.limit) ?? OPENAQ_PAGE_LIMIT;

if (found != null && found > 0) expectedFound = found;

if (pageResults.length < effectiveLimit) break;

if (expectedFound > 0 && results.length >= expectedFound) break;

}

if (results.length === 0) {

throw new Error(`${label}: no results returned`);

}

return results;

}

const lockResult = await acquireLockSafely(`${domain}:${resource}`, runId, 600_000, {

greptile-apps · 2026-04-03T01:29:42Z

scripts/seed-health-air-quality.mjs

+export const OPENAQ_META_KEY = 'seed-meta:health:air-quality';
+export const CLIMATE_META_KEY = 'seed-meta:climate:air-quality';


Inconsistent constant naming: source-based vs destination-based

OPENAQ_META_KEY is named after its data source (OpenAQ), while CLIMATE_META_KEY is named after its destination namespace. Both meta entries originate from OpenAQ, so mixing naming conventions makes the pair harder to reason about. Renaming OPENAQ_META_KEY to HEALTH_META_KEY would make the two constants symmetric and clarify that one covers the health domain and the other the climate domain. Also update the import in tests/air-quality-seed.test.mjs (line 12) and all usages in main.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

…climate RPCs

FayezBast · 2026-04-03T01:48:01Z

did the changes which the bot highlighted

koala73

Code Review — air quality seeder (shared health + climate)

Good work on this one — the mirrored dual-key pattern is the right approach, the proto and generated stubs look correct, and the isMain guard is properly implemented. A few issues before merge.

🔴 P1 — `CACHE_TTL = 3600` violates the 3× gold standard

File: scripts/seed-health-air-quality.mjs:19

export const CACHE_TTL = 3600;  // equals the cron interval exactly

The Railway cron runs every 1h = 3600s. Gold standard for all seeders is TTL ≥ 3× interval, so the minimum is 10800s. With TTL matching the cadence exactly, a single missed run (deploy, Railway restart, lock collision) causes the data to expire before the next run executes. The extendExistingTtl call on failure is also using this same value, meaning repeated failures don't buy extra buffer.

Fix: export const CACHE_TTL = 10800; (3h)

🔴 P1 — `api/seed-health.js` SEED_DOMAINS not updated

File: api/seed-health.js (unchanged in this PR)

The PR updates api/health.js with SEED_META entries for both keys — good. But api/seed-health.js has its own separate SEED_DOMAINS object (used by the /api/seed-health monitoring endpoint) that is not updated. The health dashboard will never report this seeder's freshness status; it will silently stay absent.

Fix: Add to SEED_DOMAINS in api/seed-health.js:

'climate:air-quality':  { key: 'seed-meta:climate:air-quality',  intervalMin: 60 },
'health:air-quality':   { key: 'seed-meta:health:air-quality',   intervalMin: 60 },

(Or a single shared entry if preferred, since they're the same seeder run.)

🟡 P2 — Both keys in `FAST_KEYS` — payload size risk

File: api/bootstrap.js:48

'socialVelocity', 'climateAirQuality', 'healthAirQuality',

FAST_KEYS are fetched in the critical-path first batch of the bootstrap response. OpenAQ fetches up to 20 pages × 1000 measurements, filtered to the last 2h. This can easily be 5,000–15,000 stations. Putting both keys in FAST_KEYS adds that payload twice (once per domain) to the initial page load.

Consider either: removing them from FAST_KEYS (they load in the second batch), or explicitly bounding the station count in the seeder before writing to Redis (e.g., top 2000 by AQI).

🟡 P2 — Shared seeder writes two separate SEED_META keys

Files: scripts/seed-health-air-quality.mjs, api/health.js

The seeder writes seed-meta:health:air-quality AND seed-meta:climate:air-quality as separate keys, and api/health.js registers both in SEED_META independently. Since it's a single run, both meta keys always have the same timestamp — but the health dashboard will show two separate staleness entries that can only diverge if a Redis write fails mid-run.

This isn't broken, but consider: a single canonical meta key (e.g., seed-meta:air-quality) read by both SEED_META entries would be simpler and eliminate the divergence risk. Not a blocker, but worth a conscious decision.

…YS, shared meta - Raise CACHE_TTL from 3600 to 10800 (3× the 1h cron cadence; gold standard) - Add health:air-quality to api/seed-health.js SEED_DOMAINS so monitoring dashboard tracks freshness - Remove climateAirQuality and healthAirQuality from FAST_KEYS (large station payloads; load in slow batch) - Point climateAirQuality SEED_META to same meta key as healthAirQuality (same seeder run, one source of truth)

…yloads avoid critical-path batch

…until panel exists - Drop deprecated first URL attempt (parameters=pm25, order_by=lastUpdated, sort=desc); use correct v3 params (parameters_id=2, sort_order=desc) directly — eliminates guaranteed 4xx retry cycle per page on 20-page crawl - Remove climateAirQuality and healthAirQuality from BOOTSTRAP_CACHE_KEYS, SLOW_KEYS, and BOOTSTRAP_TIERS — no panel consumes these yet; adding thousands of station records to every startup bootstrap is pure payload bloat - Remove normalizeAirQualityPayload helpers from bootstrap.js (no longer called) - Update service wrappers to fetch via RPC directly; re-add bootstrap hydration when a panel actually needs it

…t case 2 OpenAQ calls × 20 pages × (30s timeout × 3 attempts) = 3600s max runtime. Previous 600s TTL allowed concurrent cron runs on any degraded upstream.

feat(climate+health):add shared air quality seed and mirrored health/…

ebaf89e

…climate RPCs

mintlify bot deployed to staging - docs April 3, 2026 01:21 View deployment

greptile-apps bot reviewed Apr 3, 2026

View reviewed changes

feat(climate+health):add shared air quality seed and mirrored health/…

12744a1

…climate RPCs

vercel bot deployed to Preview April 3, 2026 01:50 View deployment

koala73 reviewed Apr 3, 2026

View reviewed changes

koala73 added 2 commits April 3, 2026 09:08

fix(bootstrap): move air quality keys to SLOW tier — large station pa…

6da9ca4

…yloads avoid critical-path batch

vercel bot deployed to Preview April 3, 2026 05:20 View deployment

vercel bot deployed to Preview April 3, 2026 05:35 View deployment

fix(air-quality): raise lock TTL to 3600s to cover 20-page crawl wors…

8192a02

…t case 2 OpenAQ calls × 20 pages × (30s timeout × 3 attempts) = 3600s max runtime. Previous 600s TTL allowed concurrent cron runs on any degraded upstream.

koala73 merged commit 9d94ad3 into main Apr 3, 2026
12 checks passed

koala73 deleted the feat/add-air-quality-seeder branch April 3, 2026 06:27

koala73 mentioned this pull request Apr 4, 2026

feat(climate): add proto definitions + handlers for 5 new climate RPCs #2473

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(climate+health):add shared air quality seed and mirrored health#2634

feat(climate+health):add shared air quality seed and mirrored health#2634
koala73 merged 6 commits intomainfrom
feat/add-air-quality-seeder

FayezBast commented Apr 3, 2026

Uh oh!

mintlify bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

vercel bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 3, 2026

Uh oh!

greptile-apps bot Apr 3, 2026

Uh oh!

greptile-apps bot Apr 3, 2026

Uh oh!

greptile-apps bot Apr 3, 2026

Uh oh!

greptile-apps bot Apr 3, 2026

Uh oh!

FayezBast commented Apr 3, 2026

Uh oh!

koala73 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		export const OPENAQ_META_KEY = 'seed-meta:health:air-quality';
		export const CLIMATE_META_KEY = 'seed-meta:climate:air-quality';

Conversation

FayezBast commented Apr 3, 2026

Summary

Type of change

Affected areas

Checklist

Uh oh!

mintlify bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Apr 3, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

FayezBast commented Apr 3, 2026

Uh oh!

koala73 left a comment

Choose a reason for hiding this comment

Code Review — air quality seeder (shared health + climate)

🔴 P1 — CACHE_TTL = 3600 violates the 3× gold standard

🔴 P1 — api/seed-health.js SEED_DOMAINS not updated

🟡 P2 — Both keys in FAST_KEYS — payload size risk

🟡 P2 — Shared seeder writes two separate SEED_META keys

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mintlify bot commented Apr 3, 2026 •

edited

Loading

vercel bot commented Apr 3, 2026 •

edited

Loading

🔴 P1 — `CACHE_TTL = 3600` violates the 3× gold standard

🔴 P1 — `api/seed-health.js` SEED_DOMAINS not updated

🟡 P2 — Both keys in `FAST_KEYS` — payload size risk