Skip to content

chore(railway): sync regulatory seed cron schedule#2569

Open
lspassos1 wants to merge 12 commits intokoala73:mainfrom
lspassos1:chore/regulatory-railway-cron
Open

chore(railway): sync regulatory seed cron schedule#2569
lspassos1 wants to merge 12 commits intokoala73:mainfrom
lspassos1:chore/regulatory-railway-cron

Conversation

@lspassos1
Copy link
Copy Markdown
Collaborator

Summary

This extends scripts/railway-set-watch-paths.mjs so the seed-service sync also enforces the expected Railway cronSchedule for seed-regulatory-actions, alongside the existing watchPatterns and startCommand checks.

Root cause

The repository had no codified check for the regulatory seeder's 2-hour cron cadence, so the Railway scheduler state for seed-regulatory-actions could drift from the intended 7200s Redis TTL contract.

Changes

  • rename the script intent from watch-pattern-only sync to watch patterns + start command + cron schedule sync
  • require the seed-regulatory-actions service to exist before applying settings
  • query and update cronSchedule through serviceInstanceUpdate
  • define the expected regulatory cron as 0 */2 * * *
  • add tests/railway-set-watch-paths.test.mjs covering the required service, expected schedule, and cronSchedule update path

Validation

  • node --test tests/railway-set-watch-paths.test.mjs
  • node scripts/railway-set-watch-paths.mjs --dry-run -> fails locally with No Railway token found. Set RAILWAY_TOKEN or run \railway login`.`

Risk

Low code risk. The script change is narrow and test-covered, but the live Railway schedule was not applied from this environment because no Railway auth is available.

Note

This branch is currently stacked on #2568, which depends on #2567 and #2564. Until those PRs merge, this PR includes the parent commits in the diff.

Depends on #2568
Refs #2495

Add a standalone seeder that fetches and normalizes SEC, CFTC, Federal Reserve, FDIC, and FINRA regulatory feeds without introducing new dependencies.

The script stays import-safe, tolerates partial feed failure, and emits JSON for the fetch/parse-only phase of the pipeline. Unit tests cover RSS/Atom parsing, deduplication, ordering, and degraded-feed behavior.

Refs koala73#2492
Refs koala73#2493
Refs koala73#2494
Refs koala73#2495
Build on the standalone RSS fetcher by adding keyword-based tier classification, aggregate payload counts, and runSeed integration for regulatory:actions:v1.

The updated tests cover matched keywords, payload stats, and the runSeed wiring needed for Redis publication.

Refs koala73#2493
Depends on koala73#2564
Add regulatory:actions:v1 as a new cross-source input, map regulatory actions into the policy category, and emit CROSS_SOURCE_SIGNAL_TYPE_REGULATORY_ACTION signals for recent high/medium items.

The new test covers severity scoring and composite escalation when policy, financial, and economic signals co-fire in Global Markets.

Refs koala73#2494
Depends on koala73#2567
Extend the Railway seed-service sync script to enforce the expected cronSchedule for seed-regulatory-actions, while continuing to validate watch patterns and start commands.

Add a focused test for the new cronSchedule path and fail fast when the required seed-regulatory-actions service is missing.

Refs koala73#2495
Depends on koala73#2568
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 30, 2026

@lspassos1 is attempting to deploy a commit to the Elie Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR introduces a new seed-regulatory-actions service that polls five regulatory RSS/Atom feeds (SEC, CFTC, Federal Reserve, FDIC, FINRA), classifies actions by keyword tier, and writes results to Redis under regulatory:actions:v1 with a 7200s TTL. It wires the new source into the cross-source signals aggregator, enforces the expected 2-hour Railway cron schedule via railway-set-watch-paths.mjs, and ships comprehensive unit tests for all new logic.

Key findings:

  • Missing bootstrap hydration (api/bootstrap.js): regulatory:actions:v1 is not registered in BOOTSTRAP_CACHE_KEYS. Per AGENTS.md Critical Conventions, every new data source must be wired into api/bootstrap.js. Without this, the cross-source signals panel will load with an empty regulatory extractor for up to 2 hours after a cold deploy or cache expiry.
  • FINRA feed uses plain HTTP (http://feeds.finra.org/FINRANotices) while every other feed in the list uses HTTPS. Financial regulatory notices transmitted over HTTP are susceptible to interception or tampering.
  • currentPatterns.sort() mutates in place in railway-set-watch-paths.mjs — a minor defensive improvement would be to spread before sorting, consistent with the right-hand side of the same comparison.

Confidence Score: 4/5

  • Safe to merge after adding regulatory:actions:v1 to api/bootstrap.js — the missing bootstrap registration causes a silent 2-hour data gap on first deploy.
  • One P1 finding remains: the regulatory:actions:v1 key is absent from api/bootstrap.js, violating an explicit AGENTS.md critical convention and causing the regulatory extractor to return empty results for up to one full cron interval after deployment. All other findings are P2 (HTTPS on FINRA feed, defensive sort copy). The core script logic, cron sync, and test coverage are solid.
  • api/bootstrap.js needs a regulatoryActions: 'regulatory:actions:v1' entry; scripts/seed-regulatory-actions.mjs line 30 should switch FINRA to HTTPS.

Important Files Changed

Filename Overview
scripts/railway-set-watch-paths.mjs Extended to enforce cronSchedule for seed-regulatory-actions via a new REQUIRED_SEED_SERVICES guard and EXPECTED_CRON_SCHEDULES map; watch-pattern logic refactored into buildExpectedPatterns. One cosmetic issue: currentPatterns.sort() mutates in place.
scripts/seed-regulatory-actions.mjs New seed script polling 5 regulatory RSS/Atom feeds (SEC, CFTC, Fed, FDIC, FINRA), classifying actions by keyword tier, and writing to Redis under regulatory:actions:v1 with a 7200s TTL. Two issues: FINRA feed URL uses HTTP, and bootstrap hydration in api/bootstrap.js is absent (AGENTS.md violation).
scripts/seed-cross-source-signals.mjs Adds regulatory:actions:v1 to SOURCE_KEYS, a new extractRegulatoryAction extractor (48-hour lookback, top 3 non-low-tier actions), and registers it in the extractor pipeline. Logic is consistent with existing extractors.
tests/railway-set-watch-paths.test.mjs New test file verifying the required-service guard, expected 2-hour cron expression, and cronSchedule mutation path via source-text assertions.
tests/regulatory-seed-unit.test.mjs Comprehensive unit tests for the new seed script covering entity decoding, RSS/Atom parsing, deduplication, classification, and the main wiring — good coverage.
tests/cross-source-signals-regulatory.test.mjs Unit tests for the regulatory extractor and composite-escalation integration; verifies tier filtering, 48h cutoff, 3-item limit, and score math.

Sequence Diagram

sequenceDiagram
    participant Cron as Railway Cron<br/>(0 */2 * * *)
    participant Seed as seed-regulatory-actions
    participant Feeds as RSS Feeds<br/>(SEC, CFTC, Fed, FDIC, FINRA)
    participant Redis as Redis
    participant CSS as seed-cross-source-signals
    participant Bootstrap as api/bootstrap.js
    participant Client as Dashboard Client

    Cron->>Seed: trigger every 2h
    Seed->>Feeds: fetch RSS/Atom (parallel, 15s timeout)
    Feeds-->>Seed: XML responses
    Seed->>Seed: parse → normalize → dedupe → classify (high/medium/low)
    Seed->>Redis: SET regulatory:actions:v1 (TTL 7200s)
    Seed->>Redis: SET seed-meta:regulatory:actions

    Cron->>CSS: trigger (separate schedule)
    CSS->>Redis: GET regulatory:actions:v1 + other source keys
    Redis-->>CSS: payload
    CSS->>CSS: extractRegulatoryAction (48h cutoff, top-3 non-low)
    CSS->>Redis: SET intelligence:cross-source-signals:v1

    Client->>Bootstrap: GET /api/bootstrap
    Bootstrap->>Redis: MGET [...keys] ⚠️ regulatory:actions:v1 missing
    Redis-->>Bootstrap: bulk data (no regulatory actions)
    Bootstrap-->>Client: hydration payload (regulatory gap)
Loading

Reviews (1): Last reviewed commit: "chore(railway): sync regulatory seed cro..." | Re-trigger Greptile

@lspassos1
Copy link
Copy Markdown
Collaborator Author

lspassos1 commented Mar 31, 2026

Applied this live in Railway.

seed-regulatory-actions now exists in magnificent-recreation with the settings this PR expects: /scripts root directory, node seed-regulatory-actions.mjs as the start command, the matching watch paths, and cron 0 */2 * * *.

Refs #2495.

Use the repository-standard fetch wrapper in the seeder defaults, keep the documented FINRA HTTP exception in place, and include publish time in generated action ids to avoid same-day collisions.

Validated with: node --test tests/regulatory-seed-unit.test.mjs; node scripts/seed-regulatory-actions.mjs | head -n 20
Clean up the leftover cherry-pick marker after carrying the shared seeder hardening changes onto this branch.

Validated with: node --test tests/regulatory-seed-unit.test.mjs and a local fetchRegulatoryActionPayload smoke check.
Explicitly sort regulatory actions by publishedAt inside the cross-source extractor before applying the 3-item limit, and cover the behavior with an out-of-order payload test.

Validated with: node --test tests/regulatory-seed-unit.test.mjs and node --test tests/cross-source-signals-regulatory.test.mjs.
Compare Railway watchPatterns using a copied array so the validation path stays side-effect free.

Validated with: node --test tests/regulatory-seed-unit.test.mjs; node --test tests/cross-source-signals-regulatory.test.mjs; node --test tests/railway-set-watch-paths.test.mjs; node --test tests/bootstrap.test.mjs.
@lspassos1
Copy link
Copy Markdown
Collaborator Author

Follow-up on the bootstrap thread: regulatory:actions:v1 stays out of api/bootstrap.js intentionally. It is an internal Redis input for the cross-source aggregation job, not a frontend hydration key. The client-facing bootstrap key is intelligence:cross-source-signals:v1, which is already wired. Bootstrapping the intermediate feed would add unused data to every bootstrap response without affecting panel behavior.

@koala73
Copy link
Copy Markdown
Owner

koala73 commented Apr 1, 2026

Review — PR #2569 (Railway cron schedule)

Why this PR? Configures Railway cron for the regulatory seed and includes hardening fixes from the chain.

Blocking

1. TTL = cron interval (reiterated from #2567).
TTL_SECONDS = 7200 with 0 */2 * * * cron = 1x. Must be 3x (21600s = 6h). See #2567 review.

2. Missing api/health.js SEED_META registration.
No entry for regulatory:actions in SEED_META across the entire 4-PR chain. Without this, the health dashboard can't monitor staleness. Fix:

regulatoryActions: { key: 'seed-meta:regulatory:actions', maxStaleMin: 240 },

3. Missing STANDALONE_KEYS registration in api/health.js.
regulatory:actions:v1 needs to be in STANDALONE_KEYS for data presence monitoring:

regulatoryActions: 'regulatory:actions:v1',

Suggestions

  1. REQUIRED_SEED_SERVICES only contains seed-regulatory-actions. No other seed service uses this guard. Either add all seed services or remove the guard for consistency.
  2. buildExpectedPatterns() is a good refactoring. Clean extraction of watch-path logic.
  3. Pre-compiled keyword patterns and [...currentPatterns].sort() spread-before-mutate are both good fixes.

Across the chain

The code quality is strong. @lspassos1 clearly studied the project patterns (deferred fetch lambda, isDirectRun guard, vm.createContext tests, Promise.allSettled, deterministic IDs). The blocking issues are:

  • TTL constant (quick fix)
  • Health/cache registration (checklist items)
  • Keyword vocabulary gaps (needs thought, most important)
  • Signal selection order (sort by importance, not just recency)

Once these are addressed, the chain is merge-ready. Happy to re-review.

Extract RSS and Atom descriptions into the normalized action payload so later classifier work can use the same parsed feed output. Also adds @ts-check and documents the FINRA HTTP feed constraint.
Raise the Redis retention window, classify against combined title and description text, reserve low for routine notices, and export the shared regulatory cache key for downstream health wiring.
Reject malformed timestamps, sort regulatory actions by tier before recency, and keep only high/medium signals in the cross-source extractor.
Register the regulatory actions Redis key and its seed-meta freshness window in the health endpoint without expanding bootstrap coverage.
@lspassos1
Copy link
Copy Markdown
Collaborator Author

@koala73 Thanks for the review. I applied the requested changes across the chain: description parsing in the feed payload, the TTL/cache-key/health wiring, the classifier updates, and the cross-source selection fixes. When you have a moment, could you take another look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants