Skip to content

feat(regulatory): classify and publish regulatory actions#2567

Open
lspassos1 wants to merge 4 commits intokoala73:mainfrom
lspassos1:feat/regulatory-rss-classify-write
Open

feat(regulatory): classify and publish regulatory actions#2567
lspassos1 wants to merge 4 commits intokoala73:mainfrom
lspassos1:feat/regulatory-rss-classify-write

Conversation

@lspassos1
Copy link
Copy Markdown
Collaborator

Summary

This adds the second regulatory RSS step by classifying normalized actions into high / medium / low, computing aggregate counts, and publishing the final payload to regulatory:actions:v1 via runSeed.

Root cause

The fetch/parse seeder from #2564 produces stable normalized actions, but the pipeline still needed tiering logic and Redis publication before cross-source signals could consume regulatory events.

Changes

  • add the HIGH_KEYWORDS and MEDIUM_KEYWORDS classification rules from feat(regulatory): seed-regulatory-actions.mjs — classify + Redis write #2493
  • classify each action with tier and matchedKeywords, using word-boundary matching to avoid false positives like ban inside bank
  • build the final seed payload with actions, fetchedAt, recordCount, highCount, and mediumCount
  • wire main() through runSeed('regulatory', 'actions', 'regulatory:actions:v1', ...) with TTL 7200 and an empty-array-safe validateFn
  • expand tests/regulatory-seed-unit.test.mjs to cover classification, payload counts, fetch-to-payload flow, and the runSeed wiring

Validation

  • node --test tests/regulatory-seed-unit.test.mjs
  • node -e "import('./scripts/seed-regulatory-actions.mjs').then(async (m) => { const data = await m.fetchRegulatoryActionPayload(); process.stdout.write(JSON.stringify({recordCount: data.recordCount, highCount: data.highCount, mediumCount: data.mediumCount, first: data.actions[0]}, null, 2) + '\\n'); })"
  • node -e "import('./scripts/seed-regulatory-actions.mjs').then(() => process.stdout.write('import-ok\\n'))"

Risk

Low risk. This only evolves the new regulatory seeder introduced in #2564 and adds focused test coverage around the new classification/publish behavior.

Note

This branch is currently stacked on #2564. Until #2564 merges, this PR includes the parent fetch/parse commit in the diff.

Depends on #2564
Closes #2493
Refs #2494
Refs #2495

Add a standalone seeder that fetches and normalizes SEC, CFTC, Federal Reserve, FDIC, and FINRA regulatory feeds without introducing new dependencies.

The script stays import-safe, tolerates partial feed failure, and emits JSON for the fetch/parse-only phase of the pipeline. Unit tests cover RSS/Atom parsing, deduplication, ordering, and degraded-feed behavior.

Refs koala73#2492
Refs koala73#2493
Refs koala73#2494
Refs koala73#2495
Build on the standalone RSS fetcher by adding keyword-based tier classification, aggregate payload counts, and runSeed integration for regulatory:actions:v1.

The updated tests cover matched keywords, payload stats, and the runSeed wiring needed for Redis publication.

Refs koala73#2493
Depends on koala73#2564
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 30, 2026

@lspassos1 is attempting to deploy a commit to the Elie Team on Vercel.

A member of the Team first needs to authorize it.

lspassos1 added a commit to lspassos1/worldmonitor that referenced this pull request Mar 30, 2026
Add regulatory:actions:v1 as a new cross-source input, map regulatory actions into the policy category, and emit CROSS_SOURCE_SIGNAL_TYPE_REGULATORY_ACTION signals for recent high/medium items.

The new test covers severity scoring and composite escalation when policy, financial, and economic signals co-fire in Global Markets.

Refs koala73#2494
Depends on koala73#2567
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR completes the second step of the regulatory RSS pipeline by adding keyword-based tiering (high/medium/low) to normalized regulatory actions and publishing the resulting payload to regulatory:actions:v1 via runSeed with a 2-hour TTL. Classification uses word-boundary regex matching to avoid false positives (e.g. ban inside bank), and the final payload includes highCount/mediumCount aggregate fields for downstream consumers. The implementation is well-structured and the test coverage is thorough.

Key findings:

  • Missing bootstrap hydration (api/bootstrap.js) — AGENTS.md requires all new data sources to be wired for bootstrap hydration; regulatory:actions:v1 is absent from api/bootstrap.js, meaning first page loads will not receive regulatory data until a cache-miss fetch cycle completes.
  • fetch convention violation — All four functions that accept a fetchImpl parameter default to globalThis.fetch directly (fetchImpl = globalThis.fetch). AGENTS.md explicitly requires the (...args) => globalThis.fetch(...args) arrow-wrapper pattern instead. This is consistent with how other seed scripts in the repo invoke fetch inline via globalThis.fetch(url, opts).
  • FINRA feed over plain HTTP — All other feed URLs use HTTPS; the FINRA entry uses http://feeds.finra.org/FINRANotices.
  • Regex pre-compilationfindMatchedKeywords compiles a new RegExp for each keyword on every classifyAction call; since the keyword lists are constant, pre-compiling at module load would be a minor efficiency improvement.

Confidence Score: 4/5

  • Safe to merge after wiring bootstrap hydration in api/bootstrap.js; all other findings are style/convention issues.
  • One P1 finding: regulatory:actions:v1 is not wired in api/bootstrap.js, which AGENTS.md marks as a MUST for new data sources. Without it, the key is invisible to first-load hydration. The remaining findings (fetch convention, HTTP URL, regex pre-compilation) are all P2 style/best-practice items that do not affect runtime correctness in the Node.js Railway environment where these scripts execute.
  • scripts/seed-regulatory-actions.mjs — bootstrap wiring and fetch convention; api/bootstrap.js — needs regulatory:actions:v1 entry added.

Important Files Changed

Filename Overview
scripts/seed-regulatory-actions.mjs New regulatory seed script: fetches 5 agency RSS/Atom feeds, classifies actions as high/medium/low via word-boundary keyword matching, and publishes to regulatory:actions:v1 via runSeed. Issues: globalThis.fetch used as raw default parameter (AGENTS.md convention), FINRA feed uses plain HTTP, bootstrap hydration not wired in api/bootstrap.js.
tests/regulatory-seed-unit.test.mjs Thorough unit tests covering entity decoding, HTML stripping, RSS/Atom parsing, deduplication, classification tiers, payload construction, and runSeed wiring; uses vm.runInContext to test pure functions without live network or Redis calls.

Sequence Diagram

sequenceDiagram
    participant Cron as Railway Cron
    participant Main as main()
    participant Utils as _seed-utils / runSeed
    participant Feeds as Regulatory Feeds (5)
    participant Redis as Upstash Redis

    Cron->>Main: trigger seed-regulatory-actions.mjs
    Main->>Utils: runSeed('regulatory','actions','regulatory:actions:v1', fetchFn, opts)
    Utils->>Utils: acquireLock('regulatory:actions')
    Utils->>Main: invoke fetchFn()
    Main->>Main: fetchRegulatoryActionPayload()
    Main->>Feeds: Promise.allSettled(fetchFeed × 5)
    Feeds-->>Main: RSS/Atom XML responses
    Main->>Main: parseFeed() → normalizeFeedItems()
    Main->>Main: dedupeAndSortActions()
    Main->>Main: classifyAction() × N (high/medium/low)
    Main->>Main: buildSeedPayload() → {actions, fetchedAt, recordCount, highCount, mediumCount}
    Main-->>Utils: resolved payload
    Utils->>Redis: atomicPublish('regulatory:actions:v1', payload, TTL=7200s)
    Utils->>Redis: writeFreshnessMetadata('regulatory','actions')
    Utils->>Redis: releaseLock()
Loading

Reviews (1): Last reviewed commit: "feat(regulatory): classify and publish r..." | Re-trigger Greptile

Use the repository-standard fetch wrapper in the seeder defaults, keep the documented FINRA HTTP exception in place, and include publish time in generated action ids to avoid same-day collisions.

Validated with: node --test tests/regulatory-seed-unit.test.mjs; node scripts/seed-regulatory-actions.mjs | head -n 20
Clean up the leftover cherry-pick marker after carrying the shared seeder hardening changes onto this branch.

Validated with: node --test tests/regulatory-seed-unit.test.mjs and a local fetchRegulatoryActionPayload smoke check.
@lspassos1
Copy link
Copy Markdown
Collaborator Author

Follow-up on the bootstrap thread: I am intentionally leaving regulatory:actions:v1 out of api/bootstrap.js here. That key is an internal seed input for seed-cross-source-signals.mjs, not a client-facing bootstrap source. The UI hydrates crossSourceSignals directly, and that key is already in bootstrap. Adding regulatory:actions:v1 would only increase bootstrap payload size with unused data; it would not change first-load behavior for the cross-source panel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(regulatory): seed-regulatory-actions.mjs — classify + Redis write

1 participant