feat(regulatory): add regulatory RSS fetch seeder#2564
feat(regulatory): add regulatory RSS fetch seeder#2564lspassos1 wants to merge 2 commits intokoala73:mainfrom
Conversation
Add a standalone seeder that fetches and normalizes SEC, CFTC, Federal Reserve, FDIC, and FINRA regulatory feeds without introducing new dependencies. The script stays import-safe, tolerates partial feed failure, and emits JSON for the fetch/parse-only phase of the pipeline. Unit tests cover RSS/Atom parsing, deduplication, ordering, and degraded-feed behavior. Refs koala73#2492 Refs koala73#2493 Refs koala73#2494 Refs koala73#2495
|
@lspassos1 is attempting to deploy a commit to the Elie Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR adds the first step of the regulatory-actions pipeline: Key findings:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant CLI as CLI / Importer
participant main
participant fetchAllFeeds
participant fetchFeed
participant Agency as Agency Feed (SEC/CFTC/Fed/FDIC/FINRA)
CLI->>main: node seed-regulatory-actions.mjs
main->>fetchAllFeeds: fetchAllFeeds(globalThis.fetch)
fetchAllFeeds->>fetchAllFeeds: Promise.allSettled(feeds.map(fetchFeed))
par Concurrent fetch
fetchAllFeeds->>fetchFeed: fetchFeed(SEC, fetch)
fetchFeed->>Agency: GET pressreleases.rss (WorldMonitor UA)
Agency-->>fetchFeed: RSS XML
fetchFeed->>fetchFeed: parseFeed → normalizeFeedItems
fetchFeed-->>fetchAllFeeds: RegulatoryAction[]
and
fetchAllFeeds->>fetchFeed: fetchFeed(CFTC, fetch)
fetchFeed->>Agency: GET rssenf.xml (Chrome UA)
Agency-->>fetchFeed: RSS XML
fetchFeed-->>fetchAllFeeds: RegulatoryAction[]
and
fetchAllFeeds->>fetchFeed: fetchFeed(FINRA, fetch)
fetchFeed->>Agency: GET FINRANotices (http⚠️)
Agency-->>fetchFeed: RSS XML
fetchFeed-->>fetchAllFeeds: RegulatoryAction[]
end
fetchAllFeeds->>fetchAllFeeds: dedupeAndSortActions (by URL, newest first)
alt successCount === 0
fetchAllFeeds-->>main: throw "All regulatory feeds failed"
main-->>CLI: process.exit(1)
else at least one succeeded
fetchAllFeeds-->>main: RegulatoryAction[] (sorted)
main-->>CLI: stdout JSON
end
Reviews (1): Last reviewed commit: "feat(regulatory): add regulatory RSS fet..." | Re-trigger Greptile |
Build on the standalone RSS fetcher by adding keyword-based tier classification, aggregate payload counts, and runSeed integration for regulatory:actions:v1. The updated tests cover matched keywords, payload stats, and the runSeed wiring needed for Redis publication. Refs koala73#2493 Depends on koala73#2564
Use the repository-standard fetch wrapper in the seeder defaults, keep the documented FINRA HTTP exception in place, and include publish time in generated action ids to avoid same-day collisions. Validated with: node --test tests/regulatory-seed-unit.test.mjs; node scripts/seed-regulatory-actions.mjs | head -n 20
Summary
This adds the first regulatory RSS pipeline step for
seed-regulatory-actions.mjs: fetch live SEC, CFTC, Federal Reserve, FDIC, and FINRA feeds concurrently, parse RSS/Atom natively, normalize the output, and emit JSON for the fetch/parse-only phase.Root cause
The repository did not have a regulatory-actions seeder yet, and several URLs from the initial issue description are no longer the live official feed endpoints. Without a dedicated fetch/parse layer, the rest of the regulatory pipeline cannot be built on stable input.
Changes
scripts/seed-regulatory-actions.mjsas an import-safe standalone seeder with concurrent fetch, partial-failure tolerance, native RSS/Atom parsing, deterministic IDs, deduplication, and sorted normalized outputUser-Agentonly for SEC because the current SEC endpoint rejects generic browser spoofingtests/regulatory-seed-unit.test.mjscovering RSS/Atom parsing, href extraction, HTML cleanup, deduplication, ordering, partial failure, and all-feeds-fail behaviorValidation
node --test tests/regulatory-seed-unit.test.mjsnode scripts/seed-regulatory-actions.mjs | head -n 40node -e "import('./scripts/seed-regulatory-actions.mjs').then(() => process.stdout.write('import-ok\\n'))"Risk
Low risk. This PR only adds a new standalone script and a focused unit test; it does not write to Redis or change runtime application behavior yet.
Closes #2492
Refs #2493
Refs #2494
Refs #2495