Skip to content

feat(simulation): add keyActorRoles to fix actor overlap bonus vocabulary mismatch#2582

Open
koala73 wants to merge 1 commit intomainfrom
feat/sim-actor-role-overlap
Open

feat(simulation): add keyActorRoles to fix actor overlap bonus vocabulary mismatch#2582
koala73 wants to merge 1 commit intomainfrom
feat/sim-actor-role-overlap

Conversation

@koala73
Copy link
Copy Markdown
Owner

@koala73 koala73 commented Mar 31, 2026

Summary

  • Root cause: The +0.04 actor overlap bonus in computeSimulationAdjustment has never reliably fired in production. stateSummary.actors uses role-category strings ('Commodity traders', 'Policy officials') while simulation keyActors uses named geo-political entities ('Iran', 'Houthi'). 53 production runs audited showed the bonus fired once.
  • Fix: Add keyActorRoles?: string[] to SimulationTopPath. The Round 2 prompt now includes a CANDIDATE ACTOR ROLES section with theater-local role vocab seeded from candidatePacket.stateSummary.actors. Overlap is scored against keyActorRoles when actorSource=stateSummary; the existing keyActors entity-overlap path is preserved for the affectedAssets fallback.
  • Backwards compat: Old sim output (no keyActorRoles) → roleOverlapCount=0 → no bonus → same as before. affectedAssets fallback path unchanged.

Changes

File What changed
scripts/seed-forecasts.types.d.ts keyActorRoles?: string[] on SimulationTopPath; roleOverlapCount/keyActorsOverlapCount on SimulationAdjustmentDetail and ScorecardSimDetail
scripts/seed-forecasts.mjs buildSimulationPackageFromDeepSnapshot: add actorRoles[] per theater; buildSimulationRound2SystemPrompt: inject CANDIDATE ACTOR ROLES section; tryParseSimulationRoundPayload: extract keyActorRoles; mergedPaths.map(): filter against theater.actorRoles guardrail; computeSimulationAdjustment: dual-path overlap scoring; summarizeImpactPathScore: project new fields into simDetail
tests/forecast-trace-export.test.mjs 308 tests (was 301): 7 fixture updates (T2/T-F/T-G/T-J/T-K/T-N2/T-SC-4), new tests T-P1/T-P2/T-P3 (prompt/parser), T-RO1/T-RO2/T-RO3 (role overlap logic), T-PKG1 (pkg builder actorRoles)

Design decisions

  1. Theater-scoped roles: actorRoles for each theater comes only from that theater's candidate — no cross-theater aggregation (avoids ECB/Germany roles bleeding into Red Sea theater)
  2. Conditional bonus path: actorSource=stateSummary → role overlap via keyActorRoles; actorSource=affectedAssets → entity overlap via keyActors (backwards compat)
  3. actorOverlapCount preserved as alias pointing to whichever overlap drove the bonus decision
  4. Guardrail in mergedPaths.map(): keyActorRoles filtered against theater.actorRoles at merge time — single enforcement point with both LLM output and allowed set in scope

Testing

  • node --test tests/forecast-trace-export.test.mjs → 308/308 pass
  • npm run test:data → 2676/2676 pass
  • npm run typecheck → clean
  • npm run typecheck:api → clean

Post-Deploy Monitoring & Validation

  • What to monitor
    • Logs: search actorOverlapCount, roleOverlapCount in sim-log artifacts — expect roleOverlapCount >= 2 in a non-trivial fraction of runs where stateSummary.actors is populated
    • Redis: forecast:simulation-outcome:latesttheaterResults[*].topPaths[*].keyActorRoles should be present and non-empty for theaters with populated candidate stateSummary.actors
  • Validation checks
    • redis-cli get forecast:simulation-outcome:latest | jq '.theaterResults[0].topPaths[0].keyActorRoles' — expect non-empty array after first new sim run
    • redis-cli get forecast:simulation-outcome:latest | jq '[.theaterResults[] | .topPaths[] | select(.keyActorRoles | length > 0)] | length' — expect > 0
  • Expected healthy behavior
    • roleOverlapCount >= 2 fires for theatres whose candidates have populated stateSummary.actors (role categories present)
    • actorOverlapCount alias equals roleOverlapCount for stateSummary path, keyActorsOverlapCount for affectedAssets path
    • Old sim artifacts (no keyActorRoles) continue to produce adjustment=0.08 (bucket+channel only) — no regression
  • Failure signals / rollback trigger
    • keyActorRoles consistently empty despite actorRoles being populated in pkg → investigate prompt injection in buildSimulationRound2SystemPrompt
    • roleOverlapCount never fires despite matching vocab → check normalization in computeSimulationAdjustment
    • Any crash in mergedPaths.map() → check theater.actorRoles availability in pkg
  • Validation window: first 3 simulation runs after deploy
  • Owner: @koala73

Compound Engineering v2.49.0
🤖 Generated with Claude Sonnet 4.6 (200K context) via Claude Code

…vocabulary mismatch

The +0.04 actor overlap bonus never reliably fired in production because
stateSummary.actors uses role-category strings ('Commodity traders',
'Policy officials') while simulation keyActors uses named geo-political
entities ('Iran', 'Houthi'). 53 production runs audited showed the bonus
fired once out of 53.

Fix: add keyActorRoles?: string[] to SimulationTopPath. The Round 2 prompt
now includes a CANDIDATE ACTOR ROLES section with theater-local role vocab
seeded from candidatePacket.stateSummary.actors. The LLM copies matching
roles into keyActorRoles. applySimulationMerge scores overlap against
keyActorRoles when actorSource=stateSummary, preserving the existing
keyActors entity-overlap path for the affectedAssets fallback.

- buildSimulationPackageFromDeepSnapshot: add actorRoles[] to each theater
  from candidate.stateSummary.actors (theater-scoped, no cross-theater noise)
- buildSimulationRound2SystemPrompt: inject CANDIDATE ACTOR ROLES section
  with exact-copy instruction and keyActorRoles in JSON template
- tryParseSimulationRoundPayload: extract keyActorRoles from round 2 output
- mergedPaths.map(): filter keyActorRoles against theater.actorRoles guardrail
- computeSimulationAdjustment: dual-path overlap — roleOverlapCount for
  stateSummary, keyActorsOverlapCount for affectedAssets (backwards compat)
- summarizeImpactPathScore: project roleOverlapCount + keyActorsOverlapCount
  into path-scorecards.json simDetail

New fields: roleOverlapCount, keyActorsOverlapCount in SimulationAdjustmentDetail
and ScorecardSimDetail. actorOverlapCount preserved as backwards-compat alias.

Tests: 308 pass (was 301 before). New tests T-P1/T-P2/T-P3 (prompt/parser),
T-RO1/T-RO2/T-RO3 (role overlap logic), T-PKG1 (pkg builder actorRoles),
plus fixture updates for T2/T-F/T-G/T-J/T-K/T-N2/T-SC-4.

🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.ai/claude-code) + Compound Engineering v2.49.0

Co-Authored-By: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
worldmonitor Ignored Ignored Mar 31, 2026 4:54pm

Request Review

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR fixes a long-standing bug where the +0.04 actor-overlap bonus in computeSimulationAdjustment never fired for macro-financial theaters because stateSummary.actors uses role-category strings (e.g. "Commodity traders") while keyActors uses geo-political entity names (e.g. "Iran"). The fix introduces a parallel keyActorRoles vocabulary on SimulationTopPath, seeds it from the candidate's stateSummary.actors via the Round 2 prompt, and routes the bonus decision through role overlap for the stateSummary path while preserving entity-overlap for the affectedAssets fallback path. Backwards compatibility for old sim artifacts (no keyActorRoles) is correctly handled — the bonus stays at zero via an empty simRoles set.

  • Core logic (computeSimulationAdjustment): dual-path scoring is clean and the actorOverlapCount backwards-compat alias correctly points to whichever overlap drove the bonus decision.
  • Prompt injection (buildSimulationRound2SystemPrompt): the CANDIDATE ACTOR ROLES section is only injected when theater.actorRoles is non-empty and the copy-verbatim instruction is clear; the omit-when-absent path is validated by T-P2.
  • Guardrail (mergedPaths.map()): keyActorRoles is filtered against theater.actorRoles at merge time — a single enforcement point that correctly handles the allowed.length === 0 no-op case safely (empty actorRoles implies empty stateSummary.actorsactorSrc ≠ stateSummary → role overlap path never executes).
  • Minor telemetry concern: the ?? d.actorOverlapCount fallback in summarizeImpactPathScore would misattribute old entity-overlap counts as roleOverlapCount if old stored simulationAdjustmentDetail objects are reprocessed. Changing to ?? 0 would match the documented expected behaviour for old data.
  • Test coverage: 7 new tests (T-PKG1/T-P1/T-P2/T-P3/T-RO1/T-RO2/T-RO3) and 7 fixture updates are thorough and correctly cover the graceful-degradation, backwards-compat, and role-category vocabulary paths.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style/telemetry quality improvements with no impact on scoring correctness or runtime behaviour.

The scoring logic is correct and fully backwards-compatible. Old sim artifacts (no keyActorRoles) correctly produce roleOverlapCount=0 and no bonus change at decision time. The single P2 comment (roleOverlapCount fallback) affects only scorecard telemetry for reprocessed old data — not any live scoring path. Test coverage is comprehensive with 308/308 passing.

No files require special attention — the one flagged line (seed-forecasts.mjs:4609) is a minor telemetry edge case.

Important Files Changed

Filename Overview
scripts/seed-forecasts.mjs Core logic changes: adds actorRoles to simulation package theaters, injects CANDIDATE ACTOR ROLES section in Round 2 prompt, introduces dual-path overlap scoring (role vs entity), and adds guardrail filter for keyActorRoles at merge time. All changes are well-structured with correct backwards-compatibility handling.
scripts/seed-forecasts.types.d.ts Adds keyActorRoles? to SimulationTopPath and roleOverlapCount/keyActorsOverlapCount to SimulationAdjustmentDetail and ScorecardSimDetail with accurate backwards-compat documentation. Missing a SimulationPackageTheater type to capture the new actorRoles field on the selectedTheaters entries.
tests/forecast-trace-export.test.mjs 7 fixture updates and 7 new tests (T-PKG1, T-P1, T-P2, T-P3, T-RO1, T-RO2, T-RO3) covering prompt injection, parser extraction, and role overlap logic including the graceful-degradation and affectedAssets backwards-compat paths. Coverage is thorough.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[candidatePacket.stateSummary.actors] -->|"non-empty → actorSrc=stateSummary"| B[candidateActors normalised]
    A -->|"empty → actorSrc=affectedAssets/none"| C[affectedAssets from expandedPath hops]

    D[sim topPath.keyActorRoles] -->|"normaliseActorName each"| E[simRoles Set]
    F[sim topPath.keyActors] -->|"normaliseActorName each"| G[simEntities Set]

    B -->|"actorSrc=stateSummary"| H["roleOverlap = candidateActors ∩ simRoles"]
    B -->|"always"| I["keyActorsOverlapCount = candidateActors ∩ simEntities"]
    C -->|"actorSrc=affectedAssets"| I

    H --> J{"bonusOverlap ≥ 2?"}
    I -->|"actorSrc=affectedAssets"| J

    J -->|"yes"| K["+0.04 × simConf bonus"]
    J -->|"no"| L["no actor bonus"]

    K --> M["details.actorOverlapCount = bonusOverlap (alias)"]
    L --> M
Loading

Reviews (1): Last reviewed commit: "feat(simulation): add keyActorRoles fiel..." | Re-trigger Greptile

stabilizerHit: Boolean(d.stabilizerHit),
bucketChannelMatch: Boolean(d.bucketChannelMatch),
actorOverlapCount: Number(d.actorOverlapCount),
roleOverlapCount: Number(d.roleOverlapCount ?? d.actorOverlapCount),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 roleOverlapCount fallback misattributes old entity-overlap data

The fallback ?? d.actorOverlapCount is intended to be defensive for old simulationAdjustmentDetail objects that pre-date this PR, but it has the opposite effect: pre-PR, actorOverlapCount was the entity-space overlap count. Any reprocessed old scorecard would report that value as roleOverlapCount, incorrectly classifying entity overlap as role-category overlap.

This conflicts with the PR's own monitoring guidance ("Old sim output → roleOverlapCount=0"). Since computeSimulationAdjustment now always initialises roleOverlapCount: 0, the fallback is only reached for genuinely old stored artifacts. Using ?? 0 is both more accurate and matches the documented expected behaviour:

Suggested change
roleOverlapCount: Number(d.roleOverlapCount ?? d.actorOverlapCount),
roleOverlapCount: Number(d.roleOverlapCount ?? 0),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant