feat(simulation): add keyActorRoles to fix actor overlap bonus vocabulary mismatch#2582
feat(simulation): add keyActorRoles to fix actor overlap bonus vocabulary mismatch#2582
Conversation
…vocabulary mismatch
The +0.04 actor overlap bonus never reliably fired in production because
stateSummary.actors uses role-category strings ('Commodity traders',
'Policy officials') while simulation keyActors uses named geo-political
entities ('Iran', 'Houthi'). 53 production runs audited showed the bonus
fired once out of 53.
Fix: add keyActorRoles?: string[] to SimulationTopPath. The Round 2 prompt
now includes a CANDIDATE ACTOR ROLES section with theater-local role vocab
seeded from candidatePacket.stateSummary.actors. The LLM copies matching
roles into keyActorRoles. applySimulationMerge scores overlap against
keyActorRoles when actorSource=stateSummary, preserving the existing
keyActors entity-overlap path for the affectedAssets fallback.
- buildSimulationPackageFromDeepSnapshot: add actorRoles[] to each theater
from candidate.stateSummary.actors (theater-scoped, no cross-theater noise)
- buildSimulationRound2SystemPrompt: inject CANDIDATE ACTOR ROLES section
with exact-copy instruction and keyActorRoles in JSON template
- tryParseSimulationRoundPayload: extract keyActorRoles from round 2 output
- mergedPaths.map(): filter keyActorRoles against theater.actorRoles guardrail
- computeSimulationAdjustment: dual-path overlap — roleOverlapCount for
stateSummary, keyActorsOverlapCount for affectedAssets (backwards compat)
- summarizeImpactPathScore: project roleOverlapCount + keyActorsOverlapCount
into path-scorecards.json simDetail
New fields: roleOverlapCount, keyActorsOverlapCount in SimulationAdjustmentDetail
and ScorecardSimDetail. actorOverlapCount preserved as backwards-compat alias.
Tests: 308 pass (was 301 before). New tests T-P1/T-P2/T-P3 (prompt/parser),
T-RO1/T-RO2/T-RO3 (role overlap logic), T-PKG1 (pkg builder actorRoles),
plus fixture updates for T2/T-F/T-G/T-J/T-K/T-N2/T-SC-4.
🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.ai/claude-code) + Compound Engineering v2.49.0
Co-Authored-By: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Greptile SummaryThis PR fixes a long-standing bug where the
Confidence Score: 5/5Safe to merge — all remaining findings are P2 style/telemetry quality improvements with no impact on scoring correctness or runtime behaviour. The scoring logic is correct and fully backwards-compatible. Old sim artifacts (no keyActorRoles) correctly produce roleOverlapCount=0 and no bonus change at decision time. The single P2 comment (roleOverlapCount fallback) affects only scorecard telemetry for reprocessed old data — not any live scoring path. Test coverage is comprehensive with 308/308 passing. No files require special attention — the one flagged line (seed-forecasts.mjs:4609) is a minor telemetry edge case. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[candidatePacket.stateSummary.actors] -->|"non-empty → actorSrc=stateSummary"| B[candidateActors normalised]
A -->|"empty → actorSrc=affectedAssets/none"| C[affectedAssets from expandedPath hops]
D[sim topPath.keyActorRoles] -->|"normaliseActorName each"| E[simRoles Set]
F[sim topPath.keyActors] -->|"normaliseActorName each"| G[simEntities Set]
B -->|"actorSrc=stateSummary"| H["roleOverlap = candidateActors ∩ simRoles"]
B -->|"always"| I["keyActorsOverlapCount = candidateActors ∩ simEntities"]
C -->|"actorSrc=affectedAssets"| I
H --> J{"bonusOverlap ≥ 2?"}
I -->|"actorSrc=affectedAssets"| J
J -->|"yes"| K["+0.04 × simConf bonus"]
J -->|"no"| L["no actor bonus"]
K --> M["details.actorOverlapCount = bonusOverlap (alias)"]
L --> M
Reviews (1): Last reviewed commit: "feat(simulation): add keyActorRoles fiel..." | Re-trigger Greptile |
| stabilizerHit: Boolean(d.stabilizerHit), | ||
| bucketChannelMatch: Boolean(d.bucketChannelMatch), | ||
| actorOverlapCount: Number(d.actorOverlapCount), | ||
| roleOverlapCount: Number(d.roleOverlapCount ?? d.actorOverlapCount), |
There was a problem hiding this comment.
roleOverlapCount fallback misattributes old entity-overlap data
The fallback ?? d.actorOverlapCount is intended to be defensive for old simulationAdjustmentDetail objects that pre-date this PR, but it has the opposite effect: pre-PR, actorOverlapCount was the entity-space overlap count. Any reprocessed old scorecard would report that value as roleOverlapCount, incorrectly classifying entity overlap as role-category overlap.
This conflicts with the PR's own monitoring guidance ("Old sim output → roleOverlapCount=0"). Since computeSimulationAdjustment now always initialises roleOverlapCount: 0, the fallback is only reached for genuinely old stored artifacts. Using ?? 0 is both more accurate and matches the documented expected behaviour:
| roleOverlapCount: Number(d.roleOverlapCount ?? d.actorOverlapCount), | |
| roleOverlapCount: Number(d.roleOverlapCount ?? 0), |
Summary
+0.04actor overlap bonus incomputeSimulationAdjustmenthas never reliably fired in production.stateSummary.actorsuses role-category strings ('Commodity traders','Policy officials') while simulationkeyActorsuses named geo-political entities ('Iran','Houthi'). 53 production runs audited showed the bonus fired once.keyActorRoles?: string[]toSimulationTopPath. The Round 2 prompt now includes aCANDIDATE ACTOR ROLESsection with theater-local role vocab seeded fromcandidatePacket.stateSummary.actors. Overlap is scored againstkeyActorRoleswhenactorSource=stateSummary; the existingkeyActorsentity-overlap path is preserved for theaffectedAssetsfallback.keyActorRoles) →roleOverlapCount=0→ no bonus → same as before.affectedAssetsfallback path unchanged.Changes
scripts/seed-forecasts.types.d.tskeyActorRoles?: string[]onSimulationTopPath;roleOverlapCount/keyActorsOverlapCountonSimulationAdjustmentDetailandScorecardSimDetailscripts/seed-forecasts.mjsbuildSimulationPackageFromDeepSnapshot: addactorRoles[]per theater;buildSimulationRound2SystemPrompt: inject CANDIDATE ACTOR ROLES section;tryParseSimulationRoundPayload: extractkeyActorRoles;mergedPaths.map(): filter againsttheater.actorRolesguardrail;computeSimulationAdjustment: dual-path overlap scoring;summarizeImpactPathScore: project new fields into simDetailtests/forecast-trace-export.test.mjsDesign decisions
actorRolesfor each theater comes only from that theater's candidate — no cross-theater aggregation (avoids ECB/Germany roles bleeding into Red Sea theater)actorSource=stateSummary→ role overlap viakeyActorRoles;actorSource=affectedAssets→ entity overlap viakeyActors(backwards compat)actorOverlapCountpreserved as alias pointing to whichever overlap drove the bonus decisionmergedPaths.map():keyActorRolesfiltered againsttheater.actorRolesat merge time — single enforcement point with both LLM output and allowed set in scopeTesting
node --test tests/forecast-trace-export.test.mjs→ 308/308 passnpm run test:data→ 2676/2676 passnpm run typecheck→ cleannpm run typecheck:api→ cleanPost-Deploy Monitoring & Validation
actorOverlapCount,roleOverlapCountin sim-log artifacts — expectroleOverlapCount >= 2in a non-trivial fraction of runs wherestateSummary.actorsis populatedforecast:simulation-outcome:latest—theaterResults[*].topPaths[*].keyActorRolesshould be present and non-empty for theaters with populated candidatestateSummary.actorsredis-cli get forecast:simulation-outcome:latest | jq '.theaterResults[0].topPaths[0].keyActorRoles'— expect non-empty array after first new sim runredis-cli get forecast:simulation-outcome:latest | jq '[.theaterResults[] | .topPaths[] | select(.keyActorRoles | length > 0)] | length'— expect > 0roleOverlapCount >= 2fires for theatres whose candidates have populatedstateSummary.actors(role categories present)actorOverlapCountalias equalsroleOverlapCountfor stateSummary path,keyActorsOverlapCountfor affectedAssets pathkeyActorRoles) continue to produceadjustment=0.08(bucket+channel only) — no regressionkeyActorRolesconsistently empty despiteactorRolesbeing populated in pkg → investigate prompt injection inbuildSimulationRound2SystemPromptroleOverlapCountnever fires despite matching vocab → check normalization incomputeSimulationAdjustmentmergedPaths.map()→ checktheater.actorRolesavailability in pkg🤖 Generated with Claude Sonnet 4.6 (200K context) via Claude Code