feat: MiroFish Phase 1+2 improvements - sim quality, ops, report quality by lilubot · Pull Request #1 · lilubot/MiroFish-Offline

lilubot · 2026-03-20T12:43:20Z

Summary

Four commits of improvements across simulation quality, ops reliability, and report quality. All changes are additive - zero existing functionality broken.

?? Perseus - Phase 1 Sim Quality (commit `ade8ab8`)

NER_MODEL config option (default: minimax-m2.7:cloud)
_deduplicate_entities() in entity_reader.py - fuzzy dedup via SequenceMatcher (threshold 0.85)
PlatformSimulator ABC + TwitterSimulator + RedditSimulator (platform_simulators.py)
SimulationOutcome dataclass + OutcomeScorer class (outcome_scorer.py)

?? Prometheus - Phase 1 Ops & Reliability (commit `b509ecb`)

/api/health - Neo4j + Ollama + uploads checks, healthy/degraded/unhealthy
/api/health/ready - strict readiness gate for Prometheus before starting sims
preflight.py - 5 startup checks, logs warnings, never blocks startup
/api/ops/simulation/<id>/poll - machine-readable: terminal, success, stalled, percent_complete

?? Calypso - Phase 1 Report Quality (commit `becc6e5`)

English-only enforcement in PLAN_SYSTEM_PROMPT + SECTION_SYSTEM_PROMPT_TEMPLATE
_validate_section_output() - strips corruption, warns on high non-ASCII ratio
_run_report_sanity_check() - post-generation integrity check
executive_summary.py - ExecutiveSummary dataclass + LLM extraction ? JSON + Markdown card

?? Perseus - Phase 2 Sim Quality (commit `345acae`)

Behavioral variance fields on OasisAgentProfile: conformity, reactiveness, persuadability, behavioral_noise
FatigueState - per-agent fatigue tracking with decay
ArchetypePersonaGenerator - 5 archetypes with Gaussian jitter + LLM bio generation
NarrativeFrame + MultiFrameInitializer - competing narrative frames at t=0 with agent distribution

…ators, outcome scorer - Add NER_MODEL config option to Config (default: minimax-m2.7:cloud) - Add _deduplicate_entities() to EntityReader using SequenceMatcher (threshold 0.85) - Add PlatformSimulator base class + TwitterSimulator + RedditSimulator (platform_simulators.py) - Add SimulationOutcome dataclass + OutcomeScorer class (outcome_scorer.py) All changes are additive only. No existing functionality modified. Implements Phase 1 of MIROFISH_MEMO_SIM_QUALITY.md roadmap.

coderabbitai · 2026-03-20T12:43:37Z

📝 Walkthrough

Walkthrough

Adds health/ops endpoints and startup preflight checks; introduces entity deduplication, platform simulators, outcome scoring, executive-summary and persona generators; changes report endpoint to require Neo4j extension and to pass GraphToolsService into ReportAgent; removes Ollama from docker-compose; adds NER model config.

Changes

Cohort / File(s)	Summary
App entry, preflight & routes `backend/app/__init__.py`, `backend/app/preflight.py`, `backend/app/routes/__init__.py`, `backend/app/routes/health.py`, `backend/app/routes/ops.py`	Register new health and ops blueprints; run and store startup preflight checks; add `/api/health` and `/api/health/ready` and `/api/ops/simulation/<id>/poll` handlers with component probes, readiness/stall logic, and preflight aggregation.
Report API & ReportAgent changes `backend/app/api/report.py`, `backend/app/services/report_agent.py`	`generate_report()` now obtains `neo4j_storage` from `current_app.extensions`, returns 500 if missing, and constructs `GraphToolsService(storage=storage)` passed into `ReportAgent` via `graph_tools`. `ReportAgent` adds per-section and full-report sanitization and sanity checks.
Configuration & exports `backend/app/config.py`, `backend/app/services/__init__.py`	Added `Config.NER_MODEL` env-backed setting. Export list expanded to include new simulation/outcome and platform simulator symbols.
Entity reading & deduplication `backend/app/services/entity_reader.py`	`EntityReader.__init__` signature extended with `ner_model` and `deduplication_threshold`. Added deduplication pass using `SequenceMatcher`, canonical merging of labels/summaries/attributes, and unique merging of related nodes/edges.
Platform simulators `backend/app/services/platform_simulators.py`	New `PlatformSimulator` base class and concrete `TwitterSimulator` and `RedditSimulator` with visibility scoring and engagement-multiplier formulas.
Outcome scoring `backend/app/services/outcome_scorer.py`	New `SimulationOutcome` dataclass and `OutcomeScorer` with `score_simulation()` computing totals, engagement, reach, viral counts, sentiment SD, platform distribution, top influencers, and peak activity round.
Executive summary generator `backend/app/services/executive_summary.py`	New `ExecutiveSummary` dataclass and `ExecutiveSummaryGenerator` that calls LLM (chat_json/chat fallback), normalizes fields, and writes JSON/MD artifacts.
Persona & fatigue tooling `backend/app/services/oasis_profile_generator.py`	Extended `OasisAgentProfile` with phase‑2 behavioral fields; added `FatigueState` and `ArchetypePersonaGenerator` to synthesize personas with behavioral variance.
Simulation framing `backend/app/services/simulation_config_generator.py`	Added `NarrativeFrame` dataclass and `MultiFrameInitializer` to generate/distribute competing narrative frames (LLM or rule-based) and assign agents to frames.
Docker Compose cleanup `docker-compose.yml`	Removed `ollama` service and `ollama_data` volume; dropped `mirofish` dependency on Ollama; kept Neo4j service/volumes and adjusted commented lines.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through code with nimble paws,
I stitch the graphs and check the laws,
Deduped names and platforms sing,
Reports now call the Neo4j ring,
Ollama left — but oh, what paws! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 61.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes across three phases (sim quality, ops, report quality), aligning with the detailed PR objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch atlas-hardening

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-20T12:43:43Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request marks Phase 1 of the simulation quality improvements, focusing on enhancing the realism and analytical capabilities of the MiroFish simulation. It introduces a modular system for platform-specific content behavior and comprehensive outcome scoring, alongside refining entity processing to improve data quality. These changes are designed to be additive and pluggable, setting the foundation for future simulation enhancements without altering existing core logic.

Highlights

Platform-Specific Simulation: Introduced an abstract base class PlatformSimulator with concrete implementations for Twitter and Reddit, enabling platform-specific post visibility and engagement scoring based on distinct platform behaviors.
Simulation Outcome Scoring: Added OutcomeScorer to compute key simulation metrics such as total posts, engagement, average reach, viral posts, sentiment deviation, platform distribution, and top influencers, providing a quantitative overview of simulation runs.
Entity Deduplication: Implemented fuzzy entity deduplication in EntityReader using SequenceMatcher to consolidate similar entities, improving data quality and consistency.
Configuration Update: Configured a new NER_MODEL option in the application settings to specify the Named Entity Recognition model.
Infrastructure Streamlining: Removed the Ollama service and its related configurations from docker-compose.yml, streamlining the local development environment.
Report Agent Enhancement: Integrated GraphToolsService into the ReportAgent for enhanced graph interaction during report generation, ensuring proper context for background tasks.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Sorry @lilubot, you have reached your weekly rate limit of 1500000 diff characters.

Please try again later or upgrade to continue using Sourcery

gemini-code-assist · 2026-03-20T12:47:52Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docker-compose.yml (1)
2-15: ⚠️ Potential issue | 🟠 Major

Activate Docker-mode LLM/embedding endpoints in docker-compose.yml.

The mirofish service defaults to localhost:11434 for both LLM and embedding endpoints (from backend/app/config.py), but inside a container, localhost resolves to the container itself, not the host. After removing the in-compose ollama service, these defaults will fail unless explicitly overridden.

The project already documents Docker overrides in .env.example (commented out with the note "Uncomment these when running ALL services inside Docker"):
LLM_BASE_URL=http://ollama:11434/v1
EMBEDDING_BASE_URL=http://ollama:11434
However, since the ollama service no longer exists in the compose stack, activate these overrides using extra_hosts to map the container hostname, or update .env.example to reflect the new offline-only setup.
Suggested wiring for host.docker.internal
   mirofish:
     build: .
     container_name: mirofish-offline
     env_file:
       - .env
+    extra_hosts:
+      - "host.docker.internal:host-gateway"
+    environment:
+      LLM_BASE_URL: ${LLM_BASE_URL:-http://host.docker.internal:11434/v1}
+      EMBEDDING_BASE_URL: ${EMBEDDING_BASE_URL:-http://host.docker.internal:11434}
     ports:
       - "3000:3000"
       - "5001:5001"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 2 - 15, The mirofish service is using
host-local LLM/embedding defaults that won't resolve inside the container; add
an extra_hosts mapping under the service to point the ollama hostname to the
Docker host (e.g., add extra_hosts: - "ollama:host-gateway" to the mirofish
service) and ensure the .env or .env.example contains the override variables
LLM_BASE_URL=http://ollama:11434/v1 and EMBEDDING_BASE_URL=http://ollama:11434
so the app (config.py) uses those container-resolvable endpoints.

🧹 Nitpick comments (1)

backend/app/services/outcome_scorer.py (1)

129-134: Stabilize tied influencer rankings.

Equal engagement scores currently inherit input order, so the same simulation can emit a different top_influencers list when records arrive in a different order. A secondary name sort keeps reports reproducible.

♻️ Proposed refactor

         ranked = sorted(
             influence_scores.items(),
-            key=lambda item: item[1],
-            reverse=True,
+            key=lambda item: (-item[1], item[0].lower()),
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/outcome_scorer.py` around lines 129 - 134, The current
sort of influence_scores in outcome_scorer.py only keys by score so ties
preserve unpredictable input order; update the sorted call that produces ranked
(the block using influence_scores.items() and influencer_limit) to use a
deterministic secondary key of the influencer name (e.g., sort by score desc
then name asc) so tied engagement scores are reproducibly ordered in
top_influencers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/services/entity_reader.py`:
- Around line 79-87: Validate deduplication_threshold in EntityReader.__init__:
check that the provided deduplication_threshold is between 0.0 and 1.0
(inclusive or exclusive per project convention) and raise a ValueError with a
clear message if out of range; update the __init__ signature in entity_reader.py
to perform this early validation (before assigning to
self.deduplication_threshold) so invalid values cannot silently alter dedup
behavior.
- Around line 408-410: In _merge_entity_into_canonical, avoid sorting merged
labels because get_entity_type is order-dependent; instead preserve
canonical.labels order and append any labels from duplicate.labels that are not
already present (dedupe while keeping canonical order), so update
canonical.labels to canonical.labels + [l for l in duplicate.labels if l not in
canonical.labels]; this ensures get_entity_type and downstream
entity_types_found logic remain stable and later same-type merges still match
the canonical record.
- Around line 417-421: The merge loop treats falsy canonical values as missing;
update the condition in the block that assigns merged_attributes from
duplicate.attributes (where canonical and duplicate are used) to only treat
truly absent/undefined values as missing (e.g., check for key not in
merged_attributes or merged_attributes[key] is None) instead of using a
truthiness test like not merged_attributes[key], so valid values like 0 or False
are preserved.

In `@backend/app/services/outcome_scorer.py`:
- Around line 62-71: The engagement sum in outcome_scorer.py currently adds both
alias fields (e.g., "retweets" and "reposts", "comments" and "replies") causing
double-counting; update the aggregation logic around components/_value so
aliases are deduplicated by mapping aliases to a canonical metric (or taking the
max of alias pairs) before summing—e.g., treat "retweets"/"reposts" as one
metric and "comments"/"replies" as one metric, compute a single value for each
canonical interaction, then return the int(sum(...)) of those canonical values.

---

Outside diff comments:
In `@docker-compose.yml`:
- Around line 2-15: The mirofish service is using host-local LLM/embedding
defaults that won't resolve inside the container; add an extra_hosts mapping
under the service to point the ollama hostname to the Docker host (e.g., add
extra_hosts: - "ollama:host-gateway" to the mirofish service) and ensure the
.env or .env.example contains the override variables
LLM_BASE_URL=http://ollama:11434/v1 and EMBEDDING_BASE_URL=http://ollama:11434
so the app (config.py) uses those container-resolvable endpoints.

---

Nitpick comments:
In `@backend/app/services/outcome_scorer.py`:
- Around line 129-134: The current sort of influence_scores in outcome_scorer.py
only keys by score so ties preserve unpredictable input order; update the sorted
call that produces ranked (the block using influence_scores.items() and
influencer_limit) to use a deterministic secondary key of the influencer name
(e.g., sort by score desc then name asc) so tied engagement scores are
reproducibly ordered in top_influencers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54cfec22-5bf5-4934-ae58-2923d075ad18

📥 Commits

Reviewing files that changed from the base of the PR and between f47fa5c and ade8ab8.

📒 Files selected for processing (7)

backend/app/api/report.py
backend/app/config.py
backend/app/services/__init__.py
backend/app/services/entity_reader.py
backend/app/services/outcome_scorer.py
backend/app/services/platform_simulators.py
docker-compose.yml

coderabbitai · 2026-03-20T12:52:49Z

backend/app/services/entity_reader.py

+    def __init__(
+        self,
+        storage: GraphStorage,
+        ner_model: Optional[str] = None,
+        deduplication_threshold: float = 0.85,
+    ):
        self.storage = storage
+        self.ner_model = ner_model or Config.NER_MODEL
+        self.deduplication_threshold = deduplication_threshold


⚠️ Potential issue | 🟡 Minor

Reject out-of-range similarity thresholds early.

An invalid deduplication_threshold silently changes dedup behavior: values above 1 prevent merges, and values below 0 merge every same-type pair.

💡 Suggested fix

def __init__( self, storage: GraphStorage, ner_model: Optional[str] = None, deduplication_threshold: float = 0.85, ): self.storage = storage self.ner_model = ner_model or Config.NER_MODEL + if not 0.0 <= deduplication_threshold <= 1.0: + raise ValueError("deduplication_threshold must be between 0.0 and 1.0") self.deduplication_threshold = deduplication_threshold

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/entity_reader.py` around lines 79 - 87, Validate deduplication_threshold in EntityReader.__init__: check that the provided deduplication_threshold is between 0.0 and 1.0 (inclusive or exclusive per project convention) and raise a ValueError with a clear message if out of range; update the __init__ signature in entity_reader.py to perform this early validation (before assigning to self.deduplication_threshold) so invalid values cannot silently alter dedup behavior.

coderabbitai · 2026-03-20T12:52:49Z

backend/app/services/entity_reader.py

+    def _merge_entity_into_canonical(self, canonical: EntityNode, duplicate: EntityNode) -> None:
+        """Merge duplicate entity details into the canonical entity in place."""
+        canonical.labels = sorted(set(canonical.labels) | set(duplicate.labels))


⚠️ Potential issue | 🟠 Major

Preserve the canonical label order here.

get_entity_type() is order-dependent, so sorting the merged labels can change an entity from its original filtered type to some other label mid-dedup. That leaks into the recomputed entity_types_found on Lines 247-250 and can also stop later same-type matches from merging into the same canonical record.

💡 Suggested fix

- canonical.labels = sorted(set(canonical.labels) | set(duplicate.labels)) + canonical.labels = list(dict.fromkeys(canonical.labels + duplicate.labels))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _merge_entity_into_canonical(self, canonical: EntityNode, duplicate: EntityNode) -> None:

"""Merge duplicate entity details into the canonical entity in place."""

canonical.labels = sorted(set(canonical.labels) | set(duplicate.labels))

def _merge_entity_into_canonical(self, canonical: EntityNode, duplicate: EntityNode) -> None:

"""Merge duplicate entity details into the canonical entity in place."""

canonical.labels = list(dict.fromkeys(canonical.labels + duplicate.labels))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/entity_reader.py` around lines 408 - 410, In _merge_entity_into_canonical, avoid sorting merged labels because get_entity_type is order-dependent; instead preserve canonical.labels order and append any labels from duplicate.labels that are not already present (dedupe while keeping canonical order), so update canonical.labels to canonical.labels + [l for l in duplicate.labels if l not in canonical.labels]; this ensures get_entity_type and downstream entity_types_found logic remain stable and later same-type merges still match the canonical record.

coderabbitai · 2026-03-20T12:52:49Z

backend/app/services/entity_reader.py

+        merged_attributes = dict(canonical.attributes)
+        for key, value in duplicate.attributes.items():
+            if key not in merged_attributes or not merged_attributes[key]:
+                merged_attributes[key] = value
+        canonical.attributes = merged_attributes


⚠️ Potential issue | 🟠 Major

Don't treat falsy attribute values as missing.

not merged_attributes[key] overwrites valid canonical values like 0 and False during deduplication, which silently corrupts merged facts.

💡 Suggested fix

merged_attributes = dict(canonical.attributes) for key, value in duplicate.attributes.items(): - if key not in merged_attributes or not merged_attributes[key]: + current_value = merged_attributes.get(key) + if key not in merged_attributes or current_value is None or current_value == "": merged_attributes[key] = value

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/entity_reader.py` around lines 417 - 421, The merge loop treats falsy canonical values as missing; update the condition in the block that assigns merged_attributes from duplicate.attributes (where canonical and duplicate are used) to only treat truly absent/undefined values as missing (e.g., check for key not in merged_attributes or merged_attributes[key] is None) instead of using a truthiness test like not merged_attributes[key], so valid values like 0 or False are preserved.

coderabbitai · 2026-03-20T12:52:49Z

backend/app/services/outcome_scorer.py

+        components = (
+            self._value(post, "likes", 0),
+            self._value(post, "upvotes", 0),
+            self._value(post, "comments", 0),
+            self._value(post, "replies", 0),
+            self._value(post, "shares", 0),
+            self._value(post, "retweets", 0),
+            self._value(post, "reposts", 0),
+        )
+        return int(sum(int(component or 0) for component in components))


⚠️ Potential issue | 🟠 Major

Avoid double-counting alias metrics in engagement totals.

backend/app/services/platform_simulators.py lines 47-48 and 77-80 treat retweets/reposts and comments/replies as alternate names for the same interaction. Summing both here will inflate total_engagement, viral_posts_count, average_post_reach when it falls back to engagement, and top_influencers as soon as a post carries both aliases.

🛠️ Proposed fix

def _engagement_value(self, post: Any) -> int: direct_value = self._value(post, "engagement", None) if direct_value is not None: return int(direct_value) + comment_count = max( + int(self._value(post, "comments", 0) or 0), + int(self._value(post, "replies", 0) or 0), + ) + repost_count = max( + int(self._value(post, "retweets", 0) or 0), + int(self._value(post, "reposts", 0) or 0), + ) + components = ( self._value(post, "likes", 0), self._value(post, "upvotes", 0), - self._value(post, "comments", 0), - self._value(post, "replies", 0), + comment_count, self._value(post, "shares", 0), - self._value(post, "retweets", 0), - self._value(post, "reposts", 0), + repost_count, ) return int(sum(int(component or 0) for component in components))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

components = (

self._value(post, "likes", 0),

self._value(post, "upvotes", 0),

self._value(post, "comments", 0),

self._value(post, "replies", 0),

self._value(post, "shares", 0),

self._value(post, "retweets", 0),

self._value(post, "reposts", 0),

)

return int(sum(int(component or 0) for component in components))

comment_count = max(

int(self._value(post, "comments", 0) or 0),

int(self._value(post, "replies", 0) or 0),

)

repost_count = max(

int(self._value(post, "retweets", 0) or 0),

int(self._value(post, "reposts", 0) or 0),

)

components = (

self._value(post, "likes", 0),

self._value(post, "upvotes", 0),

comment_count,

self._value(post, "shares", 0),

repost_count,

)

return int(sum(int(component or 0) for component in components))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/outcome_scorer.py` around lines 62 - 71, The engagement sum in outcome_scorer.py currently adds both alias fields (e.g., "retweets" and "reposts", "comments" and "replies") causing double-counting; update the aggregation logic around components/_value so aliases are deduplicated by mapping aliases to a canonical metric (or taking the max of alias pairs) before summing—e.g., treat "retweets"/"reposts" as one metric and "comments"/"replies" as one metric, compute a single value for each canonical interaction, then return the int(sum(...)) of those canonical values.

- backend/app/routes/health.py: /api/health (Neo4j + Ollama + uploads component check, healthy/degraded/unhealthy), /api/health/ready (strict readiness gate for Prometheus pre-simulation checks) - backend/app/preflight.py: run_preflight_checks() — checks Neo4j, Ollama, required model, uploads dir, env vars on startup; logs warnings but never blocks startup - backend/app/routes/ops.py: /api/ops/simulation/<id>/poll — returns terminal, success, status, percent_complete, stalled, stalled_since fields for machine-readable Prometheus polling - backend/app/routes/__init__.py: package init exporting health_bp, ops_bp - backend/app/__init__.py: register health_bp and ops_bp; call run_preflight_checks() at startup and store result in app.extensions['preflight'] for /api/health to surface Additive only — no existing endpoints modified.

…lidation, executive summary Changes to backend/app/services/report_agent.py: - Confirmed all open() file writes already use encoding='utf-8' - Added CRITICAL LANGUAGE REQUIREMENT to PLAN_SYSTEM_PROMPT (English-only output) - Added CRITICAL LANGUAGE REQUIREMENT to SECTION_SYSTEM_PROMPT_TEMPLATE (English-only output) - Added ReportAgent._validate_section_output(): strips null bytes, removes control chars, normalizes line endings, warns if non-ASCII ratio > 20% - Added ReportAgent._run_report_sanity_check(): checks section min length, unicode replacement chars, and non-ASCII ratio; returns issues dict - Wired _validate_section_output() into generate_report() after each section - Wired _run_report_sanity_check() into generate_report() after assembly New file backend/app/services/executive_summary.py: - ExecutiveSummary dataclass with all structured fields (outcome, actors, metrics, timeline, platform sentiment, recommendations) - ExecutiveSummaryGenerator class: calls LLM to extract structured JSON from full report, saves executive_summary.json and executive_summary.md - English-only system prompt for LLM extraction - Graceful fallback on malformed LLM responses (safe_float, safe_list, etc.) - _render_markdown() produces clean Markdown card with emoji risk indicators Additive changes only — no existing logic modified.

… MultiFrameInitializer oasis_profile_generator.py: - Add behavioral variance fields to OasisAgentProfile (conformity, reactiveness, persuadability, behavioral_noise) - Add FatigueState dataclass with get_current_activity_level(), record_activity(), and record_rest() methods - Add ArchetypePersonaGenerator class: generates synthetic OASIS profiles from five seed archetypes (Enthusiast, Skeptic, Neutral, Influencer, Contrarian) with per-field Gaussian variance; supports both LLM-assisted and rule-based identity generation simulation_config_generator.py: - Add NarrativeFrame dataclass: stores frame_id, label, description, initial_sentiment, talking_points, and assigned agent_ids - Add MultiFrameInitializer class: generates N competing narrative frames at t=0 via LLM (with rule-based fallback), and distributes agents to frames by minimum-distance matching on sentiment_bias All changes are strictly additive; no existing code was modified.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (3)

backend/app/services/oasis_profile_generator.py (2)

1525-1556: Remove unnecessary f prefix from string literals.

Several strings in bio_templates don't contain placeholders but use f-string syntax:

Line 1527: "Skeptic": f"I question everything..."
Line 1528: "Neutral": f"Just here to read..."
Line 1530: "Contrarian": f"If everyone agrees..."

♻️ Suggested fix

         bio_templates = {
             "Enthusiast": f"Passionate about innovation and progress. Always excited to discuss new ideas. #{archetype}",
-            "Skeptic": f"I question everything. Critical thinker. Not buying the hype.",
-            "Neutral": f"Just here to read and stay informed. Opinions are my own.",
+            "Skeptic": "I question everything. Critical thinker. Not buying the hype.",
+            "Neutral": "Just here to read and stay informed. Opinions are my own.",
             "Influencer": f"Content creator | thought leader | {random.randint(1, 50)}K followers. Let's connect!",
-            "Contrarian": f"If everyone agrees, someone is wrong. Usually them.",
+            "Contrarian": "If everyone agrees, someone is wrong. Usually them.",
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/oasis_profile_generator.py` around lines 1525 - 1556,
Three entries in bio_templates ("Skeptic", "Neutral", "Contrarian") use
f-strings without any interpolation; remove the unnecessary f prefix from those
string literals so they become normal strings (leave the f prefix on entries
that do interpolate like "Enthusiast" and "Influencer"); update the
bio_templates definition accordingly and run tests to ensure no behavioral
change.

1332-1332: Use tuple instead of list for class-level constant.

DEFAULT_ARCHETYPES is a class attribute that should be immutable. Using list() creates a mutable object that could be accidentally modified, affecting all instances.

♻️ Suggested fix

-    DEFAULT_ARCHETYPES = list(_ARCHETYPE_DEFAULTS.keys())
+    DEFAULT_ARCHETYPES = tuple(_ARCHETYPE_DEFAULTS.keys())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/oasis_profile_generator.py` at line 1332, Replace the
mutable class-level list DEFAULT_ARCHETYPES with an immutable tuple to avoid
accidental mutation: change the assignment that currently uses
list(_ARCHETYPE_DEFAULTS.keys()) to use tuple(_ARCHETYPE_DEFAULTS.keys()) (or
tuple(_ARCHETYPE_DEFAULTS)) in the oasis_profile_generator module so
DEFAULT_ARCHETYPES is immutable at class level and cannot be modified by
instances; update any code that relies on list-specific methods to convert to
list locally if needed.

backend/app/services/simulation_config_generator.py (1)

1066-1072: Either implement the documented neutral split or trim the docstring.

Lines 1068-1070 promise neutral agents are spread across middle frames, but Lines 1303-1306 always pick a single closest frame. That makes the API description misleading and will concentrate neutral agents more than callers expect.

Also applies to: 1294-1307

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/simulation_config_generator.py` around lines 1066 -
1072, The docstring promises neutral agents are split across middle frames but
the current selection logic always assigns each agent to the single closest
frame; update the assignment logic (where agent sentiment_bias is mapped to a
frame and frame.agent_ids are populated) so neutral agents (bias near 0) are
distributed evenly across the middle frames instead of always choosing the
single closest frame—identify the code that finds the "closest frame" and
replace it with a neutral-handling branch that computes the middle-frame subset
(e.g., the central one or two frames of the frames list sorted by positivity)
and round-robin or evenly assigns neutral agent ids into those
NarrativeFrame.agent_ids; alternatively, if you prefer the simpler fix, edit the
docstring to remove the claim about neutral splitting and state that agents are
assigned to the single closest frame based on sentiment_bias.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/preflight.py`:
- Around line 16-22: The preflight currently treats config defaults as "present"
so update the check to read from the actual environment: change logic that
builds env_vars to inspect os.environ (e.g., use k in os.environ or
os.environ.get(k)) instead of importing values from backend.app.config.Config;
add 'LLM_API_KEY' to _REQUIRED_ENV_VARS so it is validated too; apply the same
fix to the later duplicate check around lines 91-102 (ensure both places
reference os.environ directly and not Config) and keep using the same unique
symbols _REQUIRED_ENV_VARS and env_vars so the preflight reports true missing
env vars.

In `@backend/app/routes/health.py`:
- Around line 56-68: _check_uploads_dir currently uses a fixed ".health_check"
filename which causes races on os.remove; change the probe to create a unique
temp file inside the uploads folder (use tempfile.NamedTemporaryFile or
tempfile.mkstemp with dir=folder) write/read as needed, close and unlink that
unique path in a finally block to ensure cleanup, and keep returning True/False
on success/exception; also add import tempfile at the top of the file.
- Around line 119-128: The route currently attaches the raw preflight payload
from current_app.extensions.get('preflight') to the health response (variable
preflight and body) which leaks internal paths/URLs/model lists; instead,
replace that with a sanitized summary or gate detailed data behind authenticated
ops access: when preflight is present set body["preflight"] to a coarse object
(e.g. {"ok": bool(preflight.passed or all checks), "checks": {name: bool}}) or
simply {"ok": True/False}, and only include the full preflight payload if the
requester is authenticated/authorized for ops (check your auth helper in the
health route) — update the logic around current_app.extensions.get('preflight')
and the body assignment to implement this sanitization/gating.

In `@backend/app/services/oasis_profile_generator.py`:
- Around line 1233-1244: record_rest(double-counts fatigue decay) — modify
record_rest in oasis_profile_generator.py to track the last round when decay was
applied (e.g., add a new attribute like last_decay_round or last_rest_round
initialized in the profile constructor) and compute rounds_idle using
current_round minus that last_decay_round (or the later of last_activity_round
and last_decay_round) so repeated calls only apply decay for new rounds; after
applying decay set last_decay_round = current_round (and also ensure
record_activity sets last_decay_round = last_activity_round when activity
happens).

In `@backend/app/services/report_agent.py`:
- Around line 622-626: The top-level English-only enforcement conflicts with a
later instruction inside SECTION_SYSTEM_PROMPT_TEMPLATE in report_agent.py that
tells the model to switch to Chinese for Chinese source material; remove the
stale Chinese-output rule from SECTION_SYSTEM_PROMPT_TEMPLATE (or make it
conditional on a dedicated locale/config flag) so the section prompt no longer
overrides the global English-only requirement and the model output remains
consistent with the earlier enforcement.

In `@backend/app/services/simulation_config_generator.py`:
- Around line 1154-1167: The generation paths currently can return fewer frames
than self.num_frames (LLM path accepts any non-empty frames_data and the
rule-based path is limited by template_pool); update both generation branches so
the final list of NarrativeFrame instances has exactly self.num_frames: in the
LLM path (where frames_data is consumed into frames) validate len(frames) ==
self.num_frames and raise a ValueError if fewer, and in the rule-based fallback
(where template_pool is used) pad or repeat templates deterministically to
produce exactly self.num_frames (ensuring unique frame_id/label generation)
before returning; reference the NarrativeFrame construction loop (frames_data ->
NarrativeFrame) and the template_pool usage to locate where to apply
truncation/padding or the strict validation.
- Around line 1081-1094: The mapping uses frame.frame_id as a key but frame_id
can be blank or duplicated from the LLM; ensure frame_id is normalized and
unique before using it as a dictionary key (either during NarrativeFrame
construction or immediately before building frame_assignments). Update the code
that creates NarrativeFrame instances (and any place that sets frame.frame_id)
to trim/normalize the id, generate a fallback unique id when blank, and
de-duplicate collisions (e.g., append a counter or uuid). Then use those
guaranteed-unique frame.frame_id values in sorted_frames, frame_assignments, and
when calling _find_best_frame so no frames collapse into the same bucket.

---

Nitpick comments:
In `@backend/app/services/oasis_profile_generator.py`:
- Around line 1525-1556: Three entries in bio_templates ("Skeptic", "Neutral",
"Contrarian") use f-strings without any interpolation; remove the unnecessary f
prefix from those string literals so they become normal strings (leave the f
prefix on entries that do interpolate like "Enthusiast" and "Influencer");
update the bio_templates definition accordingly and run tests to ensure no
behavioral change.
- Line 1332: Replace the mutable class-level list DEFAULT_ARCHETYPES with an
immutable tuple to avoid accidental mutation: change the assignment that
currently uses list(_ARCHETYPE_DEFAULTS.keys()) to use
tuple(_ARCHETYPE_DEFAULTS.keys()) (or tuple(_ARCHETYPE_DEFAULTS)) in the
oasis_profile_generator module so DEFAULT_ARCHETYPES is immutable at class level
and cannot be modified by instances; update any code that relies on
list-specific methods to convert to list locally if needed.

In `@backend/app/services/simulation_config_generator.py`:
- Around line 1066-1072: The docstring promises neutral agents are split across
middle frames but the current selection logic always assigns each agent to the
single closest frame; update the assignment logic (where agent sentiment_bias is
mapped to a frame and frame.agent_ids are populated) so neutral agents (bias
near 0) are distributed evenly across the middle frames instead of always
choosing the single closest frame—identify the code that finds the "closest
frame" and replace it with a neutral-handling branch that computes the
middle-frame subset (e.g., the central one or two frames of the frames list
sorted by positivity) and round-robin or evenly assigns neutral agent ids into
those NarrativeFrame.agent_ids; alternatively, if you prefer the simpler fix,
edit the docstring to remove the claim about neutral splitting and state that
agents are assigned to the single closest frame based on sentiment_bias.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b16abba3-aee8-4607-8e62-70363a9e30db

📥 Commits

Reviewing files that changed from the base of the PR and between ade8ab8 and 345acae.

📒 Files selected for processing (9)

backend/app/__init__.py
backend/app/preflight.py
backend/app/routes/__init__.py
backend/app/routes/health.py
backend/app/routes/ops.py
backend/app/services/executive_summary.py
backend/app/services/oasis_profile_generator.py
backend/app/services/report_agent.py
backend/app/services/simulation_config_generator.py

coderabbitai · 2026-03-20T13:27:55Z

backend/app/preflight.py

+# Required environment variable names (as strings, checked against os.environ)
+_REQUIRED_ENV_VARS = [
+    'NEO4J_URI',
+    'NEO4J_PASSWORD',
+    'LLM_BASE_URL',
+    'LLM_MODEL_NAME',
+]


⚠️ Potential issue | 🟠 Major

Make env_vars check the environment, not the defaulted config.

This currently falls back to backend/app/config.py::Config, so NEO4J_PASSWORD, LLM_BASE_URL, and LLM_MODEL_NAME read as “present” even when nothing is set in os.environ. LLM_API_KEY is also missing from _REQUIRED_ENV_VARS, so the preflight can report success while backend/app/config.py::Config.validate() would still fail.

🛠️ Suggested fix

_REQUIRED_ENV_VARS = [ + 'LLM_API_KEY', 'NEO4J_URI', 'NEO4J_PASSWORD', 'LLM_BASE_URL', 'LLM_MODEL_NAME', ] @@ def _check_required_env_vars() -> dict: """Check that required environment variables are set.""" - missing = [] - for var in _REQUIRED_ENV_VARS: - # Accept values set either via os.environ or via Config class attributes - env_val = os.environ.get(var) - config_attr = var.replace('LLM_BASE_URL', 'LLM_BASE_URL') \ - .replace('LLM_MODEL_NAME', 'LLM_MODEL_NAME') - config_val = getattr(Config, config_attr, None) - if not env_val and not config_val: - missing.append(var) + missing = [var for var in _REQUIRED_ENV_VARS if not os.environ.get(var)] return {"ok": len(missing) == 0, "missing": missing}

Also applies to: 91-102

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/preflight.py` around lines 16 - 22, The preflight currently treats config defaults as "present" so update the check to read from the actual environment: change logic that builds env_vars to inspect os.environ (e.g., use k in os.environ or os.environ.get(k)) instead of importing values from backend.app.config.Config; add 'LLM_API_KEY' to _REQUIRED_ENV_VARS so it is validated too; apply the same fix to the later duplicate check around lines 91-102 (ensure both places reference os.environ directly and not Config) and keep using the same unique symbols _REQUIRED_ENV_VARS and env_vars so the preflight reports true missing env vars.

coderabbitai · 2026-03-20T13:27:55Z

backend/app/routes/health.py

+def _check_uploads_dir() -> bool:
+    """Verify the uploads directory exists and is writable."""
+    try:
+        folder = os.path.abspath(Config.UPLOAD_FOLDER)
+        os.makedirs(folder, exist_ok=True)
+        test_path = os.path.join(folder, '.health_check')
+        with open(test_path, 'w') as f:
+            f.write('ok')
+        os.remove(test_path)
+        return True
+    except Exception as e:
+        logger.debug(f"Uploads dir check failed: {e}")
+        return False


⚠️ Potential issue | 🟠 Major

Use a unique temp file for the writability probe.

Every request touches the same .health_check path. Two concurrent /api/health or /api/health/ready calls can race on os.remove() and flap the endpoint to false-negative failures.

🛠️ Suggested fix

- test_path = os.path.join(folder, '.health_check') - with open(test_path, 'w') as f: - f.write('ok') - os.remove(test_path) + with tempfile.NamedTemporaryFile( + mode='w', + dir=folder, + prefix='.health_check_', + delete=True, + ) as f: + f.write('ok') return True

Also add import tempfile at the top of the file.

🧰 Tools

🪛 Ruff (0.15.6)

[warning] 66-66: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/routes/health.py` around lines 56 - 68, _check_uploads_dir currently uses a fixed ".health_check" filename which causes races on os.remove; change the probe to create a unique temp file inside the uploads folder (use tempfile.NamedTemporaryFile or tempfile.mkstemp with dir=folder) write/read as needed, close and unlink that unique path in a finally block to ensure cleanup, and keep returning True/False on success/exception; also add import tempfile at the top of the file.

coderabbitai · 2026-03-20T13:27:55Z

backend/app/routes/health.py

+    # Surface any stored preflight results
+    preflight = current_app.extensions.get('preflight')
+
+    body = {
+        "status": overall,
+        "timestamp": datetime.now().isoformat(),
+        "components": components,
+    }
+    if preflight is not None:
+        body["preflight"] = preflight


⚠️ Potential issue | 🟠 Major

Avoid returning raw preflight diagnostics from /api/health.

preflight includes internal URLs, filesystem paths, model inventory, and missing env var names. Exposing that verbatim on an unauthenticated health endpoint is unnecessary information leakage; return coarse ok flags only, or gate the detailed payload behind authenticated ops access.

🛠️ Suggested fix

- if preflight is not None: - body["preflight"] = preflight + if preflight is not None: + body["preflight"] = { + name: {"ok": detail.get("ok", False)} + for name, detail in preflight.items() + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Surface any stored preflight results

preflight = current_app.extensions.get('preflight')

body = {

"status": overall,

"timestamp": datetime.now().isoformat(),

"components": components,

}

if preflight is not None:

body["preflight"] = preflight

# Surface any stored preflight results

preflight = current_app.extensions.get('preflight')

body = {

"status": overall,

"timestamp": datetime.now().isoformat(),

"components": components,

}

if preflight is not None:

body["preflight"] = {

name: {"ok": detail.get("ok", False)}

for name, detail in preflight.items()

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/routes/health.py` around lines 119 - 128, The route currently attaches the raw preflight payload from current_app.extensions.get('preflight') to the health response (variable preflight and body) which leaks internal paths/URLs/model lists; instead, replace that with a sanitized summary or gate detailed data behind authenticated ops access: when preflight is present set body["preflight"] to a coarse object (e.g. {"ok": bool(preflight.passed or all checks), "checks": {name: bool}}) or simply {"ok": True/False}, and only include the full preflight payload if the requester is authenticated/authorized for ops (check your auth helper in the health route) — update the logic around current_app.extensions.get('preflight') and the body assignment to implement this sanitization/gating.

coderabbitai · 2026-03-20T13:27:55Z

backend/app/services/oasis_profile_generator.py

+    def record_rest(self, current_round: int) -> None:
+        """
+        Passive fatigue decay when the agent skips a round.
+
+        Args:
+            current_round: The simulation round being skipped.
+        """
+        if self.last_activity_round >= 0:
+            rounds_idle = max(0, current_round - self.last_activity_round)
+            self.fatigue_accumulation = max(
+                0.0, self.fatigue_accumulation - rounds_idle * self.fatigue_decay_rate
+            )


⚠️ Potential issue | 🟠 Major

Bug: record_rest double-counts decay when called consecutively.

The method calculates rounds_idle from last_activity_round, but doesn't update any state to track when rest was last recorded. Calling record_rest multiple times causes cumulative over-decay.

Example:

Agent acts at round 0 → fatigue_accumulation=0.5, last_activity_round=0

record_rest(5) → rounds_idle=5, decays by 0.25 → fatigue=0.25

record_rest(6) → rounds_idle=6 (still from round 0!), decays by 0.30 → fatigue≈0 (over-decayed)

The second call should only decay for 1 round (round 5→6), not 6 rounds.

🔧 Proposed fix: track last decay point

+ # Round number when fatigue was last updated (via rest or activity) + last_decay_round: int = -1 + def record_rest(self, current_round: int) -> None: """ Passive fatigue decay when the agent skips a round. Args: current_round: The simulation round being skipped. """ - if self.last_activity_round >= 0: - rounds_idle = max(0, current_round - self.last_activity_round) + reference_round = max(self.last_activity_round, self.last_decay_round) + if reference_round >= 0: + rounds_idle = max(0, current_round - reference_round) self.fatigue_accumulation = max( 0.0, self.fatigue_accumulation - rounds_idle * self.fatigue_decay_rate ) + self.last_decay_round = current_round

Alternatively, update record_activity to also set last_decay_round = current_round after it finishes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/oasis_profile_generator.py` around lines 1233 - 1244, record_rest(double-counts fatigue decay) — modify record_rest in oasis_profile_generator.py to track the last round when decay was applied (e.g., add a new attribute like last_decay_round or last_rest_round initialized in the profile constructor) and compute rounds_idle using current_round minus that last_decay_round (or the later of last_activity_round and last_decay_round) so repeated calls only apply decay for new rounds; after applying decay set last_decay_round = current_round (and also ensure record_activity sets last_decay_round = last_activity_round when activity happens).

coderabbitai · 2026-03-20T13:27:55Z

backend/app/services/report_agent.py

+CRITICAL LANGUAGE REQUIREMENT: You MUST write ENTIRELY in English.
+Do NOT use Chinese, Japanese, Korean, or any non-English language anywhere in your output.
+Even if the source documents or simulation data contain non-English text, you must translate everything to English.
+All section content, analysis, quotes (translated), and conclusions must be in English.
+


⚠️ Potential issue | 🟠 Major

Remove the stale Chinese-output rule from the section prompt.

Line 622 now mandates English-only output, but the same backend/app/services/report_agent.py::SECTION_SYSTEM_PROMPT_TEMPLATE later still tells the model to switch to Chinese when the source material is Chinese. That contradiction means the English-only hardening here is not actually enforced for section generation.

🛠️ Suggested fix

-3. [Language Consistency - Quoted Content Must Be Translated to Report Language] - - Tool returned content may contain English or mixed Chinese-English expressions - - If the simulation requirement and source material are in Chinese, the report must be entirely in Chinese - - When you quote English or mixed Chinese-English content from tools, you must translate it to fluent Chinese before including it in the report - - When translating, preserve the original meaning and ensure natural expression +3. [Language Consistency - Output Must Remain in English] + - Tool returned content may contain non-English expressions + - Translate all quoted or paraphrased material into fluent English before including it in the report + - When translating, preserve the original meaning and ensure natural English expression

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/report_agent.py` around lines 622 - 626, The top-level English-only enforcement conflicts with a later instruction inside SECTION_SYSTEM_PROMPT_TEMPLATE in report_agent.py that tells the model to switch to Chinese for Chinese source material; remove the stale Chinese-output rule from SECTION_SYSTEM_PROMPT_TEMPLATE (or make it conditional on a dedicated locale/config flag) so the section prompt no longer overrides the global English-only requirement and the model output remains consistent with the earlier enforcement.

coderabbitai · 2026-03-20T13:27:55Z

backend/app/services/simulation_config_generator.py

+        sorted_frames = sorted(self.frames, key=lambda f: f.initial_sentiment)
+        frame_assignments: Dict[str, List[int]] = {f.frame_id: [] for f in sorted_frames}
+
+        # Clear any prior assignments on frame objects
+        for frame in sorted_frames:
+            frame.agent_ids = []
+
+        num_frames = len(sorted_frames)
+
+        for agent in agent_configs:
+            bias = getattr(agent, "sentiment_bias", 0.0)
+            frame = self._find_best_frame(bias, sorted_frames)
+            frame_assignments[frame.frame_id].append(agent.agent_id)
+            frame.agent_ids.append(agent.agent_id)


⚠️ Potential issue | 🟠 Major

Make frame_id unique before using it as the assignment key.

Line 1082 keys the returned mapping by frame_id, so duplicate or blank ids from the LLM will silently collapse multiple frames into one bucket. Please normalize and de-duplicate ids when constructing NarrativeFrames.

Suggested fix

+ seen_frame_ids: Dict[str, int] = {} frames: List[NarrativeFrame] = [] for item in frames_data[: self.num_frames]: + raw_frame_id = str(item.get("frame_id") or "").strip() or f"frame_{len(frames)}" + collision_count = seen_frame_ids.get(raw_frame_id, 0) + seen_frame_ids[raw_frame_id] = collision_count + 1 + frame_id = raw_frame_id if collision_count == 0 else f"{raw_frame_id}_{collision_count}" frames.append( NarrativeFrame( - frame_id=item.get("frame_id", f"frame_{len(frames)}"), + frame_id=frame_id, label=item.get("label", item.get("frame_id", "Frame")), description=item.get("description", ""), initial_sentiment=float(item.get("initial_sentiment", 0.0)), talking_points=item.get("talking_points", []), )

Also applies to: 1154-1163

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/simulation_config_generator.py` around lines 1081 - 1094, The mapping uses frame.frame_id as a key but frame_id can be blank or duplicated from the LLM; ensure frame_id is normalized and unique before using it as a dictionary key (either during NarrativeFrame construction or immediately before building frame_assignments). Update the code that creates NarrativeFrame instances (and any place that sets frame.frame_id) to trim/normalize the id, generate a fallback unique id when blank, and de-duplicate collisions (e.g., append a counter or uuid). Then use those guaranteed-unique frame.frame_id values in sorted_frames, frame_assignments, and when calling _find_best_frame so no frames collapse into the same bucket.

coderabbitai · 2026-03-20T13:27:55Z

backend/app/services/simulation_config_generator.py

+        frames: List[NarrativeFrame] = []
+        for item in frames_data[: self.num_frames]:
+            frames.append(
+                NarrativeFrame(
+                    frame_id=item.get("frame_id", f"frame_{len(frames)}"),
+                    label=item.get("label", item.get("frame_id", "Frame")),
+                    description=item.get("description", ""),
+                    initial_sentiment=float(item.get("initial_sentiment", 0.0)),
+                    talking_points=item.get("talking_points", []),
+                )
+            )
+
+        if not frames:
+            raise ValueError("LLM returned no valid frames")


⚠️ Potential issue | 🟠 Major

Preserve the num_frames contract on both generation paths.

The class exposes num_frames, but the current helpers can return fewer: the LLM path accepts any non-empty result, and the rule-based fallback tops out at the six templates in template_pool. A caller asking for 7 or 8 frames can therefore silently get less after an LLM failure.

Suggested fix

- if not frames: + if not frames: raise ValueError("LLM returned no valid frames") + if len(frames) != self.num_frames: + raise ValueError( + f"LLM returned {len(frames)} frames for requested {self.num_frames}" + ) logger.info(f"MultiFrameInitializer: generated {len(frames)} frames via LLM") return frames @@ - logger.info(f"MultiFrameInitializer: generated {len(frames)} frames via rule-based fallback") + if len(frames) != self.num_frames: + raise ValueError( + f"Rule-based fallback produced {len(frames)} frames for requested {self.num_frames}" + ) + logger.info(f"MultiFrameInitializer: generated {len(frames)} frames via rule-based fallback") return frames

Also applies to: 1266-1288

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/simulation_config_generator.py` around lines 1154 - 1167, The generation paths currently can return fewer frames than self.num_frames (LLM path accepts any non-empty frames_data and the rule-based path is limited by template_pool); update both generation branches so the final list of NarrativeFrame instances has exactly self.num_frames: in the LLM path (where frames_data is consumed into frames) validate len(frames) == self.num_frames and raise a ValueError if fewer, and in the rule-based fallback (where template_pool is used) pad or repeat templates deterministically to produce exactly self.num_frames (ensuring unique frame_id/label generation) before returning; reference the NarrativeFrame construction loop (frames_data -> NarrativeFrame) and the template_pool usage to locate where to apply truncation/padding or the strict validation.

lilubot added 2 commits March 19, 2026 16:12

fix(report): inject GraphToolsService and use host Ollama docker layout

ab6bb4d

sourcery-ai bot reviewed Mar 20, 2026

View reviewed changes

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

lilubot added 3 commits March 20, 2026 06:01

lilubot changed the title ~~feat(sim-quality): Phase 1 simulation quality improvements~~ feat: MiroFish Phase 1+2 improvements - sim quality, ops, report quality Mar 20, 2026

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

Conversation

lilubot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

?? Perseus - Phase 1 Sim Quality (commit ade8ab8)

?? Prometheus - Phase 1 Ops & Reliability (commit b509ecb)

?? Calypso - Phase 1 Report Quality (commit becc6e5)

?? Perseus - Phase 2 Sim Quality (commit 345acae)

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Mar 20, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lilubot commented Mar 20, 2026 •

edited

Loading

?? Perseus - Phase 1 Sim Quality (commit `ade8ab8`)

?? Prometheus - Phase 1 Ops & Reliability (commit `b509ecb`)

?? Calypso - Phase 1 Report Quality (commit `becc6e5`)

?? Perseus - Phase 2 Sim Quality (commit `345acae`)

coderabbitai bot commented Mar 20, 2026 •

edited

Loading