Skip to content

feat(harness): complete resume loop + event observability#13

Merged
ranvier2d2 merged 6 commits intomainfrom
feat/resume-loop-observability
Mar 26, 2026
Merged

feat(harness): complete resume loop + event observability#13
ranvier2d2 merged 6 commits intomainfrom
feat/resume-loop-observability

Conversation

@ranvier2d2
Copy link
Collaborator

@ranvier2d2 ranvier2d2 commented Mar 26, 2026

Summary

  • Complete the resume loopSessionManager now reads Storage.get_binding on session start and injects resumeCursor into params when the provider matches. The bindings table (PR feat(harness): SQLite bindings table for thread resume #12) was write-only; this closes the read side.
  • Provider mismatch guard — if a thread has a Codex binding but Cursor tries to start on it, the stale cursor is ignored (not injected, not deleted). Caller-supplied resumeCursor is never overwritten.
  • Document agent_message_delta silencing — confirmed codex/event/agent_message_* events are redundant with item/agentMessage/delta (already mapped). Mapping both would cause duplicate deltas. Added explanatory comment.
  • Log unmapped harness eventsmapHarnessEventToRuntimeEvents now logs unmapped events with provider context instead of silently returning []. Surfaces mapping gaps for new event types.

Commits

  1. feat(harness): complete resume loop — SessionManager reads bindings
  2. docs(codex): document why agent_message_delta events are silenced
  3. fix(harness): log unmapped events + document agent_message silencing

Test plan

  • mix test test/harness/storage_test.exs — 38/38 pass (7 new resume cursor injection tests)
  • mix compile --warnings-as-errors — clean
  • tsc --noEmit — clean (pre-existing ProviderHealth lint only)
  • Provider match: binding exists + provider matches → resumeCursor injected
  • Provider mismatch: binding exists + wrong provider → NOT injected
  • No binding: params unchanged
  • Caller-supplied cursor: never overwritten
  • Multi-session concurrent: 3 providers, different threads, no cross-contamination
  • Multi-session same thread: Codex binding ignored when Cursor starts on same thread
  • Manual: Codex session → close → new session same thread → logs show thread/resume

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Session resumption: automatically injects resume cursors when a matching stored binding exists for the same provider and thread (caller-supplied cursor is preserved).
  • Improvements

    • Clarified handling of unmapped provider events; certain agent-message event types are now treated as quiet to reduce logging noise.
  • Tests

    • Added tests covering resume-cursor injection, provider/thread isolation, and edge cases.

ranvier2d2 and others added 3 commits March 25, 2026 21:27
SessionManager now reads Storage.get_binding on session start and
injects resumeCursor into params when the provider matches. Completes
the bindings table feature: write path (PR #12) + read path (this).

Provider mismatch check prevents injecting a Codex cursor into a
Cursor session. Caller-supplied resumeCursor is never overwritten.

7 new tests including multi-session concurrent and cross-provider.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These raw codex/event/agent_message_* events are duplicates of the
item-level item/agentMessage/delta (already mapped to content.delta).
Mapping both would cause double deltas in the UI.

Confirmed via stress-test-real-subagent.ts which treats them as
equivalent, and by tracing both paths through the event pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two observability improvements:

1. mapHarnessEventToRuntimeEvents now logs unmapped events with
   provider context instead of silently returning []. Surfaces
   mapping gaps for new event types from any provider.

2. Document why codex/event/agent_message_* events are in
   QUIET_UNMAPPED_EVENTS: they're redundant with item/agentMessage/delta
   (already mapped). Mapping both would cause duplicate deltas.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 26, 2026

Warning

Rate limit exceeded

@ranvier2d2 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 30 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d668403d-a902-4d3c-83aa-cf94d5ee2785

📥 Commits

Reviewing files that changed from the base of the PR and between 3c20bc1 and 138cf20.

📒 Files selected for processing (1)
  • apps/server/src/provider/Layers/codexEventMapping.test.ts
📝 Walkthrough

Walkthrough

The PR injects stored per-thread resume cursors into session start params via a new internal helper, adds ExUnit tests for that behavior, and adjusts provider-layer unmapped-event handling by marking specific Codex agent-message methods as quiet and clarifying logging attribution.

Changes

Cohort / File(s) Summary
Harness Session Resume Logic
apps/harness/lib/harness/session_manager.ex
Added maybe_inject_resume_cursor/3 and normalize_resume_cursor/1; preprocesses params to inject "resumeCursor" from Harness.Storage.get_binding/1 when provider matches and JSON decodes successfully.
Harness Session Tests
apps/harness/test/harness/storage_test.exs
Added tests verifying injection only on matching provider and valid stored cursor, no overwrite of existing resumeCursor, handling of nil/missing bindings, and multi-binding isolation.
Provider Event Logging
apps/server/src/provider/Layers/HarnessClientAdapter.ts, apps/server/src/provider/Layers/codexEventMapping.ts
Clarified fallback attribution in mapHarnessEventToRuntimeEvents and added three codex/event agent-message methods to QUIET_UNMAPPED_EVENTS to suppress noisy debug logs.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Client
    participant SM as SessionManager
    participant Storage as Harness.Storage
    participant Supervisor as SessionSupervisor

    Caller->>SM: start_session_with_module(params, provider, thread_id, ...)
    SM->>SM: maybe_inject_resume_cursor(params, provider, thread_id)
    SM->>Storage: get_binding(thread_id)
    alt binding exists and provider matches and cursor is binary
        Storage-->>SM: %{provider: p, resume_cursor_json: json}
        SM->>SM: Jason.decode(json) -> cursor_map
        SM->>SM: normalize_resume_cursor(cursor_map)
        SM-->>Caller: params with "resumeCursor" injected
    else no binding / provider mismatch / decode fails
        Storage-->>SM: nil or non-matching provider or bad json
        SM-->>Caller: original params (unchanged)
    end
    Caller->>Supervisor: start_child(session_spec(params,...))
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

size:L

Poem

🐰 I sniff the threads where cursors hide,
I tuck your place into params wide.
A nibble, a hop, the session wakes,
Quiet notes for noisy wakes,
The rabbit hops—resumes the ride.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the two main changes: completing the resume loop in the harness session manager and improving event observability through logging.
Description check ✅ Passed The description provides comprehensive information covering what changed, why it was needed, test results, and implementation details, matching the template's core requirements despite not strictly following its exact section structure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/resume-loop-observability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added size:M vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. labels Mar 26, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
apps/harness/lib/harness/session_manager.ex (1)

278-280: Log decode failures for better observability.

The decode error path currently returns silently; a warning log here would make corrupted bindings diagnosable.

Proposed refactor
-          _ ->
+          {:error, reason} ->
+            Logger.warning(
+              "Invalid resume_cursor_json for #{thread_id} (#{provider}): #{inspect(reason)}"
+            )
             params
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/harness/lib/harness/session_manager.ex` around lines 278 - 280, The
fallback clause that currently returns params on decode failure (the "_ ->
params" branch) should log a warning so corrupt or un-decodable bindings are
observable; update that branch to call Logger.warn (or the module's logger) with
a concise message including the failing input/params and the decode error or
reason before returning params, so you still preserve behavior but surface the
failure for diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/harness/lib/harness/session_manager.ex`:
- Around line 270-277: The decoded resume cursor is injected verbatim in
maybe_inject_resume_cursor, which causes a nested "threadId" object for Codex;
update maybe_inject_resume_cursor to normalize the decoded cursor before
Map.put: if provider matches the Codex binding (reference
Harness.Providers.CodexSession) and the decoded cursor is a map with a
"threadId" key whose value is a string, replace the cursor with that string (the
actual id); otherwise leave the decoded cursor unchanged—then inject the
normalized cursor into params as "resumeCursor".

In `@apps/server/src/provider/Layers/HarnessClientAdapter.ts`:
- Around line 920-928: The fallback debug log in HarnessClientAdapter.ts noisily
re-logs events that Codex intentionally silences and bypasses centralized
logging; update the unmapped-event path in mapHarnessEventToRuntimeEvents (the
block that checks if (event.method)) to first consult the Codex quiet set
(QUIET_UNMAPPED_EVENTS / codexEventMapping helper) and skip logging for those
methods, and for non-quiet unmapped events route the log through the centralized
providerManager logging API (or call the providerManager helper used for
provider/dispatch event logging) instead of using console.debug; this ensures
quiet events remain silent and all other unmapped events are logged in the
coordinated place.

---

Nitpick comments:
In `@apps/harness/lib/harness/session_manager.ex`:
- Around line 278-280: The fallback clause that currently returns params on
decode failure (the "_ -> params" branch) should log a warning so corrupt or
un-decodable bindings are observable; update that branch to call Logger.warn (or
the module's logger) with a concise message including the failing input/params
and the decode error or reason before returning params, so you still preserve
behavior but surface the failure for diagnostics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ac9fe446-0cbb-4798-bb64-61e6d97d22b4

📥 Commits

Reviewing files that changed from the base of the PR and between c452e73 and 112adc7.

📒 Files selected for processing (4)
  • apps/harness/lib/harness/session_manager.ex
  • apps/harness/test/harness/storage_test.exs
  • apps/server/src/provider/Layers/HarnessClientAdapter.ts
  • apps/server/src/provider/Layers/codexEventMapping.ts

ranvier2d2 and others added 2 commits March 25, 2026 22:05
Three fixes from review:

1. Normalize Codex resume cursor: {"threadId":"abc"} → "abc" (raw
   string). CodexSession sends resumeCursor directly as the threadId
   RPC param — a nested map would be rejected by the app-server.

2. Remove redundant fallback log in mapHarnessEventToRuntimeEvents.
   codexMapToRuntimeEvents already handles both QUIET_UNMAPPED_EVENTS
   suppression and console.debug for unknown methods.

3. Log warning on corrupt/un-decodable binding JSON instead of
   silently returning params.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/harness/lib/harness/session_manager.ex`:
- Around line 270-293: The decode of cursor_json should be moved out of
maybe_inject_resume_cursor and into normalize_resume_cursor so each provider
receives the same raw-format value it expects; change maybe_inject_resume_cursor
to pass the stored cursor_json (not the decoded map) into
normalize_resume_cursor, and update normalize_resume_cursor to perform
Jason.decode(cursor_json) internally, returning the extracted thread id string
for Codex (like current %{"threadId" => tid} -> tid), or the raw decoded
JSON/string for Cursor and others, and log/return params on decode errors.
Update references to maybe_inject_resume_cursor and normalize_resume_cursor
accordingly so Cursor's cursor_session guard (cursor when is_binary(cursor))
receives a JSON string as intended.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54158ac7-6129-4aef-8f9b-8556b314131e

📥 Commits

Reviewing files that changed from the base of the PR and between 112adc7 and 3c20bc1.

📒 Files selected for processing (3)
  • apps/harness/lib/harness/session_manager.ex
  • apps/harness/test/harness/storage_test.exs
  • apps/server/src/provider/Layers/HarnessClientAdapter.ts
✅ Files skipped from review due to trivial changes (1)
  • apps/server/src/provider/Layers/HarnessClientAdapter.ts

Comment on lines +270 to +293
def maybe_inject_resume_cursor(params, thread_id, provider) do
case Harness.Storage.get_binding(thread_id) do
%{provider: ^provider, resume_cursor_json: cursor_json} when is_binary(cursor_json) ->
case Jason.decode(cursor_json) do
{:ok, cursor} ->
normalized = normalize_resume_cursor(cursor)
Logger.info("Injecting resumeCursor from binding for #{thread_id} (#{provider})")
Map.put(params, "resumeCursor", normalized)

{:error, reason} ->
Logger.warning("Failed to decode binding cursor for #{thread_id}: #{inspect(reason)}")
params
end

_ ->
# No binding, provider mismatch, or nil cursor — start fresh
params
end
end

# Codex stores {"threadId": "..."} but CodexSession expects the raw string.
# Other providers store richer objects their session modules know how to parse.
defp normalize_resume_cursor(%{"threadId" => tid}) when is_binary(tid), do: tid
defp normalize_resume_cursor(cursor), do: cursor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how each session module handles resumeCursor

echo "=== cursor_session.ex: extract_resume_session_id expects binary ==="
rg -n -A 15 'defp extract_resume_session_id' apps/harness/lib/harness/providers/cursor_session.ex

echo ""
echo "=== claude_session.ex: parse_resume_state expects binary ==="
rg -n -A 20 'defp parse_resume_state' apps/harness/lib/harness/providers/claude_session.ex

echo ""
echo "=== opencode_session.ex: check resumeCursor handling ==="
rg -n -B 2 -A 10 'resumeCursor' apps/harness/lib/harness/providers/opencode_session.ex || echo "No resumeCursor handling found"

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 1794


🏁 Script executed:

# Get the full parse_resume_state function from claude_session
sed -n '1021,1080p' apps/harness/lib/harness/providers/claude_session.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 1745


🏁 Script executed:

# Get the full extract_resume_session_id function from cursor_session
sed -n '374,410p' apps/harness/lib/harness/providers/cursor_session.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 1270


🏁 Script executed:

# Check codex_session for resumeCursor handling
rg -n -B 2 -A 15 'resumeCursor' apps/harness/lib/harness/providers/codex_session.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 671


🏁 Script executed:

# List all session provider files to see what modules exist
fd -e ex$ 'session\.ex$' apps/harness/lib/harness/providers/

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 57


🏁 Script executed:

# Check mock_session for resumeCursor handling
rg -n -B 2 -A 10 'resumeCursor' apps/harness/lib/harness/providers/mock_session.ex

# Check if there's an opencode_session file and its resumeCursor handling
fd -e ex$ session apps/harness/lib/harness/providers/ | sort

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 57


🏁 Script executed:

# Check if mock_session even exists
fd -type f -name '*session.ex' apps/harness/lib/harness/providers/ | xargs -I {} basename {}

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 245


🏁 Script executed:

# Verify the current binding storage format to understand if cursor_json is stored as JSON string or map
rg -n 'resume_cursor_json' apps/harness/lib/harness/storage.ex | head -20

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 1240


🏁 Script executed:

# List all session provider files
fd -name '*session.ex' apps/harness/lib/harness/providers/

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 305


🏁 Script executed:

# Check how cursor_session builds its resume cursor to understand what format it stores
rg -n -B 2 -A 10 'build_resume_cursor' apps/harness/lib/harness/providers/cursor_session.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 935


🏁 Script executed:

# List all session provider files
fd '*session.ex' apps/harness/lib/harness/providers/

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 365


🏁 Script executed:

# Verify if there's a mock_session file
ls -la apps/harness/lib/harness/providers/ | grep session

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 413


🏁 Script executed:

# Check mock_session for resumeCursor handling
rg 'resumeCursor' apps/harness/lib/harness/providers/mock_session.ex

# Check opencode_session for resumeCursor handling
rg 'resumeCursor' apps/harness/lib/harness/providers/opencode_session.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 114


🏁 Script executed:

# Now verify the actual issue: let's trace through what normalize_resume_cursor does
# Get the exact implementation from session_manager
sed -n '270,330p' apps/harness/lib/harness/session_manager.ex

Repository: Ranvier-Technologies/t3code-OTP

Length of output: 1523


Type mismatch in maybe_inject_resume_cursor: Cursor session receives decoded map instead of JSON string and silently ignores resume.

The function decodes cursor_json from storage and passes the decoded map through normalize_resume_cursor. For Codex, this correctly extracts the raw string. However, for Cursor, the decoded map is returned as-is.

cursor_session.ex:379 guards with cursor when is_binary(cursor) before decoding. When passed a map instead of a JSON string, this guard fails and extract_resume_session_id/1 returns nil, causing resume to be silently ignored.

claude_session.ex handles this correctly via an explicit map pattern at line 1041, so it works with decoded data. Cursor does not have this fallback.

The fix: move JSON decoding into normalize_resume_cursor so each provider receives the same format it stores (raw JSON string for Cursor and Claude, extracted string for Codex).

Proposed fix
 def maybe_inject_resume_cursor(params, thread_id, provider) do
   case Harness.Storage.get_binding(thread_id) do
     %{provider: ^provider, resume_cursor_json: cursor_json} when is_binary(cursor_json) ->
-      case Jason.decode(cursor_json) do
-        {:ok, cursor} ->
-          normalized = normalize_resume_cursor(cursor)
-          Logger.info("Injecting resumeCursor from binding for #{thread_id} (#{provider})")
-          Map.put(params, "resumeCursor", normalized)
-
-        {:error, reason} ->
-          Logger.warning("Failed to decode binding cursor for #{thread_id}: #{inspect(reason)}")
-          params
-      end
+      case normalize_resume_cursor(provider, cursor_json) do
+        {:ok, normalized} ->
+          Logger.info("Injecting resumeCursor from binding for #{thread_id} (#{provider})")
+          Map.put(params, "resumeCursor", normalized)
+
+        :error ->
+          Logger.warning("Failed to decode binding cursor for #{thread_id}")
+          params
+      end

     _ ->
       # No binding, provider mismatch, or nil cursor — start fresh
       params
   end
 end

-# Codex stores {"threadId": "..."} but CodexSession expects the raw string.
-# Other providers store richer objects their session modules know how to parse.
-defp normalize_resume_cursor(%{"threadId" => tid}) when is_binary(tid), do: tid
-defp normalize_resume_cursor(cursor), do: cursor
+defp normalize_resume_cursor("codex", cursor_json) do
+  case Jason.decode(cursor_json) do
+    {:ok, %{"threadId" => tid}} when is_binary(tid) -> {:ok, tid}
+    _ -> :error
+  end
+end
+defp normalize_resume_cursor(_provider, cursor_json), do: {:ok, cursor_json}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/harness/lib/harness/session_manager.ex` around lines 270 - 293, The
decode of cursor_json should be moved out of maybe_inject_resume_cursor and into
normalize_resume_cursor so each provider receives the same raw-format value it
expects; change maybe_inject_resume_cursor to pass the stored cursor_json (not
the decoded map) into normalize_resume_cursor, and update
normalize_resume_cursor to perform Jason.decode(cursor_json) internally,
returning the extracted thread id string for Codex (like current %{"threadId" =>
tid} -> tid), or the raw decoded JSON/string for Cursor and others, and
log/return params on decode errors. Update references to
maybe_inject_resume_cursor and normalize_resume_cursor accordingly so Cursor's
cursor_session guard (cursor when is_binary(cursor)) receives a JSON string as
intended.

CI runs vitest, not bun. bun:test is not available in the GitHub
Actions runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ranvier2d2 ranvier2d2 merged commit 5552cc2 into main Mar 26, 2026
7 checks passed
@ranvier2d2 ranvier2d2 deleted the feat/resume-loop-observability branch March 26, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant