Skip to content

Support streaming tool output and deduplication#343

Draft
timvisher-dd wants to merge 14 commits intoxenodium:mainfrom
timvisher-dd:streaming-dedup
Draft

Support streaming tool output and deduplication#343
timvisher-dd wants to merge 14 commits intoxenodium:mainfrom
timvisher-dd:streaming-dedup

Conversation

@timvisher-dd
Copy link
Contributor

@timvisher-dd timvisher-dd commented Feb 26, 2026

Fixes #342

See that issue for the full problem description, root cause analysis, and perf measurements.

Depends on xenodium/acp.el#15 (adds :terminal-capability and :meta-capabilities to acp-make-initialize-request).

Checklist

  • I agree to communicate (PR description and comments) with the author myself (not AI-generated).
  • I've reviewed all code in PR myself and will vouch for its quality.
  • I've read and followed the Contributing guidelines.
  • I've filed a feature request/discussion for a new feature.
  • I've added tests where applicable.
  • I've run M-x checkdoc and M-x byte-compile-file.

Implementation

New files

agent-shell-meta.el — extractors for ACP _meta payloads:

  • agent-shell--meta-lookup — key lookup handling both symbol and string keys in alists.
  • agent-shell--meta-find-tool-response — walks any _meta namespace to find a toolResponse value (used by claude-agent-acp).
  • agent-shell--tool-call-meta-response-text — extracts stdout text from _meta.*.toolResponse in its various shapes (string, alist with stdout key, vector of content blocks).
  • agent-shell--tool-call-terminal-output-data — extracts _meta.terminal_output.data (used by codex-acp for incremental streaming chunks).

agent-shell-streaming.el — streaming tool call update handler:

  • agent-shell--tool-call-normalize-output — strips markdown fences, strips <persisted-output> XML tags (rendering the preview with font-lock-comment-face), and ensures trailing newlines.
  • agent-shell--append-tool-call-output — accumulates streamed output in the state's :tool-calls hash under an :accumulated key per tool call ID.
  • agent-shell--handle-tool-call-update-streaming — the main handler, replacing the inline tool_call_update block in agent-shell.el. Three branches:
    1. Terminal data (_meta.terminal_output.data): normalize the chunk, accumulate it, and immediately append it to the fragment body for live streaming.
    2. Meta response (_meta.*.toolResponse): normalize and accumulate silently (rendered only on final update to avoid duplication).
    3. Final update (status is "completed" or "failed"): render accumulated output (or fall back to content text), log to transcript, clean up permission dialogs, and apply title/label updates.
  • agent-shell--mark-tool-calls-cancelled — marks all in-progress tool calls as cancelled (called from agent-shell-interrupt).

Changes to agent-shell.el

  • (require 'agent-shell-streaming) added.
  • The ~50-line inline tool_call_update rendering block (lines 1290-1346 on main) is replaced by a single call to agent-shell--handle-tool-call-update-streaming. The metadata save (title/description/command/raw-input/diff) remains inline before the handler call.
  • The initialize request now passes :terminal-capability t and :meta-capabilities '((terminal_output . t)) to acp-make-initialize-request.
  • agent-shell-interrupt calls agent-shell--mark-tool-calls-cancelled after sending the cancel notification.
  • shell-maker-define-major-mode call passes 'agent-shell-mode-map (quoted symbol) instead of the bare variable.

Tests

7 new tests in tests/agent-shell-streaming-tests.el:

  • agent-shell--tool-call-meta-response-text-test — extracts text from _meta.claudeCode.toolResponse.stdout.
  • agent-shell--tool-call-normalize-output-test — strips fences and ensures trailing newline.
  • agent-shell--tool-call-normalize-output-persisted-output-test — strips <persisted-output> tags.
  • agent-shell--tool-call-update-writes-output-test — verifies accumulated output is written to the fragment body.
  • agent-shell--tool-call-meta-response-no-duplication-test — meta response text is rendered once, not duplicated with content.
  • agent-shell-initialize-request-meta-capabilities-test — the initialize request includes _meta.terminal_output.
  • agent-shell--tool-call-terminal-output-data-streaming-test — codex-style _meta.terminal_output.data chunks are accumulated and rendered incrementally.

@timvisher-dd timvisher-dd changed the title # Support streaming tool output and deduplication Support streaming tool output and deduplication Feb 26, 2026
@xenodium
Copy link
Owner

I appreciate the contribution, but please be mindful there's a lot being sent my way here, from long issue descriptions to chunky PRs. When I ask folks to file a feature request before sending a PR, I don't mean it to be just a checkbox item so folks send the PR soon after that.

From CONTRIBUTING.org

Before implementing new features, please file a feature request first to discuss the proposal.

There's a span of minutes between the feature request/issue and the PRs submitted for review.

According to commit logs, I see this feature was all built today. Being a larger contribution, please use it for some time before sending it my way. Let it settle for some time. Use it. It takes a lot of concentration for me, energy, and time to parse of all this. I am maintaining this package, so I need to understand the contributions sent my way and that takes a significant effort. Please be mindful.

@timvisher-dd
Copy link
Contributor Author

I appreciate the contribution, but please be mindful there's a lot being sent my way here, from long issue descriptions to chunky PRs. When I ask folks to file a feature request before sending a PR, I don't mean it to be just a checkbox item so folks send the PR soon after that.

From CONTRIBUTING.org

Before implementing new features, please file a feature request first to discuss the proposal.

There's a span of minutes between the feature request/issue and the PRs submitted for review.

So my intent was to be respectful of your wish to use an Issue to discuss the feature and the PR is opened in Draft to indicate that you shouldn't review it yet unless you wish. It's meant to be a both/and offering rather than to indicate that you should feel the need to read both.

If you'd prefer I can keep the draft changeset entirely in my local dev integration branches until you tell me explicitly that I may open a PR. But from my perspective the 'this code might answer this issue if we both agree that the issue is real and make sense' is a useful thing especially when I'm mostly opening the issue because I've fixed it for myself.

According to commit logs, I see this feature was all built today. Being a larger contribution, please use it for some time before sending it my way. Let it settle for some time. Use it.

The commit logs show that just because I rebase and push every time I work to be sure that I'm not in conflict with anything happening on trunk. I assure you that I'm not opening Issues or PRs without using the changes I'm making for at least a few days.

It takes a lot of concentration for me, energy, and time to parse of all this. I am maintaining this package, so I need to understand the contributions sent my way and that takes a significant effort. Please be mindful.

I wonder if there's some way to set a 'max in flight Issues/PRs' setting on the repo? To me I'm opening issues as I see and solve them for myself and I have zero expectation that you'll spend any more time than is reasonable for you on it. If that means my Issues/PRs sit unwatched for months then so be it. That's the nature of open source. Glob knows my own open source projects often will go that long with zero attention paid by me.

That said I'm unsure how better to indicate to you that something's available for you to look at. Like if we had a discord somewhere I'd still be sending you a message 'Hey this thing maybe could use your attention' and then you'd have to decide whether you have any to spare at that time or not. Again it's just kind of the nature of an open source community.

I guess finally if I'm pushing too much into your attention we could go where I just work on my fork and integrate your work and you can check my fork every now and then to see if anything interesting has landed. That's a reasonable stance for you to take as well.

LMK how you'd like to proceed! Really appreciate all the effort you've put into this so far. :)

@timvisher-dd timvisher-dd force-pushed the streaming-dedup branch 3 times, most recently from 16a7ad0 to ae7e165 Compare March 3, 2026 20:21
@timvisher-dd timvisher-dd force-pushed the streaming-dedup branch 4 times, most recently from f46d7c1 to 8fd241e Compare March 15, 2026 15:46
timvisher-dd and others added 13 commits March 15, 2026 14:49
CI workflow runs byte-compilation and ERT tests on push/PR using
GitHub Actions with deps checked out from timvisher-dd/acp.el-plus
and xenodium/shell-maker.

bin/test parses ci.yml with yq so local runs stay in sync with CI
automatically. It symlinks local dependency checkouts into deps/ to
match the CI layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New library for sending desktop notifications from Emacs.

In GUI mode on macOS, uses native UNUserNotificationCenter via a
dynamic module (agent-shell-alert-mac.dylib) compiled JIT on first
use (inspired by vterm). When compilation fails (e.g. missing Xcode
CLI tools), a message recommends `xcode-select --install`.

In terminal mode, auto-detects the host terminal emulator and sends
the appropriate OSC escape sequence:
- OSC 9: iTerm2, Ghostty, WezTerm, foot, mintty, ConEmu
- OSC 99: kitty
- OSC 777: urxvt, VTE-based terminals

Inside tmux, wraps in DCS passthrough (checking allow-passthrough
first). Falls back to osascript on macOS when the terminal is
unknown or tmux passthrough is not enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follows the same pattern as acp.el: a boolean toggle
(agent-shell-logging-enabled, off by default), a per-shell log
buffer stored in state, and label+format-string logging. Adds log
calls to idle notification start/cancel/fire for observability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After each agent turn completes, a 30s timer starts. Any user input
in the buffer cancels it; otherwise it fires a desktop notification
via agent-shell-alert. The echo area message is only shown when the
shell buffer is not the active buffer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validate that agent-shell-buffers and agent-shell-project-buffers
reflect (buffer-list) ordering correctly: switch-to-buffer and
select-window promote, with-current-buffer does not, bury-buffer
demotes, and project filtering preserves order.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fails PRs that modify .el files or tests/ without also updating
README.org, ensuring the soft-fork features list stays current.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add comprehensive ERT tests for agent-shell-usage.el covering
notification updates, context indicator scaling/colors, compaction
replay, token saving, and number formatting.

The ACP server has a bug where model switches cause used to exceed
size in session/update notifications. Rather than clamping, signal
unreliable data: indicator shows ? with warning face, format shows
(?) instead of a bogus percentage. A regression test replays real
observed traffic from the Opus 1M -> Sonnet 200k switch scenario.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirror the CI readme-updated job locally so bin/test catches missing
README.org updates before pushing. Also fix the copy-paste error where
both dep-missing messages said shell_maker_root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the two key requirements for contributing: run bin/test and
keep the README features list current.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support streaming tool output and deduplication

2 participants