Support streaming tool output and deduplication#343
Support streaming tool output and deduplication#343timvisher-dd wants to merge 14 commits intoxenodium:mainfrom
Conversation
|
I appreciate the contribution, but please be mindful there's a lot being sent my way here, from long issue descriptions to chunky PRs. When I ask folks to file a feature request before sending a PR, I don't mean it to be just a checkbox item so folks send the PR soon after that. From CONTRIBUTING.org
There's a span of minutes between the feature request/issue and the PRs submitted for review. According to commit logs, I see this feature was all built today. Being a larger contribution, please use it for some time before sending it my way. Let it settle for some time. Use it. It takes a lot of concentration for me, energy, and time to parse of all this. I am maintaining this package, so I need to understand the contributions sent my way and that takes a significant effort. Please be mindful. |
36ba031 to
5496a0d
Compare
So my intent was to be respectful of your wish to use an Issue to discuss the feature and the PR is opened in Draft to indicate that you shouldn't review it yet unless you wish. It's meant to be a both/and offering rather than to indicate that you should feel the need to read both. If you'd prefer I can keep the draft changeset entirely in my local dev integration branches until you tell me explicitly that I may open a PR. But from my perspective the 'this code might answer this issue if we both agree that the issue is real and make sense' is a useful thing especially when I'm mostly opening the issue because I've fixed it for myself.
The commit logs show that just because I rebase and push every time I work to be sure that I'm not in conflict with anything happening on trunk. I assure you that I'm not opening Issues or PRs without using the changes I'm making for at least a few days.
I wonder if there's some way to set a 'max in flight Issues/PRs' setting on the repo? To me I'm opening issues as I see and solve them for myself and I have zero expectation that you'll spend any more time than is reasonable for you on it. If that means my Issues/PRs sit unwatched for months then so be it. That's the nature of open source. Glob knows my own open source projects often will go that long with zero attention paid by me. That said I'm unsure how better to indicate to you that something's available for you to look at. Like if we had a discord somewhere I'd still be sending you a message 'Hey this thing maybe could use your attention' and then you'd have to decide whether you have any to spare at that time or not. Again it's just kind of the nature of an open source community. I guess finally if I'm pushing too much into your attention we could go where I just work on my fork and integrate your work and you can check my fork every now and then to see if anything interesting has landed. That's a reasonable stance for you to take as well. LMK how you'd like to proceed! Really appreciate all the effort you've put into this so far. :) |
16a7ad0 to
ae7e165
Compare
f46d7c1 to
8fd241e
Compare
CI workflow runs byte-compilation and ERT tests on push/PR using GitHub Actions with deps checked out from timvisher-dd/acp.el-plus and xenodium/shell-maker. bin/test parses ci.yml with yq so local runs stay in sync with CI automatically. It symlinks local dependency checkouts into deps/ to match the CI layout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New library for sending desktop notifications from Emacs. In GUI mode on macOS, uses native UNUserNotificationCenter via a dynamic module (agent-shell-alert-mac.dylib) compiled JIT on first use (inspired by vterm). When compilation fails (e.g. missing Xcode CLI tools), a message recommends `xcode-select --install`. In terminal mode, auto-detects the host terminal emulator and sends the appropriate OSC escape sequence: - OSC 9: iTerm2, Ghostty, WezTerm, foot, mintty, ConEmu - OSC 99: kitty - OSC 777: urxvt, VTE-based terminals Inside tmux, wraps in DCS passthrough (checking allow-passthrough first). Falls back to osascript on macOS when the terminal is unknown or tmux passthrough is not enabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follows the same pattern as acp.el: a boolean toggle (agent-shell-logging-enabled, off by default), a per-shell log buffer stored in state, and label+format-string logging. Adds log calls to idle notification start/cancel/fire for observability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After each agent turn completes, a 30s timer starts. Any user input in the buffer cancels it; otherwise it fires a desktop notification via agent-shell-alert. The echo area message is only shown when the shell buffer is not the active buffer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validate that agent-shell-buffers and agent-shell-project-buffers reflect (buffer-list) ordering correctly: switch-to-buffer and select-window promote, with-current-buffer does not, bury-buffer demotes, and project filtering preserves order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fails PRs that modify .el files or tests/ without also updating README.org, ensuring the soft-fork features list stays current. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add comprehensive ERT tests for agent-shell-usage.el covering notification updates, context indicator scaling/colors, compaction replay, token saving, and number formatting. The ACP server has a bug where model switches cause used to exceed size in session/update notifications. Rather than clamping, signal unreliable data: indicator shows ? with warning face, format shows (?) instead of a bogus percentage. A regression test replays real observed traffic from the Opus 1M -> Sonnet 200k switch scenario. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mirror the CI readme-updated job locally so bin/test catches missing README.org updates before pushing. Also fix the copy-paste error where both dep-missing messages said shell_maker_root. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the two key requirements for contributing: run bin/test and keep the README features list current. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
668cbfe to
5dea5d0
Compare
f16181f to
083af54
Compare
083af54 to
95686f7
Compare
Fixes #342
See that issue for the full problem description, root cause analysis, and perf measurements.
Depends on xenodium/acp.el#15 (adds
:terminal-capabilityand:meta-capabilitiestoacp-make-initialize-request).Checklist
M-x checkdocandM-x byte-compile-file.Implementation
New files
agent-shell-meta.el— extractors for ACP_metapayloads:agent-shell--meta-lookup— key lookup handling both symbol and string keys in alists.agent-shell--meta-find-tool-response— walks any_metanamespace to find atoolResponsevalue (used by claude-agent-acp).agent-shell--tool-call-meta-response-text— extracts stdout text from_meta.*.toolResponsein its various shapes (string, alist withstdoutkey, vector of content blocks).agent-shell--tool-call-terminal-output-data— extracts_meta.terminal_output.data(used by codex-acp for incremental streaming chunks).agent-shell-streaming.el— streaming tool call update handler:agent-shell--tool-call-normalize-output— strips markdown fences, strips<persisted-output>XML tags (rendering the preview withfont-lock-comment-face), and ensures trailing newlines.agent-shell--append-tool-call-output— accumulates streamed output in the state's:tool-callshash under an:accumulatedkey per tool call ID.agent-shell--handle-tool-call-update-streaming— the main handler, replacing the inlinetool_call_updateblock inagent-shell.el. Three branches:_meta.terminal_output.data): normalize the chunk, accumulate it, and immediately append it to the fragment body for live streaming._meta.*.toolResponse): normalize and accumulate silently (rendered only on final update to avoid duplication)."completed"or"failed"): render accumulated output (or fall back tocontenttext), log to transcript, clean up permission dialogs, and apply title/label updates.agent-shell--mark-tool-calls-cancelled— marks all in-progress tool calls as cancelled (called fromagent-shell-interrupt).Changes to
agent-shell.el(require 'agent-shell-streaming)added.tool_call_updaterendering block (lines 1290-1346 on main) is replaced by a single call toagent-shell--handle-tool-call-update-streaming. The metadata save (title/description/command/raw-input/diff) remains inline before the handler call.initializerequest now passes:terminal-capability tand:meta-capabilities '((terminal_output . t))toacp-make-initialize-request.agent-shell-interruptcallsagent-shell--mark-tool-calls-cancelledafter sending the cancel notification.shell-maker-define-major-modecall passes'agent-shell-mode-map(quoted symbol) instead of the bare variable.Tests
7 new tests in
tests/agent-shell-streaming-tests.el:agent-shell--tool-call-meta-response-text-test— extracts text from_meta.claudeCode.toolResponse.stdout.agent-shell--tool-call-normalize-output-test— strips fences and ensures trailing newline.agent-shell--tool-call-normalize-output-persisted-output-test— strips<persisted-output>tags.agent-shell--tool-call-update-writes-output-test— verifies accumulated output is written to the fragment body.agent-shell--tool-call-meta-response-no-duplication-test— meta response text is rendered once, not duplicated with content.agent-shell-initialize-request-meta-capabilities-test— the initialize request includes_meta.terminal_output.agent-shell--tool-call-terminal-output-data-streaming-test— codex-style_meta.terminal_output.datachunks are accumulated and rendered incrementally.