Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions .agents/commands/live-validate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Live validation of agent-shell rendering

Run a live agent-shell session in batch mode and verify the buffer output.
This exercises the full rendering pipeline with real ACP traffic — the only
way to catch ordering, marker, and streaming bugs that unit tests miss.

## Prerequisites

- `ANTHROPIC_API_KEY` must be available (via `op run` / 1Password)
- `timvisher_emacs_agent_shell` must be on PATH
- Dependencies (acp.el-plus, shell-maker) in sibling worktrees or
overridden via env vars

## How to run

```bash
cd "$(git rev-parse --show-toplevel)"
timvisher_agent_shell_checkout=. \
timvisher_emacs_agent_shell claude --batch \
1>/tmp/agent-shell-live-stdout.log \
2>/tmp/agent-shell-live-stderr.log
```

Stderr shows heartbeat lines every 30 seconds. Stdout contains the
full buffer dump once the agent turn completes.

## What to check in the output

1. **Fragment ordering**: tool call drawers should appear in
chronological order (the order the agent invoked them), not
reversed. Look for `▶` lines — their sequence should match the
logical execution order.

2. **No duplicate content**: each tool call output should appear
exactly once. Watch for repeated blocks of identical text.

3. **Prompt position**: the prompt line (`agent-shell>`) should
appear at the very end of the buffer, after all fragments.

4. **Notices placement**: `[hook-trace]` and other notice lines
should appear in a `Notices` section, not interleaved with tool
call fragments.

## Enabling invariant checking

To run with runtime invariant assertions (catches corruption as it
happens rather than after the fact):

```elisp
;; Add to your init or eval before the session starts:
(setq agent-shell-invariants-enabled t)
```

When an invariant fires, a `*agent-shell invariant*` buffer pops up
with a debug bundle and recommended analysis prompt.

## Quick validation one-liner

```bash
cd "$(git rev-parse --show-toplevel)" && \
timvisher_agent_shell_checkout=. \
timvisher_emacs_agent_shell claude --batch \
1>/tmp/agent-shell-live.log 2>&1 && \
grep -n '▶' /tmp/agent-shell-live.log | head -20
```

If the `▶` lines are in logical order and the exit code is 0, the
rendering pipeline is healthy.
1 change: 1 addition & 0 deletions .claude
1 change: 1 addition & 0 deletions .codex
1 change: 1 addition & 0 deletions .gemini
176 changes: 176 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
name: CI

on:
push:
branches: [main, dev]
pull_request:
branches: [main]

jobs:
readme-updated:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Check README.org updated when code changes
run: |
base="${{ github.event.pull_request.base.sha }}"
head="${{ github.event.pull_request.head.sha }}"
changed_files=$(git diff --name-only "$base" "$head")

has_code_changes=false
for f in $changed_files; do
case "$f" in
*.el|tests/*) has_code_changes=true; break ;;
esac
done

if "$has_code_changes"; then
if ! echo "$changed_files" | grep -q '^README\.org$'; then
echo "::error::Code or test files changed but README.org was not updated."
echo "Please update the soft-fork features list in README.org."
exit 1
fi
fi

agent-symlinks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Verify agent config symlinks
run: |
ok=true
for dir in .claude .codex .gemini; do
target=$(readlink "${dir}" 2>/dev/null)
if [[ "${target}" != ".agents" ]]; then
echo "::error::${dir} should symlink to .agents but points to '${target:-<missing>}'"
ok=false
fi
done
for md in CLAUDE.md CODEX.md GEMINI.md; do
target=$(readlink "${md}" 2>/dev/null)
if [[ "${target}" != "AGENTS.md" ]]; then
echo "::error::${md} should symlink to AGENTS.md but points to '${target:-<missing>}'"
ok=false
fi
done
if ! [[ -d .agents/commands ]]; then
echo "::error::.agents/commands/ directory missing"
ok=false
fi
if [[ "${ok}" != "true" ]]; then
exit 1
fi
echo "All agent config symlinks verified."

dependency-dag:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Verify require graph is a DAG (no cycles)
run: |
# Build the set of project-internal modules from *.el filenames.
declare -A project_modules
for f in *.el; do
mod="${f%.el}"
project_modules["${mod}"]=1
done

# Parse (require 'foo) from each file and build an adjacency list.
# Only track edges where both ends are project-internal.
declare -A edges # edges["a"]="b c" means a requires b and c
for f in *.el; do
mod="${f%.el}"
deps=""
while IFS= read -r dep; do
if [[ -n "${project_modules[$dep]+x}" ]]; then
deps="${deps} ${dep}"
fi
done < <(sed -n "s/^.*(require '\\([a-zA-Z0-9_-]*\\)).*/\\1/p" "$f")
edges["${mod}"]="${deps}"
done

# DFS cycle detection.
declare -A color # white=unvisited, gray=in-stack, black=done
found_cycle=""
cycle_path=""

dfs() {
local node="$1"
local path="$2"
color["${node}"]="gray"
for neighbor in ${edges["${node}"]}; do
if [[ "${color[$neighbor]:-white}" == "gray" ]]; then
found_cycle=1
cycle_path="${path} -> ${neighbor}"
return
fi
if [[ "${color[$neighbor]:-white}" == "white" ]]; then
dfs "${neighbor}" "${path} -> ${neighbor}"
if [[ -n "${found_cycle}" ]]; then
return
fi
fi
done
color["${node}"]="black"
}

for mod in "${!project_modules[@]}"; do
if [[ "${color[$mod]:-white}" == "white" ]]; then
dfs "${mod}" "${mod}"
if [[ -n "${found_cycle}" ]]; then
echo "::error::Dependency cycle detected: ${cycle_path}"
exit 1
fi
fi
done
echo "Dependency graph is a DAG — no cycles found."

test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/checkout@v4
with:
repository: timvisher-dd/acp.el-plus
path: deps/acp.el

- uses: actions/checkout@v4
with:
repository: xenodium/shell-maker
path: deps/shell-maker

- uses: purcell/setup-emacs@master
with:
version: 29.4

- name: Remove stale .elc files
run: find . deps -follow -name '*.elc' -print0 | xargs -0 rm -f

- name: Byte-compile
run: |
compile_files=()
for f in *.el; do
case "$f" in x.*|y.*|z.*) ;; *) compile_files+=("$f") ;; esac
done
emacs -Q --batch \
-L . -L deps/acp.el -L deps/shell-maker \
-f batch-byte-compile \
"${compile_files[@]}"

- name: Run ERT tests
run: |
test_args=()
for f in tests/*-tests.el; do
test_args+=(-l "$f")
done
emacs -Q --batch \
-L . -L deps/acp.el -L deps/shell-maker -L tests \
"${test_args[@]}" \
-f ert-run-tests-batch-and-exit
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/.agent-shell/
/deps/

*.elc
22 changes: 22 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,25 @@ When contributing:
## Contributing

This is an Emacs Lisp project. See [CONTRIBUTING.org](CONTRIBUTING.org) for style guidelines, code checks, and testing. Please adhere to these guidelines.

## Development workflow

When adding or changing features:

1. **Run `bin/test`.** Set `acp_root` and `shell_maker_root` if the
deps aren't in sibling worktrees. This runs byte-compilation, ERT
tests, dependency DAG check, and checks that `README.org` was
updated when code changed.
2. **Keep the README features list current.** The "Features on top of
agent-shell" section in `README.org` must be updated whenever code
changes land. Both `bin/test` and CI enforce this — changes to `.el`
or `tests/` files without a corresponding `README.org` update will
fail.
3. **Live-validate rendering changes.** For changes to the rendering
pipeline (fragment insertion, streaming, markers, UI), run a live
batch session to verify fragment ordering and buffer integrity.
See `.agents/commands/live-validate.md` for details. The key command:
```bash
timvisher_agent_shell_checkout=. timvisher_emacs_agent_shell claude --batch \
1>/tmp/agent-shell-live.log 2>&1
```
1 change: 1 addition & 0 deletions CODEX.md
17 changes: 17 additions & 0 deletions CONTRIBUTING.org
Original file line number Diff line number Diff line change
Expand Up @@ -231,3 +231,20 @@ Tests live under the tests directory:
Opening any file under the =tests= directory will load the =agent-shell-run-all-tests= command.

Run tests with =M-x agent-shell-run-all-tests=.

*** From the command line

=bin/test= runs the full ERT suite in batch mode. By default it
expects =acp.el= and =shell-maker= to be checked out as sibling
worktrees (e.g. =…/acp.el/main= and =…/shell-maker/main= next to
=…/agent-shell/main=). Override the paths with environment variables
if your layout differs:

#+begin_src bash
acp_root=~/path/to/acp.el \
shell_maker_root=~/path/to/shell-maker \
bin/test
#+end_src

The script validates that both dependencies are readable and exits
with a descriptive error if either is missing.
22 changes: 21 additions & 1 deletion README.org
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
#+TITLE: Emacs Agent Shell
#+AUTHOR: Álvaro Ramírez
#+AUTHOR: Tim Visher

A soft fork of [[https://github.com/xenodium/agent-shell][agent-shell]] with extra features on top.

* Features on top of agent-shell

- CI workflow and local test runner ([[https://github.com/timvisher-dd/agent-shell-plus/pull/1][#1]], [[https://github.com/timvisher-dd/agent-shell-plus/pull/6][#6]])
- Byte-compilation of all =.el= files ([[https://github.com/timvisher-dd/agent-shell-plus/pull/1][#1]])
- ERT test suite ([[https://github.com/timvisher-dd/agent-shell-plus/pull/1][#1]])
- README update check when code changes ([[https://github.com/timvisher-dd/agent-shell-plus/pull/4][#4]])
- Dependency DAG check (=require= graph must be acyclic) ([[https://github.com/timvisher-dd/agent-shell-plus/pull/7][#7]])
- Desktop notifications when the prompt is idle and waiting for input ([[https://github.com/timvisher-dd/agent-shell-plus/pull/2][#2]], [[https://github.com/timvisher-dd/agent-shell-plus/pull/8][#8]])
- Per-shell debug logging infrastructure ([[https://github.com/timvisher-dd/agent-shell-plus/pull/2][#2]])
- Regression tests for shell buffer selection ordering ([[https://github.com/timvisher-dd/agent-shell-plus/pull/3][#3]])
- CI check that README.org is updated when code changes ([[https://github.com/timvisher-dd/agent-shell-plus/pull/4][#4]])
- Usage tests and defense against ACP =used > size= bug ([[https://github.com/timvisher-dd/agent-shell-plus/pull/5][#5]])
- Streaming tool output with dedup: advertise =_meta.terminal_output= capability, handle incremental chunks from codex-acp and batch results from claude-agent-acp, strip =<persisted-output>= tags, and fix O(n²) rendering ([[https://github.com/timvisher-dd/agent-shell-plus/pull/7][#7]])
- DWIM context insertion: inserted context lands at the prompt and fragment updates no longer drag process-mark past it ([[https://github.com/timvisher-dd/agent-shell-plus/pull/7][#7]])
- Runtime buffer invariant checking with event tracing and violation debug bundles ([[https://github.com/timvisher-dd/agent-shell-plus/pull/7][#7]])

-----

[[https://melpa.org/#/agent-shell][file:https://melpa.org/packages/agent-shell-badge.svg]]

Expand Down
Loading