Token / cost visibility in /agents by TroyHernandez · Pull Request #79 · cornball-ai/corteza

TroyHernandez · 2026-05-14T02:21:24Z

Summary

`/agents` (and the underlying `subagent_list()` / `format_subagent_list()`) now show model, age, live context size, cumulative input/output tokens, and cumulative cost per subagent. Dense single-line output, one line per agent.

Before:
```
[1] demo task (15.0 min remaining) idle stub-12345678
```

After:
```
[1] demo task (moonshot-v1-8k · 2s · ctx 617/128.0K · 741 in / 5 out · ?) (30.0 min remaining) idle 6777ea48
```

What changed

`subagent_turn_prompt()` now returns `list(reply, usage)` instead of just the reply string. Two internal callers updated (sync `subagent_query()`, async `subagent_collect()`); saber::blast_radius confirmed nothing external.
Each registry entry gains cumulative usage counters and a `query_count`. Both wait paths accumulate via the new `subagent_accumulate_usage()` helper. Cost is NA when the provider doesn't surface it (moonshot/ollama); distinguished from $0.
`subagent_list()` returns the new fields plus a best-effort live token count via `info$session$run()`. Busy agents (with a pending async call) get `NA` — callr can't stack queries on a session with an outstanding call.
`format_subagent_list()` renders the dense one-line layout. Cost shows `?` when NA, ctx shows `ctx ?` when live tokens are unavailable.
New helpers in `R/context-budget.R`: `format_age()` and `format_live_ctx()`.

Test plan

39 new offline tests in `test_agents_visibility.R` cover the format helpers, `subagent_accumulate_usage()` (NULL-safe, partial usage, running cost), and the full formatter across idle / busy / with-cost shapes.
`tinytest::test_package("corteza")` — 1555/1555 OK.
End-to-end against moonshot: spawn → one sync + one async query → cumulative tokens grow correctly, live ctx updates, age advances.

Open follow-ups (not in this PR)

Cost computation for providers that don't return it natively (would need a per-model pricing table; out of scope here).
Per-tool-call latency breakdown if that proves useful later.

Each subagent's registry entry now tracks cumulative usage: cumulative_input_tokens, cumulative_output_tokens, cumulative_total_tokens, cumulative_cost, query_count. Both sync subagent_query() and async subagent_collect() accumulate after each successful turn. Cost is captured only when the provider returns it (Anthropic typically does; moonshot/ollama don't), and the cumulative starts as NA so a missing cost stays distinguishable from $0. subagent_turn_prompt() now returns list(reply, usage) instead of just the reply string so the parent can read both. Only two internal callers; saber::blast_radius confirmed. subagent_list() gains: model, age_seconds, live_tokens, context_limit, cumulative_*, query_count. live_tokens is computed per /agents call via info$session$run() against the child — best- effort, NA for busy agents (callr can't query a session with a pending call). format_subagent_list() renders a dense single line per agent: [1] task (model · age · ctx N/limit · X in / Y out · $Z) (T min remaining) idle <short-id> Cost shows '?' when not provided. live ctx shows 'ctx ?' when the child is busy. Two new helpers in R/context-budget.R: format_age() and format_live_ctx(). 39 offline tests in test_agents_visibility.R cover the format helpers, subagent_accumulate_usage() (NULL safe, partial usage, running cost), and the full format_subagent_list() output across idle/busy/with-cost shapes. Verified end-to-end against moonshot: [1] demo (moonshot-v1-8k · 2s · ctx 617/128.0K · 741 in / 5 out · ?) (30.0 min remaining) idle <id> Cumulative tokens grow across two queries; one sync, one async + collect, both accumulate correctly.

Subagents spawned without an explicit model used to show the provider name as the model field and 'ctx ?' as the live context limit, because the limit lookup had no key. The child still ran with the provider's real default model, so the display was misleading. Add default_provider_model() in R/context-budget.R as a single source of truth, mirroring the CLI's default_provider_model() and matching .resolve_model() in R/turn.R for moonshot. Use it in: - subagent_live_token_count(): when sess$model_map$cloud is NULL, fall back to the provider default before looking up the context limit. Returns the resolved model so the parent can display the same identity. - subagent_list(): show the resolved model name in the agents listing (info$model > live$model > default_provider_model() > provider name > '?'). - maybe_compact_turn_session(): replace the inline switch with the shared helper. Same defaults; this also corrects an old drift where compaction used 'moonshot-v1-8k' while the rest of the package uses 'kimi-k2.6'. Verified end-to-end: spawning a moonshot subagent with no explicit model now shows '(kimi-k2.6 · 0s · ctx 565/128.0K · ...)' instead of '(moonshot · 0s · ctx ? · ...)'. 8 new tests cover default_provider_model() lookups and confirm each resolved default has a context-limit entry. 1563/1563 OK.

TroyHernandez added 2 commits May 13, 2026 21:21

TroyHernandez merged commit ca46c91 into main May 14, 2026
4 checks passed

TroyHernandez deleted the agents-token-cost branch May 14, 2026 02:57

TroyHernandez mentioned this pull request May 14, 2026

Surface usage$cost in agent() and chat() return shapes cornball-ai/llm.api#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token / cost visibility in /agents#79

Token / cost visibility in /agents#79
TroyHernandez merged 2 commits into
mainfrom
agents-token-cost

TroyHernandez commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TroyHernandez commented May 14, 2026

Summary

What changed

Test plan

Open follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant