feat(ai): record total billed tokens per ai run by ocervell · Pull Request #1222 · freelabz/secator

ocervell · 2026-06-25T18:00:43Z

Goal

Add per-run AI-token accounting to the ai task so the cloud platform can bill/quota it. The ai task already gets real billed token usage per LLM call (call_llm → response.usage.total_tokens + litellm completion_cost) but never aggregated it. This persists the per-run total on the runner so the billing chore can read it — the AI analog of how context.scan_hours is used for run-hours billing.

What changed

secator/tasks/ai.py
- _init_options seeds context["ai_tokens"] (int) and context["ai_cost"] (float) so the field lands on the task doc even with zero LLM calls.
- _account_usage(usage) sums a single call_llm usage dict onto self.context. Missing/None/malformed usage counts as 0 and never raises.
- Wired in after the main loop call_llm (before any empty-response continue, so each billed call is counted once) and after the intent-detection call_llm.
- _drain_history_usage() rolls billed usage accrued by history summarization into context.ai_tokens once per iteration.
secator/ai/history.py — ChatHistory.compact() now records its summarization call's billed usage on billed_tokens/billed_cost (the task drains it). Missing usage = 0.

self.context is copied onto every item's _context and persisted to MongoDB, so context.ai_tokens lands on the task doc. Subagent/batch ai tasks are separate runners with their own task doc + own context.ai_tokens, so the chore sums across docs without double-counting. Consistent in chat and attack modes.

The field the billing chore reads

context.ai_tokens (cumulative billed tokens, int). context.ai_cost (float) is also persisted.

Tests

New tests/unit/test_ai_tokens.py (9 tests, all green): N-call sum, missing/None usage = 0, malformed usage doesn't crash, exact persisted key, summarization drain-once, compact() records usage / handles missing usage, and two end-to-end _run_loop runs asserting context.ai_tokens == sum (and 0 with no usage). flake8 clean.

🤖 Generated with Claude Code

https://claude.ai/code/session_01P5vSjfkBuGAAHdKxHS3ySm

Summary by CodeRabbit

New Features
- Added tracking of billed AI token and cost usage during runs.
- Run summaries now include accumulated usage from both direct model calls and history compaction.
Bug Fixes
- Improved handling of missing or invalid usage data so accounting stays accurate.
- Ensured history-related usage is counted only once and then cleared from temporary storage.
Tests
- Added coverage for usage tracking, invalid usage values, and end-to-end run accounting.

Aggregate the real billed token usage (from call_llm's response.usage + litellm completion_cost) across every LLM call an ai task makes into a per-run total persisted on the runner context as context.ai_tokens (int, cumulative) and context.ai_cost (float). This is the AI analog of context.scan_hours: the cloud billing chore reads context.ai_tokens off the task doc to bill/quota AI usage. - _account_usage() sums call_llm usage onto self.context; missing/None usage counts as 0 so accounting never crashes the run. - Main loop call, intent-detection call, and history summarization call are all counted exactly once. ChatHistory.compact() accrues its own summarization usage which the task drains into context per iteration. - Subagent/batch ai tasks are separate runners with their own task doc and their own context.ai_tokens, so the chore sums across docs without double-counting. - Works identically in chat and attack modes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01P5vSjfkBuGAAHdKxHS3ySm

coderabbitai · 2026-06-25T18:00:57Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 79307a56-f1f3-4a5b-aa40-56e178227c60

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

This change adds billed token and cost tracking for AI history compaction and task execution. ChatHistory stores summarization usage, and AI task runs accumulate per-call and drained history usage into context. New tests cover helper and loop accounting.

Changes

AI billing accounting

Layer / File(s)	Summary
History billing fields and compaction `secator/ai/history.py`	`ChatHistory` gains billed token/cost fields, and `compact()` stores summarization usage totals on the instance.
Run-loop billing accumulation `secator/tasks/ai.py`	`_init_options` seeds context totals, `_run_loop` drains history usage and accounts each LLM response, `_detect_mode` records intent-detection usage, and the new helpers accumulate or reset billing counters.
Billing accounting tests `tests/unit/test_ai_tokens.py`	Unit and loop tests cover helper accumulation, history drainage, `compact()` usage recording, and end-to-end totals.

Sequence Diagram(s)

sequenceDiagram
  participant RunLoop as secator.tasks.ai._run_loop
  participant History as secator.ai.history.ChatHistory.compact
  participant LLM as call_llm
  participant Account as secator.tasks.ai._account_usage
  participant Drain as secator.tasks.ai._drain_history_usage
  participant Detect as secator.tasks.ai._detect_mode

  RunLoop->>History: compact() when history is summarized
  History->>LLM: summarize messages
  LLM-->>History: result with usage
  History-->>RunLoop: billed_tokens and billed_cost updated
  RunLoop->>Drain: move history billing into context
  RunLoop->>Account: add response usage after each LLM call
  Detect->>Account: add intent-detection usage

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A bunny counted tokens by moonlight glow,
Then tucked cost crumbs where the tallies grow.
Hop, hop — the history keeps its score,
And the loop adds up a little more.
With whiskers twitching, I cheer and say:
“Carrot-ledgers make a bright AI day!”

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: per-run AI billing accounting for billed tokens.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/ai-token-quota

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@secator/tasks/ai.py`:
- Around line 800-814: The usage drain in _drain_history_usage only accounts and
clears history.billed_tokens and history.billed_cost when tokens is truthy,
which can drop cost-only usage. Update the condition so the drain runs when
either billed_tokens or billed_cost is present, and make sure both
history.billed_tokens and history.billed_cost are reset after calling
self._account_usage, even when tokens is zero.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6b9a15e-08c4-474c-a52d-b9baf8237797

📥 Commits

Reviewing files that changed from the base of the PR and between 8d8ec83 and 258fe14.

📒 Files selected for processing (3)

secator/ai/history.py
secator/tasks/ai.py
tests/unit/test_ai_tokens.py

coderabbitai · 2026-06-25T18:05:19Z

+	def _drain_history_usage(self):
+		"""Roll billed usage accrued by history summarization into context.ai_tokens.
+
+		`ChatHistory.compact` makes its own LLM calls and stashes their billed
+		usage on the history object; drain it here so it is counted exactly once.
+		"""
+		history = getattr(self, "history", None)
+		if history is None:
+			return
+		tokens = getattr(history, "billed_tokens", 0) or 0
+		cost = getattr(history, "billed_cost", 0.0) or 0.0
+		if tokens:
+			self._account_usage({"tokens": tokens, "cost": cost})
+			history.billed_tokens = 0
+			history.billed_cost = 0.0


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Cost can be silently dropped when billed_tokens is 0 but billed_cost is non-zero.

The drain only fires (and resets the counters) when tokens is truthy. If a summarization call reports a cost with zero/missing tokens, that cost is neither accounted nor reset on this iteration. It would only be picked up on a later drain that happens to have non-zero tokens, and is lost entirely if that never occurs. Gate on either value.

🛠️ Proposed fix

tokens = getattr(history, "billed_tokens", 0) or 0 cost = getattr(history, "billed_cost", 0.0) or 0.0 - if tokens: + if tokens or cost: self._account_usage({"tokens": tokens, "cost": cost}) history.billed_tokens = 0 history.billed_cost = 0.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _drain_history_usage(self):

"""Roll billed usage accrued by history summarization into context.ai_tokens.

`ChatHistory.compact` makes its own LLM calls and stashes their billed

usage on the history object; drain it here so it is counted exactly once.

"""

history = getattr(self, "history", None)

if history is None:

return

tokens = getattr(history, "billed_tokens", 0) or 0

cost = getattr(history, "billed_cost", 0.0) or 0.0

if tokens:

self._account_usage({"tokens": tokens, "cost": cost})

history.billed_tokens = 0

history.billed_cost = 0.0

def _drain_history_usage(self):

"""Roll billed usage accrued by history summarization into context.ai_tokens.

`ChatHistory.compact` makes its own LLM calls and stashes their billed

usage on the history object; drain it here so it is counted exactly once.

"""

history = getattr(self, "history", None)

if history is None:

return

tokens = getattr(history, "billed_tokens", 0) or 0

cost = getattr(history, "billed_cost", 0.0) or 0.0

if tokens or cost:

self._account_usage({"tokens": tokens, "cost": cost})

history.billed_tokens = 0

history.billed_cost = 0.0

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@secator/tasks/ai.py` around lines 800 - 814, The usage drain in _drain_history_usage only accounts and clears history.billed_tokens and history.billed_cost when tokens is truthy, which can drop cost-only usage. Update the condition so the drain runs when either billed_tokens or billed_cost is present, and make sure both history.billed_tokens and history.billed_cost are reset after calling self._account_usage, even when tokens is zero.

Extend the single per-run billed-token accumulator so every call_llm contributes its prompt/completion breakdown, not just the total. call_llm now reads response.usage.prompt_tokens / completion_tokens (alongside total_tokens) and _account_usage rolls each into a dedicated cumulative context key: context.ai_prompt_tokens / context.ai_completion_tokens (context.ai_tokens remains the billed total). The history-compaction path (ChatHistory.compact -> billed_* -> _drain_history_usage) threads the split through too, so summarization is split-billed alongside the main loop and intent-detection calls. Accounting stays at the source: every successful call_llm (tool-only main turn, _detect_mode intent call, and history compaction) is counted exactly once regardless of whether the turn produced display content. Missing/None usage adds 0 and never raises. There is one accounting path — the old "sum ai_type==response findings" approach is not used. Tests (tests/unit/test_ai_tokens.py): add a tool-only-turn test (proves a content-less turn is billed), a combined main+intent+compaction sum test, and prompt/completion-split accumulation tests. 13/13 pass; flake8 clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01P5vSjfkBuGAAHdKxHS3ySm

…tering The platform metering chore prices a run's consumed tokens against a model registry (free vs paid, per-million in/out/cached rates), so it needs to know WHICH model produced the tokens. Record the resolved run model id on `context.ai_model` in `_init_options`, alongside the existing `context.ai_tokens` accounting seeds. This is the configured model for the run; a mid-session model switch is out of scope (the configured model is recorded). Tests: TestAiModelRecording drives _init_options with collaborators stubbed and asserts context.ai_model == the resolved model (paid + free ids). 15/15 pass; flake8 clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01P5vSjfkBuGAAHdKxHS3ySm

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

ocervell and others added 2 commits June 25, 2026 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ai): record total billed tokens per ai run#1222

feat(ai): record total billed tokens per ai run#1222
ocervell wants to merge 3 commits into
mainfrom
feat/ai-token-quota

ocervell commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ocervell commented Jun 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Goal

What changed

The field the billing chore reads

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ocervell commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading