fix: use backend thread token usage for header total by Layau-code · Pull Request #2800 · bytedance/deer-flow

Layau-code · 2026-05-08T15:20:56Z

Summary

use backend persisted thread/run token usage for the header total
keep the header live during in-flight runs by adding current streamed-message usage as a pending delta
keep per-turn and debug token usage based on currently visible messages
fall back to visible-message aggregation when backend thread usage is unavailable or empty
add backend and frontend tests for the thread-level accounting path

Details

The header token total now requests GET /api/threads/{thread_id}/token-usage and uses the persisted thread-level run aggregation when available.

To avoid freezing the header during long responses, the UI records the current live message baseline before sending a new run. While the run is streaming, the header displays:

persisted backend total + current in-flight message usage

Copilot

Pull request overview

This PR updates the token-usage header total to prefer backend-persisted, thread-level token accounting (via a new GET /api/threads/{thread_id}/token-usage endpoint), while keeping the header responsive during in-flight runs by adding a streamed “pending” delta derived from currently visible messages. It also adds backend/frontend tests for the new thread-level accounting path.

Changes:

Backend: add a typed /api/threads/{thread_id}/token-usage response model + endpoint backed by RunStore.aggregate_tokens_by_thread.
Frontend: add a thread token-usage query + mapping helper, and update the header indicator to prefer backend totals with an in-flight delta.
Tests: add unit coverage for the backend response shape, repository aggregation behavior, and frontend selection logic.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
frontend/tests/unit/core/threads/token-usage.test.ts	Adds unit tests for mapping backend thread token usage into UI `TokenUsage`.
frontend/tests/unit/core/messages/usage.test.ts	Adds unit tests for selecting header totals using backend usage + pending delta, with fallback behavior.
frontend/src/core/threads/types.ts	Introduces `ThreadTokenUsageResponse` type for the new backend API payload.
frontend/src/core/threads/token-usage.ts	Adds query key helper + response-to-UI mapping function.
frontend/src/core/threads/hooks.ts	Adds pending-usage baseline tracking and a new `useThreadTokenUsage` query; exposes `pendingUsageMessages` from `useThreadStream`.
frontend/src/core/messages/usage.ts	Adds `selectHeaderTokenUsage` to prefer backend totals and optionally add pending in-flight usage.
frontend/src/components/workspace/token-usage-indicator.tsx	Switches header usage calculation to `selectHeaderTokenUsage` with backend + pending inputs.
frontend/src/app/workspace/chats/[thread_id]/page.tsx	Wires `useThreadTokenUsage` + pending usage messages into the header indicator.
frontend/src/app/workspace/agents/[agent_name]/chats/[thread_id]/page.tsx	Same wiring for agent chat route (including mock handling).
frontend/src/core/i18n/locales/en-US.ts	Updates token usage note text to reflect new accounting sources.
frontend/src/core/i18n/locales/zh-CN.ts	Updates token usage note text to reflect new accounting sources.
backend/app/gateway/routers/thread_runs.py	Adds response models and the `/token-usage` endpoint with `response_model=ThreadTokenUsageResponse`.
backend/tests/test_thread_token_usage.py	Adds API test asserting stable response shape for `/token-usage`.
backend/tests/test_run_repository.py	Adds repository test ensuring aggregation counts only completed runs and produces expected breakdowns.

+  const pendingUsageMessages = thread.isLoading
+    ? getMessagesAfterBaseline(
+        thread.messages,
+        pendingUsageBaselineMessageIdsRef.current,
+      )


+      if (!response.ok) {
+        throw new Error("Failed to load thread token usage.");
+      }
+      return (await response.json()) as ThreadTokenUsageResponse;
+    },
+    enabled: enabled && Boolean(threadId),


WillemJiang · 2026-05-09T01:38:24Z

@Layau-code, thanks for your contribution. Here are some other review comments on your PR.

isMock destructured from useThreadChat but not shown in diff

The diff adds isMock to the destructuring:
const { threadId, setThreadId, isNewThread, setIsNewThread, isMock } = useThreadChat();
But the diff doesn't show useThreadChat being updated to return isMock. If this property doesn't exist yet, this will be undefined at runtime (not a crash, but the mock-mode disabling won't work). Should verify useThreadChat already exposes this.

useThreadTokenUsage uses raw fetch instead of shared API client

  const response = await fetch(`${getBackendBaseURL()}/api/threads/...`);

Other hooks in the same file use getAPIClient() (e.g., useRunDetail). If the shared client handles auth headers, request interceptors, or error normalization, this hook bypasses all of that. It is worth aligning with the existing pattern unless there's a reason to use a raw fetch.

Content-Type: application/json on a GET request is unnecessary

headers: { "Content-Type": "application/json" },

GET requests have no body, so Content-Type has no effect. Harmless but slightly misleading.

Layau-code · 2026-05-09T07:27:31Z

useThreadChat already returns isMock in frontend/src/components/workspace/chats/use-thread-chat.ts, so it won't be undefined at runtime. I think the PR diff just didn't show that context clearly. I also updated the token usage request to use the shared auth fetch path and removed the unnecessary `Content-Type` header from the GET request.

WillemJiang requested a review from Copilot May 8, 2026 15:33

Copilot started reviewing on behalf of WillemJiang May 8, 2026 15:34 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Layau-code force-pushed the fix/thread-token-usage-header-total branch 2 times, most recently from 5755d68 to 3850c8d Compare May 8, 2026 15:58

fix: use backend thread token usage for header total

219804f

Layau-code force-pushed the fix/thread-token-usage-header-total branch from 3850c8d to 219804f Compare May 8, 2026 16:03

Merge branch 'main' into fix/thread-token-usage-header-total

c878926

WillemJiang added this to the 2.0-m1 milestone May 9, 2026

WillemJiang added the reviewing The PR is in reviewing status label May 9, 2026

Refactor thread token usage fetch

d689789

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use backend thread token usage for header total#2800

fix: use backend thread token usage for header total#2800
Layau-code wants to merge 3 commits intobytedance:mainfrom
Layau-code:fix/thread-token-usage-header-total

Layau-code commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

WillemJiang commented May 9, 2026

Uh oh!

Layau-code commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Layau-code commented May 8, 2026

Summary

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

WillemJiang commented May 9, 2026

Uh oh!

Layau-code commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants