fix(token-estimation): refine logic #921

steebchen · 2025-09-23T00:03:10Z

Summary

Refines token estimation logic in estimateTokens function
Estimates prompt tokens only if not provided by the API
Estimates completion tokens only if not provided by the API
Removes redundant estimation when tokens are already available
Adds error handling with fallback for encoding failures

Changes

Token Estimation Logic

Updated to estimate prompt tokens only when promptTokens is missing and messages are present
Updated to estimate completion tokens only when completionTokens is missing and content is present
Uses encodeChat for prompt token estimation and encode for completion token estimation
Logs errors and falls back to simple length-based estimation if encoding fails

Test plan

Verify prompt tokens are estimated only when missing
Verify completion tokens are estimated only when missing
Confirm error handling and fallback logic works by simulating encoding failures
Ensure no redundant token estimation occurs when tokens are provided by the API

🌿 Generated by Terry

ℹ️ Tag @terragon-labs to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/79437a36-00fa-4c42-b027-987164208a0a

Summary by CodeRabbit

Bug Fixes
- Token usage is now estimated only when the API omits counts, improving accuracy of reported usage and quotas.
- Fallback estimates for both prompt and completion tokens use rounded values for more consistent results.
- Chat totals now rely on provided token counts to reduce misreporting.
Bug Fixes / Data Validation
- Base64 image size estimation now rounds to an integer before enforcing the size limit.

- Removed redundant estimation of tokens when promptTokens and completionTokens are provided by the API. - Now only estimate prompt tokens if promptTokens is missing and messages are available. - Only estimate completion tokens if completionTokens is missing and content is available. - Improved code clarity by separating conditions for prompt and completion token estimation. - Maintained fallback logic with error logging for encoding failures. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

bunnyshell · 2025-09-23T00:03:17Z

❌ Preview Environment deleted from Bunnyshell

Available commands (reply to this comment):

🚀 /bns:deploy to deploy the environment

coderabbitai · 2025-09-23T00:03:18Z

Walkthrough

Separated prompt and completion token estimation into independent guarded paths and switched chat flows to prefer provided token fields; also changed base64 size calculation to round to an integer. Logging and per-path error fallbacks were adjusted accordingly.

Changes

Cohort / File(s)	Summary
Token estimation helper `apps/gateway/src/chat/tools/estimate-tokens.ts`	Split combined estimation into two isolated guards: estimate promptTokens only when missing (and messages exist) and estimate completionTokens only when missing (and content exists). Encode-based calculations retained; failure fallbacks now use Math.round for length-based estimates. Comments updated.
Chat request/flow `apps/gateway/src/chat/chat.ts`	Replaced use of calculatedPromptTokens/calculatedCompletionTokens with direct use of provided `promptTokens` and `completionTokens` across streaming and non-streaming paths. Removed some debug logs and removed calls to the combined estimation paths; simplified total token computation to sum provided fields (plus reasoningTokens).
Image size calculation `packages/models/src/process-image-url.ts`	Changed base64 data URL estimated size calculation to use `Math.round(base64Data.length * 3 / 4)` (integer rounding) while preserving existing validation and 20 MB limit checks.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Chat as ChatService
  participant Estimator as estimateTokens()
  participant Encoder as TokenEncoder
  participant Logger

  Client->>Chat: send request (may include promptTokens, completionTokens)
  alt both tokens provided
    Chat-->>Client: proceed using provided promptTokens & completionTokens
  else any token missing
    Chat->>Estimator: estimateTokens(input)
    alt promptTokens missing
      Estimator->>Encoder: encodeChat(messages)
      alt encode succeeds
        Encoder-->>Estimator: promptCount
      else encode fails
        Estimator->>Logger: log prompt encode error
        Estimator-->>Estimator: promptCount = Math.round(fallback)
      end
    end
    alt completionTokens missing
      Estimator->>Encoder: encode(JSON.stringify(content))
      alt encode succeeds
        Encoder-->>Estimator: completionCount
      else encode fails
        Estimator->>Logger: log completion encode error
        Estimator-->>Estimator: completionCount = Math.round(fallback)
      end
    end
    Estimator-->>Chat: return promptTokens, completionTokens (only those calculated)
    Chat-->>Client: continue using provided + calculated token fields
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(chat): add usage metadata token counts #371 — Adds extraction/population of promptTokens/completionTokens from provider responses; directly related to using provided token fields in chat flows.
feat(chat): simplify Google streaming data parsing #536 — Modifies token-estimation and usage logic in chat.ts; related to the split estimation and token preference changes.
fix(chat): use estimateTokens helper for chat token calculation #537 — Consolidates callers to use estimateTokens; relevant because this PR changes estimateTokens’ internal guards and rounding behavior.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "fix(token-estimation): refine logic" is concise and accurately summarizes the primary change — refinements to the token-estimation logic — using a conventional-commit style prefix and clear scope. It directly reflects the estimateTokens-focused updates described in the PR summary and avoids vague terms or noisy details. This makes the intent easy to understand for teammates scanning history.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch terragon/fix-token-estimation-logic

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df7dffa and c2f7be5.

📒 Files selected for processing (1)

apps/gateway/src/chat/tools/estimate-tokens.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

{apps/api,apps/gateway,apps/ui,apps/docs,packages}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Always use top-level import; never use require() or dynamic imports (e.g., import(), next/dynamic)

Files:

apps/gateway/src/chat/tools/estimate-tokens.ts

{apps/api,apps/gateway,packages/db}/**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

{apps/api,apps/gateway,packages/db}/**/*.ts: Use Drizzle ORM with the latest object syntax for database access
For reads, use db().query.
.findMany() or db().query.
.findFirst()
Files:

apps/gateway/src/chat/tools/estimate-tokens.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Never use any or as any in this TypeScript project unless absolutely necessary
Always use top-level import; never use require or dynamic imports

Files:

apps/gateway/src/chat/tools/estimate-tokens.ts

{apps/api,apps/gateway}/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

For reads, use db().query.

.findMany() or db().query.
.findFirst()
Files:

apps/gateway/src/chat/tools/estimate-tokens.ts

🧬 Code graph analysis (1)

apps/gateway/src/chat/tools/estimate-tokens.ts (2)

apps/gateway/src/chat/tools/types.ts (2)

ChatMessage (4-8)

DEFAULT_TOKENIZER_MODEL (1-1)

packages/logger/src/index.ts (2)

error (147-154)

logger (175-175)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: e2e-shards (5)

GitHub Check: e2e-shards (2)

GitHub Check: e2e-shards (4)

GitHub Check: e2e-shards (1)

GitHub Check: e2e-shards (3)

GitHub Check: autofix

GitHub Check: test / run

GitHub Check: build / run

GitHub Check: generate / run

GitHub Check: lint / run

coderabbitai · 2025-09-23T00:07:27Z

apps/gateway/src/chat/tools/estimate-tokens.ts

+	// Estimate prompt tokens only if not provided by the API
+	if (!promptTokens && messages && messages.length > 0) {
+		try {
+			// Convert messages to the format expected by gpt-tokenizer
+			const chatMessages: ChatMessage[] = messages.map((m) => ({
+				role: m.role,
+				content:
+					typeof m.content === "string" ? m.content : JSON.stringify(m.content),
+				name: m.name,
+			}));
+			calculatedPromptTokens = encodeChat(
+				chatMessages,
+				DEFAULT_TOKENIZER_MODEL,
+			).length;
+		} catch (error) {
+			// Fallback to simple estimation if encoding fails
+			logger.error(
+				"Failed to encode chat messages in estimate tokens",
+				error instanceof Error ? error : new Error(String(error)),
+			);
+			calculatedPromptTokens =
+				messages.reduce((acc, m) => acc + (m.content?.length || 0), 0) / 4;
 		}
+	}


⚠️ Potential issue

Guard for “missing” must not treat 0 as missing; fix fallback undercount and return integers

Line 23: Using !promptTokens re-estimates when the API legitimately returns 0. Use a null/undefined check.

Lines 42-44: Fallback undercounts when content is non-string (objects/arrays). Mirror the encoding path and stringify per‑message before length. Also round to an integer token count.

Apply this diff:

- // Estimate prompt tokens only if not provided by the API - if (!promptTokens && messages && messages.length > 0) { + // Estimate prompt tokens only if not provided by the API + if (promptTokens == null && messages?.length > 0) { try { @@ - calculatedPromptTokens = - messages.reduce((acc, m) => acc + (m.content?.length || 0), 0) / 4; + calculatedPromptTokens = Math.round( + messages.reduce((acc, m) => { + const text = + typeof m.content === "string" ? m.content : JSON.stringify(m.content); + return acc + text.length; + }, 0) / 4 + ); } }

Additionally: the parameter messages: any[] violates the project guideline to avoid any. Consider typing it (e.g., a ChatMessageLike with content: unknown) and narrowing in place.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Estimate prompt tokens only if not provided by the API

if (!promptTokens && messages && messages.length > 0) {

try {

// Convert messages to the format expected by gpt-tokenizer

const chatMessages: ChatMessage[] = messages.map((m) => ({

role: m.role,

content:

typeof m.content === "string" ? m.content : JSON.stringify(m.content),

name: m.name,

}));

calculatedPromptTokens = encodeChat(

chatMessages,

DEFAULT_TOKENIZER_MODEL,

).length;

} catch (error) {

// Fallback to simple estimation if encoding fails

logger.error(

"Failed to encode chat messages in estimate tokens",

error instanceof Error ? error : new Error(String(error)),

);

calculatedPromptTokens =

messages.reduce((acc, m) => acc + (m.content?.length || 0), 0) / 4;

}

}

// Estimate prompt tokens only if not provided by the API

if (promptTokens == null && messages?.length > 0) {

try {

// Convert messages to the format expected by gpt-tokenizer

const chatMessages: ChatMessage[] = messages.map((m) => ({

role: m.role,

content:

typeof m.content === "string" ? m.content : JSON.stringify(m.content),

name: m.name,

}));

calculatedPromptTokens = encodeChat(

chatMessages,

DEFAULT_TOKENIZER_MODEL,

).length;

} catch (error) {

// Fallback to simple estimation if encoding fails

logger.error(

"Failed to encode chat messages in estimate tokens",

error instanceof Error ? error : new Error(String(error)),

);

calculatedPromptTokens = Math.round(

messages.reduce((acc, m) => {

const text =

typeof m.content === "string" ? m.content : JSON.stringify(m.content);

return acc + text.length;

}, 0) / 4

);

}

}

coderabbitai · 2025-09-23T00:07:28Z

apps/gateway/src/chat/tools/estimate-tokens.ts

+	// Estimate completion tokens only if not provided by the API
+	if (!completionTokens && content) {
+		try {
+			calculatedCompletionTokens = encode(JSON.stringify(content)).length;
+		} catch (error) {
+			// Fallback to simple estimation if encoding fails
+			logger.error(
+				"Failed to encode completion text",
+				error instanceof Error ? error : new Error(String(error)),
+			);
+			calculatedCompletionTokens = content.length / 4;
 		}
 	}


⚠️ Potential issue

Fix completion-path guard and encoding; avoid JSON.stringify and round fallback

Line 48: !completionTokens incorrectly treats 0 as missing and skips empty-string content. Use a null/undefined check and allow empty strings.

Line 50: encode(JSON.stringify(content)) overcounts by adding quotes/escapes; pass the string directly.

Line 57: Round the heuristic to an integer.

Apply this diff:

- // Estimate completion tokens only if not provided by the API - if (!completionTokens && content) { + // Estimate completion tokens only if not provided by the API + if (completionTokens == null && content != null) { try { - calculatedCompletionTokens = encode(JSON.stringify(content)).length; + calculatedCompletionTokens = encode(content).length; } catch (error) { @@ - calculatedCompletionTokens = content.length / 4; + calculatedCompletionTokens = Math.round(content.length / 4); } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Estimate completion tokens only if not provided by the API

if (!completionTokens && content) {

try {

calculatedCompletionTokens = encode(JSON.stringify(content)).length;

} catch (error) {

// Fallback to simple estimation if encoding fails

logger.error(

"Failed to encode completion text",

error instanceof Error ? error : new Error(String(error)),

);

calculatedCompletionTokens = content.length / 4;

}

}

// Estimate completion tokens only if not provided by the API

if (completionTokens == null && content != null) {

try {

calculatedCompletionTokens = encode(content).length;

} catch (error) {

// Fallback to simple estimation if encoding fails

logger.error(

"Failed to encode completion text",

error instanceof Error ? error : new Error(String(error)),

);

calculatedCompletionTokens = Math.round(content.length / 4);

}

}

🤖 Prompt for AI Agents

In apps/gateway/src/chat/tools/estimate-tokens.ts around lines 47 to 59, the guard `!completionTokens` treats 0 as missing and the encoder call uses JSON.stringify which inflates token count and the fallback heuristic isn't rounded; change the guard to check for null/undefined (e.g., completionTokens == null) so zero is allowed, only run the block when content is not null/undefined (allow empty strings), call encode on the string content directly (e.g., encode(String(content))) instead of JSON.stringify(content), and round the fallback heuristic to an integer (e.g., Math.round(content.length / 4)).

Rounded token estimation values in `estimateTokens` to ensure accurate integer predictions for both prompt and completion tokens. Enhanced size estimation for base64 data in `process-image-url` by rounding results.

Removed redundant token estimation when `promptTokens` and `completionTokens` are provided. Simplified token-related calculations and ensured consistent handling across all relevant functions.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

packages/models/src/process-image-url.ts (1)

51-69: Harden against SSRF (block private/link-local IPs even over HTTPS).

Fetching user-supplied URLs can still hit internal services (e.g., 169.254.169.254) via HTTPS. Add DNS/IP resolution guards and allow-list public IP ranges or domains.

I can provide a helper to resolve hostnames and reject private/reserved CIDRs if you want.
apps/gateway/src/chat/chat.ts (1)
2525-2533: Avoid any[]; use ChatMessage[] per project TS guidelines.

You already import ChatMessage; use it here.

Apply this diff:
-              const chatMessages: any[] = messages.map((m) => ({
+              const chatMessages: ChatMessage[] = messages.map((m) => ({
                 role: m.role as "user" | "assistant" | "system" | undefined,
                 content: m.content || "",
                 name: m.name,
               }));

🧹 Nitpick comments (1)

apps/gateway/src/chat/chat.ts (1)
2548-2551: Don’t JSON.stringify completion before encoding.

JSON.stringify(fullContent) inflates tokens. Encode the raw text.

Apply this diff:
-              calculatedCompletionTokens = encode(
-                JSON.stringify(fullContent),
-              ).length;
+              calculatedCompletionTokens = encode(fullContent).length;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2f7be5 and 2008509.

📒 Files selected for processing (3)

apps/gateway/src/chat/chat.ts (3 hunks)
apps/gateway/src/chat/tools/estimate-tokens.ts (1 hunks)
packages/models/src/process-image-url.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

apps/gateway/src/chat/tools/estimate-tokens.ts

🧰 Additional context used

📓 Path-based instructions (4)

{apps/api,apps/gateway,apps/ui,apps/docs,packages}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Always use top-level import; never use require() or dynamic imports (e.g., import(), next/dynamic)

Files:

apps/gateway/src/chat/chat.ts
packages/models/src/process-image-url.ts

{apps/api,apps/gateway,packages/db}/**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

{apps/api,apps/gateway,packages/db}/**/*.ts: Use Drizzle ORM with the latest object syntax for database access
For reads, use db().query.
.findMany() or db().query.
.findFirst()
Files:

apps/gateway/src/chat/chat.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Never use any or as any in this TypeScript project unless absolutely necessary
Always use top-level import; never use require or dynamic imports

Files:

apps/gateway/src/chat/chat.ts

packages/models/src/process-image-url.ts

{apps/api,apps/gateway}/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

For reads, use db().query.

.findMany() or db().query.
.findFirst()
Files:

apps/gateway/src/chat/chat.ts

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: e2e-shards (2)

GitHub Check: e2e-shards (5)

GitHub Check: e2e-shards (1)

GitHub Check: e2e-shards (3)

GitHub Check: e2e-shards (4)

GitHub Check: build / run

GitHub Check: test / run

GitHub Check: generate / run

GitHub Check: lint / run

GitHub Check: autofix

🔇 Additional comments (1)

apps/gateway/src/chat/chat.ts (1)

65-67: LGTM: round length-based token estimate.

Keeps integer semantics and avoids zeros with Math.max(1, …).

coderabbitai · 2025-09-23T00:28:15Z

apps/gateway/src/chat/chat.ts

+		promptTokens,
+		completionTokens,
 		cachedTokens,


⚠️ Potential issue

Ensure costs/response use tokens even when providers omit them.

When prompt/completion tokens are null, derive them once and pass consistent numbers into calculateCosts and transformResponseToOpenai.

Apply this diff to use safe values in the two calls:

const costs = calculateCosts( usedModel, usedProvider, - promptTokens, - completionTokens, + safePromptTokens, + safeCompletionTokens, cachedTokens,

const transformedResponse = transformResponseToOpenai( usedProvider, usedModel, json, content, reasoningContent, finishReason, - promptTokens, - completionTokens, - (promptTokens || 0) + (completionTokens || 0) + (reasoningTokens || 0), + safePromptTokens ?? 0, + safeCompletionTokens ?? 0, + (safePromptTokens ?? 0) + (safeCompletionTokens ?? 0) + (reasoningTokens || 0),

Add these helpers right before the calculateCosts call:

// Compute safe token values only if missing let safePromptTokens = promptTokens ?? null; let safeCompletionTokens = completionTokens ?? null; if (safePromptTokens === null && messages?.length) { try { const chatMsgs: ChatMessage[] = messages.map((m) => ({ role: m.role as "user" | "assistant" | "system" | undefined, content: typeof m.content === "string" ? m.content : JSON.stringify(m.content ?? ""), name: m.name, })); safePromptTokens = encodeChat(chatMsgs, DEFAULT_TOKENIZER_MODEL).length; } catch (e) { logger.error("Failed to encode messages (non-streaming)", e instanceof Error ? e : new Error(String(e))); safePromptTokens = estimateTokensFromContent(messages.map((m) => String(m.content ?? "")).join("\n")); } } if (safeCompletionTokens === null && content) { try { safeCompletionTokens = encode(content).length; } catch (e) { logger.error("Failed to encode completion (non-streaming)", e instanceof Error ? e : new Error(String(e))); safeCompletionTokens = estimateTokensFromContent(content); } }

Also applies to: 3035-3038

🤖 Prompt for AI Agents

In apps/gateway/src/chat/chat.ts around lines 3017-3019 (and similarly 3035-3038), promptTokens/completionTokens can be null from providers so derive safe values once and reuse them: add helpers just before the calculateCosts call to set let safePromptTokens = promptTokens ?? null and let safeCompletionTokens = completionTokens ?? null, then if safePromptTokens is null and messages exist compute it by encoding messages via encodeChat with DEFAULT_TOKENIZER_MODEL falling back to estimateTokensFromContent on error (logging via logger.error); if safeCompletionTokens is null and content exists compute it via encode with fallback to estimateTokensFromContent (also logging on error). Finally replace direct promptTokens/completionTokens uses in calculateCosts and transformResponseToOpenai with safePromptTokens and safeCompletionTokens so both functions receive consistent, non-null token counts.

coderabbitai · 2025-09-23T00:28:15Z

apps/gateway/src/chat/chat.ts

+		promptTokens: promptTokens?.toString() || null,
+		completionTokens: completionTokens?.toString() || null,
 		totalTokens:
-			totalTokens ||
-			(
-				(calculatedPromptTokens || 0) + (calculatedCompletionTokens || 0)
-			).toString(),
+			totalTokens || ((promptTokens || 0) + (completionTokens || 0)).toString(),
 		reasoningTokens: reasoningTokens,


⚠️ Potential issue

Fix totalTokens type/consistency and include reasoning fallback.

Currently totalTokens may be a number (not string) and ignores reasoning in fallback; align with other fields.

Apply this diff:

- promptTokens: promptTokens?.toString() || null, - completionTokens: completionTokens?.toString() || null, - totalTokens: - totalTokens || ((promptTokens || 0) + (completionTokens || 0)).toString(), + promptTokens: (safePromptTokens ?? promptTokens)?.toString() || null, + completionTokens: (safeCompletionTokens ?? completionTokens)?.toString() || null, + totalTokens: ( + (totalTokens ?? ((safePromptTokens ?? promptTokens ?? 0) + (safeCompletionTokens ?? completionTokens ?? 0) + (reasoningTokens ?? 0))) + ).toString(),

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In apps/gateway/src/chat/chat.ts around lines 3084 to 3088, totalTokens can be a number and currently ignores reasoningTokens when falling back; change totalTokens to follow the same string-or-null pattern as promptTokens/completionTokens by ensuring it is computed and stored as a string (or null) and include reasoningTokens in the fallback sum (i.e., sum promptTokens, completionTokens and reasoningTokens treating missing values as 0, then call toString() or set null appropriately) so types and formatting match the other token fields.

coderabbitai · 2025-09-23T00:28:15Z

packages/models/src/process-image-url.ts


 		// Validate size (estimate: base64 adds ~33% overhead)
-		const estimatedSize = (base64Data.length * 3) / 4;
+		const estimatedSize = Math.round((base64Data.length * 3) / 4);


⚠️ Potential issue

Compute base64 byte size precisely (handle padding/whitespace).

Math.round(len*3/4) can over/under-estimate and cause false 20MB limit rejections. Account for '=' padding and possible whitespace without decoding the payload.

Apply this diff:

- const estimatedSize = Math.round((base64Data.length * 3) / 4); + const sanitized = base64Data.replace(/\s/g, ""); + const padding = sanitized.endsWith("==") ? 2 : sanitized.endsWith("=") ? 1 : 0; + const estimatedSize = Math.floor((sanitized.length * 3) / 4) - padding;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const estimatedSize = Math.round((base64Data.length * 3) / 4);

const sanitized = base64Data.replace(/\s/g, "");

const padding = sanitized.endsWith("==") ? 2 : sanitized.endsWith("=") ? 1 : 0;

const estimatedSize = Math.floor((sanitized.length * 3) / 4) - padding;

🤖 Prompt for AI Agents

In packages/models/src/process-image-url.ts around line 31, the current size estimate uses Math.round(base64.length*3/4) which miscounts when padding or whitespace are present; instead strip any data URL prefix and all whitespace from the base64 string, count trailing '=' padding characters (0,1,2) and compute exact byte length as (cleanLen * 3) / 4 - paddingCount (use integer math, no decoding), then use that value to enforce the 20MB limit; update the code to perform these steps so padding/whitespace are handled precisely.

github-actions bot changed the title ~~Fix token estimation logic to only estimate missing tokens~~ fix(token-estimation): refine logic Sep 23, 2025

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

steebchen added 2 commits September 23, 2025 01:17

fix(chat): round token estimates to nearest integer

3e265ff

Rounded token estimation values in `estimateTokens` to ensure accurate integer predictions for both prompt and completion tokens. Enhanced size estimation for base64 data in `process-image-url` by rounding results.

refactor(chat): streamline token and cost logic

2008509

Removed redundant token estimation when `promptTokens` and `completionTokens` are provided. Simplified token-related calculations and ensured consistent handling across all relevant functions.

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

steebchen self-assigned this Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(token-estimation): refine logic #921

fix(token-estimation): refine logic #921

Uh oh!

steebchen commented Sep 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

bunnyshell bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 23, 2025

Uh oh!

coderabbitai bot Sep 23, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 23, 2025

Uh oh!

coderabbitai bot Sep 23, 2025

Uh oh!

coderabbitai bot Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-		const estimatedSize = Math.round((base64Data.length * 3) / 4);
+		const sanitized = base64Data.replace(/\s/g, "");
+		const padding = sanitized.endsWith("==") ? 2 : sanitized.endsWith("=") ? 1 : 0;
+		const estimatedSize = Math.floor((sanitized.length * 3) / 4) - padding;

fix(token-estimation): refine logic #921

Are you sure you want to change the base?

fix(token-estimation): refine logic #921

Uh oh!

Conversation

steebchen commented Sep 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Token Estimation Logic

Test plan

Summary by CodeRabbit

Uh oh!

bunnyshell bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Preview Environment deleted from Bunnyshell

Uh oh!

coderabbitai bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steebchen commented Sep 23, 2025 •

edited by coderabbitai bot

Loading

bunnyshell bot commented Sep 23, 2025 •

edited

Loading

coderabbitai bot commented Sep 23, 2025 •

edited

Loading