docs(changelog): add context-aware auto routing #691

steebchen · 2025-08-31T00:12:12Z

Summary

Introduces smart context-aware model selection for the auto routing model
Automatically estimates context size requirements based on input messages, tools, and completion needs
Selects the most cost-effective model that meets the context and provider constraints

Changes

Core Functionality

Context Estimation: Analyzes conversation history, input messages, and tool definitions to calculate total context size
Model Filtering: Filters models by context capacity and provider limits to prevent context size errors
Cost Optimization: Chooses the cheapest suitable model automatically
Token Counting: Uses advanced tokenization for precise context size estimation
Fallback Handling: Maintains robust fallback to ensure responses even with unusual requests

User Experience

Seamless integration with existing API calls using model: "auto"
No configuration changes required for users
Improved reliability and cost efficiency without impacting existing workflows

Test plan

Verified that requests with varying context sizes are routed to appropriate models
Confirmed fallback behavior works under edge cases
Tested cost optimization by comparing selected models for different request sizes
Ensured no breaking changes for existing API integrations

This enhancement lays the groundwork for future improvements in intelligent model selection based on additional factors like task complexity and response quality.

🌿 Generated by Terry

ℹ️ Tag @terragon-labs to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/e24f4239-4318-40cb-892b-9b31c0af160b

Summary by CodeRabbit

Documentation
- Added changelog entry “Introducing Smart Auto Routing: Context-Aware Model Selection” (2025-08-31).
- Highlights: automatic context estimation, intelligent model filtering by context capacity, cost-optimized selection, precise token counting with provider limits, tool/function-call awareness, completion buffer handling, and fallback protection.
- Benefits: fewer context-size failures, improved cost efficiency and performance.
- Getting started: enabled automatically when using model “auto”.
- Includes overview, how it works, what’s next, and an accompanying image.

Introduce smart auto routing that selects models based on request context size requirements. This enhancement improves performance, reduces failures due to context size limits, and optimizes cost by choosing the most economical suitable model automatically when using `model: "auto"`. - Automatic context estimation from input messages, tools, and completion needs - Intelligent filtering of models by context capacity - Cost-optimized model selection - Precise token counting and provider context limit respect - Robust fallback behavior to ensure response delivery This feature requires no changes from users and works transparently to improve routing intelligence and efficiency. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

coderabbitai · 2025-08-31T00:12:18Z

Walkthrough

Adds a new changelog entry dated 2025-08-31 documenting “Smart Auto Routing: Context-Aware Model Selection” for the auto model, covering context estimation, model filtering, cost-aware selection, precise token counting, tool considerations, completion buffer handling, and fallback behavior, with metadata (id 17, slug) and an image reference.

Changes

Cohort / File(s)	Summary
Changelog content `apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md`	Adds a changelog post describing context-aware auto routing: context estimation, model filtering by capacity, cost-optimized selection, token counting with provider limits, tool/function-call context handling, completion buffer, and fallback. Includes metadata (title, summary, date, id 17, slug) and image path `/changelog/auto-routing-context-aware.png`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant UI as App/UI (model: "auto")
    participant GW as Gateway (Auto Router)
    participant Prov as Model Providers
    participant Model as Selected Model

    User->>UI: Send messages + tools + requirements
    UI->>GW: Request with history, tools, max_tokens
    note over GW: 1) Estimate total context (tokenization)<br/>2) Filter providers/models by context limits<br/>3) Reserve completion buffer
    GW->>Prov: Query candidate models
    Prov-->>GW: Capabilities + limits + pricing
    GW->>GW: Choose cost-optimized model meeting constraints
    GW->>Model: Send request (messages + tool defs)
    alt Success
        Model-->>GW: Completion
        GW-->>UI: Response
        UI-->>User: Display output
    else Context/other failure
        note over GW: Fallback protection to alternate candidate
        GW->>Model: Retry with next viable model
        Model-->>GW: Completion
        GW-->>UI: Response
        UI-->>User: Display output
    end

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

feat(gateway): add context size checking for auto model selection #660 — Implements context-size estimation and provider/model filtering aligning with the documented auto-routing behavior.
docs(routing): add comprehensive Routing documentation page #690 — Adds detailed routing documentation corresponding to this changelog entry.

Suggested reviewers

smakosh

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch terragon/add-auto-routing-changelog

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (6)

apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md (6)
6-6: Use backticks around the model name in summary

Improves consistency with the rest of the doc where code/params are backticked.

Apply:
-summary: "The 'auto' routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency."
+summary: "The `auto` routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency."
14-15: Tighten phrasing; avoid “request requirements” redundancy

Readability nit.

Apply:
-We're excited to announce a major enhancement to our **'auto' routing model** - introducing **context-aware model selection** that automatically chooses the best model for your specific request requirements.
+We're excited to announce a major enhancement to our **`auto` routing model**—introducing **context-aware model selection** that automatically chooses the best model for your request.
18-23: Minor wording polish to reduce repetition

Two bullets end with “requirements.” Consider slight tightening.

Apply:
-**Cost-Optimized Selection**: Among suitable models, the system selects the most cost-effective option that meets your requirements.
+**Cost-Optimized Selection**: Among suitable models, the system selects the most cost-effective option that meets your needs.
30-31: Parameter formatting is good; consider clarifying buffer scope

Optional: add that reserved space also accounts for tool/function-calling tokens when applicable.

Proposed tweak:
-**Completion Buffer**: Automatically accounts for response length requirements, including your specified `max_tokens` parameter.
+**Completion Buffer**: Automatically reserves space for the response (your `max_tokens`) and, when applicable, tool/function-calling tokens.
36-41: Optional cross-link

If we have a docs page describing token counting, link it here for credibility.

52-59: Add a minimal “Getting started” example

A short snippet helps readers copy/paste.

Add after Line 59:
+### Example
+```bash
+curl -X POST https://api.your-gateway.example/v1/chat/completions \
+  -H "Authorization: Bearer $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "auto",
+    "messages": [{"role":"user","content":"Summarize this thread."}]
+  }'
+```

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between beb9a2d and 5ae5cdd.

📒 Files selected for processing (1)

apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md

[grammar] ~16-~16: There might be a mistake here.
Context: ...uirements. ## 🧠 Smart Context Analysis Automatic Context Estimation: The auto...

(QB_NEW_EN)

[grammar] ~24-~24: There might be a mistake here.
Context: ...s your requirements. ## 🎯 How It Works Message Analysis: The system analyzes ...

(QB_NEW_EN)

[grammar] ~42-~42: There might be a mistake here.
Context: ...th unusual requirements. ## 💡 Benefits Reduced Failures: Eliminates context s...

(QB_NEW_EN)

[grammar] ~52-~52: There might be a mistake here.
Context: ... optimal results. ## 🚀 Getting Started Already Available: This enhancement is...

(QB_NEW_EN)

[grammar] ~60-~60: There might be a mistake here.
Context: ...ting API integration. ## 🔮 What's Next This context-aware routing represents th...

(QB_NEW_EN)

🔇 Additional comments (2)

apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md (2)

62-66: LGTM overall

Clear, user-focused changelog; no blocking issues.

24-33: Emoji headings align with prevalent changelog style
No changes needed.

coderabbitai · 2025-09-01T21:40:14Z

apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md

+id: "17"
+slug: "auto-routing-model-context-aware"
+date: "2025-08-31"
+title: "Introducing Smart Auto Routing: Context-Aware Model Selection"
+summary: "The 'auto' routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency."
+image:
+  src: "/changelog/auto-routing-context-aware.png"
+  alt: "Auto routing model with context-aware selection on LLM Gateway"
+  width: 1768
+  height: 677
+---


💡 Verification agent

🧩 Analysis chain

Frontmatter consistency: quote style and identifier type

Consider using backticks for code-ish terms in summary (use auto instead of 'auto').

Verify whether id is expected to be a number (unquoted) across other changelog entries; stay consistent with the repo’s precedent.

Run to check frontmatter conventions, duplicate ids/slugs, and image existence:

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect how ids are declared elsewhere (quoted vs unquoted) echo "ID style by file:" rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog | sed -E 's/.*id:\s*"?([0-9]+)"?.*/\1/' | wc -l rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog -n | head -n 20 # 2) Check for duplicate ids echo "Duplicate IDs (if any):" rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog \ | sed -E 's/.*id:\s*"?([0-9]+)".*/\1/' \ | sort | uniq -d # 3) Check for duplicate slugs echo "Duplicate slugs (if any):" rg -nP '^slug:\s*"(.*)"' apps/ui/src/content/changelog \ | sed -E 's/.*slug:\s*"(.*)".*/\1/' \ | sort | uniq -d # 4) Ensure image asset exists somewhere echo "Image search:" fd -HI 'auto-routing-context-aware.png' || true

Length of output: 2033

Add missing image asset and refine summary styling

Keep id: "17" quoted—matches all other changelog entries.

Change the summary to use backticks for the code-ish term:
summary: "The `auto` routing model now intelligently selects models based on your request’s context size requirements…"

Add the referenced image file (auto-routing-context-aware.png) to the repository so that image.src: "/changelog/auto-routing-context-aware.png" resolves correctly.

🤖 Prompt for AI Agents

In apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md around lines 2–12, keep id: "17" quoted as-is, update the summary value to wrap the term auto in backticks (e.g. summary: "The `auto` routing model now intelligently selects models based on your request’s context size requirements…"), and add the missing image file at apps/ui/public/changelog/auto-routing-context-aware.png (or the repo’s static assets location) so image.src: "/changelog/auto-routing-context-aware.png" resolves correctly; ensure the image is the correct dimensions and referenced path matches the site’s static asset structure.

bunnyshell · 2025-09-01T21:48:07Z

❌ Preview Environment deleted from Bunnyshell

Available commands (reply to this comment):

🚀 /bns:deploy to deploy the environment

bunnyshell · 2025-09-04T00:31:34Z

❌ Preview Environment deleted from Bunnyshell

Available commands (reply to this comment):

🚀 /bns:deploy to deploy the environment

steebchen · 2025-09-22T12:58:27Z

/bns:deploy

bunnyshell · 2025-09-22T12:58:30Z

💬 Bunnyshell reply to comment:

No Preview Environment was referenced in message/thread

bunnyshell · 2025-09-22T12:59:06Z

❌ Preview Environment deleted from Bunnyshell

Available commands (reply to this comment):

🚀 /bns:deploy to deploy the environment

chore(autofix): apply diff

b990d9d

steebchen changed the title ~~feat: Add context-aware model selection to auto routing~~ docs(changelog): add context-aware auto routing Aug 31, 2025

Merge branch 'main' into terragon/add-auto-routing-changelog

5ae5cdd

steebchen marked this pull request as ready for review September 1, 2025 21:36

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

steebchen closed this Sep 1, 2025

steebchen reopened this Sep 1, 2025

steebchen closed this Sep 3, 2025

steebchen reopened this Sep 3, 2025

steebchen closed this Sep 4, 2025

steebchen reopened this Sep 4, 2025

Merge branch 'main' into terragon/add-auto-routing-changelog

6c9fc80

bunnyshell bot deleted a comment from steebchen Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(changelog): add context-aware auto routing #691

docs(changelog): add context-aware auto routing #691

Uh oh!

steebchen commented Aug 31, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 31, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 1, 2025

Uh oh!

bunnyshell bot commented Sep 1, 2025 •

edited

Loading

Uh oh!

bunnyshell bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

steebchen commented Sep 22, 2025

Uh oh!

bunnyshell bot commented Sep 22, 2025

Uh oh!

bunnyshell bot commented Sep 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

docs(changelog): add context-aware auto routing #691

Are you sure you want to change the base?

docs(changelog): add context-aware auto routing #691

Uh oh!

Conversation

steebchen commented Aug 31, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core Functionality

User Experience

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

bunnyshell bot commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Preview Environment deleted from Bunnyshell

Uh oh!

bunnyshell bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Preview Environment deleted from Bunnyshell

Uh oh!

steebchen commented Sep 22, 2025

Uh oh!

bunnyshell bot commented Sep 22, 2025

💬 Bunnyshell reply to comment:

Uh oh!

bunnyshell bot commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Preview Environment deleted from Bunnyshell

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steebchen commented Aug 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 31, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

bunnyshell bot commented Sep 1, 2025 •

edited

Loading

bunnyshell bot commented Sep 4, 2025 •

edited

Loading

bunnyshell bot commented Sep 22, 2025 •

edited

Loading