-
Notifications
You must be signed in to change notification settings - Fork 55
docs(changelog): add context-aware auto routing #691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Introduce smart auto routing that selects models based on request context size requirements. This enhancement improves performance, reduces failures due to context size limits, and optimizes cost by choosing the most economical suitable model automatically when using `model: "auto"`. - Automatic context estimation from input messages, tools, and completion needs - Intelligent filtering of models by context capacity - Cost-optimized model selection - Precise token counting and provider context limit respect - Robust fallback behavior to ensure response delivery This feature requires no changes from users and works transparently to improve routing intelligence and efficiency. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
WalkthroughAdds a new changelog entry dated 2025-08-31 documenting “Smart Auto Routing: Context-Aware Model Selection” for the auto model, covering context estimation, model filtering, cost-aware selection, precise token counting, tool considerations, completion buffer handling, and fallback behavior, with metadata (id 17, slug) and an image reference. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant UI as App/UI (model: "auto")
participant GW as Gateway (Auto Router)
participant Prov as Model Providers
participant Model as Selected Model
User->>UI: Send messages + tools + requirements
UI->>GW: Request with history, tools, max_tokens
note over GW: 1) Estimate total context (tokenization)<br/>2) Filter providers/models by context limits<br/>3) Reserve completion buffer
GW->>Prov: Query candidate models
Prov-->>GW: Capabilities + limits + pricing
GW->>GW: Choose cost-optimized model meeting constraints
GW->>Model: Send request (messages + tool defs)
alt Success
Model-->>GW: Completion
GW-->>UI: Response
UI-->>User: Display output
else Context/other failure
note over GW: Fallback protection to alternate candidate
GW->>Model: Retry with next viable model
Model-->>GW: Completion
GW-->>UI: Response
UI-->>User: Display output
end
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md (6)
6-6
: Use backticks around the model name in summaryImproves consistency with the rest of the doc where code/params are backticked.
Apply:
-summary: "The 'auto' routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency." +summary: "The `auto` routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency."
14-15
: Tighten phrasing; avoid “request requirements” redundancyReadability nit.
Apply:
-We're excited to announce a major enhancement to our **'auto' routing model** - introducing **context-aware model selection** that automatically chooses the best model for your specific request requirements. +We're excited to announce a major enhancement to our **`auto` routing model**—introducing **context-aware model selection** that automatically chooses the best model for your request.
18-23
: Minor wording polish to reduce repetitionTwo bullets end with “requirements.” Consider slight tightening.
Apply:
-**Cost-Optimized Selection**: Among suitable models, the system selects the most cost-effective option that meets your requirements. +**Cost-Optimized Selection**: Among suitable models, the system selects the most cost-effective option that meets your needs.
30-31
: Parameter formatting is good; consider clarifying buffer scopeOptional: add that reserved space also accounts for tool/function-calling tokens when applicable.
Proposed tweak:
-**Completion Buffer**: Automatically accounts for response length requirements, including your specified `max_tokens` parameter. +**Completion Buffer**: Automatically reserves space for the response (your `max_tokens`) and, when applicable, tool/function-calling tokens.
36-41
: Optional cross-linkIf we have a docs page describing token counting, link it here for credibility.
52-59
: Add a minimal “Getting started” exampleA short snippet helps readers copy/paste.
Add after Line 59:
+### Example +```bash +curl -X POST https://api.your-gateway.example/v1/chat/completions \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "auto", + "messages": [{"role":"user","content":"Summarize this thread."}] + }' +```
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md
[grammar] ~16-~16: There might be a mistake here.
Context: ...uirements. ## 🧠 Smart Context Analysis Automatic Context Estimation: The auto...
(QB_NEW_EN)
[grammar] ~24-~24: There might be a mistake here.
Context: ...s your requirements. ## 🎯 How It Works Message Analysis: The system analyzes ...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...th unusual requirements. ## 💡 Benefits Reduced Failures: Eliminates context s...
(QB_NEW_EN)
[grammar] ~52-~52: There might be a mistake here.
Context: ... optimal results. ## 🚀 Getting Started Already Available: This enhancement is...
(QB_NEW_EN)
[grammar] ~60-~60: There might be a mistake here.
Context: ...ting API integration. ## 🔮 What's Next This context-aware routing represents th...
(QB_NEW_EN)
🔇 Additional comments (2)
apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md (2)
62-66
: LGTM overallClear, user-focused changelog; no blocking issues.
24-33
: Emoji headings align with prevalent changelog style
No changes needed.
id: "17" | ||
slug: "auto-routing-model-context-aware" | ||
date: "2025-08-31" | ||
title: "Introducing Smart Auto Routing: Context-Aware Model Selection" | ||
summary: "The 'auto' routing model now intelligently selects models based on your request's context size requirements, ensuring optimal performance and cost efficiency." | ||
image: | ||
src: "/changelog/auto-routing-context-aware.png" | ||
alt: "Auto routing model with context-aware selection on LLM Gateway" | ||
width: 1768 | ||
height: 677 | ||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Frontmatter consistency: quote style and identifier type
- Consider using backticks for code-ish terms in summary (use
auto
instead of 'auto'). - Verify whether
id
is expected to be a number (unquoted) across other changelog entries; stay consistent with the repo’s precedent.
Run to check frontmatter conventions, duplicate ids/slugs, and image existence:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect how ids are declared elsewhere (quoted vs unquoted)
echo "ID style by file:"
rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog | sed -E 's/.*id:\s*"?([0-9]+)"?.*/\1/' | wc -l
rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog -n | head -n 20
# 2) Check for duplicate ids
echo "Duplicate IDs (if any):"
rg -nP '^id:\s*"?(\d+)"?' apps/ui/src/content/changelog \
| sed -E 's/.*id:\s*"?([0-9]+)".*/\1/' \
| sort | uniq -d
# 3) Check for duplicate slugs
echo "Duplicate slugs (if any):"
rg -nP '^slug:\s*"(.*)"' apps/ui/src/content/changelog \
| sed -E 's/.*slug:\s*"(.*)".*/\1/' \
| sort | uniq -d
# 4) Ensure image asset exists somewhere
echo "Image search:"
fd -HI 'auto-routing-context-aware.png' || true
Length of output: 2033
Add missing image asset and refine summary styling
- Keep
id: "17"
quoted—matches all other changelog entries. - Change the summary to use backticks for the code-ish term:
summary: "The `auto` routing model now intelligently selects models based on your request’s context size requirements…" - Add the referenced image file (auto-routing-context-aware.png) to the repository so that
image.src: "/changelog/auto-routing-context-aware.png"
resolves correctly.
🤖 Prompt for AI Agents
In apps/ui/src/content/changelog/2025-08-31-auto-routing-model-context-aware.md
around lines 2–12, keep id: "17" quoted as-is, update the summary value to wrap
the term auto in backticks (e.g. summary: "The `auto` routing model now
intelligently selects models based on your request’s context size
requirements…"), and add the missing image file at
apps/ui/public/changelog/auto-routing-context-aware.png (or the repo’s static
assets location) so image.src: "/changelog/auto-routing-context-aware.png"
resolves correctly; ensure the image is the correct dimensions and referenced
path matches the site’s static asset structure.
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
/bns:deploy |
💬 Bunnyshell reply to comment:No Preview Environment was referenced in message/thread |
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
Summary
auto
routing modelChanges
Core Functionality
User Experience
model: "auto"
Test plan
This enhancement lays the groundwork for future improvements in intelligent model selection based on additional factors like task complexity and response quality.
🌿 Generated by Terry
ℹ️ Tag @terragon-labs to ask questions and address PR feedback
📎 Task: https://www.terragonlabs.com/task/e24f4239-4318-40cb-892b-9b31c0af160b
Summary by CodeRabbit