feat: add maxResultsPerQuery option to limit search results per query#1056
Open
CrossAndHorsesRanch wants to merge 7 commits intoItzCrazyKns:masterfrom
Open
feat: add maxResultsPerQuery option to limit search results per query#1056CrossAndHorsesRanch wants to merge 7 commits intoItzCrazyKns:masterfrom
CrossAndHorsesRanch wants to merge 7 commits intoItzCrazyKns:masterfrom
Conversation
Adds an optional maxResultsPerQuery field to SearchAgentConfig that limits the number of search results passed to the LLM per query. This is useful for self-hosted deployments using local LLMs with smaller context windows, where large result sets cause excessive prompt token counts and slow inference. When not set, behavior is unchanged. Changes: - types.ts: Add maxResultsPerQuery to SearchAgentConfig and AdditionalConfig - researcher/index.ts: Thread maxResultsPerQuery through to executeAll - actions/webSearch.ts: Apply slice before mapping results - actions/academicSearch.ts: Apply slice before mapping results - actions/socialSearch.ts: Apply slice before mapping results - app/api/search/route.ts: Accept from request body and pass to config
Contributor
There was a problem hiding this comment.
1 issue found across 6 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/lib/agents/search/researcher/index.ts">
<violation number="1" location="src/lib/agents/search/researcher/index.ts:170">
P2: `maxResultsPerQuery` is forwarded from request input without validation, and is used directly in `.slice(...)`, allowing invalid values (e.g., negatives) to break the expected per-query result limit.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…rQuery Per-query slicing alone is insufficient when multiple queries run in parallel and accumulate into a shared results array. This adds a final slice at the return statement in all three search actions to enforce the maxResultsPerQuery limit on the total result set.
Contributor
There was a problem hiding this comment.
3 issues found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/lib/agents/search/researcher/actions/academicSearch.ts">
<violation number="1" location="src/lib/agents/search/researcher/actions/academicSearch.ts:126">
P2: `maxResultsPerQuery` is applied twice, turning a per-query limit into an unintended global cap on final returned results.</violation>
</file>
<file name="src/lib/agents/search/researcher/actions/socialSearch.ts">
<violation number="1" location="src/lib/agents/search/researcher/actions/socialSearch.ts:126">
P2: `maxResultsPerQuery` is incorrectly applied as a final global cap, causing loss of multi-query results and unstable subset selection.</violation>
</file>
<file name="src/lib/agents/search/researcher/actions/webSearch.ts">
<violation number="1" location="src/lib/agents/search/researcher/actions/webSearch.ts:179">
P2: The new final slice applies maxResultsPerQuery to the merged results, effectively capping total results across all queries. This contradicts the documented per‑query limit and makes returned results dependent on async completion order.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
The final .slice() on the merged results array incorrectly turned a per-query limit into a global cap across all queries. With Promise.all, async completion order is non-deterministic, making the subset selection unstable. The per-query slice in the search() function is sufficient to enforce maxResultsPerQuery semantics.
…ultsPerQuery Per-query slicing alone is insufficient because results accumulate in the LLM message history across iterations. Each tool result is serialized with JSON.stringify and fed back into the next LLM call, causing prompt token counts to grow with each iteration. This truncates search_results actions before serialization when maxResultsPerQuery is set.
Contributor
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/lib/agents/search/researcher/index.ts">
<violation number="1" location="src/lib/agents/search/researcher/index.ts:181">
P2: Tool message truncation uses a truthy check instead of the sanitized positive-integer guard used for executeAll, so invalid-but-truthy values can alter what the LLM sees while actions execute untruncated.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…cation The truthy check on maxResultsPerQuery was inconsistent with the guard used in executeAll, allowing invalid-but-truthy values (e.g. strings, floats) to alter what the LLM sees while actions executed untruncated. Aligns the condition with the Number.isInteger + > 0 guard used elsewhere.
maxResultsPerQuery slices results per Vane action but does not bound the combined result count when multiple search engines are active, since SearXNG merges and deduplicates results across engines before returning them. With two engines (e.g. brave + bing), the merged result set exceeded the per-query cap in agentMessageHistory, causing prompt token counts to regress to pre-fix levels (2,388 tokens, 81s vs 38s). maxTotalResults applies a post-merge cap at the agentMessageHistory serialization point, providing a hard bound on what the LLM sees regardless of how many engines are configured. Falls back to maxResultsPerQuery if maxTotalResults is not set, preserving backward compatibility.
Author
Contributor
|
@CrossAndHorsesRanch I have started the AI code review. It will take a few minutes to complete. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an optional
maxResultsPerQueryto limit how many search results are sent to the LLM per query, reducing prompt size and inference time for small-context models. The value is accepted by the API, validated in the researcher, and applied before chunking and before serializing tool messages.Changes
maxResultsPerQuery?: numbertoSearchAgentConfigandAdditionalConfig(types.ts).maxResultsPerQueryin the search API request and forward it (src/app/api/search/route.ts).maxResultsPerQueryas a positive integer in the researcher (researcher/index.ts).webSearch,academicSearch, andsocialSearchbefore mapping to chunks.search_resultsin tool message history using the same positive-integer guard to keep prompts bounded.