-
Notifications
You must be signed in to change notification settings - Fork 9
feat: data copilot improvements #726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: data copilot improvements #726
Conversation
Signed-off-by: anilb <[email protected]>
Signed-off-by: anilb <[email protected]>
Signed-off-by: anilb <[email protected]>
- self healing text-to-sql agent - text-to-sql agent sql api restriction improvements - improved ui - auditor and pipe agent improvements Signed-off-by: anilb <[email protected]>
Signed-off-by: anilb <[email protected]>
Signed-off-by: anilb <[email protected]>
Signed-off-by: anilb <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements comprehensive data copilot improvements including auditor-based validation, self-healing SQL generation, agent execution tracking, and improved routing intelligence to enhance query reliability and answer accuracy.
- Introduces auditor agent for data validation with auto-retry loop (max 1 retry)
- Adds ASK_CLARIFICATION router action with loop prevention for ambiguous questions
- Implements comprehensive agent execution tracking with new database table
- Enhances text-to-SQL with self-healing retry logic and TinyBird constraint documentation
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
frontend/server/repo/chat.repo.ts | Adds chat response tracking methods and agent steps logging |
frontend/server/middleware/database.ts | Updates database middleware with conditional CM DB pool initialization |
frontend/server/api/chat/stream.ts | Refactors stream handler to use single question instead of messages array |
frontend/nuxt.config.ts | Adds cmDbEnabled configuration flag |
frontend/lib/chat/utils/data-summary.ts | New utility for generating statistical data summaries for auditor validation |
frontend/lib/chat/types.ts | Extends types with auditor schemas, ASK_CLARIFICATION action, and SQL error context |
frontend/lib/chat/tests/router.test.ts | Refactors router tests to use parameterized test cases and adds ASK_CLARIFICATION tests |
frontend/lib/chat/tests/auditor.test.ts | New comprehensive test suite for auditor agent validation logic |
frontend/lib/chat/prompts/tinybird-patterns.md | New reference documentation for TinyBird SQL patterns and anti-patterns |
frontend/lib/chat/prompts/tinybird-functions.md | New exhaustive TinyBird function reference for SQL generation |
frontend/lib/chat/prompts/text-to-sql.ts | Enhanced with error handling context and TinyBird constraint documentation |
frontend/lib/chat/prompts/router.ts | Adds clarification loop prevention and repository filtering logic |
frontend/lib/chat/prompts/pipe.ts | Enforces YYYY-MM-DD HH:MM:SS date format requirements |
frontend/lib/chat/prompts/auditor.ts | New auditor prompt for data validation with statistical analysis |
frontend/lib/chat/enums.ts | Adds new stream data types and statuses for auditor and clarification flows |
frontend/lib/chat/data-copilot.ts | Major refactor implementing auditor validation loop and execution tracking |
frontend/lib/chat/agents/text-to-sql.ts | Updates to support SQL error context for retry scenarios |
frontend/lib/chat/agents/router.ts | Increases max steps and adds execute_query tool access |
frontend/lib/chat/agents/index.ts | Exports new AuditorAgent |
frontend/lib/chat/agents/base-agent.ts | Minor prompt ordering fix |
frontend/lib/chat/agents/auditor.ts | New auditor agent for validating data completeness and quality |
frontend/app/components/shared/modules/copilot/types/copilot.types.ts | Adds ask_clarification message status and question field |
frontend/app/components/shared/modules/copilot/store/copilot.api.service.ts | Removes test data and adds auditor status handling |
frontend/app/components/shared/modules/copilot/components/results/results-section.vue | Changes default tab to data view and removes forced chart selection |
frontend/app/components/shared/modules/copilot/components/chat-history/chat-result.vue | Simplifies chat result display by removing result selection functionality |
database/migrations/V1759927412__makeChatResponsesNullable.sql | Makes router fields nullable for early chat response creation |
database/migrations/V1759927411__createChatResponseAgentStepsTable.sql | Creates table for tracking individual agent execution steps |
database/migrations/V1759392166__addAskClarificationRouterResponseType.sql | Adds ASK_CLARIFICATION enum value and clarification_question column |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
const messages: ChatMessage[] = [{ role: 'user', content: userQuery }] | ||
|
||
console.warn("📝 Creating test input for query:", userQuery) | ||
console.warn('📝 Creating test input for query:', userQuery) |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Replace console.warn with console.log or console.debug for test logging. console.warn should be reserved for actual warnings.
console.warn('📝 Creating test input for query:', userQuery) | |
console.log('📝 Creating test input for query:', userQuery) |
Copilot uses AI. Check for mistakes.
} | ||
} | ||
// Date columns (detect date strings) | ||
else if (typeof firstValue === 'string' && !isNaN(Date.parse(firstValue))) { |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Date.parse() can be unreliable for date detection as it accepts many non-date strings. Consider using a more robust date validation like regex pattern matching for common date formats.
Copilot uses AI. Check for mistakes.
function loadReferenceDoc(filename: string): string { | ||
try { | ||
const path = join(__dirname, filename) | ||
return readFileSync(path, 'utf-8') | ||
} catch (error) { | ||
console.warn(`Failed to load ${filename}:`, error) | ||
return '' | ||
} | ||
} |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __dirname usage may not work correctly in all Node.js environments (especially with bundlers). Consider using import.meta.url or a more reliable path resolution method.
Copilot uses AI. Check for mistakes.
content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. \n | ||
Please adjust your approach based on this feedback.`, |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The multiline string has inconsistent indentation. Either use proper template literal formatting or normalize the whitespace.
content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. \n | |
Please adjust your approach based on this feedback.`, | |
content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. | |
Please adjust your approach based on this feedback.`, |
Copilot uses AI. Check for mistakes.
enhancedErrorMessage = `${errorMessage}\n\nCRITICAL: Your query contains UNION or UNION ALL, | ||
which is NOT supported by TinyBird's SQL API. This is likely causing the error. | ||
You MUST rewrite the query WITHOUT using UNION. | ||
Instead:\n | ||
- Return a single result set with all data\n | ||
- Use CASE statements to categorize different data types\n | ||
- Add a 'type' or 'category' column to distinguish different aggregations\n | ||
- Do NOT attempt to combine multiple SELECTs with UNION` |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The multiline error message has inconsistent formatting and mixing of template literals with manual line breaks. Consider using a dedicated template or normalize the formatting.
enhancedErrorMessage = `${errorMessage}\n\nCRITICAL: Your query contains UNION or UNION ALL, | |
which is NOT supported by TinyBird's SQL API. This is likely causing the error. | |
You MUST rewrite the query WITHOUT using UNION. | |
Instead:\n | |
- Return a single result set with all data\n | |
- Use CASE statements to categorize different data types\n | |
- Add a 'type' or 'category' column to distinguish different aggregations\n | |
- Do NOT attempt to combine multiple SELECTs with UNION` | |
enhancedErrorMessage = `${errorMessage} | |
CRITICAL: Your query contains UNION or UNION ALL, which is NOT supported by TinyBird's SQL API. This is likely causing the error. | |
You MUST rewrite the query WITHOUT using UNION. | |
Instead: | |
- Return a single result set with all data | |
- Use CASE statements to categorize different data types | |
- Add a 'type' or 'category' column to distinguish different aggregations | |
- Do NOT attempt to combine multiple SELECTs with UNION` |
Copilot uses AI. Check for mistakes.
(data.type === 'router-status' || data.type === 'auditor-status') && | ||
( | ||
data.status === 'complete' || | ||
data.status === 'error' || | ||
data.status === 'ask_clarification' || | ||
data.status === 'validated' | ||
) |
Copilot
AI
Oct 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The condition is complex and hard to read. Consider extracting the status checks into a constant array or helper function for better maintainability.
Copilot uses AI. Check for mistakes.
Data Copilot: Agent Validation, Self-Healing, and Execution Tracking
Overview
Introduces auditor-based validation, self-healing SQL generation, comprehensive agent execution tracking, and improved routing intelligence to reduce failed queries and improve answer accuracy.
Key Features
1. Auditor Agent & Validation Loop
2. Ask Clarification Mode
ASK_CLARIFICATION
action with loop prevention3. Agent Execution Tracking
chat_response_agent_steps
table tracks every agent executionEXECUTE_INSTRUCTIONS
type for query execution vs agent thinking time4. Self-Healing Text-to-SQL
5. Router Intelligence
6. Date Format Enforcement
YYYY-MM-DD HH:MM:SS
formatDatabase Changes
5 new migrations:
Impact
Accuracy: Auditor prevents streaming wrong data
Reliability: Self-healing reduces SQL failures
Observability: Full agent execution telemetry
Cost tracking: Per-agent token usage
UX: Clarifications instead of failures
main