feat: data copilot improvements #726

epipav · 2025-10-08T13:42:19Z

Data Copilot: Agent Validation, Self-Healing, and Execution Tracking

Overview

Introduces auditor-based validation, self-healing SQL generation, comprehensive agent execution tracking, and improved routing intelligence to reduce failed queries and improve answer accuracy.

Key Features

1. Auditor Agent & Validation Loop

New agent validates data completeness before streaming to users
Checks: column coverage, data quality, time dimensions, granularity matching
Auto-retry loop (max 1 retry) with detailed feedback
Prevents streaming incomplete/invalid results

2. Ask Clarification Mode

Router can request clarification for ambiguous questions
New ASK_CLARIFICATION action with loop prevention
Handles unclear timeframes, metrics, or dimensions

3. Agent Execution Tracking

New chat_response_agent_steps table tracks every agent execution
Captures: agent type, tokens, response time (seconds), responses, errors
Separate EXECUTE_INSTRUCTIONS type for query execution vs agent thinking time
Enables performance analysis and cost optimization

4. Self-Healing Text-to-SQL

Auto-retry on errors with context and reference docs
Enhanced TinyBird constraints:
- No range-based JOINs (causes errors)
- No UNION/UNION ALL (unsupported)
- Max 3-4 CTEs, use window functions
- Always LIMIT 100
UNION detection in error handler

5. Router Intelligence

Auto-adds "git AND github" filters for repository questions
STOP decision validation checklist prevents premature failures
Tool validation checks pipe capabilities before routing
Anti-pattern examples prevent common mistakes

6. Date Format Enforcement

Pipes require YYYY-MM-DD HH:MM:SS format
Prevents execution errors from malformed dates

Database Changes

5 new migrations:

Conversation ID support
ASK_CLARIFICATION router type
Agent steps tracking table
Nullable router fields for early record creation

Impact

Accuracy: Auditor prevents streaming wrong data
Reliability: Self-healing reduces SQL failures
Observability: Full agent execution telemetry
Cost tracking: Per-agent token usage
UX: Clarifications instead of failures

main
- feat: data copilot text to sql #645
  - feat: data copilot improvements #726 👈

Signed-off-by: anilb <[email protected]>

- self healing text-to-sql agent - text-to-sql agent sql api restriction improvements - improved ui - auditor and pipe agent improvements Signed-off-by: anilb <[email protected]>

Signed-off-by: anilb <[email protected]>

frontend/lib/chat/prompts/router.ts

frontend/lib/chat/prompts/tinybird-functions.md

Signed-off-by: anilb <[email protected]>

Copilot

Pull Request Overview

This PR implements comprehensive data copilot improvements including auditor-based validation, self-healing SQL generation, agent execution tracking, and improved routing intelligence to enhance query reliability and answer accuracy.

Introduces auditor agent for data validation with auto-retry loop (max 1 retry)
Adds ASK_CLARIFICATION router action with loop prevention for ambiguous questions
Implements comprehensive agent execution tracking with new database table
Enhances text-to-SQL with self-healing retry logic and TinyBird constraint documentation

Reviewed Changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
frontend/server/repo/chat.repo.ts	Adds chat response tracking methods and agent steps logging
frontend/server/middleware/database.ts	Updates database middleware with conditional CM DB pool initialization
frontend/server/api/chat/stream.ts	Refactors stream handler to use single question instead of messages array
frontend/nuxt.config.ts	Adds cmDbEnabled configuration flag
frontend/lib/chat/utils/data-summary.ts	New utility for generating statistical data summaries for auditor validation
frontend/lib/chat/types.ts	Extends types with auditor schemas, ASK_CLARIFICATION action, and SQL error context
frontend/lib/chat/tests/router.test.ts	Refactors router tests to use parameterized test cases and adds ASK_CLARIFICATION tests
frontend/lib/chat/tests/auditor.test.ts	New comprehensive test suite for auditor agent validation logic
frontend/lib/chat/prompts/tinybird-patterns.md	New reference documentation for TinyBird SQL patterns and anti-patterns
frontend/lib/chat/prompts/tinybird-functions.md	New exhaustive TinyBird function reference for SQL generation
frontend/lib/chat/prompts/text-to-sql.ts	Enhanced with error handling context and TinyBird constraint documentation
frontend/lib/chat/prompts/router.ts	Adds clarification loop prevention and repository filtering logic
frontend/lib/chat/prompts/pipe.ts	Enforces YYYY-MM-DD HH:MM:SS date format requirements
frontend/lib/chat/prompts/auditor.ts	New auditor prompt for data validation with statistical analysis
frontend/lib/chat/enums.ts	Adds new stream data types and statuses for auditor and clarification flows
frontend/lib/chat/data-copilot.ts	Major refactor implementing auditor validation loop and execution tracking
frontend/lib/chat/agents/text-to-sql.ts	Updates to support SQL error context for retry scenarios
frontend/lib/chat/agents/router.ts	Increases max steps and adds execute_query tool access
frontend/lib/chat/agents/index.ts	Exports new AuditorAgent
frontend/lib/chat/agents/base-agent.ts	Minor prompt ordering fix
frontend/lib/chat/agents/auditor.ts	New auditor agent for validating data completeness and quality
frontend/app/components/shared/modules/copilot/types/copilot.types.ts	Adds ask_clarification message status and question field
frontend/app/components/shared/modules/copilot/store/copilot.api.service.ts	Removes test data and adds auditor status handling
frontend/app/components/shared/modules/copilot/components/results/results-section.vue	Changes default tab to data view and removes forced chart selection
frontend/app/components/shared/modules/copilot/components/chat-history/chat-result.vue	Simplifies chat result display by removing result selection functionality
database/migrations/V1759927412__makeChatResponsesNullable.sql	Makes router fields nullable for early chat response creation
database/migrations/V1759927411__createChatResponseAgentStepsTable.sql	Creates table for tracking individual agent execution steps
database/migrations/V1759392166__addAskClarificationRouterResponseType.sql	Adds ASK_CLARIFICATION enum value and clarification_question column

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T08:04:45Z

frontend/lib/chat/tests/router.test.ts

    const messages: ChatMessage[] = [{ role: 'user', content: userQuery }]

-    console.warn("📝 Creating test input for query:", userQuery)
+    console.warn('📝 Creating test input for query:', userQuery)


[nitpick] Replace console.warn with console.log or console.debug for test logging. console.warn should be reserved for actual warnings.

Suggested change

console.warn('📝 Creating test input for query:', userQuery)

console.log('📝 Creating test input for query:', userQuery)

Copilot · 2025-10-09T08:04:46Z

frontend/lib/chat/utils/data-summary.ts

+      }
+    }
+    // Date columns (detect date strings)
+    else if (typeof firstValue === 'string' && !isNaN(Date.parse(firstValue))) {


Date.parse() can be unreliable for date detection as it accepts many non-date strings. Consider using a more robust date validation like regex pattern matching for common date formats.

Copilot · 2025-10-09T08:04:46Z

frontend/lib/chat/prompts/text-to-sql.ts

+function loadReferenceDoc(filename: string): string {
+  try {
+    const path = join(__dirname, filename)
+    return readFileSync(path, 'utf-8')
+  } catch (error) {
+    console.warn(`Failed to load ${filename}:`, error)
+    return ''
+  }
+}


The __dirname usage may not work correctly in all Node.js environments (especially with bundlers). Consider using import.meta.url or a more reliable path resolution method.

Copilot · 2025-10-09T08:04:47Z

frontend/lib/chat/data-copilot.ts

+          content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. \n
+                    Please adjust your approach based on this feedback.`,


[nitpick] The multiline string has inconsistent indentation. Either use proper template literal formatting or normalize the whitespace.

Suggested change

content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. \n

Please adjust your approach based on this feedback.`,

content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}.

Please adjust your approach based on this feedback.`,

Copilot · 2025-10-09T08:04:47Z

frontend/lib/chat/data-copilot.ts

+            enhancedErrorMessage = `${errorMessage}\n\nCRITICAL: Your query contains UNION or UNION ALL, 
+            which is NOT supported by TinyBird's SQL API. This is likely causing the error. 
+            You MUST rewrite the query WITHOUT using UNION. 
+            Instead:\n
+            - Return a single result set with all data\n
+            - Use CASE statements to categorize different data types\n
+            - Add a 'type' or 'category' column to distinguish different aggregations\n
+            - Do NOT attempt to combine multiple SELECTs with UNION`


[nitpick] The multiline error message has inconsistent formatting and mixing of template literals with manual line breaks. Consider using a dedicated template or normalize the formatting.

Suggested change

enhancedErrorMessage = `${errorMessage}\n\nCRITICAL: Your query contains UNION or UNION ALL,

which is NOT supported by TinyBird's SQL API. This is likely causing the error.

You MUST rewrite the query WITHOUT using UNION.

Instead:\n

- Return a single result set with all data\n

- Use CASE statements to categorize different data types\n

- Add a 'type' or 'category' column to distinguish different aggregations\n

- Do NOT attempt to combine multiple SELECTs with UNION`

enhancedErrorMessage = `${errorMessage}

CRITICAL: Your query contains UNION or UNION ALL, which is NOT supported by TinyBird's SQL API. This is likely causing the error.

You MUST rewrite the query WITHOUT using UNION.

Instead:

- Return a single result set with all data

- Use CASE statements to categorize different data types

- Add a 'type' or 'category' column to distinguish different aggregations

- Do NOT attempt to combine multiple SELECTs with UNION`

Copilot · 2025-10-09T08:04:47Z

frontend/app/components/shared/modules/copilot/store/copilot.api.service.ts

+            (data.type === 'router-status' || data.type === 'auditor-status') &&
+            (
+             data.status === 'complete' || 
+             data.status === 'error' || 
+             data.status === 'ask_clarification' || 
+             data.status === 'validated' 
+            )


[nitpick] The condition is complex and hard to read. Consider extracting the status checks into a constant array or helper function for better maintainability.

epipav added 6 commits October 2, 2025 09:53

fix: compact question->routing tests, prompt improvements

346c4c9

Signed-off-by: anilb <[email protected]>

feat: clarification questions by router agent

6992159

Signed-off-by: anilb <[email protected]>

feat: auditor agent and router agent ask_clarification mode

c98ea36

Signed-off-by: anilb <[email protected]>

feat(chat): quality improvements

2a830a7

- self healing text-to-sql agent - text-to-sql agent sql api restriction improvements - improved ui - auditor and pipe agent improvements Signed-off-by: anilb <[email protected]>

feat: step logging agent details

5cfddc9

Signed-off-by: anilb <[email protected]>

fix: types and linting

2c027ab

Signed-off-by: anilb <[email protected]>

epipav requested review from Copilot and joanagmaia and removed request for Copilot October 8, 2025 13:42

github-actions bot mentioned this pull request Oct 8, 2025

feat: data copilot text to sql #645

Closed

joanagmaia approved these changes Oct 8, 2025

View reviewed changes

frontend/lib/chat/prompts/router.ts Outdated Show resolved Hide resolved

frontend/lib/chat/prompts/tinybird-functions.md Show resolved Hide resolved

fix: add gitlab and gerrit to code platforms

16ec34f

Signed-off-by: anilb <[email protected]>

Copilot AI review requested due to automatic review settings October 9, 2025 08:03

epipav merged commit 5d113ec into feature/data-copilot-text-to-sql Oct 9, 2025
5 checks passed

epipav deleted the feature/data-copilot-ask-clarification-route branch October 9, 2025 08:04

Copilot AI reviewed Oct 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: data copilot improvements #726

feat: data copilot improvements #726

Uh oh!

epipav commented Oct 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Copilot AI Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	console.warn('📝 Creating test input for query:', userQuery)
	console.log('📝 Creating test input for query:', userQuery)

		content: `Previous attempt did not produce valid results. Auditor feedback: ${previousFeedback}. \n
		Please adjust your approach based on this feedback.`,

feat: data copilot improvements #726

feat: data copilot improvements #726

Uh oh!

Conversation

epipav commented Oct 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Data Copilot: Agent Validation, Self-Healing, and Execution Tracking

Overview

Key Features

1. Auditor Agent & Validation Loop

2. Ask Clarification Mode

3. Agent Execution Tracking

4. Self-Healing Text-to-SQL

5. Router Intelligence

6. Date Format Enforcement

Database Changes

Impact

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

epipav commented Oct 8, 2025 •

edited by github-actions bot

Loading