Writing Convex Evals

This document outlines the workflow and best practices for creating effective Convex evaluation tests.

Eval Structure

Each eval consists of:

A TASK.txt that describes the task
An answer/ directory containing the reference solution
Within answer/convex/:
- schema.ts - The database schema
- Other necessary files (e.g., public.ts, private.ts)

Implementation Workflow

Create the Eval Structure

# If creating a new category
mkdir -p evals/<category-number>-<category-name>

# Create the eval
python3 create_eval.py <eval_name> <category>

Write the Prompt
- Create detailed requirements in TASK.txt
- Include complete schema
- Specify all function names and requirements
Implement the Solution
1. Create schema.ts first
2. Run codegen to generate types:
```
cd evals/<category>/<eval>/answer && bunx convex codegen
```
3. Implement solution files (e.g., public.ts)
4. Run codegen again after any schema changes
TypeScript Setup Notes
- Always run codegen after creating/modifying schema
- Linter errors about indexes (e.g., "not assignable to parameter of type 'never'") indicate missing codegen
- Some common type errors and their fixes:
  - Index fields: Requires codegen to resolve
  - Optional parameters: Use v.optional(v.string()) in args
  - Return types: May need explicit type annotations

Creating New Evals

Use create_eval.py to scaffold:
```
python3 create_eval.py <eval_name> <category>
```
This creates the basic directory structure and initializes a Convex project.
Categories should be organized by concept:
- 000-fundamentals/ - Basic Convex concepts
- 001-data_modeling/ - Schema design and relationships
- 002-mutations/ - Data modification patterns
- 003-queries/ - Query patterns and optimization etc.

Writing Good Prompts

Be Explicit About Schema
- Always provide the complete schema in the prompt
- Use TypeScript code blocks for clarity
- Include comments explaining field purposes
Clear Requirements
- List specific functions to implement
- For each function, specify:
  - Exact function name
  - Required arguments and their types
  - Expected return type/structure
  - Any specific behaviors or edge cases to handle
Test Data Requirements
- Specify minimum number of test records
- Define required data variations
- Include specific scenarios to test
Implementation Guidelines
- Highlight required patterns or approaches
- Specify what NOT to do
- Note performance considerations

Example Prompt Structure

Given this schema:

\`\`\`typescript
export default defineSchema({
// Schema definition with comments
});
\`\`\`

Write two functions:

1. A mutation named \`insertX\` that:

   - Specific requirements
   - Test data to insert
   - Edge cases to handle

2. A query named \`getY\` that:
   - Input parameters
   - Return structure
   - Required behavior
   - Performance considerations

Your solution should:

- Technical requirements
- Patterns to use/avoid
- Error handling expectations

Common Eval Types

Data Modeling
- Table relationships (1:1, 1:N, N:M)
- Index design
- Schema validation
Query Patterns
- Basic CRUD operations
- Index usage
- Filtering and sorting
- Joins and relationships
- Aggregation and grouping
Performance
- Efficient index usage
- Parallel fetching
- Batch operations
- Query optimization

Best Practices

Focused Testing
- Each eval should test ONE main concept
- Include related patterns only if they support the main concept
- Keep requirements focused and clear
Realistic Scenarios
- Use real-world examples when possible
- Make data requirements meaningful
- Include common edge cases
Clear Success Criteria
- Make requirements explicit and testable
- Include both functional and technical requirements
- Specify performance expectations when relevant
Progressive Complexity
- Order evals from simple to complex within categories
- Build on previous concepts
- Include stretch goals or optional optimizations

Common Pitfalls to Avoid

Ambiguous Requirements
- Don't leave function names unspecified
- Don't use vague terms like "appropriate" without context
- Always specify exact field names and types
Over-complication
- Don't test multiple concepts in one eval
- Don't require complex setup for simple concepts
- Keep schemas focused on the tested concept
Missing Context
- Don't assume knowledge of specific patterns
- Include relevant documentation references
- Explain performance implications
Untestable Requirements
- Make success criteria measurable
- Specify exact return types
- Include specific test cases

Solution File Organization

Schema First

// schema.ts
import { defineSchema, defineTable } from "convex/server";
import { v } from "convex/values";

export default defineSchema({
  // Tables and indexes
});

Function Files

// public.ts
import { v } from "convex/values";
import { mutation, query } from "./_generated/server";
import { Id } from "./_generated/dataModel";  // If using IDs

export const myFunction = query({...});

Common Patterns
- Keep mutations and related queries in the same file
- Group related functionality together
- Include type imports from generated files
- Add helpful comments for complex logic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

EVAL_WORKFLOW.md

EVAL_WORKFLOW.md

Writing Convex Evals

Eval Structure

Implementation Workflow

Creating New Evals

Writing Good Prompts

Example Prompt Structure

Common Eval Types

Best Practices

Common Pitfalls to Avoid

Solution File Organization

Files

EVAL_WORKFLOW.md

Latest commit

History

EVAL_WORKFLOW.md

File metadata and controls

Writing Convex Evals

Eval Structure

Implementation Workflow

Creating New Evals

Writing Good Prompts

Example Prompt Structure

Common Eval Types

Best Practices

Common Pitfalls to Avoid

Solution File Organization