This document outlines the workflow and best practices for creating effective Convex evaluation tests.
Each eval consists of:
- A
TASK.txt
that describes the task - An
answer/
directory containing the reference solution - Within
answer/convex/
:schema.ts
- The database schema- Other necessary files (e.g.,
public.ts
,private.ts
)
-
Create the Eval Structure
# If creating a new category mkdir -p evals/<category-number>-<category-name> # Create the eval python3 create_eval.py <eval_name> <category>
-
Write the Prompt
- Create detailed requirements in
TASK.txt
- Include complete schema
- Specify all function names and requirements
- Create detailed requirements in
-
Implement the Solution
- Create
schema.ts
first - Run codegen to generate types:
cd evals/<category>/<eval>/answer && bunx convex codegen
- Implement solution files (e.g.,
public.ts
) - Run codegen again after any schema changes
- Create
-
TypeScript Setup Notes
- Always run codegen after creating/modifying schema
- Linter errors about indexes (e.g., "not assignable to parameter of type 'never'") indicate missing codegen
- Some common type errors and their fixes:
- Index fields: Requires codegen to resolve
- Optional parameters: Use
v.optional(v.string())
in args - Return types: May need explicit type annotations
-
Use
create_eval.py
to scaffold:python3 create_eval.py <eval_name> <category>
This creates the basic directory structure and initializes a Convex project.
-
Categories should be organized by concept:
000-fundamentals/
- Basic Convex concepts001-data_modeling/
- Schema design and relationships002-mutations/
- Data modification patterns003-queries/
- Query patterns and optimization etc.
-
Be Explicit About Schema
- Always provide the complete schema in the prompt
- Use TypeScript code blocks for clarity
- Include comments explaining field purposes
-
Clear Requirements
- List specific functions to implement
- For each function, specify:
- Exact function name
- Required arguments and their types
- Expected return type/structure
- Any specific behaviors or edge cases to handle
-
Test Data Requirements
- Specify minimum number of test records
- Define required data variations
- Include specific scenarios to test
-
Implementation Guidelines
- Highlight required patterns or approaches
- Specify what NOT to do
- Note performance considerations
Given this schema:
\`\`\`typescript
export default defineSchema({
// Schema definition with comments
});
\`\`\`
Write two functions:
1. A mutation named \`insertX\` that:
- Specific requirements
- Test data to insert
- Edge cases to handle
2. A query named \`getY\` that:
- Input parameters
- Return structure
- Required behavior
- Performance considerations
Your solution should:
- Technical requirements
- Patterns to use/avoid
- Error handling expectations
-
Data Modeling
- Table relationships (1:1, 1:N, N:M)
- Index design
- Schema validation
-
Query Patterns
- Basic CRUD operations
- Index usage
- Filtering and sorting
- Joins and relationships
- Aggregation and grouping
-
Performance
- Efficient index usage
- Parallel fetching
- Batch operations
- Query optimization
-
Focused Testing
- Each eval should test ONE main concept
- Include related patterns only if they support the main concept
- Keep requirements focused and clear
-
Realistic Scenarios
- Use real-world examples when possible
- Make data requirements meaningful
- Include common edge cases
-
Clear Success Criteria
- Make requirements explicit and testable
- Include both functional and technical requirements
- Specify performance expectations when relevant
-
Progressive Complexity
- Order evals from simple to complex within categories
- Build on previous concepts
- Include stretch goals or optional optimizations
-
Ambiguous Requirements
- Don't leave function names unspecified
- Don't use vague terms like "appropriate" without context
- Always specify exact field names and types
-
Over-complication
- Don't test multiple concepts in one eval
- Don't require complex setup for simple concepts
- Keep schemas focused on the tested concept
-
Missing Context
- Don't assume knowledge of specific patterns
- Include relevant documentation references
- Explain performance implications
-
Untestable Requirements
- Make success criteria measurable
- Specify exact return types
- Include specific test cases
-
Schema First
// schema.ts import { defineSchema, defineTable } from "convex/server"; import { v } from "convex/values"; export default defineSchema({ // Tables and indexes });
-
Function Files
// public.ts import { v } from "convex/values"; import { mutation, query } from "./_generated/server"; import { Id } from "./_generated/dataModel"; // If using IDs export const myFunction = query({...});
-
Common Patterns
- Keep mutations and related queries in the same file
- Group related functionality together
- Include type imports from generated files
- Add helpful comments for complex logic