-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Expand GenAI evals to allow for more robust evaluation types
Including:
- LLM as a judge (LLMJudgeTask)
- Structured output validation (FieldValidators)
- Matching validators (validating specific outputs types, e.g. string)
Functionality:
- Allow users to specify multiple evaluations for a single
event - Global test suite validations along with individualized assertions
- GenAI service profile type that allows users to validate multiple genai profiles at once
- Some genai applications will have more than one prompt. A user may want to build a test suite for each individual task as well as a global service evaluation task.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status