- 
                Notifications
    
You must be signed in to change notification settings  - Fork 44
 
Add 2 new scorers Exact Match and Contains #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v1
Are you sure you want to change the base?
Conversation
- Remove unused DB_LOCATION import from test-utils.ts - Replace FILES_LOCATION import with local constant in files.test.ts Co-authored-by: Matt Pocock <[email protected]>
- Add dotenv as a dependency - Create env-setup-file module that imports dotenv/config - Export env-setup-file as 'evalite/env-setup-file' - Automatically prepend env-setup-file to setupFiles array - Update documentation to reflect automatic .env loading - Update example config to remove manual dotenv setup Fixes mattpocock#234 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Matt Pocock <[email protected]>
… precedence - Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts - Merge setupFiles from both configs with evalite.config.ts taking precedence - Add tests for vitest.config.ts setupFiles support and precedence - setupFiles execution order: env-setup-file -> vitest -> evalite Co-authored-by: Matt Pocock <[email protected]>
- Export new `evalite/scorers` module with factory functions - Add `createLLMBasedScorer` for model-dependent scorers - Add `createEmbeddingBasedScorer` for embedding-dependent scorers - Introduce `EvaluationSample` type with query, contexts, and reference fields Part of mattpocock#250
- Added a new `faithfulness` scorer to evaluate model responses against retrieved contexts. - Introduced utility functions for scoring and context handling. - Updated `package.json` to include `zod` version 4.1.12 as a dependency. - Updated `pnpm-lock.yaml` to reflect changes in dependencies and versions. Part of mattpocock#250
Add Faithfulness
- Introduced a new `answerSimilarity` scorer to assess the semantic similarity between a ground truth answer and a generated answer. - The scorer utilizes embedding models to compute cosine similarity and includes an optional threshold for binary output. - Updated the `scorers` module to export the new `answerSimilarity` scorer. Part of mattpocock#250
Add Answer Similarity Scorer
- Introduced a new `contextRecall` scorer to evaluate how much of a generated answer can be attributed to retrieved contexts. - Updated the `scorers` module to export the new `contextRecall` scorer. Part of mattpocock#250
…c and updating metadata format
…g based scorers, and context recall and faithfulness classifications
… namespace for better organization
…d Multi Turn Sample Type
- Remove unused DB_LOCATION import from test-utils.ts - Replace FILES_LOCATION import with local constant in files.test.ts Co-authored-by: Matt Pocock <[email protected]>
- Add dotenv as a dependency - Create env-setup-file module that imports dotenv/config - Export env-setup-file as 'evalite/env-setup-file' - Automatically prepend env-setup-file to setupFiles array - Update documentation to reflect automatic .env loading - Update example config to remove manual dotenv setup Fixes mattpocock#234 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Matt Pocock <[email protected]>
… precedence - Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts - Merge setupFiles from both configs with evalite.config.ts taking precedence - Add tests for vitest.config.ts setupFiles support and precedence - setupFiles execution order: env-setup-file -> vitest -> evalite Co-authored-by: Matt Pocock <[email protected]>
…intained and grows as necessary
Add Scorers module
- Remove `createBaseScorer`, consolidate to `createLLMScorer`/`createEmbeddingScorer` - Add generic `TExpected` type for type-safe expected data - Replace `singleTurn`/`multiTurn` with single `scorer` function - Rename utils to `isSingleTurnInput`/`isMultiTurnInput` - Update all scorers (faithfulness, answerSimilarity, contextRecall) to new API - Fix example.eval.ts: textStream -> text 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Move inline expected data types from answer-similarity, context-recall, and faithfulness scorers to the Evalite.Scorers namespace in types.ts for better type organization and discoverability. Co-authored-by: Matt Pocock <[email protected]>
- Renamed input types in Evalite.Scorers namespace to reflect output handling: SingleTurnInput to SingleTurnOutput, MultiTurnInput to MultiTurnOutput, and updated related types accordingly. - Modified scorer implementations in context-recall and faithfulness to use new output types. - Updated utility functions to check for output types instead of input types, enhancing clarity and consistency in the scoring logic.
- Introduced two new scorers: `exactMatch` checks for exact string matches, while `contains` verifies if the output includes the expected substring. - Updated the Evalite types to include `ExactMatchExpected` and `ContainsExpected` for better type safety. - Exported new scorers from the scorers index for accessibility. Related mattpocock#250
- Created a new file `string-scorers.eval.ts` to demonstrate the usage of `exactMatch` and `contains` scorers.
          
 | 
    
| 
           @cantemizyurek is attempting to deploy a commit to the Skill Recordings Team on Vercel. A member of the Team first needs to authorize it.  | 
    
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
No description provided.