Skip to content

Conversation

@cantemizyurek
Copy link

No description provided.

mattpocock and others added 30 commits October 19, 2025 12:44
- Remove unused DB_LOCATION import from test-utils.ts
- Replace FILES_LOCATION import with local constant in files.test.ts

Co-authored-by: Matt Pocock <[email protected]>
- Add dotenv as a dependency
- Create env-setup-file module that imports dotenv/config
- Export env-setup-file as 'evalite/env-setup-file'
- Automatically prepend env-setup-file to setupFiles array
- Update documentation to reflect automatic .env loading
- Update example config to remove manual dotenv setup

Fixes mattpocock#234

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Matt Pocock <[email protected]>
… precedence

- Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts
- Merge setupFiles from both configs with evalite.config.ts taking precedence
- Add tests for vitest.config.ts setupFiles support and precedence
- setupFiles execution order: env-setup-file -> vitest -> evalite

Co-authored-by: Matt Pocock <[email protected]>
  - Export new `evalite/scorers` module with factory functions
  - Add `createLLMBasedScorer` for model-dependent scorers
  - Add `createEmbeddingBasedScorer` for embedding-dependent scorers
  - Introduce `EvaluationSample` type with query, contexts, and reference fields

Part of mattpocock#250
- Added a new `faithfulness` scorer to evaluate model responses against retrieved contexts.
- Introduced utility functions for scoring and context handling.
- Updated `package.json` to include `zod` version 4.1.12 as a dependency.
- Updated `pnpm-lock.yaml` to reflect changes in dependencies and versions.

Part of mattpocock#250
- Introduced a new `answerSimilarity` scorer to assess the semantic similarity between a ground truth answer and a generated answer.
- The scorer utilizes embedding models to compute cosine similarity and includes an optional threshold for binary output.
- Updated the `scorers` module to export the new `answerSimilarity` scorer.

Part of mattpocock#250
- Introduced a new `contextRecall` scorer to evaluate how much of a generated answer can be attributed to retrieved contexts.
- Updated the `scorers` module to export the new `contextRecall` scorer.

Part of mattpocock#250
…g based scorers, and context recall and faithfulness classifications
cantemizyurek and others added 26 commits October 22, 2025 23:32
- Remove unused DB_LOCATION import from test-utils.ts
- Replace FILES_LOCATION import with local constant in files.test.ts

Co-authored-by: Matt Pocock <[email protected]>
- Add dotenv as a dependency
- Create env-setup-file module that imports dotenv/config
- Export env-setup-file as 'evalite/env-setup-file'
- Automatically prepend env-setup-file to setupFiles array
- Update documentation to reflect automatic .env loading
- Update example config to remove manual dotenv setup

Fixes mattpocock#234

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Matt Pocock <[email protected]>
… precedence

- Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts
- Merge setupFiles from both configs with evalite.config.ts taking precedence
- Add tests for vitest.config.ts setupFiles support and precedence
- setupFiles execution order: env-setup-file -> vitest -> evalite

Co-authored-by: Matt Pocock <[email protected]>
- Remove `createBaseScorer`, consolidate to `createLLMScorer`/`createEmbeddingScorer`
- Add generic `TExpected` type for type-safe expected data
- Replace `singleTurn`/`multiTurn` with single `scorer` function
- Rename utils to `isSingleTurnInput`/`isMultiTurnInput`
- Update all scorers (faithfulness, answerSimilarity, contextRecall) to new API
- Fix example.eval.ts: textStream -> text

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Move inline expected data types from answer-similarity, context-recall,
and faithfulness scorers to the Evalite.Scorers namespace in types.ts
for better type organization and discoverability.

Co-authored-by: Matt Pocock <[email protected]>
- Renamed input types in Evalite.Scorers namespace to reflect output handling: SingleTurnInput to SingleTurnOutput, MultiTurnInput to MultiTurnOutput, and updated related types accordingly.
- Modified scorer implementations in context-recall and faithfulness to use new output types.
- Updated utility functions to check for output types instead of input types, enhancing clarity and consistency in the scoring logic.
- Introduced two new scorers: `exactMatch` checks for exact string matches, while `contains` verifies if the output includes the expected substring.
- Updated the Evalite types to include `ExactMatchExpected` and `ContainsExpected` for better type safety.
- Exported new scorers from the scorers index for accessibility.

Related mattpocock#250
- Created a new file `string-scorers.eval.ts` to demonstrate the usage of `exactMatch` and `contains` scorers.
@changeset-bot
Copy link

changeset-bot bot commented Oct 26, 2025

⚠️ No Changeset found

Latest commit: 0a30d93

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Oct 26, 2025

@cantemizyurek is attempting to deploy a commit to the Skill Recordings Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants