Skip to content

Refactor: Extract shared code from V1/V2 Scorers into base class #16

@iliall

Description

@iliall

Problem

The V1 and V2 scorers share approximately 80% of their code, violating DRY principles.

Files affected:

  • src/judge/scoring/v1/scorer.py
  • src/judge/scoring/v2/scorer.py

Duplicated Methods

The following methods are identical or near-identical between V1 and V2:

Method Status
_call_llm() Identical
_parse_single_response() Identical
_parse_batch_response() Identical

Proposed Solution

Extract shared code into a BaseScorer class:

class BaseScorer:
    def __init__(self, model: str = "openai/gpt-4o"):
        self.model = model
        self.char_loader = CharacteristicLoader()
        self.prompt_loader = PromptLoader(characteristic_loader=self.char_loader)

    def _call_llm(self, prompt: str) -> str:
        ...

    def _parse_single_response(self, response: str, characteristic_id: str) -> Score:
        ...

    def _parse_batch_response(self, response: str) -> list[Score]:
        ...

Then have V1Scorer and V2Scorer extend it, only implementing:

  • score_all()
  • score_single()
  • _build_context()
  • _build_batch_prompt()
  • _build_single_prompt()

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions