Skip to content

Conversation

@pranath-reddy
Copy link
Member

Summary 📝

This PR introduces a Dynamic Relevancy Scoring Agent that evaluates repository content against dynamically generated relevance criteria using a three-level scoring rubric. The agent assesses combined repository content (README, description, title, topics) against both required and nice-to-have criteria, producing quantitative relevance scores with detailed qualitative reasoning. This enables precise ranking of search results while providing transparency into scoring decisions through specific content evidence.

Details

Dynamic Relevancy Scoring

Core Functionality

  • Evaluates repository content against query-specific criteria from DynamicRelevancyCriteriaAgent
  • Uses three-level scoring rubric for each criterion:
    • Fully Satisfies (3.0): Content clearly and comprehensively addresses the criterion
    • Partially Satisfies (1.5): Content somewhat addresses the criterion but incompletely
    • Does Not Satisfy (0.0): Content does not address the criterion
  • Assesses all repository content fields: README text, description, title, topics, metadata
  • Provides detailed reasoning with specific evidence from content for each score

Evaluation Outputs

  • Quantitative relevance metric for ranking repositories
  • Qualitative reasoning traces explaining scoring decisions
  • Overall assessment with actionable feedback for query refinement
  • Identifies well-matched aspects, missing elements, and suggestions for improving retrieval

Code Changes

Added system prompt at akd/configs/code_prompts.py:

  • RELEVANCY_SCORING_PROMPT for evidence-based relevancy evaluation logic

New Agent Implementation: RelevancyScoringAgent

  • Extends LiteLLMInstructorBaseAgent with specialized input/output schemas
  • Single evaluation call efficiently assesses all criteria
  • Input: Query, content, required criteria, nice-to-have criteria
  • Output: Criterion evaluations, overall assessment with refinement feedback

Extended RelevancyCriterion schema:

  • Added is_required: bool field to distinguish required vs. nice-to-have criteria
  • This extends the base RelevanceCriterion from DynamicRelevancyCriteriaAgent

Usage

from akd.agents.relevance import RelevancyScoringAgent, RelevancyScoringAgentInputSchema, RelevancyCriterion
from akd.configs.code_prompts import RELEVANCY_SCORING_PROMPT

# Initialize scoring agent
scoring_agent = RelevancyScoringAgent(
    system_prompt=RELEVANCY_SCORING_PROMPT,
    model="gpt-4"
)

# Prepare criteria from DynamicRelevancyCriteriaAgent output
required_criteria = [
    RelevancyCriterion(
        name="ml_climate_integration",
        description="Implements ML algorithms for climate science",
        is_required=True
    ),
    RelevancyCriterion(
        name="climate_data_handling",
        description="Processes climate datasets",
        is_required=True
    )
]

nice_to_have_criteria = [
    RelevancyCriterion(
        name="visualization_tools",
        description="Provides visualization for climate outputs",
        is_required=False
    )
]

# Evaluate repository content
result = await scoring_agent.arun(
    RelevancyScoringAgentInputSchema(
        query="machine learning for climate modeling",
        content=repository_content,  # Concatenated README, description, etc.
        required_criteria=required_criteria,
        nice_to_have_criteria=nice_to_have_criteria
    )
)

Checks

  • Tested Changes
  • Stakeholder Approval

- Add an agent that uses the criteria generated by dynamic criteria generation agent and outputs a relevancy score
- Add initial system prompt for the relevancy scoring agent
@github-actions
Copy link

❌ Tests failed (exit code: 2)

📊 Test Results

  • Passed: 0
  • Failed: 0
  • Skipped: 0
  • Warnings: 7
  • Coverage: 0%

⚠️ Note: Test counts are 0, which may indicate parsing issues or early test failure. Check the workflow logs for details.
Parsing strategy used: summary_line

Branch: feature/code-relevancy-score
PR: #283
Commit: f1cad70

📋 Full coverage report and logs are available in the workflow run.

@github-actions
Copy link

❌ Tests failed (exit code: 2)

📊 Test Results

  • Passed: 0
  • Failed: 0
  • Skipped: 0
  • Warnings: 7
  • Coverage: 0%

⚠️ Note: Test counts are 0, which may indicate parsing issues or early test failure. Check the workflow logs for details.
Parsing strategy used: summary_line

Branch: feature/code-relevancy-score
PR: #283
Commit: 51ad6bf

📋 Full coverage report and logs are available in the workflow run.

@NISH1001
Copy link
Collaborator

NISH1001 commented Nov 21, 2025

cc: @iamsims check this. i think better collaborate and come to a common ground on score criteria thing here. We don't want various disconnected components doing the same things.
cc: @pranath-reddy as well. I will let you 2 discuss on this.

@NISH1001
Copy link
Collaborator

also @pranath-reddy i'd recommend to put the new relevance agents (if we happen to finalize it/merge it later) to already existing module akd.agents.relevancy

- Update base relevancy criterion to match the criteria gen agent
@github-actions
Copy link

❌ Tests failed (exit code: 2)

📊 Test Results

  • Passed: 0
  • Failed: 0
  • Skipped: 0
  • Warnings: 8
  • Coverage: 0%

⚠️ Note: Test counts are 0, which may indicate parsing issues or early test failure. Check the workflow logs for details.
Parsing strategy used: summary_line

Branch: feature/code-relevancy-score
PR: #283
Commit: 8f51bbd

📋 Full coverage report and logs are available in the workflow run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants