Add Query Division Classifier to Code Search #273

pranath-reddy · 2025-11-12T05:55:31Z

Summary 📝

This PR adds query classification to the code search agent and tools by implementing an LLM-based division classifier that categorizes user queries into one of NASA's five science divisions. This enhancement allows the search agent to narrow down searches to repositories belonging to specific divisions, improving search precision and relevance.

Details

Created ScienceDivision enum and DivisionAgent class in intents.py for classifying queries into Earth Science, Planetary Science, Astrophysics, Heliophysics, Biological and Physical Sciences, or Unknown divisions
Added comprehensive DIVISION_PROMPT in code_prompts.py with detailed overviews, study areas, and mission examples for each division
Enhanced LocalRepoCodeSearchTool and SDECodeSearchTool to support division-based filtering with use_division_filter_local and use_division_filter_sde configuration flags in the agent
Modified find_repo() and sde_search() methods to be async, perform division classification, and filter results by division
Updated CodeSearchAgentConfig to inherit division filter configuration and automatically initialize division agent when enabled
Added query division tracking in result metadata via extra field for transparency

Usage

from akd.agents.intents import DivisionAgent, DivisionInputSchema
from akd.agents._base import BaseAgentConfig
from akd.configs.code_prompts import DIVISION_PROMPT

# Configure and initialize the agent
cfg = BaseAgentConfig(model_name="gpt-4o-mini", system_prompt=DIVISION_PROMPT)
agent = DivisionAgent(config=cfg)

# Run the search
result = await agent.arun(
    DivisionInputSchema(
        query="landslide nepal", 
    )
)

Checks

Tested Changes
Stakeholder Approval

- Add a division classifier under `agents/intents` - Add system prompt based on SDE document used in github repo search - Add division filtering support to both local and SDE code search tools *To-Do:* - Add division classifier to the agent flow

…-intent

- Update division classifier prompt based on Rachel's feedback

- integrate division classifier into the code search sub tools - enable division classifier flag at the agent level

github-actions · 2025-11-12T06:06:16Z

❌ Tests failed (exit code: 1)

📊 Test Results

Passed: 571
Failed: 2
Skipped: 7
Warnings: 318
Coverage: 82%

Branch: feature/code-agent-intent
PR: #273
Commit: 4385a47

📋 Full coverage report and logs are available in the workflow run.

NISH1001 · 2025-11-12T17:21:11Z

akd/tools/search/code_search.py

+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.division_agent = DivisionAgent(
+            config=BaseAgentConfig(model_name="gpt-4o-mini", system_prompt=DIVISION_PROMPT)
+        )


I thought this would be part of the agent not tool.

Something like:

agent = CodeSearchAgent(division_classifier=....,...)

And the filter would be applied at agentic level. At tool-level what we can do is, provide enum as a way to just filter without any classification.

tool = CodeSearchTool(Config(division_filter=True,...)) await tool.arun(CodeInputSchema(query=..., division="Something")) # where we pass division from upstream. It could be None as well. If `division` is None, then no need to filter, else just filter using that.

I'd also try to avoid complicating the tools itself. If we have this as part of agent, few things would be nice:

Division classification reasoning trace from original query

We can use that trace and division to drive subsequent agentic iteration

Let's have it as part of CodeSearchAgent, not tool. I'd avoid modifying tool to have llm-stuff unless necessary. This way, divsion classifier wil lbe an intent udnerstanding layer at agent layer,..,,

Reiterating:

division = division_classifier.arun(query....) if self.division_classifier results_full = [] while iteration...: results = self.tool.arun(..., division=division.....) # if division is there, else Just None filter might work # We need to test if performing classification at each iteration results in same set of division or not. Otherwise, original intent might drive the iteration. # Also, we could reuse the same intent udnerstanding to udnerstand the results as well. # Same division classifier -> run through each result -> get division -> then filter # Doing this: we know tool is already dterministic, and the tuning is just happening at agent level. relevant_results = self.checker.arun(results) # does relevancyc hecks tuned to code search itself. Something like archetype we have for lit and data search results_full.extend(relevant_results) refomrulation = self.reformulate(relevant_results, division_traces, division=....) # this will play nice with reformulation as well because there will be more context from the intent/division that can drive next iteration. - perform next iteration or stop, etc

This way, tool behaviour is deterministic in nature and all the non-determinism comes from llm/agent workflow. And this will make a case taht agent improved because we did classification, then tool call, then filter, then checks, anything. Tool tentatively remains the same and only we focus on agentic improvement. This avoids pinpointing non-determinism. If tool is non-deterministic, then we'd have hard time replicating tool call results because there's llm layer to it. This is also the problem I have with LinkAssessor as well. Replication is hard with LinkAssessor (but there's leeway: because we're giving so many urls, 10s of urls, to it, it's fine if there's some non-determinism to it; just a side note).

This is also the reason we're not using Searchpipeline in akd lit v2 because searchpipeline was kind of non-deterministic. But again, since it's owrking with 100s of urls, it's less non-deterministic in nature stastiticaly.

Yeah, makes sense

But thank you for the work. You're on right track. Let's segregate non-dtermnisim more to agentci worfklow than tool. That way: tool call will be our baselines always and always replicable. The comments are feedbacks, not discouraging you to think about the process.

Yup, got it. I'll continue working on this PR after tomorrow's SME session.

…-intent

github-actions · 2025-11-20T21:33:57Z

❌ Tests failed (exit code: 1)

📊 Test Results

Passed: 557
Failed: 7
Skipped: 7
Warnings: 165
Coverage: 80%

Branch: feature/code-agent-intent
PR: #273
Commit: 34bfd2b

📋 Full coverage report and logs are available in the workflow run.

- Remove division classifier from the code agent

github-actions · 2025-11-23T23:07:44Z

❌ Tests failed (exit code: 1)

📊 Test Results

Passed: 559
Failed: 5
Skipped: 7
Warnings: 166
Coverage: 80%

Branch: feature/code-agent-intent
PR: #273
Commit: ecbdd0a

📋 Full coverage report and logs are available in the workflow run.

pranath-reddy added 4 commits November 5, 2025 11:41

add division classification agent

1a16686

- Add a division classifier under `agents/intents` - Add system prompt based on SDE document used in github repo search - Add division filtering support to both local and SDE code search tools *To-Do:* - Add division classifier to the agent flow

Merge remote-tracking branch 'origin/develop' into feature/code-agent…

5fb63a6

…-intent

Update code_prompts.py

3ea1305

- Update division classifier prompt based on Rachel's feedback

update code search

14edda6

- integrate division classifier into the code search sub tools - enable division classifier flag at the agent level

pranath-reddy requested a review from NISH1001 November 12, 2025 05:55

pranath-reddy self-assigned this Nov 12, 2025

pranath-reddy temporarily deployed to integration November 12, 2025 05:55 — with GitHub Actions Inactive

NISH1001 requested changes Nov 12, 2025

View reviewed changes

Merge remote-tracking branch 'origin/develop' into feature/code-agent…

2658ef2

…-intent

pranath-reddy temporarily deployed to integration November 20, 2025 21:20 — with GitHub Actions Inactive

Update code search agent

1f434cb

- Remove division classifier from the code agent

pranath-reddy temporarily deployed to integration November 23, 2025 22:55 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Query Division Classifier to Code Search #273

Add Query Division Classifier to Code Search #273

Uh oh!

pranath-reddy commented Nov 12, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 12, 2025

Uh oh!

NISH1001 Nov 12, 2025

Uh oh!

NISH1001 Nov 12, 2025

Uh oh!

NISH1001 Nov 12, 2025

Uh oh!

NISH1001 Nov 12, 2025 •

edited

Loading

Uh oh!

pranath-reddy Nov 12, 2025

Uh oh!

NISH1001 Nov 12, 2025

Uh oh!

pranath-reddy Nov 12, 2025

Uh oh!

github-actions bot commented Nov 20, 2025

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Query Division Classifier to Code Search #273

Are you sure you want to change the base?

Add Query Division Classifier to Code Search #273

Uh oh!

Conversation

pranath-reddy commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary 📝

Details

Usage

Checks

Uh oh!

github-actions bot commented Nov 12, 2025

📊 Test Results

Uh oh!

NISH1001 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

NISH1001 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

NISH1001 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

NISH1001 Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pranath-reddy Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

NISH1001 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

pranath-reddy Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 20, 2025

📊 Test Results

Uh oh!

github-actions bot commented Nov 23, 2025

📊 Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pranath-reddy commented Nov 12, 2025 •

edited

Loading

NISH1001 Nov 12, 2025 •

edited

Loading