-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor navigation commands into its own tool #35
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this! I've left some comments and questions - hopefully they are helpful! I want to dig a bit further into some other things such as output tree rendering, but thought I would share what I have so far.
|
||
TOOL_DESCRIPTION = """Custom navigation tool for navigating to symbols in a codebase | ||
* If there are multiple symbols with the same name, the tool will print all of them | ||
* The `jump_to_definition` command will print the FULL definition of the symbol, along with the absolute path to the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you tested cases where the definition is for a dependency which has not yet been resolved / downloaded? I don't mean anything exotic like dynamically downloaded dependencies, but a simple case of finding the definition from an external dependency which hasn't yet been installed by a package manager or what have you? just curious if this can be surfaced in a meaningful way that OH is able to reason about and install the dependencies itself or prompt the user to do so?
with pytest.raises(Exception) as exc_info: | ||
navigator(command='invalid_command', symbol_name='MyClass') | ||
|
||
assert 'Unrecognized command' in str(exc_info.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there further plans to test E2E that OH is using the Navigator CLI when it would by expected to do so or does that introduce too much bias / reduce agency? It would seem that there might at least be some basic tests along these lines where the obvious choice should be to use this tool. I'm not really concerned with influencing the agency of the OH "agent", but more saying that it would be good to have a way to test the efficacy of this tool and it's relevant prompts such that we have confidence that 1.) OH is well aware of it as a valid option when it is contextually appropriate and that 2.) it is able to appropriately call the right CLI commands and effectively navigate the results (understanding this may not be perfect perhaps this could be based on some threshold percentage?). My thinking here is that this would help build confidence that the OH eval results are actually testing the desired functionality. Also are eval results the sole arbiter of whether or not a new feature will make the cut and if so where can I learn more about how that process works and how it is measured? A concern is what might happen if, for instance, this led to significantly lower token usage, but performed slightly worse perhaps because more brute force methods pull more code into the context. It could be that ancillary work to endow OH with a better ability to logically reason about code could unlock improved eval scores. In other words how do you avoid throwing away perfectly good tools?
|
||
Command = Literal[ | ||
'view', | ||
'create', | ||
'str_replace', | ||
'insert', | ||
'undo_edit', | ||
# 'jump_to_definition', TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if there's a formal board and backlog item to link these TODO items to, but in my experience these kinds of things often never get done otherwise
def get_definitions_tree( | ||
self, symbol: str, rel_file_path: str | None = None, use_end_line=True | ||
): | ||
if not self.git_utils: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why this is required for Navigation commands? perhpaps that will become more obvious as I read on..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see this as being a useful way for eagerly anticipating files that may need to be refreshed from the cache although perhaps that would only make sense if you're running some background indexing process
) # (relative file, symbol identifier) -> set of its REF tags | ||
|
||
all_abs_files_iter = ( | ||
tqdm(all_abs_files, desc='Parsing tags', unit='file') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tqdm(all_abs_files, desc='Parsing tags', unit='file') | |
# tdqm is a library which helps us track progress as we're | |
# iterating through the list of files we got from git utils | |
tqdm(all_abs_files, desc='Parsing tags', unit='file') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps this library is more well-known than I'm aware, but I thought it might be helpful to add some context here given the naming does not convey a clear utility for this import - alternatively alias the import as something like progress_tracker
?
parsed_tags = self.ts_parser.get_tags_from_file(abs_file, rel_file) | ||
|
||
for parsed_tag in parsed_tags: | ||
if parsed_tag.tag_kind == TagKind.DEF: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a nitpick, but you could separate the file finding and iteration using a generator and then doing the actual processing on the relative files as you iterate through the rel_file
yielded by the generator. This would allow you to have separate methods for getting definition and reference tags without having to repeat code. I understand you may still need to go through parsed_tags = self.ts_parser.get_tags_from_file(abs_file, rel_file)
for now, but it would allow these methods to evolve independently and make the methods and usages more readable. Again - bit of a nitpick. If there were separate methods to get the definition tags and reference tags perhaps I'd feel a bit more strongly about this 🙃.
if not def_tags: | ||
# Perform a fuzzy search for the symbol | ||
choices = list(ident2defrels.keys()) | ||
suggested_matches = process.extract(symbol, choices, limit=5) | ||
return f"No definitions found for `{symbol}`. Maybe you meant one of these: {', '.join(match[0] for match in suggested_matches)}?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this and think it's a thoughtful follow-up as opposed to just bailing with "not found", but could it actually be better to just bail or just hint to the agent that a fuzzy search (or some less biased language)might be a good next step? Sort of playing devils advocate here. In a more traditional program I'd say this is doing too much, but this is a different paradigm I'm still getting used to.
) | ||
|
||
# Truncate long lines in case we get minified js or something else crazy | ||
output = '\n'.join(line[:150] for line in output.splitlines()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you get start and end positions from the parsed tags to help with this sort of thing (minified js or what have you)?
Description
This PR is to:
TODO:
This is an alternative to #5.
Related Issue
Close #28