fix: canonicalize legacy cli crawler command surface#23
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
🔍 AI Code Review🔵 Info
🟡 Warning
🔴 Critical
Summary: 🤖 gpt-4.1-mini · 1955 tokens · GitHub Models free tier · 0 premium requests |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the CLI crawler's command-line interface to establish a canonical entry point and argument handling. It centralizes the core logic within the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a solid refactoring that converts the legacy cli_crawler.py into a compatibility shim, delegating to the canonical cli-crawler behavior. The normalization of --include-raw to --raw with deprecation warnings is a good improvement for user experience.
I've found a bug in the argument normalization logic that occurs when --include-raw is used before --raw, causing duplicate arguments. I've provided a fix for this. Additionally, I've suggested adding a new unit test to cover this specific scenario and prevent future regressions.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
🔍 AI Code Review🔵 CI Workflow (.github/workflows/ci.yml)
🟡 Suggestion: Consider caching pip packages and build dependencies to speed up CI. 🟡 Suggestion: Add explicit error handling or output on build failures for easier debugging. 🔴 cli_crawler.py
🟡 Warning: Removing the CLI argument parsing and logic from this script means any users relying on this script directly will lose functionality unless they switch to the canonical CLI command. 🟡 Best practice: If this script is kept for backward compatibility, consider adding a warning message to users about the deprecation or redirecting them explicitly. Overall:
Summary:
No critical security or functional bugs found. 🤖 gpt-4.1-mini · 2099 tokens · GitHub Models free tier · 0 premium requests |
Summary\n- convert legacy cli_crawler.py into a thin compatibility shim delegating to canonical cli-crawler behavior\n- normalize --include-raw to --raw with explicit deprecation warnings and deterministic precedence\n- add unit tests for legacy raw flag compatibility and canonical command nudge\n\nValidation\n- ruff check cli_crawler.py src/crawler/cli_crawler.py tests/unit/test_cli_crawler_compat.py\n- PYTHONPATH=src pytest tests/unit/test_cli_crawler_compat.py -v