-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
fix: add retry logic for planning-to-coding transition (#495) #1276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add retry logic for planning-to-coding transition (#495) #1276
Conversation
The coder agent could get stuck after planning completes because get_next_subtask() may return None briefly due to file I/O timing. - Add just_transitioned_from_planning flag to detect transition - Retry with exponential backoff (2s, 4s, 6s) after planning - Update subtask_id and phase_name after successful retry Fixes AndyMik90#495
📝 WalkthroughWalkthroughIntroduces a Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @kaigler, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the stability of the system by resolving a critical race condition that could halt task progression. By implementing a strategic retry mechanism during the transition from planning to coding, it ensures that the system reliably identifies and processes subsequent subtasks, preventing tasks from becoming unresponsive and improving overall workflow continuity. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Thanks for your first PR!
A maintainer will review it soon. Please make sure:
- Your branch is synced with
develop - CI checks pass
- You've followed our contribution guide
Welcome to the Auto Claude community!
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| if next_subtask: | ||
| # Update subtask_id and phase_name after successful retry | ||
| subtask_id = next_subtask.get("id") | ||
| phase_name = next_subtask.get("phase_name") |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning
redefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a retry mechanism to address a race condition that occurs when transitioning from the planning to the coding phase. The changes look solid and directly address the issue described. The use of a flag to detect the transition is a good approach.
My review includes a suggestion to align the backoff implementation with the 'exponential backoff' mentioned in the comments and PR description, as the current implementation is linear. I've also recommended extracting hardcoded values for retries and delays into constants to improve code maintainability.
Overall, this is a valuable fix for a tricky timing issue.
| for retry_attempt in range(3): | ||
| delay = (retry_attempt + 1) * 2 # 2s, 4s, 6s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This retry logic is a good improvement. To make it even better, I have two suggestions:
-
Exponential vs. Linear Backoff: The comment on line 346 indicates 'exponential backoff', but the delay calculation on line 352 is linear. It would be better to align the implementation with the comment by using an actual exponential backoff.
-
Magic Numbers: The retry count
3and the delay base2are hardcoded. Extracting these into named constants (e.g.,PLANNING_TRANSITION_RETRIES,RETRY_DELAY_BASE_SECONDS) at a higher scope would improve readability and maintainability.
Here's a suggestion that implements exponential backoff. The constants for retry count and delay base can be defined elsewhere.
| for retry_attempt in range(3): | |
| delay = (retry_attempt + 1) * 2 # 2s, 4s, 6s | |
| for retry_attempt in range(3): | |
| delay = 2 ** (retry_attempt + 1) # Exponential backoff: 2s, 4s, 8s |
AndyMik90
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Auto Claude Review - APPROVED
Status: Ready to Merge
Summary: ### Merge Verdict: ✅ READY TO MERGE
✅ Ready to merge - All checks passing, no blocking issues found.
No blocking issues. 4 non-blocking suggestion(s) to consider
Risk Assessment
| Factor | Level | Notes |
|---|---|---|
| Complexity | Low | Based on lines changed |
| Security Impact | None | Based on security findings |
| Scope Coherence | Good | Based on structural review |
Findings Summary
- Low: 4 issue(s)
Generated by Auto Claude PR Review
💡 Suggestions (4)
These are non-blocking suggestions for consideration:
🔵 [9a3b10490c71] [LOW] Comment says 'exponential backoff' but implementation is linear
📁 apps/backend/agents/coder.py:346
The comment on line 346 states 'Retry with exponential backoff' but the implementation delay = (retry_attempt + 1) * 2 produces 2s, 4s, 6s (linear progression). True exponential backoff would be 2 ** (retry_attempt + 1) producing 2s, 4s, 8s. This is a documentation accuracy issue - the actual delays work fine for the use case.
Suggested fix:
Either update comment to 'Retry with linear backoff before giving up.' or change formula to `delay = 2 ** (retry_attempt + 1)` for true exponential (2s, 4s, 8s).
🔵 [75ceb8034f37] [LOW] Success message reports single-iteration delay instead of cumulative wait time
📁 apps/backend/agents/coder.py:359
When a subtask is found after retry, the message reports f'Found subtask {subtask_id} after {delay}s delay'. However, delay is only the current iteration's delay, not cumulative time. For example, if found on retry 2 (after sleeping 2s then 4s), message says '4s delay' when actual wait was 6s total. Minor but could cause debugging confusion.
Suggested fix:
Track cumulative delay: `total_delay = 0` before loop, `total_delay += delay` after each sleep, then report `f'Found subtask {subtask_id} after {total_delay}s total delay'`
🔵 [c41e1c93e801] [LOW] Consider extracting retry configuration as named constants
📁 apps/backend/agents/coder.py:351
The retry count (3) and base delay (2) are hardcoded inline. The codebase has patterns for such constants (e.g., MAX_RETRIES in spec/phases/models.py, AUTO_CONTINUE_DELAY_SECONDS in agents/base.py). Extracting these would improve discoverability and make tuning easier. The inline comment # 2s, 4s, 6s documents the behavior adequately for now.
Suggested fix:
Add constants to agents/base.py: `PLAN_READY_MAX_RETRIES = 3` and `PLAN_READY_BASE_DELAY_SECONDS = 2`, then use them in the loop.
🔵 [381534ff5b73] [LOW] AI Triage: GitHub Advanced Security 'variable defined multiple times' is FALSE POSITIVE
📁 apps/backend/agents/coder.py:358
GitHub Advanced Security flagged line 358 phase_name = next_subtask.get("phase_name") as unnecessary. This is incorrect - the initial assignment on line 248 IS used on line 252 (if phase_name:) and line 269 (print_session_header). The reassignment on line 358 only occurs in the specific retry-success branch. Both assignments serve distinct purposes in different code paths.
Suggested fix:
No fix needed - this is a false positive from the static analysis tool.
This automated review found no blocking issues. The PR can be safely merged.
Generated by Auto Claude
Summary
get_next_subtask()may returnNonebriefly due to file I/O timingChanges
just_transitioned_from_planningflag inapps/backend/agents/coder.pysubtask_idandphase_nameafter successful retryTest Plan
python run.py --spec 002 --force --auto-continueRelated Issues
🤖 Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.