Skip to content

Conversation

@kaigler
Copy link
Contributor

@kaigler kaigler commented Jan 18, 2026

Summary

  • Fixes race condition where tasks get stuck after planning phase completes
  • Root cause: get_next_subtask() may return None briefly due to file I/O timing
  • Solution: Add retry logic with exponential backoff when transitioning from planning to coding

Changes

  • Added just_transitioned_from_planning flag in apps/backend/agents/coder.py
  • Added retry loop (3 attempts, 2s/4s/6s delays) when no subtask found after planning
  • Updates subtask_id and phase_name after successful retry

Test Plan

  • Tested via CLI: python run.py --spec 002 --force --auto-continue
  • Tested via Electron frontend: Started task from UI
  • Both successfully transitioned from planning to coding
  • Subtasks completed without getting stuck
  • All 1575 backend tests pass locally

Related Issues

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Improved agent stability with enhanced handling during the planning-to-coding phase transition, including automatic retry logic to gracefully manage timing delays when tasks are delayed.

✏️ Tip: You can customize this high-level summary in your review settings.

The coder agent could get stuck after planning completes because
get_next_subtask() may return None briefly due to file I/O timing.

- Add just_transitioned_from_planning flag to detect transition
- Retry with exponential backoff (2s, 4s, 6s) after planning
- Update subtask_id and phase_name after successful retry

Fixes AndyMik90#495
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

📝 Walkthrough

Walkthrough

Introduces a just_transitioned_from_planning flag to track planning-to-coding phase transitions in the coder agent. Adds a retry mechanism with exponential backoff (2s, 4s, 6s) that re-checks for pending subtasks up to three times when no next_subtask is available immediately after transition, handling potential race conditions during phase change.

Changes

Cohort / File(s) Summary
Retry logic for planning-to-coding transition
apps/backend/agents/coder.py
Added just_transitioned_from_planning flag to gate race-condition handling. Implements exponential backoff retry mechanism (3 attempts with 2s, 4s, 6s delays) to wait for next_subtask availability after planning phase completes. Falls back to original flow if no subtasks are found after retries.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested labels

bug, size/S

Suggested reviewers

  • AlexMadera

Poem

🐰 A race condition sparked a fix so neat,
With flags and retries, the logic's complete!
Planning to coding, a transition so grand,
Now subtasks await as we carefully planned!

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements retry logic for a race condition during planning-to-coding transition, but the linked issue #495 requires explicit status transition from 'human_review' to 'in_progress' with approval validation, which is not present in the implementation. Implement the explicit status transition logic checking plan.status == 'human_review' and plan.planStatus == 'review', then validate approval and transition to 'in_progress' before retrying for subtasks.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding retry logic for the planning-to-coding transition, which matches the core implementation of the retry mechanism with exponential backoff.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing retry logic for the planning-to-coding transition, which addresses the race condition mentioned in issue #495.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kaigler, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the stability of the system by resolving a critical race condition that could halt task progression. By implementing a strategic retry mechanism during the transition from planning to coding, it ensures that the system reliably identifies and processes subsequent subtasks, preventing tasks from becoming unresponsive and improving overall workflow continuity.

Highlights

  • Race Condition Fix: Addresses a race condition where tasks could get stuck after the planning phase due to get_next_subtask() briefly returning None.
  • Retry Logic Implementation: Introduces robust retry logic with exponential backoff (3 attempts, 2s/4s/6s delays) when transitioning from planning to coding, ensuring subtasks are properly picked up.
  • State Management: A new just_transitioned_from_planning flag is used to specifically trigger the retry mechanism only when immediately following the planning phase.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Thanks for your first PR!

A maintainer will review it soon. Please make sure:

  • Your branch is synced with develop
  • CI checks pass
  • You've followed our contribution guide

Welcome to the Auto Claude community!

@sentry
Copy link

sentry bot commented Jan 18, 2026

Codecov Report

❌ Patch coverage is 11.76471% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
apps/backend/agents/coder.py 11.76% 15 Missing ⚠️

📢 Thoughts on this report? Let us know!

if next_subtask:
# Update subtask_id and phase_name after successful retry
subtask_id = next_subtask.get("id")
phase_name = next_subtask.get("phase_name")

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning

This assignment to 'phase_name' is unnecessary as it is
redefined
before this value is used.
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism to address a race condition that occurs when transitioning from the planning to the coding phase. The changes look solid and directly address the issue described. The use of a flag to detect the transition is a good approach.

My review includes a suggestion to align the backoff implementation with the 'exponential backoff' mentioned in the comments and PR description, as the current implementation is linear. I've also recommended extracting hardcoded values for retries and delays into constants to improve code maintainability.

Overall, this is a valuable fix for a tricky timing issue.

Comment on lines +351 to +352
for retry_attempt in range(3):
delay = (retry_attempt + 1) * 2 # 2s, 4s, 6s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This retry logic is a good improvement. To make it even better, I have two suggestions:

  1. Exponential vs. Linear Backoff: The comment on line 346 indicates 'exponential backoff', but the delay calculation on line 352 is linear. It would be better to align the implementation with the comment by using an actual exponential backoff.

  2. Magic Numbers: The retry count 3 and the delay base 2 are hardcoded. Extracting these into named constants (e.g., PLANNING_TRANSITION_RETRIES, RETRY_DELAY_BASE_SECONDS) at a higher scope would improve readability and maintainability.

Here's a suggestion that implements exponential backoff. The constants for retry count and delay base can be defined elsewhere.

Suggested change
for retry_attempt in range(3):
delay = (retry_attempt + 1) * 2 # 2s, 4s, 6s
for retry_attempt in range(3):
delay = 2 ** (retry_attempt + 1) # Exponential backoff: 2s, 4s, 8s

@kaigler kaigler mentioned this pull request Jan 18, 2026
1 task
@AndyMik90 AndyMik90 self-assigned this Jan 18, 2026
Copy link
Owner

@AndyMik90 AndyMik90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto Claude Review - APPROVED

Status: Ready to Merge

Summary: ### Merge Verdict: ✅ READY TO MERGE

✅ Ready to merge - All checks passing, no blocking issues found.

No blocking issues. 4 non-blocking suggestion(s) to consider

Risk Assessment

Factor Level Notes
Complexity Low Based on lines changed
Security Impact None Based on security findings
Scope Coherence Good Based on structural review

Findings Summary

  • Low: 4 issue(s)

Generated by Auto Claude PR Review


💡 Suggestions (4)

These are non-blocking suggestions for consideration:

🔵 [9a3b10490c71] [LOW] Comment says 'exponential backoff' but implementation is linear

📁 apps/backend/agents/coder.py:346

The comment on line 346 states 'Retry with exponential backoff' but the implementation delay = (retry_attempt + 1) * 2 produces 2s, 4s, 6s (linear progression). True exponential backoff would be 2 ** (retry_attempt + 1) producing 2s, 4s, 8s. This is a documentation accuracy issue - the actual delays work fine for the use case.

Suggested fix:

Either update comment to 'Retry with linear backoff before giving up.' or change formula to `delay = 2 ** (retry_attempt + 1)` for true exponential (2s, 4s, 8s).

🔵 [75ceb8034f37] [LOW] Success message reports single-iteration delay instead of cumulative wait time

📁 apps/backend/agents/coder.py:359

When a subtask is found after retry, the message reports f'Found subtask {subtask_id} after {delay}s delay'. However, delay is only the current iteration's delay, not cumulative time. For example, if found on retry 2 (after sleeping 2s then 4s), message says '4s delay' when actual wait was 6s total. Minor but could cause debugging confusion.

Suggested fix:

Track cumulative delay: `total_delay = 0` before loop, `total_delay += delay` after each sleep, then report `f'Found subtask {subtask_id} after {total_delay}s total delay'`

🔵 [c41e1c93e801] [LOW] Consider extracting retry configuration as named constants

📁 apps/backend/agents/coder.py:351

The retry count (3) and base delay (2) are hardcoded inline. The codebase has patterns for such constants (e.g., MAX_RETRIES in spec/phases/models.py, AUTO_CONTINUE_DELAY_SECONDS in agents/base.py). Extracting these would improve discoverability and make tuning easier. The inline comment # 2s, 4s, 6s documents the behavior adequately for now.

Suggested fix:

Add constants to agents/base.py: `PLAN_READY_MAX_RETRIES = 3` and `PLAN_READY_BASE_DELAY_SECONDS = 2`, then use them in the loop.

🔵 [381534ff5b73] [LOW] AI Triage: GitHub Advanced Security 'variable defined multiple times' is FALSE POSITIVE

📁 apps/backend/agents/coder.py:358

GitHub Advanced Security flagged line 358 phase_name = next_subtask.get("phase_name") as unnecessary. This is incorrect - the initial assignment on line 248 IS used on line 252 (if phase_name:) and line 269 (print_session_header). The reassignment on line 358 only occurs in the specific retry-success branch. Both assignments serve distinct purposes in different code paths.

Suggested fix:

No fix needed - this is a false positive from the static analysis tool.

This automated review found no blocking issues. The PR can be safely merged.

Generated by Auto Claude

@AndyMik90 AndyMik90 merged commit b865590 into AndyMik90:develop Jan 18, 2026
26 checks passed
@kaigler kaigler deleted the fix/495-planning-to-coding-race-condition branch January 18, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task execution stops after planning phase despite approval

2 participants