Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/cli/planner.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ lol plan --refine <issue-no> [refinement-instructions]

1. **Understander** (sonnet) - Gathers codebase context with `Read,Grep,Glob` tools
2. **Bold-proposer** (opus) - Researches SOTA solutions and proposes innovative approaches with `Read,Grep,Glob,WebSearch,WebFetch` tools and `--permission-mode plan`
3. **Critique** (opus) - Validates assumptions and analyzes feasibility (runs in parallel with Reducer)
4. **Reducer** (opus) - Simplifies proposal following "less is more" philosophy (runs in parallel with Critique)
3. **Critique** (opus) - Validates assumptions and analyzes feasibility (always runs in parallel with Reducer)
4. **Reducer** (opus) - Simplifies proposal following "less is more" philosophy (always runs in parallel with Critique)
5. **Consensus** (opus) - Synthesizes final plan from the three reports using the external-consensus prompt

Both critique and reducer append plan-guideline content and run in parallel via the Python executor.
Critique and reducer append plan-guideline content and always run in parallel via the Python executor; there is no sequential mode.

### `--dry-run` (optional flag)

Expand Down
5 changes: 2 additions & 3 deletions docs/feat/core/ultra-planner.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ graph TD
U --> R{Lite conditions met?}
R -->|yes: repo-only, <5 files, <150 LOC| L[Planner-lite: Single-agent plan]
R -->|no: needs research or complex| C[Bold-proposer: Research SOTA]
C --> D[Critique + Reducer in parallel]
C --> D[Critique + Reducer (parallel-only)]
D --> F[Combined 3-perspective report]
L --> H[Update issue with plan]
F --> G[External consensus: Synthesize plan]
Expand Down Expand Up @@ -258,7 +258,7 @@ After reviewing a plan issue:
**Breakdown:**
- Understander agent: 1-2 minutes (codebase exploration + complexity estimation)
- Bold-proposer agent: 2-3 minutes (research + proposal, with context)
- Critique + Reducer agents (parallel): 2-3 minutes
- Critique + Reducer agents (parallel-only): 2-3 minutes
- External consensus review: 1-2 minutes
- Draft issue creation: <10 seconds

Expand Down Expand Up @@ -401,4 +401,3 @@ This does not change the `/ultra-planner` command interface documented above. Se
| **Workflow** | Approval → Issue → Impl | Issue → Refine* → Impl |

*Refinement is optional and can be done multiple times using `--refine`

5 changes: 3 additions & 2 deletions python/agentize/workflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ and output suffix.
```
understander → bold → critique → reducer → consensus (optional)
↓ ↓
(parallel when enabled)
(parallel-only)
```

1. **Understander**: Gathers codebase context and constraints
Expand All @@ -46,6 +46,8 @@ understander → bold → critique → reducer → consensus (optional)
4. **Reducer**: Simplifies proposals following "less is more" philosophy
5. **Consensus**: Synthesizes a unified implementation plan (optional for library use; CLI delegates to the external consensus script)

Critique and reducer are always executed in parallel.

## Usage

```python
Expand All @@ -54,7 +56,6 @@ from agentize.workflow import run_planner_pipeline
results = run_planner_pipeline(
"Add user authentication with JWT tokens",
output_dir=".tmp",
parallel=True,
output_suffix="-output.md",
skip_consensus=True,
)
Expand Down
4 changes: 1 addition & 3 deletions python/agentize/workflow/__init__.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ def run_planner_pipeline(
*,
output_dir: str | Path = ".tmp",
backends: dict[str, tuple[str, str]] | None = None,
parallel: bool = True,
runner: Callable[..., subprocess.CompletedProcess] = run_acw,
prefix: str | None = None,
output_suffix: str = "-output.md",
Expand All @@ -68,12 +67,12 @@ def run_planner_pipeline(
```

Execute the 5-stage planner pipeline: understander → bold → critique → reducer → consensus.
Critique and reducer are always executed in parallel.

**Parameters:**
- `feature_desc`: Feature request description to plan
- `output_dir`: Directory for artifacts (default: `.tmp`)
- `backends`: Provider/model mapping per stage (default: understander uses claude/sonnet, others claude/opus)
- `parallel`: Run critique and reducer in parallel (default: True)
- `runner`: Callable for stage execution (default: `run_acw`, injectable for testing)
- `prefix`: Artifact filename prefix (default: timestamp-based)
- `output_suffix`: Suffix appended to stage output filenames (default: `-output.md`)
Expand Down Expand Up @@ -162,7 +161,6 @@ from agentize.workflow import run_planner_pipeline
results = run_planner_pipeline(
"Implement dark mode toggle",
backends={"consensus": ("claude", "opus")},
parallel=False,
)

for stage, result in results.items():
Expand Down
1 change: 0 additions & 1 deletion python/agentize/workflow/planner.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ def run_planner_pipeline(
*,
output_dir: str | Path = ".tmp",
backends: dict[str, tuple[str, str]] | None = None,
parallel: bool = True,
runner: Callable[..., subprocess.CompletedProcess] = run_acw,
prefix: str | None = None,
output_suffix: str = "-output.md",
Expand Down
3 changes: 1 addition & 2 deletions python/agentize/workflow/planner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Runnable package for the multi-stage planner pipeline. Provides both library int

## Purpose

This package contains the 5-stage planner pipeline (understander → bold → critique → reducer → consensus) that powers `lol plan`. It is structured as a runnable package to support `python -m agentize.workflow.planner` invocation, with the pipeline implemented as a Session DSL example.
This package contains the 5-stage planner pipeline (understander → bold → critique → reducer → consensus) that powers `lol plan`. Critique and reducer are always executed in parallel. It is structured as a runnable package to support `python -m agentize.workflow.planner` invocation, with the pipeline implemented as a Session DSL example.

## Invocation

Expand All @@ -28,7 +28,6 @@ from agentize.workflow.planner import run_planner_pipeline, StageResult
results = run_planner_pipeline(
"Implement JWT authentication",
output_dir=".tmp",
parallel=True,
)

for stage, result in results.items():
Expand Down
1 change: 0 additions & 1 deletion python/agentize/workflow/planner/__init__.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ def run_planner_pipeline(
*,
output_dir: str | Path = ".tmp",
backends: dict[str, tuple[str, str]] | None = None,
parallel: bool = True,
runner: Callable[..., subprocess.CompletedProcess] = run_acw,
prefix: str | None = None,
output_suffix: str = "-output.md",
Expand Down
3 changes: 1 addition & 2 deletions python/agentize/workflow/planner/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ def run_planner_pipeline(
*,
output_dir: str | Path = ".tmp",
backends: dict[str, tuple[str, str]] | None = None,
parallel: bool = True,
runner: Callable[..., subprocess.CompletedProcess] = run_acw,
prefix: str | None = None,
output_suffix: str = "-output.md",
Expand Down Expand Up @@ -64,7 +63,7 @@ artifacts (`*-consensus-input.md`, `*-consensus.md`).
## Design Rationale

- **Session DSL as baseline**: The planner pipeline demonstrates the imperative Session API
with both sequential and parallel stages.
with a parallel-only critique/reducer stage.
- **Explicit artifacts**: Stage-specific input/output files remain predictable and
match CLI documentation.
- **Reusable consensus stage**: Running consensus separately preserves the `.txt`
Expand Down
58 changes: 20 additions & 38 deletions python/agentize/workflow/planner/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,6 @@ def run_planner_pipeline(
*,
output_dir: str | Path = ".tmp",
backends: dict[str, tuple[str, str]] | None = None,
parallel: bool = True,
runner: Callable[..., subprocess.CompletedProcess] = run_acw,
prefix: str | None = None,
output_suffix: str = "-output.md",
Expand Down Expand Up @@ -210,47 +209,30 @@ def _backend_label(stage: str) -> str:
"reducer", feature_desc, agentize_home, bold_output
)

mode_label = "parallel" if parallel else "sequentially"
_log_stage(
f"Stage 3-4/5: Running critique and reducer {mode_label} "
"Stage 3-4/5: Running critique and reducer in parallel "
f"({_backend_label('critique')}, {_backend_label('reducer')})"
)

if parallel:
parallel_results = session.run_parallel(
[
session.stage(
"critique",
critique_prompt,
stage_backends["critique"],
tools=STAGE_TOOLS.get("critique"),
permission_mode=STAGE_PERMISSION_MODE.get("critique"),
),
session.stage(
"reducer",
reducer_prompt,
stage_backends["reducer"],
tools=STAGE_TOOLS.get("reducer"),
permission_mode=STAGE_PERMISSION_MODE.get("reducer"),
),
]
)
results.update(parallel_results)
else:
results["critique"] = session.run_prompt(
"critique",
critique_prompt,
stage_backends["critique"],
tools=STAGE_TOOLS.get("critique"),
permission_mode=STAGE_PERMISSION_MODE.get("critique"),
)
results["reducer"] = session.run_prompt(
"reducer",
reducer_prompt,
stage_backends["reducer"],
tools=STAGE_TOOLS.get("reducer"),
permission_mode=STAGE_PERMISSION_MODE.get("reducer"),
)
parallel_results = session.run_parallel(
[
session.stage(
"critique",
critique_prompt,
stage_backends["critique"],
tools=STAGE_TOOLS.get("critique"),
permission_mode=STAGE_PERMISSION_MODE.get("critique"),
),
session.stage(
"reducer",
reducer_prompt,
stage_backends["reducer"],
tools=STAGE_TOOLS.get("reducer"),
permission_mode=STAGE_PERMISSION_MODE.get("reducer"),
),
]
)
results.update(parallel_results)

critique_output = results["critique"].text()
reducer_output = results["reducer"].text()
Expand Down
40 changes: 23 additions & 17 deletions python/tests/test_planner_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,37 +168,43 @@ class TestPlannerPipelineExecutionOrder:
"""Tests for correct stage execution order."""

@pytest.mark.skipif(run_planner_pipeline is None, reason="Implementation not yet available")
def test_sequential_order_when_parallel_disabled(self, tmp_output_dir: Path, stub_runner: Callable):
"""With parallel=False, stages run in deterministic order."""
def test_critique_reducer_run_parallel(self, tmp_output_dir: Path, stub_runner: Callable, monkeypatch):
"""Critique and reducer are dispatched through the parallel runner."""
from agentize.workflow.planner import pipeline as planner_pipeline

recorded = {}

def _run_parallel(self, calls, *, max_workers: int = 2, retry: int = 0, retry_delay: float = 0):
call_list = list(calls)
recorded["stages"] = [call.stage for call in call_list]
results = {}
for call in call_list:
results[call.stage] = self.run_prompt(
call.stage,
call.prompt,
call.backend,
**call.options,
)
return results

monkeypatch.setattr(planner_pipeline.Session, "run_parallel", _run_parallel)

run_planner_pipeline(
"Add feature X",
output_dir=tmp_output_dir,
runner=stub_runner,
parallel=False,
prefix="test",
)

invocations = stub_runner.invocations
# Extract stage names from output file paths
stages = []
for inv in invocations:
output_path = inv["output_file"]
for stage in ["understander", "bold", "critique", "reducer", "consensus"]:
if stage in output_path:
stages.append(stage)
break

expected_order = ["understander", "bold", "critique", "reducer", "consensus"]
assert stages == expected_order
assert recorded.get("stages") == ["critique", "reducer"]

@pytest.mark.skipif(run_planner_pipeline is None, reason="Implementation not yet available")
def test_understander_runs_before_bold(self, tmp_output_dir: Path, stub_runner: Callable):
"""Understander always runs before bold (even with parallel=True)."""
"""Understander always runs before bold."""
run_planner_pipeline(
"Add feature Y",
output_dir=tmp_output_dir,
runner=stub_runner,
parallel=True,
prefix="test",
)

Expand Down
4 changes: 2 additions & 2 deletions tests/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ added to `test-all.sh` or executed directly. They provide shared functionality f

## Avoiding Embedded Python Blocks

When writing shell tests, avoid embedding `python3 -c` blocks for testing Python logic. Instead:
When writing shell tests, avoid embedding `python -c` blocks for testing Python logic. Instead:

1. **Test Python logic in pytest**: Add test cases to `python/tests/test_*.py` files
2. **Shell tests for CLI/integration**: Keep shell tests focused on CLI invocation, environment handling, and end-to-end workflows
Expand All @@ -72,7 +72,7 @@ When writing shell tests, avoid embedding `python3 -c` blocks for testing Python
**Example - Avoid this in shell tests:**
```bash
# BAD: Embedded Python block
result=$(python3 -c "
result=$(python -c "
def some_function():
return 'result'
print(some_function())
Expand Down
4 changes: 2 additions & 2 deletions tests/cli/test-cursor-hook-before-prompt-submit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ EOF
)

if [ -n "$agentize_home" ]; then
HANDSOFF_MODE="$handsoff_mode" AGENTIZE_HOME="$agentize_home" python3 "$HOOK_SCRIPT" <<< "$input"
HANDSOFF_MODE="$handsoff_mode" AGENTIZE_HOME="$agentize_home" python "$HOOK_SCRIPT" <<< "$input"
else
# Run without AGENTIZE_HOME (in local directory context)
(cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE="$handsoff_mode" python3 "$HOOK_SCRIPT" <<< "$input")
(cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE="$handsoff_mode" python "$HOOK_SCRIPT" <<< "$input")
fi
}

Expand Down
4 changes: 2 additions & 2 deletions tests/cli/test-handsoff-session-path.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ EOF
)

if [ -n "$agentize_home" ]; then
HANDSOFF_MODE=1 AGENTIZE_HOME="$agentize_home" python3 "$HOOK_SCRIPT" <<< "$input"
HANDSOFF_MODE=1 AGENTIZE_HOME="$agentize_home" python "$HOOK_SCRIPT" <<< "$input"
else
# Run without AGENTIZE_HOME (in local directory context)
(cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE=1 python3 "$HOOK_SCRIPT" <<< "$input")
(cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE=1 python "$HOOK_SCRIPT" <<< "$input")
fi
}

Expand Down
Loading