diff --git a/docs/cli/planner.md b/docs/cli/planner.md index ebdb1072..52263f05 100644 --- a/docs/cli/planner.md +++ b/docs/cli/planner.md @@ -16,11 +16,11 @@ lol plan --refine [refinement-instructions] 1. **Understander** (sonnet) - Gathers codebase context with `Read,Grep,Glob` tools 2. **Bold-proposer** (opus) - Researches SOTA solutions and proposes innovative approaches with `Read,Grep,Glob,WebSearch,WebFetch` tools and `--permission-mode plan` -3. **Critique** (opus) - Validates assumptions and analyzes feasibility (runs in parallel with Reducer) -4. **Reducer** (opus) - Simplifies proposal following "less is more" philosophy (runs in parallel with Critique) +3. **Critique** (opus) - Validates assumptions and analyzes feasibility (always runs in parallel with Reducer) +4. **Reducer** (opus) - Simplifies proposal following "less is more" philosophy (always runs in parallel with Critique) 5. **Consensus** (opus) - Synthesizes final plan from the three reports using the external-consensus prompt -Both critique and reducer append plan-guideline content and run in parallel via the Python executor. +Critique and reducer append plan-guideline content and always run in parallel via the Python executor; there is no sequential mode. ### `--dry-run` (optional flag) diff --git a/docs/feat/core/ultra-planner.md b/docs/feat/core/ultra-planner.md index 0a158d15..8fb74f7f 100644 --- a/docs/feat/core/ultra-planner.md +++ b/docs/feat/core/ultra-planner.md @@ -65,7 +65,7 @@ graph TD U --> R{Lite conditions met?} R -->|yes: repo-only, <5 files, <150 LOC| L[Planner-lite: Single-agent plan] R -->|no: needs research or complex| C[Bold-proposer: Research SOTA] - C --> D[Critique + Reducer in parallel] + C --> D[Critique + Reducer (parallel-only)] D --> F[Combined 3-perspective report] L --> H[Update issue with plan] F --> G[External consensus: Synthesize plan] @@ -258,7 +258,7 @@ After reviewing a plan issue: **Breakdown:** - Understander agent: 1-2 minutes (codebase exploration + complexity estimation) - Bold-proposer agent: 2-3 minutes (research + proposal, with context) -- Critique + Reducer agents (parallel): 2-3 minutes +- Critique + Reducer agents (parallel-only): 2-3 minutes - External consensus review: 1-2 minutes - Draft issue creation: <10 seconds @@ -401,4 +401,3 @@ This does not change the `/ultra-planner` command interface documented above. Se | **Workflow** | Approval → Issue → Impl | Issue → Refine* → Impl | *Refinement is optional and can be done multiple times using `--refine` - diff --git a/python/agentize/workflow/README.md b/python/agentize/workflow/README.md index 3845c095..3153129a 100644 --- a/python/agentize/workflow/README.md +++ b/python/agentize/workflow/README.md @@ -37,7 +37,7 @@ and output suffix. ``` understander → bold → critique → reducer → consensus (optional) ↓ ↓ - (parallel when enabled) + (parallel-only) ``` 1. **Understander**: Gathers codebase context and constraints @@ -46,6 +46,8 @@ understander → bold → critique → reducer → consensus (optional) 4. **Reducer**: Simplifies proposals following "less is more" philosophy 5. **Consensus**: Synthesizes a unified implementation plan (optional for library use; CLI delegates to the external consensus script) +Critique and reducer are always executed in parallel. + ## Usage ```python @@ -54,7 +56,6 @@ from agentize.workflow import run_planner_pipeline results = run_planner_pipeline( "Add user authentication with JWT tokens", output_dir=".tmp", - parallel=True, output_suffix="-output.md", skip_consensus=True, ) diff --git a/python/agentize/workflow/__init__.md b/python/agentize/workflow/__init__.md index 4b7ed51d..7abd8db0 100644 --- a/python/agentize/workflow/__init__.md +++ b/python/agentize/workflow/__init__.md @@ -59,7 +59,6 @@ def run_planner_pipeline( *, output_dir: str | Path = ".tmp", backends: dict[str, tuple[str, str]] | None = None, - parallel: bool = True, runner: Callable[..., subprocess.CompletedProcess] = run_acw, prefix: str | None = None, output_suffix: str = "-output.md", @@ -68,12 +67,12 @@ def run_planner_pipeline( ``` Execute the 5-stage planner pipeline: understander → bold → critique → reducer → consensus. +Critique and reducer are always executed in parallel. **Parameters:** - `feature_desc`: Feature request description to plan - `output_dir`: Directory for artifacts (default: `.tmp`) - `backends`: Provider/model mapping per stage (default: understander uses claude/sonnet, others claude/opus) -- `parallel`: Run critique and reducer in parallel (default: True) - `runner`: Callable for stage execution (default: `run_acw`, injectable for testing) - `prefix`: Artifact filename prefix (default: timestamp-based) - `output_suffix`: Suffix appended to stage output filenames (default: `-output.md`) @@ -162,7 +161,6 @@ from agentize.workflow import run_planner_pipeline results = run_planner_pipeline( "Implement dark mode toggle", backends={"consensus": ("claude", "opus")}, - parallel=False, ) for stage, result in results.items(): diff --git a/python/agentize/workflow/planner.md b/python/agentize/workflow/planner.md index d8e6de99..c8628f5f 100644 --- a/python/agentize/workflow/planner.md +++ b/python/agentize/workflow/planner.md @@ -12,7 +12,6 @@ def run_planner_pipeline( *, output_dir: str | Path = ".tmp", backends: dict[str, tuple[str, str]] | None = None, - parallel: bool = True, runner: Callable[..., subprocess.CompletedProcess] = run_acw, prefix: str | None = None, output_suffix: str = "-output.md", diff --git a/python/agentize/workflow/planner/README.md b/python/agentize/workflow/planner/README.md index 4b028ecb..ae9bf2cd 100644 --- a/python/agentize/workflow/planner/README.md +++ b/python/agentize/workflow/planner/README.md @@ -4,7 +4,7 @@ Runnable package for the multi-stage planner pipeline. Provides both library int ## Purpose -This package contains the 5-stage planner pipeline (understander → bold → critique → reducer → consensus) that powers `lol plan`. It is structured as a runnable package to support `python -m agentize.workflow.planner` invocation, with the pipeline implemented as a Session DSL example. +This package contains the 5-stage planner pipeline (understander → bold → critique → reducer → consensus) that powers `lol plan`. Critique and reducer are always executed in parallel. It is structured as a runnable package to support `python -m agentize.workflow.planner` invocation, with the pipeline implemented as a Session DSL example. ## Invocation @@ -28,7 +28,6 @@ from agentize.workflow.planner import run_planner_pipeline, StageResult results = run_planner_pipeline( "Implement JWT authentication", output_dir=".tmp", - parallel=True, ) for stage, result in results.items(): diff --git a/python/agentize/workflow/planner/__init__.md b/python/agentize/workflow/planner/__init__.md index f9211eed..ca60bcd3 100644 --- a/python/agentize/workflow/planner/__init__.md +++ b/python/agentize/workflow/planner/__init__.md @@ -12,7 +12,6 @@ def run_planner_pipeline( *, output_dir: str | Path = ".tmp", backends: dict[str, tuple[str, str]] | None = None, - parallel: bool = True, runner: Callable[..., subprocess.CompletedProcess] = run_acw, prefix: str | None = None, output_suffix: str = "-output.md", diff --git a/python/agentize/workflow/planner/pipeline.md b/python/agentize/workflow/planner/pipeline.md index a4e2ef0f..84cb6e41 100644 --- a/python/agentize/workflow/planner/pipeline.md +++ b/python/agentize/workflow/planner/pipeline.md @@ -12,7 +12,6 @@ def run_planner_pipeline( *, output_dir: str | Path = ".tmp", backends: dict[str, tuple[str, str]] | None = None, - parallel: bool = True, runner: Callable[..., subprocess.CompletedProcess] = run_acw, prefix: str | None = None, output_suffix: str = "-output.md", @@ -64,7 +63,7 @@ artifacts (`*-consensus-input.md`, `*-consensus.md`). ## Design Rationale - **Session DSL as baseline**: The planner pipeline demonstrates the imperative Session API - with both sequential and parallel stages. + with a parallel-only critique/reducer stage. - **Explicit artifacts**: Stage-specific input/output files remain predictable and match CLI documentation. - **Reusable consensus stage**: Running consensus separately preserves the `.txt` diff --git a/python/agentize/workflow/planner/pipeline.py b/python/agentize/workflow/planner/pipeline.py index cb5b3a00..7f73b994 100644 --- a/python/agentize/workflow/planner/pipeline.py +++ b/python/agentize/workflow/planner/pipeline.py @@ -143,7 +143,6 @@ def run_planner_pipeline( *, output_dir: str | Path = ".tmp", backends: dict[str, tuple[str, str]] | None = None, - parallel: bool = True, runner: Callable[..., subprocess.CompletedProcess] = run_acw, prefix: str | None = None, output_suffix: str = "-output.md", @@ -210,47 +209,30 @@ def _backend_label(stage: str) -> str: "reducer", feature_desc, agentize_home, bold_output ) - mode_label = "parallel" if parallel else "sequentially" _log_stage( - f"Stage 3-4/5: Running critique and reducer {mode_label} " + "Stage 3-4/5: Running critique and reducer in parallel " f"({_backend_label('critique')}, {_backend_label('reducer')})" ) - if parallel: - parallel_results = session.run_parallel( - [ - session.stage( - "critique", - critique_prompt, - stage_backends["critique"], - tools=STAGE_TOOLS.get("critique"), - permission_mode=STAGE_PERMISSION_MODE.get("critique"), - ), - session.stage( - "reducer", - reducer_prompt, - stage_backends["reducer"], - tools=STAGE_TOOLS.get("reducer"), - permission_mode=STAGE_PERMISSION_MODE.get("reducer"), - ), - ] - ) - results.update(parallel_results) - else: - results["critique"] = session.run_prompt( - "critique", - critique_prompt, - stage_backends["critique"], - tools=STAGE_TOOLS.get("critique"), - permission_mode=STAGE_PERMISSION_MODE.get("critique"), - ) - results["reducer"] = session.run_prompt( - "reducer", - reducer_prompt, - stage_backends["reducer"], - tools=STAGE_TOOLS.get("reducer"), - permission_mode=STAGE_PERMISSION_MODE.get("reducer"), - ) + parallel_results = session.run_parallel( + [ + session.stage( + "critique", + critique_prompt, + stage_backends["critique"], + tools=STAGE_TOOLS.get("critique"), + permission_mode=STAGE_PERMISSION_MODE.get("critique"), + ), + session.stage( + "reducer", + reducer_prompt, + stage_backends["reducer"], + tools=STAGE_TOOLS.get("reducer"), + permission_mode=STAGE_PERMISSION_MODE.get("reducer"), + ), + ] + ) + results.update(parallel_results) critique_output = results["critique"].text() reducer_output = results["reducer"].text() diff --git a/python/tests/test_planner_workflow.py b/python/tests/test_planner_workflow.py index 3ea053dc..2d017978 100644 --- a/python/tests/test_planner_workflow.py +++ b/python/tests/test_planner_workflow.py @@ -168,37 +168,43 @@ class TestPlannerPipelineExecutionOrder: """Tests for correct stage execution order.""" @pytest.mark.skipif(run_planner_pipeline is None, reason="Implementation not yet available") - def test_sequential_order_when_parallel_disabled(self, tmp_output_dir: Path, stub_runner: Callable): - """With parallel=False, stages run in deterministic order.""" + def test_critique_reducer_run_parallel(self, tmp_output_dir: Path, stub_runner: Callable, monkeypatch): + """Critique and reducer are dispatched through the parallel runner.""" + from agentize.workflow.planner import pipeline as planner_pipeline + + recorded = {} + + def _run_parallel(self, calls, *, max_workers: int = 2, retry: int = 0, retry_delay: float = 0): + call_list = list(calls) + recorded["stages"] = [call.stage for call in call_list] + results = {} + for call in call_list: + results[call.stage] = self.run_prompt( + call.stage, + call.prompt, + call.backend, + **call.options, + ) + return results + + monkeypatch.setattr(planner_pipeline.Session, "run_parallel", _run_parallel) + run_planner_pipeline( "Add feature X", output_dir=tmp_output_dir, runner=stub_runner, - parallel=False, prefix="test", ) - invocations = stub_runner.invocations - # Extract stage names from output file paths - stages = [] - for inv in invocations: - output_path = inv["output_file"] - for stage in ["understander", "bold", "critique", "reducer", "consensus"]: - if stage in output_path: - stages.append(stage) - break - - expected_order = ["understander", "bold", "critique", "reducer", "consensus"] - assert stages == expected_order + assert recorded.get("stages") == ["critique", "reducer"] @pytest.mark.skipif(run_planner_pipeline is None, reason="Implementation not yet available") def test_understander_runs_before_bold(self, tmp_output_dir: Path, stub_runner: Callable): - """Understander always runs before bold (even with parallel=True).""" + """Understander always runs before bold.""" run_planner_pipeline( "Add feature Y", output_dir=tmp_output_dir, runner=stub_runner, - parallel=True, prefix="test", ) diff --git a/tests/CLAUDE.md b/tests/CLAUDE.md index 1276fe3f..a0b0444a 100644 --- a/tests/CLAUDE.md +++ b/tests/CLAUDE.md @@ -63,7 +63,7 @@ added to `test-all.sh` or executed directly. They provide shared functionality f ## Avoiding Embedded Python Blocks -When writing shell tests, avoid embedding `python3 -c` blocks for testing Python logic. Instead: +When writing shell tests, avoid embedding `python -c` blocks for testing Python logic. Instead: 1. **Test Python logic in pytest**: Add test cases to `python/tests/test_*.py` files 2. **Shell tests for CLI/integration**: Keep shell tests focused on CLI invocation, environment handling, and end-to-end workflows @@ -72,7 +72,7 @@ When writing shell tests, avoid embedding `python3 -c` blocks for testing Python **Example - Avoid this in shell tests:** ```bash # BAD: Embedded Python block -result=$(python3 -c " +result=$(python -c " def some_function(): return 'result' print(some_function()) diff --git a/tests/cli/test-cursor-hook-before-prompt-submit.sh b/tests/cli/test-cursor-hook-before-prompt-submit.sh index 8b51e1af..c64ef0fd 100755 --- a/tests/cli/test-cursor-hook-before-prompt-submit.sh +++ b/tests/cli/test-cursor-hook-before-prompt-submit.sh @@ -26,10 +26,10 @@ EOF ) if [ -n "$agentize_home" ]; then - HANDSOFF_MODE="$handsoff_mode" AGENTIZE_HOME="$agentize_home" python3 "$HOOK_SCRIPT" <<< "$input" + HANDSOFF_MODE="$handsoff_mode" AGENTIZE_HOME="$agentize_home" python "$HOOK_SCRIPT" <<< "$input" else # Run without AGENTIZE_HOME (in local directory context) - (cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE="$handsoff_mode" python3 "$HOOK_SCRIPT" <<< "$input") + (cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE="$handsoff_mode" python "$HOOK_SCRIPT" <<< "$input") fi } diff --git a/tests/cli/test-handsoff-session-path.sh b/tests/cli/test-handsoff-session-path.sh index 590a0cbf..28db75d1 100755 --- a/tests/cli/test-handsoff-session-path.sh +++ b/tests/cli/test-handsoff-session-path.sh @@ -25,10 +25,10 @@ EOF ) if [ -n "$agentize_home" ]; then - HANDSOFF_MODE=1 AGENTIZE_HOME="$agentize_home" python3 "$HOOK_SCRIPT" <<< "$input" + HANDSOFF_MODE=1 AGENTIZE_HOME="$agentize_home" python "$HOOK_SCRIPT" <<< "$input" else # Run without AGENTIZE_HOME (in local directory context) - (cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE=1 python3 "$HOOK_SCRIPT" <<< "$input") + (cd "$LOCAL_HOME" && unset AGENTIZE_HOME && HANDSOFF_MODE=1 python "$HOOK_SCRIPT" <<< "$input") fi } diff --git a/tests/cli/test-hook-permission-matching.sh b/tests/cli/test-hook-permission-matching.sh index 32e23f2e..78c957e5 100755 --- a/tests/cli/test-hook-permission-matching.sh +++ b/tests/cli/test-hook-permission-matching.sh @@ -33,7 +33,7 @@ run_hook_with_fixture() { local input=$(jq -c ".$fixture_key" "$FIXTURE_FILE") # Run hook and extract permissionDecision (isolated from external services) - decision=$(unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION; echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision') + decision=$(unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION; echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision') echo "$decision" } @@ -72,7 +72,7 @@ decision=$(run_hook_with_fixture "bash_git_push") test_info "Test 7: Unknown tool → ask (default)" input='{"tool_name":"UnknownTool","tool_input":{},"session_id":"test"}' # Unset Telegram to ensure test isolation -decision=$(unset AGENTIZE_USE_TG; echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision') +decision=$(unset AGENTIZE_USE_TG; echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision') [ "$decision" = "ask" ] || test_fail "Expected 'ask' for unknown tool, got '$decision'" # Test 8: Malformed pattern falls back to ask (fail-safe) @@ -80,7 +80,7 @@ test_info "Test 8: Hook errors fall back to ask" # This test verifies fail-safe behavior when hook encounters errors # If the hook has malformed patterns, it should return 'ask' instead of crashing input='{"tool_name":"Bash","tool_input":{"command":"test-command"},"session_id":"test"}' -decision=$(echo "$input" | python3 "$HOOK_SCRIPT" 2>/dev/null | jq -r '.hookSpecificOutput.permissionDecision') +decision=$(echo "$input" | python "$HOOK_SCRIPT" 2>/dev/null | jq -r '.hookSpecificOutput.permissionDecision') # Should get a valid decision (not empty/null) [ -n "$decision" ] || test_fail "Hook should return a decision even on errors" [ "$decision" = "allow" ] || [ "$decision" = "deny" ] || [ "$decision" = "ask" ] || \ @@ -183,7 +183,7 @@ input=$(jq -c '.setup_viewboard_gh_auth_status' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for gh auth status in setup-viewboard workflow, got '$decision'" @@ -193,7 +193,7 @@ input=$(jq -c '.setup_viewboard_gh_repo_view' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for gh repo view in setup-viewboard workflow, got '$decision'" @@ -203,7 +203,7 @@ input=$(jq -c '.setup_viewboard_gh_api_graphql' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for gh api graphql in setup-viewboard workflow, got '$decision'" @@ -213,7 +213,7 @@ input=$(jq -c '.setup_viewboard_gh_label_create' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for gh label create in setup-viewboard workflow, got '$decision'" @@ -225,7 +225,7 @@ input=$(jq -c '.setup_viewboard_gh_api_graphql' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "ask" ] || test_fail "Expected 'ask' for gh api graphql outside workflow, got '$decision'" @@ -251,7 +251,7 @@ input=$(jq -c '.order_test_global_deny_vs_workflow_allow' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$ORDER_TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "deny" ] || test_fail "Expected 'deny' for rm -rf even with workflow session, got '$decision' (global deny must override workflow)" @@ -270,7 +270,7 @@ decision=$( export AGENTIZE_HOME="$ORDER_TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION # Use the setup-viewboard session - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for gh api graphql after ask falls through to workflow, got '$decision'" @@ -297,7 +297,7 @@ input=$(jq -c '.session_state_update_done' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for session state update to 'done', got '$decision'" @@ -310,7 +310,7 @@ input=$(jq -c '.session_state_update_completed' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for session state update to 'completed', got '$decision'" @@ -323,7 +323,7 @@ input=$(jq -c '.session_state_update_error' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for session state update to 'error', got '$decision'" @@ -336,7 +336,7 @@ input=$(jq -c '.session_state_update_failed' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for session state update to 'failed', got '$decision'" @@ -349,7 +349,7 @@ input=$(jq -c '.session_state_with_quotes' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "allow" ] || test_fail "Expected 'allow' for session state update with relative path, got '$decision'" @@ -359,7 +359,7 @@ input=$(jq -c '.session_state_invalid_path' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "ask" ] || test_fail "Expected 'ask' for session state update with invalid path (missing .tmp), got '$decision'" @@ -369,7 +369,7 @@ input=$(jq -c '.session_state_path_traversal' "$FIXTURE_FILE") decision=$( export AGENTIZE_HOME="$TMP_DIR" unset AGENTIZE_USE_TG HANDSOFF_AUTO_PERMISSION - echo "$input" | python3 "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' + echo "$input" | python "$HOOK_SCRIPT" | jq -r '.hookSpecificOutput.permissionDecision' ) [ "$decision" = "ask" ] || test_fail "Expected 'ask' for path traversal attempt, got '$decision'" diff --git a/tests/cli/test-lol-python-cli.sh b/tests/cli/test-lol-python-cli.sh index 7c92dd16..9aeed081 100755 --- a/tests/cli/test-lol-python-cli.sh +++ b/tests/cli/test-lol-python-cli.sh @@ -10,7 +10,7 @@ export AGENTIZE_HOME="$PROJECT_ROOT" export PYTHONPATH="$PROJECT_ROOT/python" # Test 1: --complete commands returns expected list (apply command removed) -output=$(python3 -m agentize.cli --complete commands 2>&1) +output=$(python -m agentize.cli --complete commands 2>&1) echo "$output" | grep -q "^upgrade$" || test_fail "--complete commands missing: upgrade" echo "$output" | grep -q "^project$" || test_fail "--complete commands missing: project" echo "$output" | grep -q "^claude-clean$" || test_fail "--complete commands missing: claude-clean" @@ -21,7 +21,7 @@ if echo "$output" | grep -q "^apply$"; then fi # Test 2: --version exits 0 and prints expected labels -output=$(python3 -m agentize.cli --version 2>&1) +output=$(python -m agentize.cli --version 2>&1) exit_code=$? if [ $exit_code -ne 0 ]; then test_fail "--version exited with code $exit_code" diff --git a/tests/common.md b/tests/common.md index d8f75beb..cd62da41 100644 --- a/tests/common.md +++ b/tests/common.md @@ -16,7 +16,7 @@ Prefer `python` when it satisfies the minimum runtime requirement (Python 3.10+) If neither interpreter provides Python 3.10+, the tests exit early with a clear error so failures do not appear as unrelated runtime errors. -The `python3()` wrapper function delegates to `PYTHON_BIN` using the `command` builtin to bypass function lookup and call the binary directly. The wrapper remains local to the test shell to avoid shell-specific export behavior (`export -f` is bash-only). +The `python()` wrapper function delegates to `PYTHON_BIN` using the `command` builtin to bypass function lookup and call the binary directly. The wrapper remains local to the test shell to avoid shell-specific export behavior (`export -f` is bash-only). ## Test Result Helpers diff --git a/tests/common.sh b/tests/common.sh index d0492ab3..dcda0f8d 100644 --- a/tests/common.sh +++ b/tests/common.sh @@ -64,7 +64,7 @@ export PYTHON_BIN # Wrapper function stays local to the test shell to avoid shell-specific export behavior. # Uses command to bypass function lookup and call the binary directly. -python3() { +python() { command "$PYTHON_BIN" "$@" }