Skip to content

fix(tools): enforce exec allowlist when approval_mode is off#662

Merged
penso merged 3 commits intomainfrom
stirring-acoustic
Apr 11, 2026
Merged

fix(tools): enforce exec allowlist when approval_mode is off#662
penso merged 3 commits intomainfrom
stirring-acoustic

Conversation

@penso
Copy link
Copy Markdown
Collaborator

@penso penso commented Apr 11, 2026

Summary

Fixes #654.

ApprovalManager::check_command short-circuited to Proceed in ApprovalMode::Off without consulting the configured allowlist, silently disabling the primary exec security control in every headless deployment. Since on-miss hangs forever waiting for a human approver that will never arrive, approval_mode = "never" is the only viable mode for autonomous agents — which is exactly where the user-configured allowlist was being ignored.

  • In Off mode with a non-empty allowlist, enforce it: safe bins + allowlist matches proceed, everything else is denied with a clear error.
  • Empty allowlist preserves historical unrestricted semantics so existing deployments are unaffected.
  • SecurityLevel::Full still bypasses the list (early return before the mode match).
  • SecurityLevel::Deny still denies everything.
  • Dangerous-pattern safety floor (e.g. rm -rf /, git reset --hard, DROP TABLE) now denies in Off mode instead of returning NeedsApproval, so headless agents fail fast instead of hanging forever (addresses Greptile P1 on this PR).

Config template comment updated so users know the allowlist is now enforced under approval_mode = "never" when non-empty.

Validation

Completed

  • cargo test -p moltis-tools approval — 34 tests pass (7 new)
  • cargo +nightly-2025-11-30 fmt --all -- --check
  • cargo +nightly-2025-11-30 clippy -p moltis-tools --all-targets -- -D warnings
  • cargo +nightly-2025-11-30 clippy -p moltis-config --all-targets -- -D warnings

Remaining

  • just lint (blocked locally by pre-existing CUDA/CMake issue in llama-cpp-sys-2 — unrelated to this change; CI runs the OS-aware path)
  • just test (CI)
  • just release-preflight (CI)

Manual QA

  1. moltis.toml:
    [tools.exec]
    approval_mode = "never"
    security_level = "allowlist"
    allowlist = ["git *", "cat /data/*"]
  2. Trigger exec with git status → proceeds (allowlist match).
  3. Trigger exec with echo hi → proceeds (safe bin).
  4. Trigger exec with curl https://example.comdenied with exec denied: command not in allowlist (approval_mode=off): curl ... (previously: silently ran).
  5. Trigger exec with rm -rf / (dangerous, not in allowlist) → denied with exec denied: dangerous command pattern 'rm -r on filesystem root' (approval_mode=off): rm -rf / (previously: hung forever).
  6. Set allowlist = [] with approval_mode = "never" → unrestricted (existing behavior preserved).

New test cases

  • test_approval_off_with_allowlist_match
  • test_approval_off_with_allowlist_miss_denies
  • test_approval_off_with_allowlist_safe_bin
  • test_approval_off_empty_allowlist_unrestricted
  • test_approval_off_full_security_bypasses_allowlist
  • test_dangerous_denied_when_mode_off (renamed from test_dangerous_forces_approval_when_mode_off)
  • test_dangerous_denied_when_mode_off_full_security

Follow-ups (out of scope)

  • Config validation warning when approval_mode = "never" + empty allowlist + security_level = "allowlist" (belt-and-suspenders UX from issue [Bug]: tools.exec.allowlist is silently ignored when approval_mode = "off" #654's Option 3).
  • Optional strict-allowlist mode where SAFE_BINS does not bypass user-configured allowlists (Greptile P2 — would break existing on-miss deployments, needs to be opt-in).

ApprovalManager::check_command short-circuited to Proceed in
ApprovalMode::Off without consulting the configured allowlist, silently
disabling the primary exec security control in every headless deployment
(approval_mode = "never" is the only viable mode for autonomous agents,
since on-miss hangs forever waiting for a human approver).

In Off mode with a non-empty allowlist, now enforce: safe bins + allowlist
matches Proceed, everything else is denied with a clear error. An empty
allowlist preserves historical unrestricted semantics so existing
deployments are unaffected. SecurityLevel::Full still bypasses the list.

Fixes #654.

Entire-Checkpoint: d4cff416df40
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 11, 2026

Greptile Summary

This PR fixes a security regression (#654) where ApprovalMode::Off unconditionally short-circuited to Proceed, silently ignoring any configured allowlist in headless deployments. The fix enforces the allowlist when non-empty in Off mode, and converts dangerous-pattern hits from NeedsApproval (which hangs forever without a human approver) to an immediate Err denial. Empty allowlist preserves historical unrestricted semantics.

Confidence Score: 5/5

Safe to merge — the fix is correct, all new branches are covered by regression tests, and backward compatibility is preserved via the empty-allowlist fast path.

No P0 or P1 issues found. The logic for Off mode (dangerous-deny, allowlist enforcement, safe-bin bypass with warn, empty-allowlist no-op) is consistent and all edge cases have dedicated tests. The interaction between SecurityLevel::Full and the dangerous-pattern safety floor is correct and explicitly covered. Template docs are in sync with the new behavior.

No files require special attention.

Important Files Changed

Filename Overview
crates/tools/src/approval.rs Core logic change: Off mode now enforces non-empty allowlists and denies dangerous commands instead of returning NeedsApproval. 7 new tests cover all new branches. Implementation is correct and well-commented.
crates/config/src/template.rs Documentation update clarifying the new allowlist enforcement semantics and the safe-bins bypass for approval_mode = "never".

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[check_command] --> B{dangerous pattern\nmatches?}
    B -- yes, not allowlisted --> C{ApprovalMode::Off?}
    C -- yes --> D[Err: dangerous command\ndenied in off mode]
    C -- no --> E[NeedsApproval]
    B -- yes, allowlisted --> F[debug log, continue]
    B -- no --> F

    F --> G{SecurityLevel?}
    G -- Deny --> H[Err: security level deny]
    G -- Full --> I[Proceed]
    G -- Allowlist --> J{ApprovalMode?}

    J -- Always --> K[NeedsApproval]
    J -- OnMiss --> L{safe bin or\nallowlist match or\nprev approved?}
    L -- yes --> I
    L -- no --> K

    J -- Off --> M{allowlist empty?}
    M -- yes --> I
    M -- no --> N{matches\nallowlist?}
    N -- yes --> I
    N -- no --> O{is safe bin?}
    O -- yes --> P[warn log\nProceed]
    O -- no --> Q[Err: not in allowlist]
Loading

Reviews (3): Last reviewed commit: "fix(tools): warn when safe-bin bypasses ..." | Re-trigger Greptile

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 11, 2026

Merging this PR will degrade performance by 23.1%

❌ 1 regressed benchmark
✅ 38 untouched benchmarks
⏩ 5 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
env_substitution 10 µs 13 µs -23.1%

Comparing stirring-acoustic (07374e9) with main (6b44e1c)

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Addresses Greptile P1 on #662: the dangerous-pattern safety floor
returned NeedsApproval even in ApprovalMode::Off, so a headless agent
hitting `rm -rf /`, `git reset --hard`, etc. would block forever waiting
for a human approver that never arrives — the exact scenario this PR
set out to make safe.

Deny with a clear error in Off mode; OnMiss/Always still escalate to
NeedsApproval. Explicit allowlist override still wins (preserved by
test_dangerous_overridden_by_allowlist).

Entire-Checkpoint: fa683203b181
@penso
Copy link
Copy Markdown
Collaborator Author

penso commented Apr 11, 2026

Addressed Greptile P1 in f30a0d0: the dangerous-pattern safety floor now returns Err in ApprovalMode::Off instead of NeedsApproval, so headless agents hitting rm -rf /, git reset --hard, etc. are denied immediately rather than hanging forever on an approver that will never arrive. OnMiss/Always still escalate to NeedsApproval unchanged. Explicit allowlist override still wins (covered by existing test_dangerous_overridden_by_allowlist).

Renamed test_dangerous_forces_approval_when_mode_offtest_dangerous_denied_when_mode_off to match the new contract, and added test_dangerous_denied_when_mode_off_full_security to lock the Full+Off case. All 34 moltis-tools approval tests pass, fmt + clippy clean.

Re P2 (SAFE_BINS bypasses explicit allowlists): acknowledged, documented in the template. Out of scope for this fix — a strict-allowlist mode that gates even safe bins is a separate feature and would break a lot of existing approval_mode = "on-miss" deployments. Happy to file as a follow-up issue if useful.

Addresses Greptile P2 on #662: when an operator sets a non-empty
allowlist under approval_mode=off, safe bins (cat, grep, sed, awk, ...)
still bypass the list so ops don't have to enumerate common read-only
utilities. Emit a warn! on that path so strict-posture deployments can
detect the gap at runtime by grepping logs. Also tightened the stale
doc comment on the dangerous-pattern guard — it no longer "forces
approval regardless of mode" now that Off mode denies.

Entire-Checkpoint: cf7a6246e6ca
@penso
Copy link
Copy Markdown
Collaborator Author

penso commented Apr 11, 2026

Addressed both findings from the latest Greptile pass in 07374e9:

P2 — SAFE_BINS bypassing strict allowlists: split the Off-mode allowlist check so matches_allowlist is tried first; if the command falls through to the safe-bin bypass, emit warn!(command, "exec safe-bin bypassed non-empty allowlist in approval_mode=off"). Strict-posture operators can now grep safe-bin their logs to audit the gap. Not blocking safe bins entirely because that would break the many approval_mode = "on-miss" deployments that rely on implicit safe-bin permission — filing that as an opt-in strict mode is a separate feature.

Stale doc comment: rewrote the safety-floor comment to distinguish Off (deny) from OnMiss/Always (escalate to NeedsApproval). The old "force approval regardless of mode" wording was inherited from before the P1 fix and contradicted the actual behavior.

34 tests still pass, fmt + clippy clean. Existing test_approval_off_with_allowlist_safe_bin already locks the proceed path; the warn is a side effect operators observe in logs, not a behavior change.

@penso
Copy link
Copy Markdown
Collaborator Author

penso commented Apr 11, 2026

Re CodSpeed: the regressed benchmark is env_substitution (crates/benchmarks/benches/boot.rs:181), which calls moltis_config::env_subst::substitute_env on a hardcoded inline TOML string. Neither this PR's approval.rs changes nor the template.rs doc-comment tweak are on its code path, so the -23.1% (10 µs → 13 µs) is almost certainly codspeed microbenchmark noise — at that scale a single allocation/cache jitter is ~3 µs.

Greptile's latest pass (commit 07374e9) is 5/5 "Safe to merge" with no P0/P1/P2 findings, so I'll acknowledge the CodSpeed report rather than chase it. If we see env_substitution consistently regress on main after this merges I'll file a separate bug.

@penso penso merged commit c2f4955 into main Apr 11, 2026
17 of 27 checks passed
@penso penso deleted the stirring-acoustic branch April 11, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: tools.exec.allowlist is silently ignored when approval_mode = "off"

1 participant