iii-hq · rohitg00 · Apr 7, 2026
diff --git a/guardrails/Cargo.toml b/guardrails/Cargo.toml
@@ -0,0 +1,24 @@
+[workspace]
+
+[package]
+name = "iii-guardrails"
+version = "0.1.0"
+edition = "2021"
+publish = false
+
+[[bin]]
+name = "iii-guardrails"
+path = "src/main.rs"
+
+[dependencies]
+iii-sdk = { version = "0.10.0", features = ["otel"] }
+tokio = { version = "1", features = ["rt-multi-thread", "macros", "sync", "signal"] }
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
+serde_yaml = "0.9"
+anyhow = "1"
+tracing = "0.1"
+tracing-subscriber = { version = "0.3", features = ["fmt", "env-filter"] }
+clap = { version = "4", features = ["derive"] }
+chrono = { version = "0.4", features = ["serde"] }
+regex = "1"
diff --git a/guardrails/README.md b/guardrails/README.md
@@ -0,0 +1,78 @@
+# iii-guardrails
+
+Every LLM call should pass through a safety check before and after. iii-guardrails does this with zero LLM overhead — pure regex and keyword matching, all patterns pre-compiled at startup. It detects PII (email, phone, SSN, credit cards, IP addresses), prompt injection attempts (9 keyword patterns), and leaked secrets (API keys, tokens, private keys). Wire it as middleware in front of any function, or call it on-demand from the agent.
+
+**Plug and play:** Build with `cargo build --release`, then run `./target/release/iii-guardrails --url ws://your-engine:49134`. It registers 3 functions with 5 PII patterns and 7 secret patterns compiled from defaults — no config file needed. Call `guardrails::check_input` before processing user input, `guardrails::check_output` before returning responses, or `guardrails::classify` for a lightweight risk score.
+
+## Functions
+
+| Function ID | Description |
+|---|---|
+| `guardrails::check_input` | Validate input text for PII, injections, and length limits |
+| `guardrails::check_output` | Validate output text for PII leakage and secret exposure |
+| `guardrails::classify` | Lightweight risk classification without blocking or audit trail |
+
+## iii Primitives Used
+
+- **State** -- audit trail of checks, custom rules (future), aggregate stats (future)
+- **PubSub** -- subscribes to `guardrails.check` topic for async input checks
+- **HTTP** -- all functions exposed as POST endpoints
+
+## Prerequisites
+
+- Rust 1.75+
+- Running iii engine on `ws://127.0.0.1:49134`
+
+## Build
+
+```bash
+cargo build --release
+```
+
+## Usage
+
+```bash
+./target/release/iii-guardrails --url ws://127.0.0.1:49134 --config ./config.yaml
+```
+
+```
+Options:
+  --config <PATH>    Path to config.yaml [default: ./config.yaml]
+  --url <URL>        WebSocket URL of the iii engine [default: ws://127.0.0.1:49134]
+  --manifest         Output module manifest as JSON and exit
+  -h, --help         Print help
+```
+
+## Configuration
+
+```yaml
+pii_patterns:
+  - name: "email"
+    pattern: "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"
+  - name: "phone"
+    pattern: "\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b"
+  - name: "ssn"
+    pattern: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
+  - name: "credit_card"
+    pattern: "\\b\\d{4}[- ]?\\d{4}[- ]?\\d{4}[- ]?\\d{4}\\b"
+  - name: "ip_address"
+    pattern: "\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\b"
+injection_keywords:
+  - "ignore previous instructions"
+  - "ignore all instructions"
+  - "disregard the above"
+  - "you are now"
+  - "pretend you are"
+  - "act as if"
+  - "system prompt"
+  - "reveal your instructions"
+  - "what are your rules"
+max_input_length: 50000   # max input text length before flagging
+max_output_length: 100000  # max output text length before flagging
+```
+
+## Tests
+
+```bash
+cargo test
+```
diff --git a/guardrails/SPEC.md b/guardrails/SPEC.md
@@ -0,0 +1,145 @@
+# iii-guardrails
+
+Safety layer worker for the III engine that checks function I/O for PII, injection attacks, jailbreaks, and content policy violations.
+
+## Architecture
+
+Pure regex + keyword matching. No LLM calls. Designed to be called on every function invocation as middleware.
+
+## Functions
+
+### `guardrails::check_input`
+Validates input text before it reaches a function.
+
+**Input:**
+```json
+{
+  "text": "string (required)",
+  "context": {
+    "function_id": "string (optional)",
+    "user_id": "string (optional)"
+  }
+}
+```
+
+**Output:**
+```json
+{
+  "passed": true,
+  "risk": "none|low|medium|high",
+  "pii": [{ "pattern_name": "email", "count": 1 }],
+  "injections": [{ "keyword": "ignore previous instructions", "position": 0 }],
+  "over_length": false,
+  "check_id": "chk-in-1712345678-42"
+}
+```
+
+### `guardrails::check_output`
+Validates output text for PII leakage and secret exposure.
+
+**Input:**
+```json
+{
+  "text": "string (required)",
+  "context": {
+    "function_id": "string (optional)",
+    "user_id": "string (optional)"
+  }
+}
+```
+
+**Output:**
+```json
+{
+  "passed": true,
+  "risk": "none|low|medium|high",
+  "pii": [{ "pattern_name": "ssn", "count": 1 }],
+  "secrets": [{ "pattern_name": "openai_key", "count": 1 }],
+  "over_length": false,
+  "check_id": "chk-out-1712345678-42"
+}
+```
+
+### `guardrails::classify`
+Lightweight classification without blocking or audit trail.
+
+**Input:**
+```json
+{
+  "text": "string (required)"
+}
+```
+
+**Output:**
+```json
+{
+  "risk": "none|low|medium|high",
+  "categories": ["pii", "injection", "secrets", "over_length"],
+  "pii_types": ["email", "phone"],
+  "details": {
+    "pii_count": 2,
+    "injection_count": 0,
+    "secret_count": 0,
+    "text_length": 150,
+    "within_input_limit": true
+  }
+}
+```
+
+## Triggers
+
+| Type | Path/Topic | Function |
+|------|-----------|----------|
+| HTTP POST | `guardrails/check_input` | `guardrails::check_input` |
+| HTTP POST | `guardrails/check_output` | `guardrails::check_output` |
+| HTTP POST | `guardrails/classify` | `guardrails::classify` |
+| Subscribe | `guardrails.check` | `guardrails::check_input` |
+
+## State Scopes
+
+| Scope | Purpose |
+|-------|---------|
+| `guardrails:checks` | Audit trail of all checks performed |
+| `guardrails:rules` | Custom rules (future: user-defined patterns) |
+| `guardrails:stats` | Aggregate stats (future: checks/day, block rate) |
+
+## Risk Classification
+
+| Level | Condition |
+|-------|-----------|
+| `high` | Any injection keyword detected |
+| `medium` | More than 2 PII matches OR over length limit |
+| `low` | 1-2 PII matches |
+| `none` | Clean |
+
+## PII Patterns (default config)
+
+- Email addresses
+- US phone numbers
+- Social Security Numbers
+- Credit card numbers
+- IP addresses
+
+## Secret Patterns (hardcoded in check_output)
+
+- Bearer tokens
+- OpenAI API keys (`sk-`)
+- GitHub PATs (`ghp_`, `ghs_`, `ghr_`)
+- AWS access keys (`AKIA`)
+- Private key blocks (`-----BEGIN`)
+
+## Configuration
+
+See `config.yaml` for default patterns, keywords, and length limits. All PII regex patterns are compiled once at startup and stored in `Arc` for zero-copy sharing across async handlers.
+
+## Running
+
+```bash
+cargo run --release -- --url ws://127.0.0.1:49134 --config ./config.yaml
+```
+
+## Manifest
+
+```bash
+cargo run --release -- --manifest
+```
diff --git a/guardrails/build.rs b/guardrails/build.rs
@@ -0,0 +1,6 @@
+fn main() {
+    println!(
+        "cargo:rustc-env=TARGET={}",
+        std::env::var("TARGET").unwrap()
+    );
+}
diff --git a/guardrails/config.yaml b/guardrails/config.yaml
@@ -0,0 +1,23 @@
+pii_patterns:
+  - name: "email"
+    pattern: "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"
+  - name: "phone"
+    pattern: "\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b"
+  - name: "ssn"
+    pattern: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
+  - name: "credit_card"
+    pattern: "\\b\\d{4}[- ]?\\d{4}[- ]?\\d{4}[- ]?\\d{4}\\b"
+  - name: "ip_address"
+    pattern: "\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\b"
+injection_keywords:
+  - "ignore previous instructions"
+  - "ignore all instructions"
+  - "disregard the above"
+  - "you are now"
+  - "pretend you are"
+  - "act as if"
+  - "system prompt"
+  - "reveal your instructions"
+  - "what are your rules"
+max_input_length: 50000
+max_output_length: 100000