diff --git a/CLAUDE.md b/CLAUDE.md index 055afb9..85089cc 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,6 +2,8 @@ OpenAI-compatible API proxy that runs on Azure Confidential VMs (AMD SEV-SNP). Provides a web-accessible endpoint for end-to-end encrypted AI via [Privatemode](https://privatemode.ai) — no local software required. All memory is hardware-encrypted; Azure cannot access decrypted traffic, TLS keys, or API secrets. +5 models live: `gpt-oss-120b`, `gemma-3-27b`, `qwen3-coder-30b`, `qwen3-embedding-4b`, `whisper-large-v3`. 116 tests. This is the Tier 3 AI processing endpoint for sensitive advocacy data — investigation documentation, witness testimony, and legal defense materials must route here, not through cloud APIs. + ## Quick Start ```bash @@ -130,3 +132,70 @@ See `.env.example` for the full list. Required: `ADMIN_PASSWORD`, `PRIVATEMODE_A ### Zero-Retention Prompts and responses are never logged or stored. Only token usage metadata is tracked for billing. The proxy reads the `usage` field from responses but ignores `choices` content entirely. + +## Organizational Context + +**Layer:** 1–3 | **Lever:** Strengthen | **Integration:** Tier 3 AI processing (standalone — platform integration pending) + +This is the zero-retention AI processing endpoint for sensitive advocacy data. Per the three-adversary threat model (`foundations/security.md`), Tier 3 data (investigation documentation, witness identities, legal defense) must never route through cloud AI providers with data retention. This proxy is the countermeasure. + +**Relevant strategy documents:** +- `foundations/security.md` — three-tier security model and Tier 3 encryption requirements +- `closed-decisions.md` 2026-03-25 — Tier 3 encryption enforcement needs hardening +- `ecosystem/repos.md` — platform integration is pending (Tier 3 queries should route here) + +**Current status:** 5 models live in production. Platform integration with `open-paws-platform` is planned — Tier 3 queries should route through this proxy. Integration is not yet wired. + +## Development Standards + +### 10-Point Review Checklist (ranked by AI violation frequency) + +1. **DRY** — AI clones code at 4x the human rate. Search before writing anything new +2. **Deep modules** — Reject shallow wrappers and pass-through methods. Interface must be simpler than implementation +3. **Single responsibility** — Each function does one thing at one level of abstraction +4. **Error handling** — Never catch-all. AI suppresses errors and removes safety checks. Every catch block must handle specifically +5. **Information hiding** — Don't expose internal state. Mask API keys (last 4 chars only) +6. **Ubiquitous language** — Use movement terminology consistently. Never let AI invent synonyms for domain terms +7. **Design for change** — Abstraction layers and loose coupling +8. **Legacy velocity** — AI code churns 2x faster. Use characterization tests before modifying existing code +9. **Over-patterning** — Simplest structure that works. Three similar lines of code is better than a premature abstraction +10. **Test quality** — Every test must fail when the covered behavior breaks. Mutation score over coverage percentage + +### Quality Gates + +- **Desloppify:** `desloppify scan --path .` — minimum score ≥85 +- **Speciesist language:** `semgrep --config semgrep-no-animal-violence.yaml` on all code/docs edits +- **Two-failure rule:** After two failed fixes on the same problem, stop and restart with a better approach + +### Testing Methodology + +- Spec-first test generation preferred +- Reject: snapshot trap, mock everything, happy path only, test-after-commit, coverage theater +- Three questions per test: (1) Does it fail if code is wrong? (2) Does it encode a domain rule? (3) Would mutation testing kill it? + +### Plan-First Development + +Read existing code → identify change → write spec → subtasks → plan-test-implement-verify each → comprehension check → commit per subtask + +### Seven Concerns — Critical for This Repo + +All 7 concerns apply. Highlighted critical ones: + +- **Security** (critical) — This is the Tier 3 security boundary. Any change to the proxy that weakens encryption, adds logging of request content, or exposes keys is a Tier 3 security incident. Every PR must pass the security audit checklist. +- **Privacy** (critical) — Zero-retention is the core invariant. Never add logging that captures request content (`choices` field). Token usage metadata (the `usage` field) is the only permissible log. +- **Testing** (critical) — 116 tests. Every new endpoint or behavior must have corresponding tests. Security-critical code requires higher test assertion quality. +- **Cost optimization** — Usage tracker enables per-key cost attribution. Use this data to optimize model routing. +- **Advocacy domain** — API key names and admin UI labels should use movement terminology. +- **Accessibility** — Admin UI must work on low-bandwidth connections for field operatives. +- **Emotional safety** — Not directly applicable to the proxy layer. + +### Advocacy Domain Language + +Never introduce synonyms for: +- **Investigation** — covert documentation (all data processed here may be investigation data) +- **Witness** — person providing testimony (identity requires maximum protection) +- **Activist** — person engaged in advocacy work (not "user" in security contexts) + +### Structured Coding Reference + +For tool-specific AI coding instructions (Claude Code rules, Cursor MDC, Copilot, Windsurf, etc.), copy the corresponding directory from `structured-coding-with-ai` into this project root.