Integration: Adversarial test generation from structured limitation declarations

## Context

On [A2A#1694](https://github.com/a2aproject/A2A/issues/1694), @msaleme shared findings from the [Red Team / Blue Team Agent Fabric](https://github.com/msaleme/red-team-blue-team-agent-fabric): 342 protocol-level tests that exercise capability boundaries across MCP/A2A, with test generation logic that maps from structured limitation declarations to adversarial verification probes.

The key insight: if an agent declares structured limitations (`cannot_access: ["filesystem", "network"]`), automated harnesses can generate protocol-level probes that attempt exactly those actions and verify they fail.

## Proposed Integration

Map msaleme's test generation pipeline to APS's existing enforcement types:

| Limitation Type | APS Mapping | Enforcement |
|----------------|-------------|-------------|
| Stable ("cannot access filesystem") | `FloorPrinciple` with `enforcement.mode: 'inline'` | Blocked at gateway, pre-execution |
| Runtime ("may hallucinate under pressure") | `AttestationFreshness` type `rotating` | Re-evaluated per session window |
| Behavioral drift (>50 turns) | Compliance report over `ActionReceipt` chain | Post-hoc forensic analysis |

The test generation mapping: structured `limitations` → `FloorPrinciple[]` → `PolicyValidator.evaluate()` → adversarial probes → signed pass/fail verdicts.

## Deliverables

1. **Limitation-to-FloorPrinciple mapping spec** — how structured limitation declarations translate to APS enforcement rules
2. **Test generation example** — given a set of declared limitations, generate adversarial probes that the PolicyValidator evaluates
3. **Reference test vectors** — subset of msaleme's 342 tests mapped to APS types

## Open Questions

1. Should the limitation schema be embedded in the passport, the delegation, or a separate declaration?
2. How do you handle limitations that are substrate-dependent (e.g., "cannot access filesystem" may be true on one runtime but not another)?

@msaleme — would contributing the test generation mapping as a PR be interesting? The `PolicyValidator` interface and `FloorPrinciple` type are the integration surfaces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration: Adversarial test generation from structured limitation declarations #10

Context

Proposed Integration

Deliverables

Open Questions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Limitation Type	APS Mapping	Enforcement
Stable ("cannot access filesystem")	`FloorPrinciple` with `enforcement.mode: 'inline'`	Blocked at gateway, pre-execution
Runtime ("may hallucinate under pressure")	`AttestationFreshness` type `rotating`	Re-evaluated per session window
Behavioral drift (>50 turns)	Compliance report over `ActionReceipt` chain	Post-hoc forensic analysis

Integration: Adversarial test generation from structured limitation declarations #10

Description

Context

Proposed Integration

Deliverables

Open Questions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions