fix(ci): use audit-only pattern to prevent install overwriting tamper (closes #1202)#1291
fix(ci): use audit-only pattern to prevent install overwriting tamper (closes #1202)#1291sergio-sisternes-epam wants to merge 4 commits into
Conversation
…closes #1202) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates APM documentation and this repo’s CI configuration to support an “audit-only” CI approach (via microsoft/apm-action@v1 setup-only: true + apm audit --ci --no-drift) so audits can evaluate the as-committed working tree without apm install overwriting managed files first.
Changes:
- Document an audit-only CI pattern for tamper detection (and explain the install-before-audit blind spot).
- Switch this repo’s
APM Self-Checkworkflow job tosetup-only: true+apm audit --ci --no-drift. - Add a changelog entry describing the CI/doc updates.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/src/content/docs/integrations/ci-cd.md | Adds guidance for an audit-only (setup-only) GitHub Actions pattern for tamper detection. |
| docs/src/content/docs/enterprise/enforce-in-ci.md | Adds an “Audit-only CI pattern” section with a comparison table vs full install+audit. |
| docs/src/content/docs/enterprise/drift-detection.md | Documents the install-before-audit blind spot and points to the audit-only pattern. |
| CHANGELOG.md | Adds an Unreleased “Fixed” entry describing the CI/doc changes. |
| .github/workflows/ci.yml | Updates the APM Self-Check job to run audit-only (setup-only: true + --no-drift). |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | CI pipeline design is sound; audit-only pattern is correctly documented. Gate B is now a dead no-op step -- recommend removing or replacing it. |
| CLI Logging Expert | 0 | 0 | 1 | No CLI output code changed; existing --no-drift warning message is already clear and CI-log-readable. No concerns. |
| DevX UX Expert | 0 | 2 | 2 | Two UX gaps worth addressing: ci-cd.md snippet silently drops the PAT env var (copy-paste breakage), and Gate B trivially passing with a green label is a misleading CI signal. |
| Supply Chain Security Expert | 0 | 2 | 1 | Core tamper-detection improvement is correct; unpinned apm-action (v1) tag spreads into docs as recommended pattern and is the primary supply-chain concern. |
| OSS Growth Hacker | 0 | 2 | 1 | Strong trust signal and clear comparison table; one dogfood framing gap and two nits on hook strength and table scannability. |
| Auth Expert | 0 | 1 | 1 | GITHUB_APM_PAT in doc example is correct token name but example omits a note that apm audit --ci --no-drift may not need a PAT for public repos. |
| Doc Writer | 0 | 2 | 2 | Two-pattern story is clear; three composition issues reduce signal for readers who scan rather than read linearly. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Supply Chain Security Expert] Pin microsoft/apm-action (v1) to a full 40-char commit SHA in ci.yml and in all three new doc snippets. -- Mutable floating tag is an unverified code-execution surface. Closing the install-overwrite gap while leaving the action unpinned is an incomplete tamper-detection story. Highest-priority item: fix in this PR before merge.
- [Python Architect] Remove or explicitly disable Gate B (git status --porcelain) -- it is a permanently-passing no-op under setup-only mode. -- Three panelists flagged this independently. A named CI step that always passes misleads reviewers into believing drift was verified. One-line fix: remove the step or add
if: falsewith an explanatory echo. - [Auth Expert] Add a conditional callout in enforce-in-ci.md: GITHUB_APM_PAT is only required for repos with private-package lockfile entries. Align ci-cd.md snippet to match. -- Resolves auth-expert's over-prescription concern and devx-ux-expert's copy-paste auth-failure trap simultaneously. Teaches the right mental model rather than the always-wire habit.
- [Doc Writer] Add one sentence before the org-wide sweep YAML in drift-detection.md: 'The sweep below uses install-first because its goal is detecting missed apm install runs, not post-install tampering.' -- Without it, readers who just read the blind-spot section will interpret the sweep as the pattern they were told to avoid. High confusion risk for new adopters.
- [DevX UX Expert] Reorder the dogfood tip in ci-cd.md to lead with outcome: 'APM's own CI uses the audit-only pattern to catch tampered AI-governance files before they can merge.' -- The current tip buries the enterprise trust claim. This sentence is the highest-conversion line in the file and maps directly to the growth-hacker's launch beat recommendation.
Architecture
classDiagram
direction TB
class ApmAction {
<<ExternalAction>>
+setup_only bool
+install_cli() void
+run_apm_install() void
}
class ApmCLI {
<<Tool>>
+audit(ci bool, no_drift bool) ExitCode
}
class Lockfile {
<<ValueObject>>
+deployed_files list
+deployed_file_hashes dict
+ref str
}
class DeployedFiles {
<<IOBoundary>>
+paths list
}
class ContentIntegrityCheck {
<<Strategy>>
+verify(files, hashes) Result
}
class InstallReplayCheck {
<<Strategy>>
+replay(cache) Result
}
class GateBDriftCheck {
<<DeadGate>>
+git_status_porcelain() always_zero
}
ApmAction ..> ApmCLI : installs CLI
ApmCLI *-- ContentIntegrityCheck : runs when --ci
ApmCLI *-- InstallReplayCheck : skipped when --no-drift
ContentIntegrityCheck ..> Lockfile : reads deployed_file_hashes
ContentIntegrityCheck ..> DeployedFiles : reads bytes via SHA-256
InstallReplayCheck ..> Lockfile : reads deployed_files
ApmAction ..> GateBDriftCheck : setup-only leaves working tree unchanged
note for GateBDriftCheck "Dead gate: always exits 0 under setup-only; no drift signal produced"
note for ApmAction "setup-only: true = IOBoundary isolation -- installer does not mutate subject under audit"
class ApmAction:::touched
class ApmCLI:::touched
class GateBDriftCheck:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
subgraph AUDIT_ONLY["Audit-only pattern -- this PR"]
A1["actions/checkout (v4)"] --> A2["apm-action setup-only:true -- installs CLI only; does not run apm install"]
A2 --> A3["apm audit --ci --no-drift"]
A3 --> A4{"content-integrity: SHA-256 vs lockfile hashes"}
A4 -->|"hash mismatch"| A5["EXIT 1 -- tampered file detected"]
A4 -->|"all hashes match"| A6["EXIT 0"]
A3 --> A7["Gate B: git status --porcelain -- DEAD GATE -- always exits 0"]
end
subgraph FULL_INSTALL["Full-install pattern -- pre-PR"]
B1["actions/checkout (v4)"] --> B2["apm-action default -- runs apm install; overwrites managed files"]
B2 --> B3["apm audit --ci"]
B3 --> B4{"content-integrity: SHA-256 vs lockfile hashes"}
B4 -->|"hash mismatch"| B5["EXIT 1"]
B4 -->|"all hashes match"| B6["EXIT 0 -- BLIND SPOT: tampered bytes already overwritten by install"]
B2 --> B7["Gate B: git status --porcelain -- ACTIVE -- detects regeneration drift"]
end
AUDIT_ONLY --- FULL_INSTALL
sequenceDiagram
participant Runner as GitHub Actions Runner
participant Checkout as actions/checkout (v4)
participant Action as apm-action (v1) setup-only:true
participant CLI as apm CLI
participant LF as apm.lock.yaml
participant DF as Deployed Files
Runner->>Checkout: run
Checkout-->>Runner: working tree restored to committed state
Runner->>Action: setup-only: true
Action-->>Runner: CLI installed, PATH updated -- NO apm install
Note over Runner,DF: Deployed files unchanged -- committed bytes preserved
Runner->>CLI: apm audit --ci --no-drift
CLI->>LF: read deployed_file_hashes
CLI->>DF: read file bytes (SHA-256)
DF-->>CLI: raw bytes per deployed path
CLI->>CLI: compare SHA-256(bytes) vs lockfile hash
alt hash mismatch detected
CLI-->>Runner: exit 1 -- tampered file reported
else all hashes match
CLI-->>Runner: exit 0
end
Runner->>Runner: Gate B git status --porcelain (dead gate -- always 0)
Recommendation
Merge after resolving the two pre-merge items in-PR: (1) pin microsoft/apm-action (v1) to a commit SHA in ci.yml and all doc snippets, and (2) remove or explicitly disable Gate B. The remaining follow-ups (PAT callout, drift-detection sweep sentence, dogfood tip reorder) are fast and suitable for a same-day follow-up PR. The core tamper-detection improvement is sound, the documentation story is clear, and the growth signal is real -- this is a shippable PR one small commit away.
Full per-persona findings
Python Architect
-
[recommended] Gate B (git status --porcelain) is now a permanently-passing no-op and should be removed or replaced with a comment block at
.github/workflows/ci.yml:165
With setup-only: true, apm install never runs and the working tree is never mutated. Gate B therefore always exits 0 -- it is dead CI code. Leaving a named CI step that always passes creates two risks: (1) future readers assume drift is being detected when it is not; (2) if the step is re-used as a template, the dead-gate pattern propagates silently.
Suggested: Remove the step entirely, or replace the run: block with:run: echo 'Drift check skipped: setup-only mode does not run apm install, so the working tree is unchanged by design.' -
[nit] Design patterns: the audit-only pattern is a well-applied IOBoundary + Strategy split; no structural concern
setup-only: true creates a clear separation between the environment-setup boundary (apm-action) and the verification boundary (apm audit), preventing the installer from mutating the artifact under test before the verifier sees it. No additional abstraction warranted.
CLI Logging Expert
- [nit] --no-drift warning emits to stderr only; CI log aggregators that split stdout/stderr may silently drop it at
src/apm_cli/commands/audit.py:562
The warning about drift detection being skipped is written via click.echo(..., err=True). GitHub Actions shows stderr inline so this is usually fine. However, if a downstream team pipes stdout only to a log collector, the coverage-reduction advisory is invisible. Pre-existing condition; this PR does not worsen it.
Suggested: As a follow-up, consider also echoing a single-line [!] notice to stdout for structured log scrapers.
DevX UX Expert
-
[recommended] ci-cd.md audit-only snippet omits GITHUB_APM_PAT env var present in enforce-in-ci.md, creating a silent auth-failure trap for copy-pasters at
docs/src/content/docs/integrations/ci-cd.md
The enforce-in-ci.md full recipe correctly shows GITHUB_APM_PAT env var. The parallel snippet in ci-cd.md omits it entirely. A user who copies the shorter snippet will get an opaque auth failure at runtime with no guidance.
Suggested: Addenv: GITHUB_APM_PAT: ${{ secrets.APM_PAT }}to the audit step in the ci-cd.md snippet, or add a callout note directing readers to the full recipe for secrets wiring. -
[recommended] Gate B step label reads as an active check but trivially passes -- green checkmark misleads CI readers into believing drift was verified at
.github/workflows/ci.yml
The step name is still 'Check APM integration drift (legacy bash fallback, see Fold integration drift gate into 'apm audit --drift' (or 'apm install --check') #1071)' and emits a green checkmark. A developer reading CI results sees a passing 'drift check' and infers the integration directories were verified. They were not.
Suggested: Either rename the step to 'Gate B: drift check skipped (setup-only mode; no install ran)' with a self-documenting echo, or gate the step withif: false. -
[nit] Comparison table in enforce-in-ci.md lacks a YAML snippet for the full-install-then-audit path at
docs/src/content/docs/enterprise/enforce-in-ci.md
The table names two patterns but only one has an accompanying snippet nearby. A reader who wants to compare must scroll to find the full-install snippet. -
[nit] setup-only: true has no discoverable reference to what other apm-action (v1) inputs exist at
docs/src/content/docs/enterprise/enforce-in-ci.md
Three doc pages introduce setup-only: true inline but none link to the action README for its full input surface.
Supply Chain Security Expert
-
[recommended] microsoft/apm-action (v1) is a mutable floating tag -- pin to a commit SHA in ci.yml and in every doc snippet at
.github/workflows/ci.yml:153
A mutable tag (v1) can be silently repointed to a different commit by a compromised action repo maintainer. The PR adds setup-only: true to close one tamper vector while leaving the action itself unpinned -- the action is now an unverified code-execution surface. Three new documentation pages recommend the pattern with the unpinned tag, spreading the anti-pattern to all users who copy these snippets.
Suggested: Pin the action to a specific commit SHA:uses: microsoft/apm-action@<FULL-40-CHAR-SHA> # v1.x.y. Apply the same SHA pin to every code snippet in the three new doc sections. -
[recommended] Gate B (legacy bash drift check) trivially always passes with setup-only: true and should be removed or explicitly marked dead at
.github/workflows/ci.yml
A permanently-dead gate creates false confidence: reviewers believe two checks are active when only Gate A runs. Dead safety gates waste audit time and can mask the absence of a real check if Gate A is misconfigured.
Suggested: Remove the Gate B step entirely, or replace with an explicit comment explaining why it is disabled in audit-only mode. -
[nit] The two-pattern comparison table does not mention the lockfile-tampering blind spot common to both patterns at
docs/src/content/docs/enterprise/enforce-in-ci.md
If an attacker modifies a deployed file AND updates the lockfile hash in the same commit, content-integrity passes under either pattern. The table could mislead security reviewers into believing the audit-only pattern closes all tamper vectors.
Suggested: Add a note: 'Neither pattern detects simultaneous modification of a deployed file and its hash in apm.lock.yaml. Protect apm.lock.yaml with branch protection rules and CODEOWNERS to close this vector.'
OSS Growth Hacker
-
[recommended] Dogfood tip buries the lead -- tamper detection benefit is not the first clause at
docs/src/content/docs/integrations/ci-cd.md
The updated tip opens with mechanism instead of outcome. The one-line claim readers can repost -- 'APM's own CI catches tampered AI-governance files before they can merge' -- is never stated. The tip is the highest-conversion dogfood sentence in the file.
Suggested: Reorder to lead with the outcome: 'APM's own CI uses the audit-only pattern to catch tampered AI-governance files before they can merge.' -
[recommended] Comparison table header 'Use when' gives no quick winner signal for a new adopter at
docs/src/content/docs/enterprise/enforce-in-ci.md
Both rows are the same length and neither starts with a scannable signal word. A reader who lands via search needs to see the winning row in under 3 seconds.
Suggested: Bold the first word of each row: 'Tamper detection -- catching deployed files...' and 'Install fidelity -- catching developers who skipped...' -
[nit] CHANGELOG entry is maintainer-voice, not user-voice at
CHANGELOG.md
The entry reads like a commit message. Users scanning the changelog want 'Tamper detection in CI now works correctly' as the headline.
Auth Expert
-
[recommended] enforce-in-ci.md audit-only recipe includes GITHUB_APM_PAT without explaining when it is required at
docs/src/content/docs/enterprise/enforce-in-ci.md
The audit-only path runs content-integrity checks against already-committed files and the local lockfile; it does not download packages or hit the GitHub API for private content. For repos whose deployed packages come from public sources, the PAT is not needed. Its inclusion teaches users to always wire a secret even when unnecessary, widening the credential blast radius.
Suggested: Add a callout: 'GITHUB_APM_PAT is only required if your lockfile references private packages. Omit it for repos whose deployed packages are all public.' -
[nit] ci-cd.md audit-only snippet omits env block while enforce-in-ci.md includes it -- inconsistent examples at
docs/src/content/docs/integrations/ci-cd.md
Readers following the shorter ci-cd.md snippet will not know to add a PAT even when one is needed.
Doc Writer
-
[recommended] ci-cd.md re-explains the install-overwrite blind spot instead of deferring to enforce-in-ci at
docs/src/content/docs/integrations/ci-cd.md
PROSE orchestrated composition: state once, reference elsewhere. The new block in ci-cd.md spends four sentences explaining WHY install overwrites tampered bytes before handing off to enforce-in-ci. That explanation already lives in full in enforce-in-ci.md and drift-detection.md. Duplication means future edits must keep three files in sync.
Suggested: Replace the four-sentence prose with one sentence of context + snippet + link. Cut the SHA-256 explanation -- that detail belongs in enforce-in-ci only. -
[recommended] drift-detection.md org-wide sweep YAML uses install-first immediately after the blind-spot section with no prose acknowledging the intentional choice at
docs/src/content/docs/enterprise/drift-detection.md
The new blind-spot section explains that install-first erases tampered bytes. Twelve lines later the org-wide sweep example runsapm installthenapm audit --ci. A reader will read the sweep as the pattern they were told to avoid.
Suggested: Add one sentence before the sweep YAML: 'The sweep below uses install-first because its goal is detecting missedapm installruns, not post-install tampering.' -
[nit] --no-cache is present in the minimal-gate recipe but absent from the audit-only pattern example in enforce-in-ci.md at
docs/src/content/docs/enterprise/enforce-in-ci.md
If the concern about stale policy cache applies to the minimal-gate recipe, it applies to the audit-only pattern too.
Suggested: Changeapm audit --ci --no-drifttoapm audit --ci --no-drift --no-cachein the audit-only snippet. -
[nit] Comparison table row verbs are not parallel: 'Catching' vs 'Detecting' at
docs/src/content/docs/enterprise/enforce-in-ci.md
Minor parallel structure break. Both rows should open with the same grammatical form.
Test Coverage Expert -- inactive
Documentation-only PR -- changed files are .github/workflows/ci.yml, CHANGELOG.md, docs/src/content/docs/enterprise/drift-detection.md, docs/src/content/docs/enterprise/enforce-in-ci.md, and docs/src/content/docs/integrations/ci-cd.md; zero src/**/*.py runtime code paths to defend.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Generated by PR Review Panel for issue #1291 · ● 2.8M · ◷
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Update the CI self-check workflow to use
setup-only: truewithapm audit --ci --no-drift, preventingapm installfrom silently overwriting tampered managed files before the audit can detect them. Add documentation explaining the audit-only CI pattern.Fixes #1202
Changes
.github/workflows/ci.yml: APM Self-Check job now usessetup-only: true+--no-driftflagsenforce-in-ci.md: New "Audit-only CI pattern" section with threat model, recipe, and comparison tabledrift-detection.md: New "Install before audit and tamper detection" sectionci-cd.md: Updated dogfood tip and "Verify Deployed Primitives" section with audit-only alternativeCHANGELOG.md: Fixed entry under[Unreleased]Type of change
Testing