diff --git a/.apm/agents/test-coverage-expert.agent.md b/.apm/agents/test-coverage-expert.agent.md index fd30e55d5..d0fa720cb 100644 --- a/.apm/agents/test-coverage-expert.agent.md +++ b/.apm/agents/test-coverage-expert.agent.md @@ -148,7 +148,12 @@ it via tool calls before emitting it as a finding. The procedure: 5. **Probe the test tree** with `view` / `grep` / `glob`: - Look in `tests/unit//` for unit tests on the touched module. - Look in `tests/integration/` for integration tests on the touched - command or flow. + command or flow. New integration tests must follow the marker + placement contract in + [`.apm/instructions/tests.instructions.md`](../instructions/tests.instructions.md); + flag ungated live-network or runtime-binary calls in + `tests/integration/` as `recommended` regardless of whether the + test self-skips at runtime. - Search for the specific symbol, error string, or flag name being changed. Absence of ANY hit on the changed symbol is a strong signal of a coverage gap. diff --git a/.apm/instructions/tests.instructions.md b/.apm/instructions/tests.instructions.md index 1378efb63..3a2b40949 100644 --- a/.apm/instructions/tests.instructions.md +++ b/.apm/instructions/tests.instructions.md @@ -107,3 +107,68 @@ production code must follow (see - **Targeted runs during iteration.** Run the specific test file first (`uv run pytest tests/unit/install/test_X.py -x`) before running the full suite (`uv run pytest tests/unit tests/test_console.py`). + +## Integration tests: placement and markers + +The integration suite uses **declarative gating** via pytest markers, +not per-file orchestrator enumeration. Adding a new integration test +is two steps. + +### Procedure + +1. Drop the file under `tests/integration/test_.py`. +2. At the top of the module, declare the runtime / network / E2E + prerequisites as a single `pytestmark`: + + ```python + import pytest + + pytestmark = pytest.mark.requires_network_integration + # OR for multiple prerequisites: + pytestmark = [ + pytest.mark.requires_e2e_mode, + pytest.mark.requires_runtime_codex, + ] + ``` + +That is it. The orchestrator (`scripts/test-integration.sh`) and the +CI integration job collect everything under `tests/integration/` in +a single `pytest` invocation; markers are honored automatically. + +### Marker selection + +Pick the marker that matches the **strongest** prerequisite the test +has. The full registry lives in `pyproject.toml` under +`[tool.pytest.ini_options].markers` and is documented (with the +opt-in commands) in +[`docs/src/content/docs/contributing/integration-testing.md`](../../docs/src/content/docs/contributing/integration-testing.md). +Quick map for the common cases: + +| Test prerequisite | Marker | +|----------------------------------------------|---------------------------------| +| Real HTTP to APM-owned services | `requires_network_integration` | +| Real codex / copilot / llm runtime binary | `requires_runtime_` | +| Downloads runtimes; full E2E flow | `requires_e2e_mode` | +| GitHub / ADO token required | `requires_github_token` / `requires_ado_pat` | +| Paid or third-party external service | `live` (deselected by default) | +| Performance measurement | `benchmark` (deselected by default) | +| Hermetic (mocks all I/O) | *no marker required* | + +Need a marker that does not exist yet? Register it in +`pyproject.toml` AND add a row to the docs registry table in the +same PR. Both must stay in sync. + +### Anti-patterns (will land as `recommended` findings on review) + +- **Editing `scripts/test-integration.sh` per file.** The orchestrator + enumerates the directory, not the files. Per-file blocks are drift + by construction. +- **Runtime self-skips inside the test body.** A bare + `if not os.getenv("APM_E2E_TESTS"): pytest.skip(...)` runs before + collection-time gating and weakens the contract. Use + module-level `pytestmark` instead -- declarative gating is the + single source of truth. +- **Reading the gate env var inside test logic.** If your test + reads `APM_RUN_INTEGRATION_TESTS` to branch behaviour, the marker + is wrong (or missing). The marker is the gate; the test body + should assume the gate already passed. diff --git a/.github/agents/test-coverage-expert.agent.md b/.github/agents/test-coverage-expert.agent.md index fd30e55d5..c6e9cfb40 100644 --- a/.github/agents/test-coverage-expert.agent.md +++ b/.github/agents/test-coverage-expert.agent.md @@ -148,7 +148,12 @@ it via tool calls before emitting it as a finding. The procedure: 5. **Probe the test tree** with `view` / `grep` / `glob`: - Look in `tests/unit//` for unit tests on the touched module. - Look in `tests/integration/` for integration tests on the touched - command or flow. + command or flow. New integration tests must follow the marker + placement contract in + [`.apm/instructions/tests.instructions.md`](../../.apm/instructions/tests.instructions.md); + flag ungated live-network or runtime-binary calls in + `tests/integration/` as `recommended` regardless of whether the + test self-skips at runtime. - Search for the specific symbol, error string, or flag name being changed. Absence of ANY hit on the changed symbol is a strong signal of a coverage gap. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 8b37a0200..f72f6c294 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1,6 +1,6 @@ - - + + # Linting (canonical contract) @@ -49,4 +49,4 @@ skills; honor this instruction before invoking them. --- *This file was generated by APM CLI. Do not edit manually.* -*To regenerate: `specify apm compile`* +*To regenerate: `apm compile`* diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md index 1378efb63..3a2b40949 100644 --- a/.github/instructions/tests.instructions.md +++ b/.github/instructions/tests.instructions.md @@ -107,3 +107,68 @@ production code must follow (see - **Targeted runs during iteration.** Run the specific test file first (`uv run pytest tests/unit/install/test_X.py -x`) before running the full suite (`uv run pytest tests/unit tests/test_console.py`). + +## Integration tests: placement and markers + +The integration suite uses **declarative gating** via pytest markers, +not per-file orchestrator enumeration. Adding a new integration test +is two steps. + +### Procedure + +1. Drop the file under `tests/integration/test_.py`. +2. At the top of the module, declare the runtime / network / E2E + prerequisites as a single `pytestmark`: + + ```python + import pytest + + pytestmark = pytest.mark.requires_network_integration + # OR for multiple prerequisites: + pytestmark = [ + pytest.mark.requires_e2e_mode, + pytest.mark.requires_runtime_codex, + ] + ``` + +That is it. The orchestrator (`scripts/test-integration.sh`) and the +CI integration job collect everything under `tests/integration/` in +a single `pytest` invocation; markers are honored automatically. + +### Marker selection + +Pick the marker that matches the **strongest** prerequisite the test +has. The full registry lives in `pyproject.toml` under +`[tool.pytest.ini_options].markers` and is documented (with the +opt-in commands) in +[`docs/src/content/docs/contributing/integration-testing.md`](../../docs/src/content/docs/contributing/integration-testing.md). +Quick map for the common cases: + +| Test prerequisite | Marker | +|----------------------------------------------|---------------------------------| +| Real HTTP to APM-owned services | `requires_network_integration` | +| Real codex / copilot / llm runtime binary | `requires_runtime_` | +| Downloads runtimes; full E2E flow | `requires_e2e_mode` | +| GitHub / ADO token required | `requires_github_token` / `requires_ado_pat` | +| Paid or third-party external service | `live` (deselected by default) | +| Performance measurement | `benchmark` (deselected by default) | +| Hermetic (mocks all I/O) | *no marker required* | + +Need a marker that does not exist yet? Register it in +`pyproject.toml` AND add a row to the docs registry table in the +same PR. Both must stay in sync. + +### Anti-patterns (will land as `recommended` findings on review) + +- **Editing `scripts/test-integration.sh` per file.** The orchestrator + enumerates the directory, not the files. Per-file blocks are drift + by construction. +- **Runtime self-skips inside the test body.** A bare + `if not os.getenv("APM_E2E_TESTS"): pytest.skip(...)` runs before + collection-time gating and weakens the contract. Use + module-level `pytestmark` instead -- declarative gating is the + single source of truth. +- **Reading the gate env var inside test logic.** If your test + reads `APM_RUN_INTEGRATION_TESTS` to branch behaviour, the marker + is wrong (or missing). The marker is the gate; the test body + should assume the gate already passed. diff --git a/CHANGELOG.md b/CHANGELOG.md index 57ab88b9b..3dd564e1d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) - Integration test apm-binary resolution now prefers the local build (`./dist/apm--/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) - `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#1247) +- Integration-test marker procedure codified as `.apm/instructions/tests.instructions.md` (wired into `test-coverage-expert` persona) and guarded by a regression-trap test that asserts `pyproject.toml`, `tests/integration/conftest.py::_MARKER_CHECKS`, the docs registry table, and the instructions rule stay in sync. (#1166) ### Fixed diff --git a/apm.lock.yaml b/apm.lock.yaml index a690b45c4..1c7666e82 100644 --- a/apm.lock.yaml +++ b/apm.lock.yaml @@ -45,7 +45,7 @@ local_deployed_file_hashes: .github/agents/oss-growth-hacker.agent.md: sha256:1cd56bb78ab37d52c50e45ab69d759f775cd49cdf35981b3dc6c4004315c6b83 .github/agents/python-architect.agent.md: sha256:7587ee7c684c61046a83dfa1b7e39d1345f2f119c3395478e3ca2dbbaaaff0e9 .github/agents/supply-chain-security-expert.agent.md: sha256:8fb8cc426d6af17ba084a28b3f026c2b475b62e3ca63ed2f88b83bd823f877af - .github/agents/test-coverage-expert.agent.md: sha256:4df107de6179b5237fa9300582921e7fcb439928649601f0fef2d3b9275cea40 + .github/agents/test-coverage-expert.agent.md: sha256:bc588d89530362469502bfbea788df892a9a0b00e630cd0f3926d3dfd2c2a9e2 .github/instructions/changelog.instructions.md: sha256:1e51ec4c74e847967962bd279dc4c6e582c5d3578490b3c28d5f3acd3e05f73e .github/instructions/cicd.instructions.md: sha256:9c0fafc74f743aa97e5adba2168d66c9e3a327b135065e3b804bdbb5f04cda5d .github/instructions/cli.instructions.md: sha256:8e39e8d5047ce88575cb02f87c2bcede584dfef258bd86f7466c7badf136541a @@ -54,4 +54,4 @@ local_deployed_file_hashes: .github/instructions/integrators.instructions.md: sha256:b151e0438088d2c0b636dfc28532ecf43c3b51e5f1070a354b8d5b57c345e335 .github/instructions/linting.instructions.md: sha256:312acd32353567834ec9f4f246710a47a991729a11c0380aa6a010b63de607eb .github/instructions/python.instructions.md: sha256:45173f778eddc126c37c7ace96acd0e17adb1895031eec134ec0754638d3ba37 - .github/instructions/tests.instructions.md: sha256:4c6335e3373f9735778a05913f2d8ef250d118f8c5305e70ba407e578a525ef7 + .github/instructions/tests.instructions.md: sha256:b527ccaecf0e92f74d300fc9027f1bc49bb43d8ddcdd36338c1556fcde0d8b2d diff --git a/tests/integration/test_marker_registry_sync.py b/tests/integration/test_marker_registry_sync.py new file mode 100644 index 000000000..ab3fca581 --- /dev/null +++ b/tests/integration/test_marker_registry_sync.py @@ -0,0 +1,232 @@ +"""Marker-registry sync invariants for the integration suite. + +This test asserts the integrity contract surfaced in +``.apm/instructions/tests.instructions.md`` (section "Integration tests: +placement and markers"): the marker names referenced by the rule, declared +in ``pyproject.toml``, documented in +``docs/src/content/docs/contributing/integration-testing.md``, and wired +into ``tests/integration/conftest.py::_MARKER_CHECKS`` MUST stay in sync. + +Why this matters: the rule tells future contributors and agents that + +* ``pyproject.toml`` is the marker source of truth, +* the docs table is the canonical registry, +* the conftest predicate map is the gate logic. + +If any of those drifts, the rule's pointer becomes misleading and a future +PR that adds an ungated network/runtime test will not be caught by the +collection-time gate it claims to honour. + +The tests are hermetic (read-only on repo files); no marker required. +""" + +from __future__ import annotations + +import re +import sys +from pathlib import Path + +import tomllib + +# --------------------------------------------------------------------------- +# Resolve repo root from this file location -- the test relies on relative +# layout, not on a cwd assumption. +# --------------------------------------------------------------------------- + +REPO_ROOT = Path(__file__).resolve().parents[2] +PYPROJECT = REPO_ROOT / "pyproject.toml" +DOCS_REGISTRY = ( + REPO_ROOT / "docs" / "src" / "content" / "docs" / "contributing" / "integration-testing.md" +) +APM_RULE = REPO_ROOT / ".apm" / "instructions" / "tests.instructions.md" +CONFTEST = REPO_ROOT / "tests" / "integration" / "conftest.py" +INTEGRATION_DIR = REPO_ROOT / "tests" / "integration" + + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + + +def _pyproject_marker_names() -> set[str]: + """Return the set of marker names declared in pyproject.toml.""" + with PYPROJECT.open("rb") as fh: + data = tomllib.load(fh) + raw = data["tool"]["pytest"]["ini_options"]["markers"] + out: set[str] = set() + for line in raw: + name = line.split(":", 1)[0].strip() + if name: + out.add(name) + return out + + +def _docs_registry_marker_names() -> set[str]: + """Parse the marker table in the docs registry and return marker names. + + Table rows have shape: ``| `marker_name` | precondition | how |``. + """ + text = DOCS_REGISTRY.read_text(encoding="utf-8") + out: set[str] = set() + # Match the first cell content if it is a backtick-quoted identifier. + row_re = re.compile(r"^\|\s*`([a-z_][a-z0-9_]*)`\s*\|", re.MULTILINE) + for m in row_re.finditer(text): + out.add(m.group(1)) + return out + + +def _conftest_marker_names() -> set[str]: + """Return marker names that have a predicate registered in conftest.""" + # Static parse to avoid importing the conftest module under test. + text = CONFTEST.read_text(encoding="utf-8") + # Look for entries inside the _MARKER_CHECKS dict: ' "name": (' + key_re = re.compile(r'^\s*"(requires_[a-z0-9_]+)"\s*:\s*\(', re.MULTILINE) + return set(key_re.findall(text)) + + +def _gating_markers_in_pyproject() -> set[str]: + """Subset of pyproject markers that gate execution (requires_* + live). + + ``integration``, ``slow``, ``benchmark`` are taxonomy markers, not + gates, and are intentionally excluded from the docs registry. + """ + names = _pyproject_marker_names() + return {n for n in names if n.startswith("requires_") or n == "live"} + + +# --------------------------------------------------------------------------- +# Tests +# --------------------------------------------------------------------------- + + +def test_every_pyproject_gating_marker_has_conftest_predicate() -> None: + """A declared ``requires_*`` marker without a predicate is dead config. + + If pyproject declares ``requires_X`` but ``_MARKER_CHECKS`` has no + entry for it, the marker never auto-skips and the gate is a lie. + """ + declared = {n for n in _pyproject_marker_names() if n.startswith("requires_")} + wired = _conftest_marker_names() + missing = sorted(declared - wired) + assert not missing, ( + f"requires_* markers in pyproject.toml missing a predicate in " + f"tests/integration/conftest.py::_MARKER_CHECKS: {missing}" + ) + + +def test_every_conftest_predicate_is_declared_in_pyproject() -> None: + """A wired predicate without a marker declaration warns with PytestUnknownMarkWarning.""" + declared = _pyproject_marker_names() + wired = _conftest_marker_names() + orphans = sorted(wired - declared) + assert not orphans, ( + f"_MARKER_CHECKS entries with no matching pyproject.toml marker " + f"declaration (would emit PytestUnknownMarkWarning): {orphans}" + ) + + +def test_every_gating_marker_is_documented_in_registry() -> None: + """Every ``requires_*`` / ``live`` marker MUST appear in the docs table. + + The instructions rule tells agents and humans to consult the docs + table as the canonical registry. A marker missing from the table is + invisible to anyone following the rule. + """ + gating = _gating_markers_in_pyproject() + documented = _docs_registry_marker_names() + missing = sorted(gating - documented) + assert not missing, ( + f"Gating markers declared in pyproject.toml but missing from " + f"docs/src/content/docs/contributing/integration-testing.md " + f"registry table: {missing}" + ) + + +def test_docs_registry_only_names_declared_markers() -> None: + """The docs table must not advertise a marker that does not exist.""" + declared = _pyproject_marker_names() + documented = _docs_registry_marker_names() + phantom = sorted(documented - declared) + assert not phantom, ( + f"Markers documented in the docs registry but not declared in pyproject.toml: {phantom}" + ) + + +def test_apm_rule_only_names_declared_markers() -> None: + """Marker names cited in the APM instructions rule must really exist. + + Scans for ``requires_*`` tokens in the rule body and asserts each one + matches a declared marker (or the documented ``requires_runtime_`` + placeholder). + """ + declared = _pyproject_marker_names() + # Allow the literal placeholder used in the docs/quick-map. + declared_with_placeholder = declared | {"requires_runtime_"} + body = APM_RULE.read_text(encoding="utf-8") + # Match identifiers OR the placeholder shape inside backticks / plain. + token_re = re.compile(r"\brequires_(?:runtime_|[a-z][a-z0-9_]*)") + cited = set(token_re.findall(body)) + # Re-prefix because the regex captured only the suffix; rebuild full names. + # (The regex above intentionally returns the full token thanks to \b + # anchoring and the group inside; re-derive with findall on full pattern.) + full_re = re.compile(r"\brequires_(?:runtime_|[a-z][a-z0-9_]+)") + cited = set(full_re.findall(body)) + unknown = sorted(cited - declared_with_placeholder) + assert not unknown, ( + f"Marker names cited in .apm/instructions/tests.instructions.md " + f"that are not declared in pyproject.toml: {unknown}" + ) + + +def test_integration_tests_use_pytestmark_not_runtime_self_skip() -> None: + """Tests must declare gates via ``pytestmark``, not runtime ``os.getenv``. + + The rule says: never write ``if not os.getenv("APM_E2E_TESTS"): + pytest.skip(...)`` inside a test body. Use module-level ``pytestmark = + pytest.mark.requires_e2e_mode`` instead. This guard catches future + regressions of that pattern in ``tests/integration/test_*.py``. + + ``conftest.py`` is intentionally exempt: the marker registry itself + must read the env vars to implement the gate. + """ + gate_env_vars = ( + "APM_E2E_TESTS", + "APM_RUN_INTEGRATION_TESTS", + "APM_RUN_INFERENCE_TESTS", + "APM_TEST_ADO_BEARER", + ) + # Two-line window pattern: an os.getenv on a gate var followed by + # pytest.skip on the next non-blank line (or same line). + offenders: list[str] = [] + this_file = Path(__file__).resolve() + for path in sorted(INTEGRATION_DIR.glob("test_*.py")): + if path.resolve() == this_file: + # The lint itself names the env var strings literally; skip. + continue + text = path.read_text(encoding="utf-8") + for var in gate_env_vars: + # Heuristic: an os.getenv(...gate var...) call AND a pytest.skip + # in the SAME function body. We check for the var name appearing + # within 200 chars of pytest.skip. + for m in re.finditer(rf'os\.getenv\(\s*[\'"]{var}[\'"]', text): + window = text[m.start() : m.start() + 400] + if "pytest.skip" in window or "pytest.exit" in window: + offenders.append(f"{path.relative_to(REPO_ROOT)}: gates on {var}") + break + assert not offenders, ( + "Integration test files use runtime os.getenv self-skip on a gate " + "env var. Replace with a module-level pytestmark = pytest.mark." + "requires_* marker (see .apm/instructions/tests.instructions.md):\n " + + "\n ".join(offenders) + ) + + +# --------------------------------------------------------------------------- +# Sanity: confirm tomllib is available (Python 3.11+). All CI matrix +# entries currently use 3.12+; this test is the canary if that changes. +# --------------------------------------------------------------------------- + + +def test_tomllib_available() -> None: + """tomllib was added in 3.11; CI runs on 3.12. Guard the assumption.""" + assert sys.version_info >= (3, 11), "This suite needs Python 3.11+ for tomllib"