test: dissolve test-suite quarantine - all three suites green (closes #2581)#2583
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Code Review
This pull request unquarantines the Character_Chat_NEW test suite by removing the pytest_collection_modifyitems quarantine hook in conftest.py and updating the status in the audits/2026-07-02-quarantined-suites.md documentation. There are no review comments, so I have no additional feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
PR Summary by QodoUnquarantine Character_Chat_NEW test suite and update burn-down audit
AI Description
Diagram
High-Level Assessment
Files changed (2)
|
…cleanup Code-review response for PR #2583: - Verified empirically the Embeddings lifecycle fixture is NOT redundant with the root-conftest reset: removing it reproduces the 46 failures exactly. Probe plugin revealed the mechanism: reload_app_main() (used by test_backpressure_and_quotas.py) permanently swaps sys.modules' main module, so the root fixture resets the NEW app while pinned tests still route through the drained ORIGINAL. Docstring now documents the proven chain; the underlying reload leak is tracked in issue #2585. - Removed six stale 'pre-quarantine' comment blocks from ci.yml (the hooks they referenced no longer exist). - Tightened the burn-down doc's exit-criteria phrasing to past tense. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Code review (internal) outcome — the reviewer challenged the fixture's necessity (a root-conftest reset already exists) and was right to: verification deepened the root cause. Fixture proven load-bearing, not redundant: removing it reproduces the 46 failures exactly ( Why the root-conftest reset doesn't cover this (found with an app-identity probe — three distinct The underlying Also addressed from the review: six stale 'pre-quarantine' comment blocks removed from ci.yml; burn-down doc exit-criteria phrasing tightened. The fixture docstring now documents the verified four-step mechanism. 🤖 Generated with Claude Code |
…2581) Full unrestricted re-run on current dev: 476 passed, 4 skipped, 0 failed (1h14m, 60s per-test timeout). The 68 failures recorded 2026-07-02 were measured against a main-based checkout and did not reproduce; intervening Character_Chat work on dev fixed them. Quarantine hook removed; burn-down tracker updated (TTS_NEW and Embeddings remain quarantined pending their re-triage). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ism (closes #2581) Full unrestricted re-runs on current dev: - TTS_NEW: 626 passed / 8 skipped / 0 failed (7m06s) - the 325 recorded failures no longer reproduce - Embeddings: 421 passed / 18 skipped / 0 failed (2m37s) after fixing the one real bug: 46 directory-run failures were cross-test interference from stale app-lifecycle drain state (a lifespan exit marks the shared app draining; later tests got 503 shutdown_in_progress from DrainGateMiddleware). New autouse _reset_app_lifecycle_state fixture clears it per test; every affected file passes solo and in-directory. With all three suites green the quarantine mechanism is fully removed: hooks, the shared tests/_plugins/quarantine.py helper, and ci.yml's six now-inert RUN_QUARANTINED env lines. Burn-down doc retained as history. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…cleanup Code-review response for PR #2583: - Verified empirically the Embeddings lifecycle fixture is NOT redundant with the root-conftest reset: removing it reproduces the 46 failures exactly. Probe plugin revealed the mechanism: reload_app_main() (used by test_backpressure_and_quotas.py) permanently swaps sys.modules' main module, so the root fixture resets the NEW app while pinned tests still route through the drained ORIGINAL. Docstring now documents the proven chain; the underlying reload leak is tracked in issue #2585. - Removed six stale 'pre-quarantine' comment blocks from ci.yml (the hooks they referenced no longer exist). - Tightened the burn-down doc's exit-criteria phrasing to past tense. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
4f6754d to
6f23c56
Compare
Completes the #2581 burn-down — faster than expected: full unrestricted re-runs on current
devshow all three formerly-quarantined suites green, so the entire quarantine mechanism is dissolved.Re-run results (60s per-test timeout,
RUN_QUARANTINED=1before removal)† after fixing the one genuine bug this PR contains: Embeddings' 46 directory-run failures were cross-test interference — any test running the app lifespan (
with TestClient(app)) marks the shared module-levelappas draining on exit, and every later request gets 503{"status":"not_ready","reason":"shutdown_in_progress"}fromDrainGateMiddleware. Every affected file passes solo. Fixed with an autouse_reset_app_lifecycle_statefixture intests/Embeddings/conftest.py(uses the existingreset_lifecycle_statehelper).The Character_Chat_NEW and TTS_NEW failures recorded on 2026-07-02 were measured against a main-based checkout under back-to-back-suite load and no longer reproduce on dev.
Changes
tests/_plugins/quarantine.pyhelperRUN_QUARANTINED: "1"env lines (the shards run the same files either way)Verification
Character_Chat_NEW/property/test_ambiguous_sender_heuristics.pyshowed one transient failure in a killed partial run but passed the full run (possible hypothesis flake)Closes #2581.
🤖 Generated with Claude Code