Replace fixed-duration sleeps in tests with deterministic waits#6798
Closed
Replace fixed-duration sleeps in tests with deterministic waits#6798
Conversation
9e4b580 to
2ac13ea
Compare
Member
Author
|
I've run this on Python 3.14.2 (just since it seems to have different behavior when it comes to asyncio/scheduling as we see in Core) and a couple of times in CI here, the tests all seem stable with this change. |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves test determinism by replacing fixed-duration asyncio.sleep(...) calls with helpers that yield control to the event loop or await specific background tasks, reducing wall-clock delays and flakiness on busy CI runners.
Changes:
- Added
yield_to_event_loop()andwait_for_task_by_name()helpers intests/common.pyto replace fixed sleeps with deterministic waits. - Updated multiple tests to use
yield_to_event_loop()after in-process events, and replaced small non-deterministic delays withasyncio.sleep(0)where a single yield is sufficient. - Updated backups API tests to await the specific
BackupManager.reloadbackground task rather than sleeping.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/common.py | Adds new async waiting helpers intended to replace fixed-duration sleeps in tests. |
| tests/resolution/fixup/test_store_execute_reload.py | Replaces sleep(0.1) with yield_to_event_loop() after connectivity/event triggers. |
| tests/plugins/test_dns.py | Uses yield_to_event_loop() to wait for notify_locals_changed() async effects. |
| tests/host/test_firewall.py | Uses yield_to_event_loop() after starting apply_gateway_firewall_rules() task. |
| tests/homeassistant/test_home_assistant_watchdog.py | Uses yield_to_event_loop() after firing container state change events. |
| tests/docker/test_interface.py | Replaces sleep(1) with sleep(0) before asserting captured progress events. |
| tests/docker/test_addon.py | Replaces sleep(0.01) with sleep(0) after firing hardware events. |
| tests/backups/test_manager.py | Replaces sleep(0.01) with sleep(0) in simulated container events coroutine. |
| tests/api/test_store.py | Replaces sleep(0.01) with sleep(0) in simulated container events coroutine. |
| tests/api/test_backups.py | Uses wait_for_task_by_name() to await reload tasks spawned by API calls. |
| tests/api/test_addons.py | Replaces sleep(0.01) with sleep(0) in simulated container events coroutine. |
| tests/addons/test_manager.py | Replaces sleep(0.01) with sleep(0) in startup/uninstall wait test. |
| tests/addons/test_addon.py | Replaces small sleeps with yield_to_event_loop() around container state events. |
6718661 to
1817b77
Compare
Several tests use ``await asyncio.sleep(...)`` to "wait for the handler to run" after firing an event or unblocking a task. The fixed duration is real wall-clock time and is also flaky — if the handler chain happens to need slightly more time on a busy CI runner, the assertion races the handler. Replace each site with a wait targeted at what is actually being scheduled, instead of a generic time-based heuristic: - ``Bus.fire_event`` returns the listener tasks since #6252; capture and ``await asyncio.gather(*tasks)`` instead of sleeping. Touches test_bus.py (the bus tests were poking scheduling instead of verifying their assertions), test_home_assistant_watchdog.py, test_plugin_base.py, test_docker/test_addon.py, and test_store_execute_reload.py. - ``_fire_test_event`` in test_addons/test_addon.py becomes ``async def`` and gathers the listener tasks itself, so its 17 call sites collapse to a single ``await _fire_test_event(...)``. - The two test_store_execute_reload.py sites that used the private ``_update_connectivity()`` helper are reworked to set the cached connectivity flag directly and fire the event themselves so they can gather the listener tasks the same way. - test_dns.py (notify_locals_changed): the timer-then-task chain is awaited explicitly via ``_restart_after_locals_change_handle``. - Fire-and-forget ``sys_create_task`` jobs whose only handle is the coroutine name are awaited via the new ``tests.common.wait_for_task_by_name``: BackupManager.reload spawned by the API in test_api/test_backups.py (3 sites), the scheduled connectivity check in test_supervisor.py, and the watchdog handler triggered by a mock_stop fire_event in test_addons/test_manager.py. - The two ``sleep(1)`` post-pull drains in test_docker/test_interface.py collapse to ``sleep(0)`` (handler tasks are already gathered inside pull_image), saving ~2s. - The ``sleep(0.01)`` waits inside ``container_events()`` task bodies (test_addons.py, test_store.py, test_manager.py) are just one-yield-to-the-parent and become ``sleep(0)``. Switching to ``gather`` exposes a few latent test mocks that were silently swallowing TypeErrors as background-task failures before: - ``CGroup.add_devices_allowed`` is ``async def`` but was patched as a plain MagicMock in test_docker/test_addon.py — now patched via ``new_callable=AsyncMock``. - The watchdog does ``await (await self.start())`` / ``await (await self.restart())`` because ``App.start`` / ``App.restart`` return ``asyncio.Task``. The mocks in test_addons/test_addon.py (test_app_watchdog, test_watchdog_on_stop, test_watchdog_during_attach) needed ``AsyncMock(return_value=<settled future>)`` to mirror that shape rather than a plain MagicMock. Tests that wait on real D-Bus signal round-trips through the dbus-daemon subprocess (test_data_disk.py, dbus/udisks2/test_manager.py, mounts/test_mount.py, test_core.py, test_firewall.py) need real wall-clock time for the in-flight stop/start unit calls to settle and the property-change subscription to be in place before the test emits the signal — they keep their existing ``sleep(0.1)``. Same for tests where the sleep itself is the test point (job throttle, scheduler timing, freeze/thaw timeout).
1817b77 to
41c1d5d
Compare
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed change
Several tests use
await asyncio.sleep(...)to "wait for the handler to run" after firing an event or unblocking a task. The fixed duration is real wall-clock time and is also flaky if the handler chain happens to need slightly more time on a busy CI runner.Replace each site with a wait targeted at what is actually being scheduled:
Bus.fire_eventreturns the listener tasks since Migrate images from dockerpy to aiodocker #6252; capture andawait asyncio.gather(*tasks)instead of sleeping. Touchestest_bus.py,test_home_assistant_watchdog.py,test_plugin_base.py,docker/test_addon.py, andresolution/fixup/test_store_execute_reload.py._fire_test_eventinaddons/test_addon.pybecomesasync defand gathers internally, so its 17 call sites collapse to a singleawait _fire_test_event(...).test_store_execute_reload.pysites that used the private_update_connectivity()helper are reworked to set the cached connectivity flag directly and fire the event themselves so they can gather the listener tasks the same way.sys_create_taskjobs whose only handle is the coroutine name, thecoresystest fixture now wrapscoresys.create_taskto record every spawned task into a per-test list. The newtests.common.wait_for_task_by_namehelper looks tasks up there (rather than fromasyncio.all_tasks(), which only includes pending tasks) so it works even if the task has already finished by the time the test asserts. Used forBackupManager.reload(3 sites inapi/test_backups.py), the scheduled connectivity check intest_supervisor.py, and the four DNS-debounce timer callbacks inplugins/test_dns.py. RaisesLookupErrorif the named task was never scheduled — that is a test bug, the call site expects the work to have happened.sleep(1)post-pull drains indocker/test_interface.pycollapse tosleep(0)(handler tasks are already gathered insidepull_image), saving ~2s.sleep(0.01)waits insidecontainer_events()task bodies (test_addons.py,test_store.py,test_manager.py) are just one-yield-to-the-parent and becomesleep(0).Switching to
gatherexposes a few latent test mocks that were silently swallowing TypeErrors as background-task failures:CGroup.add_devices_allowedisasync defbut was patched as a plainMagicMockindocker/test_addon.py(nownew_callable=AsyncMock), and the watchdog doesawait (await self.start())/await (await self.restart())becauseApp.start/App.restartreturnasyncio.Task— the mocks inaddons/test_addon.py(test_app_watchdog,test_watchdog_on_stop,test_watchdog_during_attach) neededAsyncMock(return_value=<settled future>)to mirror that shape rather than a plainMagicMock.Tests that wait on real D-Bus signal round-trips through the dbus-daemon subprocess (
test_data_disk.py,dbus/udisks2/test_manager.py,mounts/test_mount.py,test_core.py,host/test_firewall.py) need real wall-clock time for the in-flight stop/start unit calls to settle and the property-change subscription to be in place before the test emits the signal — they keep their existingsleep(0.1). Same for tests where the sleep itself is the test point (job throttle, scheduler timing, freeze/thaw timeout).Type of change
Additional information
Checklist
ruff format supervisor tests)If API endpoints or add-on configuration are added/changed: