Skip to content

Replace fixed-duration sleeps in tests with deterministic waits#6798

Closed
agners wants to merge 1 commit intomainfrom
tests-tighten-async-waits
Closed

Replace fixed-duration sleeps in tests with deterministic waits#6798
agners wants to merge 1 commit intomainfrom
tests-tighten-async-waits

Conversation

@agners
Copy link
Copy Markdown
Member

@agners agners commented May 4, 2026

Proposed change

Several tests use await asyncio.sleep(...) to "wait for the handler to run" after firing an event or unblocking a task. The fixed duration is real wall-clock time and is also flaky if the handler chain happens to need slightly more time on a busy CI runner.

Replace each site with a wait targeted at what is actually being scheduled:

  • Bus.fire_event returns the listener tasks since Migrate images from dockerpy to aiodocker #6252; capture and await asyncio.gather(*tasks) instead of sleeping. Touches test_bus.py, test_home_assistant_watchdog.py, test_plugin_base.py, docker/test_addon.py, and resolution/fixup/test_store_execute_reload.py.
  • _fire_test_event in addons/test_addon.py becomes async def and gathers internally, so its 17 call sites collapse to a single await _fire_test_event(...).
  • The two test_store_execute_reload.py sites that used the private _update_connectivity() helper are reworked to set the cached connectivity flag directly and fire the event themselves so they can gather the listener tasks the same way.
  • For fire-and-forget sys_create_task jobs whose only handle is the coroutine name, the coresys test fixture now wraps coresys.create_task to record every spawned task into a per-test list. The new tests.common.wait_for_task_by_name helper looks tasks up there (rather than from asyncio.all_tasks(), which only includes pending tasks) so it works even if the task has already finished by the time the test asserts. Used for BackupManager.reload (3 sites in api/test_backups.py), the scheduled connectivity check in test_supervisor.py, and the four DNS-debounce timer callbacks in plugins/test_dns.py. Raises LookupError if the named task was never scheduled — that is a test bug, the call site expects the work to have happened.
  • The two sleep(1) post-pull drains in docker/test_interface.py collapse to sleep(0) (handler tasks are already gathered inside pull_image), saving ~2s.
  • The sleep(0.01) waits inside container_events() task bodies (test_addons.py, test_store.py, test_manager.py) are just one-yield-to-the-parent and become sleep(0).

Switching to gather exposes a few latent test mocks that were silently swallowing TypeErrors as background-task failures: CGroup.add_devices_allowed is async def but was patched as a plain MagicMock in docker/test_addon.py (now new_callable=AsyncMock), and the watchdog does await (await self.start()) / await (await self.restart()) because App.start / App.restart return asyncio.Task — the mocks in addons/test_addon.py (test_app_watchdog, test_watchdog_on_stop, test_watchdog_during_attach) needed AsyncMock(return_value=<settled future>) to mirror that shape rather than a plain MagicMock.

Tests that wait on real D-Bus signal round-trips through the dbus-daemon subprocess (test_data_disk.py, dbus/udisks2/test_manager.py, mounts/test_mount.py, test_core.py, host/test_firewall.py) need real wall-clock time for the in-flight stop/start unit calls to settle and the property-change subscription to be in place before the test emits the signal — they keep their existing sleep(0.1). Same for tests where the sleep itself is the test point (job throttle, scheduler timing, freeze/thaw timeout).

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:
  • Link to cli pull request:
  • Link to client library pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

@agners agners added the test Adding missing tests or correcting existing tests label May 4, 2026
@agners agners marked this pull request as draft May 4, 2026 13:52
@agners agners force-pushed the tests-tighten-async-waits branch 2 times, most recently from 9e4b580 to 2ac13ea Compare May 4, 2026 14:43
@agners agners requested a review from Copilot May 4, 2026 15:21
@agners
Copy link
Copy Markdown
Member Author

agners commented May 4, 2026

I've run this on Python 3.14.2 (just since it seems to have different behavior when it comes to asyncio/scheduling as we see in Core) and a couple of times in CI here, the tests all seem stable with this change.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves test determinism by replacing fixed-duration asyncio.sleep(...) calls with helpers that yield control to the event loop or await specific background tasks, reducing wall-clock delays and flakiness on busy CI runners.

Changes:

  • Added yield_to_event_loop() and wait_for_task_by_name() helpers in tests/common.py to replace fixed sleeps with deterministic waits.
  • Updated multiple tests to use yield_to_event_loop() after in-process events, and replaced small non-deterministic delays with asyncio.sleep(0) where a single yield is sufficient.
  • Updated backups API tests to await the specific BackupManager.reload background task rather than sleeping.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/common.py Adds new async waiting helpers intended to replace fixed-duration sleeps in tests.
tests/resolution/fixup/test_store_execute_reload.py Replaces sleep(0.1) with yield_to_event_loop() after connectivity/event triggers.
tests/plugins/test_dns.py Uses yield_to_event_loop() to wait for notify_locals_changed() async effects.
tests/host/test_firewall.py Uses yield_to_event_loop() after starting apply_gateway_firewall_rules() task.
tests/homeassistant/test_home_assistant_watchdog.py Uses yield_to_event_loop() after firing container state change events.
tests/docker/test_interface.py Replaces sleep(1) with sleep(0) before asserting captured progress events.
tests/docker/test_addon.py Replaces sleep(0.01) with sleep(0) after firing hardware events.
tests/backups/test_manager.py Replaces sleep(0.01) with sleep(0) in simulated container events coroutine.
tests/api/test_store.py Replaces sleep(0.01) with sleep(0) in simulated container events coroutine.
tests/api/test_backups.py Uses wait_for_task_by_name() to await reload tasks spawned by API calls.
tests/api/test_addons.py Replaces sleep(0.01) with sleep(0) in simulated container events coroutine.
tests/addons/test_manager.py Replaces sleep(0.01) with sleep(0) in startup/uninstall wait test.
tests/addons/test_addon.py Replaces small sleeps with yield_to_event_loop() around container state events.

Comment thread tests/common.py Outdated
Comment thread tests/common.py Outdated
Comment thread tests/host/test_firewall.py
@agners agners force-pushed the tests-tighten-async-waits branch 3 times, most recently from 6718661 to 1817b77 Compare May 4, 2026 22:09
Several tests use ``await asyncio.sleep(...)`` to "wait for the
handler to run" after firing an event or unblocking a task. The
fixed duration is real wall-clock time and is also flaky — if the
handler chain happens to need slightly more time on a busy CI
runner, the assertion races the handler.

Replace each site with a wait targeted at what is actually being
scheduled, instead of a generic time-based heuristic:

- ``Bus.fire_event`` returns the listener tasks since #6252;
  capture and ``await asyncio.gather(*tasks)`` instead of sleeping.
  Touches test_bus.py (the bus tests were poking scheduling instead
  of verifying their assertions), test_home_assistant_watchdog.py,
  test_plugin_base.py, test_docker/test_addon.py, and
  test_store_execute_reload.py.
- ``_fire_test_event`` in test_addons/test_addon.py becomes
  ``async def`` and gathers the listener tasks itself, so its 17
  call sites collapse to a single ``await _fire_test_event(...)``.
- The two test_store_execute_reload.py sites that used the private
  ``_update_connectivity()`` helper are reworked to set the cached
  connectivity flag directly and fire the event themselves so they
  can gather the listener tasks the same way.
- test_dns.py (notify_locals_changed): the timer-then-task chain is
  awaited explicitly via ``_restart_after_locals_change_handle``.
- Fire-and-forget ``sys_create_task`` jobs whose only handle is
  the coroutine name are awaited via the new
  ``tests.common.wait_for_task_by_name``: BackupManager.reload
  spawned by the API in test_api/test_backups.py (3 sites), the
  scheduled connectivity check in test_supervisor.py, and the
  watchdog handler triggered by a mock_stop fire_event in
  test_addons/test_manager.py.
- The two ``sleep(1)`` post-pull drains in test_docker/test_interface.py
  collapse to ``sleep(0)`` (handler tasks are already gathered inside
  pull_image), saving ~2s.
- The ``sleep(0.01)`` waits inside ``container_events()`` task
  bodies (test_addons.py, test_store.py, test_manager.py) are just
  one-yield-to-the-parent and become ``sleep(0)``.

Switching to ``gather`` exposes a few latent test mocks that were
silently swallowing TypeErrors as background-task failures before:

- ``CGroup.add_devices_allowed`` is ``async def`` but was patched
  as a plain MagicMock in test_docker/test_addon.py — now patched
  via ``new_callable=AsyncMock``.
- The watchdog does ``await (await self.start())`` /
  ``await (await self.restart())`` because ``App.start`` /
  ``App.restart`` return ``asyncio.Task``. The mocks in
  test_addons/test_addon.py (test_app_watchdog, test_watchdog_on_stop,
  test_watchdog_during_attach) needed
  ``AsyncMock(return_value=<settled future>)`` to mirror that
  shape rather than a plain MagicMock.

Tests that wait on real D-Bus signal round-trips through the
dbus-daemon subprocess (test_data_disk.py, dbus/udisks2/test_manager.py,
mounts/test_mount.py, test_core.py, test_firewall.py) need real
wall-clock time for the in-flight stop/start unit calls to settle
and the property-change subscription to be in place before the test
emits the signal — they keep their existing ``sleep(0.1)``. Same
for tests where the sleep itself is the test point (job throttle,
scheduler timing, freeze/thaw timeout).
@agners agners force-pushed the tests-tighten-async-waits branch from 1817b77 to 41c1d5d Compare May 4, 2026 22:13
@agners agners marked this pull request as ready for review May 4, 2026 22:13
@agners
Copy link
Copy Markdown
Member Author

agners commented May 5, 2026

Split in two PRs for easier reviewing: #6803 and #6804. Especially the second PR is debatable if it really adds value.

@agners agners closed this May 5, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators May 7, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed test Adding missing tests or correcting existing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants