Skip to content

Improve and extend frontend probe after update with WebSocket check#6811

Merged
agners merged 2 commits intomainfrom
improve-core-http-health-check
May 6, 2026
Merged

Improve and extend frontend probe after update with WebSocket check#6811
agners merged 2 commits intomainfrom
improve-core-http-health-check

Conversation

@agners
Copy link
Copy Markdown
Member

@agners agners commented May 5, 2026

Proposed change

The post-update health check introduced in #6311 added HomeAssistantAPI.check_frontend_available, which fetched the frontend through the existing Supervisor-internal API connection to Core. Since #6742 that connection optionally runs over a Unix socket with no authentication, so the request no longer exercises the same transport, auth and routing path that an external HTTP client uses.

Move the frontend probe out of HomeAssistantAPI into a small frontend_check module that talks to Core's TCP endpoints via the plain websession with no authentication, mirroring what an external client would see.

While doing this, extend the post-update verification to also probe the WebSocket endpoint: open /api/websocket and confirm the first frame is the auth_required text message. This catches the kind of WebSocket breakage seen in #6802, where api/config still listed websocket_api as loaded and GET / still returned HTML, but the WebSocket handshake completed with an immediate close frame and the frontend was unusable.

The component check now also requires http to be loaded, in addition to frontend and websocket_api, and iterates so every missing component is logged.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

The post-update health check introduced in #6311 added
HomeAssistantAPI.check_frontend_available, which fetched the frontend
through the existing Supervisor-internal API connection to Core.
Since #6742 that connection optionally runs over a Unix socket with
no authentication, so the request no longer exercises the same
transport, auth and routing path that an external HTTP client uses.

Move the frontend probe out of HomeAssistantAPI into a small
frontend_check module that talks to Core's TCP endpoints via the
plain websession with no authentication, mirroring what an external
client would see.

While doing this, extend the post-update verification to also probe
the WebSocket endpoint: open /api/websocket and confirm the first
frame is the auth_required text message. This catches the kind of
WebSocket breakage seen in #6802, where api/config still listed
websocket_api as loaded and GET / still returned HTML, but the
WebSocket handshake completed with an immediate close frame and the
frontend was unusable.

The component check now also requires "http" to be loaded, in
addition to "frontend" and "websocket_api", and iterates so every
missing component is logged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@agners agners added the new-feature A new feature label May 5, 2026
@agners agners requested a review from Copilot May 5, 2026 15:13
@agners
Copy link
Copy Markdown
Member Author

agners commented May 5, 2026

With this if a old http custom component is used as reported by #6802, the Supervisor correctly rolls back on Core update to the previous release:

2026-05-05 15:43:30.085 INFO (MainThread) [supervisor.homeassistant.core] Successfully started Home Assistant 2026.4.1
2026-05-05 15:43:30.094 WARNING (MainThread) [supervisor.homeassistant.websocket] Can't send WebSocket command: Unexpected error during WebSocket handshake: Received message 8:1000 is not WSMsgType.TEXT
2026-05-05 15:43:30.112 DEBUG (MainThread) [supervisor.homeassistant.frontend_check] Frontend is accessible and serving HTML
2026-05-05 15:43:30.114 ERROR (MainThread) [supervisor.homeassistant.frontend_check] WebSocket handshake returned non-text message: 8
2026-05-05 15:43:30.114 CRITICAL (MainThread) [supervisor.homeassistant.core] HomeAssistant update failed -> rollback!
2026-05-05 15:43:30.114 INFO (MainThread) [supervisor.resolution.module] Create new issue update_rollback - core / None
2026-05-05 15:43:30.115 INFO (MainThread) [supervisor.homeassistant.core] Updating Home Assistant to version 2026.3.4

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

@mdegat01 mdegat01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The only concern/question I have is will the websocket API check as implemented cause core to pop that "invalid auth" persistent notification? I'm not sure exactly what triggers that. I think its from the HTTP API rather then the WS API so we should be fine but just wanted to confirm that opening the connection and dropping without providing auth will not create an influx of those notifications.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread supervisor/homeassistant/frontend_check.py
Comment thread supervisor/homeassistant/frontend_check.py
- Wrap ws_connect in asyncio.wait_for so the handshake has an explicit
  bounded timeout (the global websession's default timeout would
  otherwise apply).
- Validate that the auth_required payload is a JSON object before
  calling .get("type"); a list/string would otherwise raise
  AttributeError at runtime.
- Add a regression test covering a non-dict JSON payload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@agners
Copy link
Copy Markdown
Member Author

agners commented May 6, 2026

This looks good. The only concern/question I have is will the websocket API check as implemented cause core to pop that "invalid auth" persistent notification? I'm not sure exactly what triggers that. I think its from the HTTP API rather then the WS API so we should be fine but just wanted to confirm that opening the connection and dropping without providing auth will not create an influx of those notifications.

I've not seen such a persistent notification in my tests.

From looking into code, in Core's websocket_api/http.py, when the client closes the WS connection during the auth phase (before sending an auth message), the server raises Disconnect("Received close message during auth phase"). It does not call process_wrong_login. process_wrong_login only runs when an actual auth message with an invalid token is sent (auth.py:106). Our probe never sends auth, so no notification.

@agners agners merged commit ad1a911 into main May 6, 2026
21 checks passed
@agners agners deleted the improve-core-http-health-check branch May 6, 2026 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Supervisor rollback not working when websocket_api integration isn't loading

3 participants