fix(backend/tty): EDID race recovery with DrmScanEvent::Changed + rescan timer#3
fix(backend/tty): EDID race recovery with DrmScanEvent::Changed + rescan timer#3coleleavitt wants to merge 2 commits intomainfrom
Conversation
…otplug USB-C docks with DP MST/alt-mode may report connectors as Connected before EDID data is available, causing pick_mode() to return None and connector_connected() to skip activation. Smithay's ConnectorScanner does not re-emit events for already-connected connectors, leaving the output in a permanent dead state. Add a bounded retry mechanism: after device_changed() processes connectors, schedule_rescan_if_needed() checks for connected connectors that have no matching surface (not yet activated). If found, it schedules a calloop timer (2 s delay) that re-runs device_changed(), giving the kernel time to complete EDID reads. The retry is capped at 3 attempts and self-clears when all connectors are activated or on device removal.
📝 WalkthroughWalkthroughPinned workspace Git dependencies for Changes
Sequence Diagram(s)sequenceDiagram
participant Device as Device Manager
participant Connector as Connector State
participant Timer as Timer System
participant Handler as device_changed()
Device->>Connector: create/update output device
Connector->>Connector: detect connected but without active surfaces
alt Rescan Needed
Connector->>Timer: schedule timer (RESCAN_DELAY) / increment retry_count
end
Timer-->>Handler: timer fired
Handler->>Connector: re-evaluate connector state (device_changed)
Connector->>Connector: attempt activation / register changed connector
alt Still unactivated && retries remain
Connector->>Timer: reschedule timer (increment retry_count)
else Max retries reached or activated
Connector->>Timer: cancel timer / clear rescan state
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
679a92b to
1691d9f
Compare
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/backend/tty.rs">
<violation number="1" location="src/backend/tty.rs:1078">
P1: Missing duplicate make/model/serial check for `Changed` connectors. The `added` handler has a dedup guard because "our Layout does not support such duplicates and will panic," but this new `changed` registration path skips it entirely. If a `Changed` connector has the same make/model/serial as an existing one, this could trigger a panic in Layout. Apply the same dedup logic used in the `added` loop.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Update smithay to rev 9219bf8a9 which includes the new DrmScanEvent::Changed variant (smithay PR niri-wm#1923). When a connector's mode list changes while it stays connected (e.g. EDID arrives after the initial probe returned empty modes), register the crtc in known_crtcs so on_output_config_changed() can connect it. This complements the existing rescan timer by providing immediate detection when scan_connectors() runs and the connector's modes have actually changed, rather than relying solely on retries.
1691d9f to
4a3e4ee
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/backend/tty.rs`:
- Around line 1126-1156: The MAX_RESCAN_RETRIES guard in
schedule_rescan_if_needed currently runs before checking has_unactivated, which
prevents resetting device.rescan_retry_count when connectors become deactivated;
change the control flow in schedule_rescan_if_needed (the function in tty.rs) so
you evaluate has_unactivated first, and if false clear device.rescan_retry_count
and remove any device.rescan_timer_token via niri.event_loop.remove(token), then
only apply the MAX_RESCAN_RETRIES check when has_unactivated is true (or move
the max-retry block after the has_unactivated branch) to avoid permanently
disabling future rescans.
| self.schedule_rescan_if_needed(node, niri); | ||
| } | ||
|
|
||
| fn schedule_rescan_if_needed(&mut self, node: DrmNode, niri: &mut Niri) { | ||
| let Some(device) = self.devices.get_mut(&node) else { | ||
| return; | ||
| }; | ||
|
|
||
| if device.rescan_retry_count >= MAX_RESCAN_RETRIES { | ||
| debug!( | ||
| "reached max rescan retries ({MAX_RESCAN_RETRIES}) for {node}, \ | ||
| not scheduling another" | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| let has_unactivated = device.drm_scanner.crtcs().any(|(conn, crtc)| { | ||
| conn.state() == connector::State::Connected | ||
| && !device.surfaces.contains_key(&crtc) | ||
| && !device | ||
| .non_desktop_connectors | ||
| .contains(&(conn.handle(), crtc)) | ||
| }); | ||
|
|
||
| if !has_unactivated { | ||
| device.rescan_retry_count = 0; | ||
| if let Some(token) = device.rescan_timer_token.take() { | ||
| niri.event_loop.remove(token); | ||
| } | ||
| return; | ||
| } |
There was a problem hiding this comment.
Reset retry state even after hitting MAX to avoid permanently disabling rescans.
Because the max-retry guard runs before the has_unactivated check, the retry counter never resets when all unactivated connectors disappear. That can block future hotplugs on the same device from ever scheduling rescans.
🛠️ Suggested fix (reorder checks to allow reset)
- if device.rescan_retry_count >= MAX_RESCAN_RETRIES {
- debug!(
- "reached max rescan retries ({MAX_RESCAN_RETRIES}) for {node}, \
- not scheduling another"
- );
- return;
- }
-
- let has_unactivated = device.drm_scanner.crtcs().any(|(conn, crtc)| {
+ let has_unactivated = device.drm_scanner.crtcs().any(|(conn, crtc)| {
conn.state() == connector::State::Connected
&& !device.surfaces.contains_key(&crtc)
&& !device
.non_desktop_connectors
.contains(&(conn.handle(), crtc))
});
if !has_unactivated {
device.rescan_retry_count = 0;
if let Some(token) = device.rescan_timer_token.take() {
niri.event_loop.remove(token);
}
return;
}
+
+ if device.rescan_retry_count >= MAX_RESCAN_RETRIES {
+ debug!(
+ "reached max rescan retries ({MAX_RESCAN_RETRIES}) for {node}, \
+ not scheduling another"
+ );
+ return;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/backend/tty.rs` around lines 1126 - 1156, The MAX_RESCAN_RETRIES guard in
schedule_rescan_if_needed currently runs before checking has_unactivated, which
prevents resetting device.rescan_retry_count when connectors become deactivated;
change the control flow in schedule_rescan_if_needed (the function in tty.rs) so
you evaluate has_unactivated first, and if false clear device.rescan_retry_count
and remove any device.rescan_timer_token via niri.event_loop.remove(token), then
only apply the MAX_RESCAN_RETRIES check when has_unactivated is true (or move
the max-retry block after the has_unactivated branch) to avoid permanently
disabling future rescans.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Cargo.toml (1)
34-34: Consider using the full 40-character commit hash for reproducibility.The partial hash
9219bf8a9resolves to9219bf8a9b79f528a834c1083630e590a9b1a04eand currently works without ambiguity. However, using the full hash guards against potential collisions as the repository grows and makes the lockfile more explicit.Also applies to: 41-41
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Cargo.toml` at line 34, Replace the shortened git revision string used in the dependency pinning (the rev = "9219bf8a9" entries) with the full 40-character commit hash to ensure reproducibility; update both occurrences of the rev value (look for rev = "9219bf8a9" in Cargo.toml) to rev = "9219bf8a9b79f528a834c1083630e590a9b1a04e".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/backend/tty.rs`:
- Around line 1156-1178: Move the unactivated-connector check before the
MAX_RESCAN_RETRIES guard: first compute has_unactivated via
device.drm_scanner.crtcs() (the existing closure) and if !has_unactivated reset
device.rescan_retry_count = 0 and clear/remove any device.rescan_timer_token
with niri.event_loop.remove(token) (as currently done), then return; only after
that check enforce the if device.rescan_retry_count >= MAX_RESCAN_RETRIES early
return so reaching MAX no longer blocks the reset when connectors become
activated by other code paths.
---
Nitpick comments:
In `@Cargo.toml`:
- Line 34: Replace the shortened git revision string used in the dependency
pinning (the rev = "9219bf8a9" entries) with the full 40-character commit hash
to ensure reproducibility; update both occurrences of the rev value (look for
rev = "9219bf8a9" in Cargo.toml) to rev =
"9219bf8a9b79f528a834c1083630e590a9b1a04e".
Summary
DrmScanEvent::Changedfrom smithay PR Implement release keybinds and mod-only binds niri-wm/niri#1923 for immediate EDID arrival detection9219bf8a9(includes Changed variant)Upstream PR: niri-wm#3409
Summary by cubic
Fixes outputs that stay inactive after hotplug when EDID arrives late. Handles DrmScanEvent::Changed and adds a bounded rescan timer so connected connectors activate as soon as modes appear.
Bug Fixes
Dependencies
Written for commit 4a3e4ee. Summary will update on new commits.
Summary by CodeRabbit
Bug Fixes
Chores