-
Notifications
You must be signed in to change notification settings - Fork 10
feat(thread): enable rx_on_when_idle for Matter CASE session support #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Call set_link_mode(rx_on_when_idle=true) after Thread stack initialization to ensure MTD devices can receive unsolicited UDP messages. This is required for Matter-over-Thread devices to successfully complete CASE session establishment after BLE commissioning, as the Matter controller sends CASE requests via Thread UDP which would otherwise be missed by sleeping MTD devices. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Point to lexfrei/openthread#feat/add-set-link-mode which adds the set_link_mode() method required for rx_on_when_idle support. Upstream PR: esp-rs/openthread#50 Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Replace local vendor paths with GitHub fork branches: - rs-matter-embassy: lexfrei/rs-matter-embassy#feat/enable-rx-on-when-idle - openthread: lexfrei/openthread#feat/add-set-link-mode PRs pending upstream: - esp-rs/openthread#50 - sysgrok/rs-matter-embassy#30 Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
|
Looks reasonable! Still, if you could wait a few days for my return (afk right now - should be back Dec 31 / Jan 01). I wonder how that would work for the nrf driver, where rx_when_idle seems not to be supported? |
|
Regarding the nrf driver question: this is my first experience with both ESP and Rust, so my understanding may be incomplete. Here's what I found:
Possible approaches:
Also, it probably makes sense to get esp-rs/openthread#50 merged first, so this PR doesn't depend on my fork. Happy to adjust based on your guidance when you're back. |
During Matter commissioning, the SRP removal loop could block indefinitely waiting for records to be removed from the SRP server. This consumed the Fail-Safe timer (typically 120-180 seconds), causing CommissioningComplete to fail with ConstraintError. Add a 10-second timeout to the removal loop. If records aren't removed within this time, proceed anyway with a warning. This prevents the Fail-Safe timer from expiring during the mDNS registration phase. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
The previous fix with timeout made things worse - after timeout the SRP client was still in "removing" state and rejected new configuration with OtError(13) INVALID_STATE. This change: - Skip SRP removal entirely if already empty (fresh commissioning) - Use immediate removal (true) instead of graceful (false) - Remove the blocking wait loop entirely Immediate removal doesn't wait for SRP server acknowledgment, which avoids consuming the Matter Fail-Safe timer during commissioning. Stale records on the server will be cleared on TTL expiry. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
The set_link_mode call was happening before the device attached to the Thread network, so the setting was being ignored/reset during attach. Move the set_link_mode call to OtNetCtl::connect() right after role.is_connected() becomes true. This ensures rx_on_when_idle is properly enabled after the device has joined the network. Without this fix, the device acts as a Sleepy End Device and cannot receive unsolicited messages like CASE session establishment from the commissioner, causing Apple Home commissioning to fail. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Add periodic SRP status logging to help debug commissioning issues. Logs SRP running state and server address every 5 seconds. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Log number of registered vs total services to debug commissioning issues. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
Reduce interval from 5s to 1s for better visibility before Thread detach. Add tick counter and detailed state counts. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
The immediate removal was being called on every loop iteration, which removed services right after adding them. This caused services to stay stuck in 'Adding' state forever. Move the cleanup to run BEFORE the loop starts, so it only cleans up stale records from previous device runs. Co-Authored-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>
|
@lexfrei This also fails with |
| select(self.0.wait_changed(), Timer::after(Duration::from_secs(1))).await; | ||
| } | ||
|
|
||
| // Enable rx_on_when_idle AFTER Thread attach so device can receive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Must be called after the device has joined the network, not before".
Why is that? Do we know the root cause reason?
| ot.set_link_mode(true, false, false) | ||
| .map_err(to_matter_err)?; | ||
|
|
||
| // SRP diagnostic task - logs status every 1 second for debugging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this diagnostics doesn't hurt, but if you could move to a separate async function - log_srp_state or such.
|
|
||
| /// Run the `OtMdns` instance by listening to the mDNS services and registering them with the SRP server | ||
| pub async fn run(&self, matter: &Matter<'_>) -> Result<(), OtError> { | ||
| // On first iteration only: clean up any stale SRP records from previous runs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns with this.
To my understanding, you can't just forget your own SRP record because then you can't re-register yourself with the same SRP server anymore, under the same fqdn.
You really need to first remove your registration from the SRP server, and then re-register. Also, and if I remember correctly, when you claim a fqdn, you receive some sort of key from the SRP server, and the key and the fqdn are all saved (or should be saved) in the persistent storage of OpenThread => the persistent storage of rs-matter, so that they survive device reboot.
But perhaps I'm missing something, if you could elaborate?
Summary
set_link_mode(rx_on_when_idle=true, device_type=false, network_data=false)after Thread stack initializationThreadDriverTaskImplandThreadCoexDriverTaskImplMotivation
MTD (Minimal Thread Device) devices have
rx_on_when_idle=falseby default, causing them to miss incoming UDP packets when idle. This breaks Matter-over-Thread because:Setting
rx_on_when_idle=truekeeps the radio receiver active, allowing the device to respond to CASE requests.Dependencies
Requires
set_link_mode()method in openthread crate: esp-rs/openthread#50Test plan