-
Notifications
You must be signed in to change notification settings - Fork 327
fix: shutdown block #3762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat-multipath
Are you sure you want to change the base?
fix: shutdown block #3762
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -979,6 +979,11 @@ impl Handle { | |
| #[instrument(skip_all)] | ||
| pub(crate) async fn close(&self) { | ||
| trace!(me = ?self.public_key, "magicsock closing..."); | ||
|
|
||
| // Set closing flag BEFORE wait_idle() to prevent new net_report runs | ||
| // from creating QAD connections while we're draining existing ones. | ||
| self.msock.closing.store(true, Ordering::Relaxed); | ||
|
|
||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have in flight conns if we close later which prevents closing at all and makes wait idle choke. |
||
| // Initiate closing all connections, and refuse future connections. | ||
| self.endpoint.close(0u16.into(), b""); | ||
|
|
||
|
|
@@ -1002,7 +1007,6 @@ impl Handle { | |
| if self.msock.is_closed() { | ||
| return; | ||
| } | ||
| self.msock.closing.store(true, Ordering::Relaxed); | ||
| self.shutdown_token.cancel(); | ||
|
|
||
| // MutexGuard is not held across await points | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -494,6 +494,7 @@ impl TransportsSender { | |
| src: Option<IpAddr>, | ||
| transmit: &Transmit<'_>, | ||
| ) -> Poll<io::Result<()>> { | ||
| let mut found_transport = false; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I were to nitpick I'd make this |
||
| match dst { | ||
| #[cfg(wasm_browser)] | ||
| Addr::Ip(..) => { | ||
|
|
@@ -503,6 +504,7 @@ impl TransportsSender { | |
| Addr::Ip(addr) => { | ||
| for sender in &mut self.ip { | ||
| if sender.is_valid_send_addr(addr) { | ||
| found_transport = true; | ||
| match Pin::new(sender).poll_send(cx, *addr, src, transmit) { | ||
| Poll::Pending => {} | ||
| Poll::Ready(res) => { | ||
|
|
@@ -519,6 +521,7 @@ impl TransportsSender { | |
| Addr::Relay(url, endpoint_id) => { | ||
| for sender in &mut self.relay { | ||
| if sender.is_valid_send_addr(url, endpoint_id) { | ||
| found_transport = true; | ||
| match sender.poll_send(cx, url.clone(), *endpoint_id, transmit) { | ||
| Poll::Pending => {} | ||
| Poll::Ready(res) => { | ||
|
|
@@ -533,7 +536,13 @@ impl TransportsSender { | |
| } | ||
| } | ||
| } | ||
| Poll::Pending | ||
| if found_transport { | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Netsim continued to choke on this for the --relay-only tests because apparently when the other transport is disabled it gets stuck here pending forever. |
||
| Poll::Pending | ||
| } else { | ||
| Poll::Ready(Err(io::Error::other(format!( | ||
| "no transport available for {dst:?}" | ||
| )))) | ||
|
Comment on lines
+542
to
+544
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a permanent error? IIRC in the past we've always decided to return The quinn-udp philosophy is that only permanent errors should be errors. That is: this socket will never ever be usable again and won't ever be able to send a datagram and needs to be re-bound to get any progress. And I think that this is not the case here: if you have one transport you can still send on that transport. So I think this should probably not be an error but rather a It would still be helpful to log this though. But if we naively log this we'd end up with flooding the log. IIRC quinn-udp itself has a similar situation where they add a bit of state to the socket so that they only emit that log once every 60 seconds or something like that. Could we do the same? I'd vote debug-level logging for this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
yes, transports are not configurable at runtime (for now) so if we have no transport available that works can error out buut, I don't know why we would ever hit this in our current tests
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I disagree. You could be sending a datagram for a disabled transport which would result in an error. But later you could send a datagram to an enabled transport, and that should still succeed. So it's a transient error I think.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. transports can't be enabled or disabled currently, this could be the case in the future, but right now that is not possible |
||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are needed because our default connection close time is a hair more than 3s (default PTO 333ms, with max calculating to 1.0x seconds and close time is 3x that)