-
Notifications
You must be signed in to change notification settings - Fork 327
fix: shutdown block #3762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat-multipath
Are you sure you want to change the base?
fix: shutdown block #3762
Conversation
|
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3762/docs/iroh/ Last updated: 2025-12-10T21:59:01Z |
| // We sent the last message, so wait for the client to close the connection once | ||
| // it received this message. | ||
| let res = tokio::time::timeout(Duration::from_secs(3), async move { | ||
| let res = tokio::time::timeout(Duration::from_secs(4), async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are needed because our default connection close time is a hair more than 3s (default PTO 333ms, with max calculating to 1.0x seconds and close time is 3x that)
| // Set closing flag BEFORE wait_idle() to prevent new net_report runs | ||
| // from creating QAD connections while we're draining existing ones. | ||
| self.msock.closing.store(true, Ordering::Relaxed); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have in flight conns if we close later which prevents closing at all and makes wait idle choke.
| } | ||
| } | ||
| Poll::Pending | ||
| if found_transport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Netsim continued to choke on this for the --relay-only tests because apparently when the other transport is disabled it gets stuck here pending forever.
|
Welp, less aggressive path filtering for PTO calcs seems to be not great. |
| Poll::Ready(Err(io::Error::other(format!( | ||
| "no transport available for {dst:?}" | ||
| )))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a permanent error? IIRC in the past we've always decided to return Poll::Ready(Ok(())) and let the the QUIC stack use loss detection to figure out this is lost.
The quinn-udp philosophy is that only permanent errors should be errors. That is: this socket will never ever be usable again and won't ever be able to send a datagram and needs to be re-bound to get any progress.
And I think that this is not the case here: if you have one transport you can still send on that transport. So I think this should probably not be an error but rather a Poll::Ready(Ok(())).
It would still be helpful to log this though. But if we naively log this we'd end up with flooding the log. IIRC quinn-udp itself has a similar situation where they add a bit of state to the socket so that they only emit that log once every 60 seconds or something like that. Could we do the same? I'd vote debug-level logging for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a permanent error?
yes, transports are not configurable at runtime (for now) so if we have no transport available that works can error out
buut, I don't know why we would ever hit this in our current tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree. You could be sending a datagram for a disabled transport which would result in an error. But later you could send a datagram to an enabled transport, and that should still succeed. So it's a transient error I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transports can't be enabled or disabled currently, this could be the case in the future, but right now that is not possible
| src: Option<IpAddr>, | ||
| transmit: &Transmit<'_>, | ||
| ) -> Poll<io::Result<()>> { | ||
| let mut found_transport = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I were to nitpick I'd make this let mut fallback_ret = Poll::Ready(Ok(())); and let each transport set this to pending when they hit a pending. But there are many ways to do this and it's subjective and maybe even a nicer way exists.
Description
Details bellow as PR comments.
Depends on n0-computer/quinn#239
The quinn PR is not in an ideal state and needs more input.
Breaking Changes
Notes & open questions
Change checklist
quic-rpciroh-gossipiroh-blobsdumbpipesendme