Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #2248; busy event loop can cause events to fire while they shouldn't #2399

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

paulrutter
Copy link

This is a fix for #2248, as supplied by @thegedge earlier. See the original issue for a description of the issue and the proposed solution.

It's hard to add a unit test for this behavior, as it involves blocking the event loop for a certain amount of time.
Let me know if this is something that would be considered or if there are any concerns.

Checklist

  • I have read the documentation.
  • I have included a pull request description of my changes.
  • I have included some tests.

@paulrutter paulrutter changed the title Fix for #2248; Fix for #2248; busy event loop can cause events to fire while they shouldn't Jan 30, 2025
@paulrutter
Copy link
Author

I do see some failed tests, so probably this change is not entirely backwards compatible.
https://github.com/sindresorhus/got/actions/runs/13049814941/job/36407379410

source/core/timed-out.ts Outdated Show resolved Hide resolved
source/core/timed-out.ts Outdated Show resolved Hide resolved
request.destroy(new TimeoutError(delay, event));
// Use queueMicroTask to allow for any cancelled events to be handled first,
// to prevent firing any TimeoutError unneeded when the event loop is busy or blocked
queueMicrotask(() => {
Copy link

@thegedge thegedge Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think queueMicrotask will work.

CleanShot 2025-01-30 at 10 03 36@2x

source

Ignoring the fact that the above image isn't a perfect representation of what's happening (in Node, other runtimes may do something even more different), what we need to happen is to go around the outer loop. Since we're in the "timer callbacks" phase when we enter this handler, by queueing up another setTimeout we're going to do one full outer loop. This means we process I/O events (like connections!) and guarantee whether or not the connection truly was established.

queueMicrotask would run immediately after the timer callbacks are processed, but likely before I/O events are processed.

Copy link
Author

@paulrutter paulrutter Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm happy to change it back though, but we still have a few failing unit tests along with either approach.

Copy link

@thegedge thegedge Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we've had this patch running in production for a long time without any issues, I'm inclined to say it's just the tests that need some tweaking 🙂

I took a super quick look, and two things stand out to me:

  1. socket timeout appears to set a custom setTimeout on the stream, which may not play nicely with this.
  2. send timeout ticks a clock. Not sure how that works, but maybe we need two calls to tick to ensure the additional callback is called.

Copy link
Author

@paulrutter paulrutter Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also looked into the tests, and the point is that (even though it works) it does break backwards compatibility as before the TimeoutError would be thrown synchronously, and now it no longer is returned directly.

We can tune the tests so that it passes, but not sure if that would be acceptable or that we would need a different approach.

Copy link

@thegedge thegedge Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fair point :)

Of course @sindresorhus can make the final call, I'd be happy to offer my two cents:

  1. The original handler was already part of a setTimeout, and hence "async". There were no guarantees that it would trigger the timeout exactly as specified.
  2. Doing another setTimeout does mean we extend the time for real connection timeouts a bit, since we have to go through the event loop one more time, but we were okay with that tradeoff (because it was sufficiently rare).
  3. If backwards compatibility is a must, adding configuration is always an option!
  4. We've been running this patch in production for almost 2 years now without any issues.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your input, appreciated.
I think it's up to @sindresorhus to make a fair judgement on this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anyone should have relied on it being sync. So I'm fine with you just adjusting the tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i will look into this 👍 thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants