Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context.WithTimeout() yields on Done() too early with no Err() set. #1658

Open
dikderoy opened this issue Sep 4, 2024 · 0 comments
Open

Comments

@dikderoy
Copy link

dikderoy commented Sep 4, 2024

The following program sample.go triggers an unexpected result

//this is except from actual code from the plugin

package traefikplugin

import (
  "net/rpc"
  ...
)

// call is called once per request in the middleware
func call() err {
  ...
  reqStart := time.Now()
  reqCtx, cancel := context.WithTimeout(context.Background(), 30*time.Milliseconds)
  defer cancel()

  done := client.Go(method, args, reply, make(chan *rpc.Call, 1)).Done
  // wait for network or timeout
  select {
    case response := <-done:
	return response.Error
    case <-reqCtx.Done():
	return fmt.Errorf(
          "rpc: call timed out: %w, net_call/duration=%s",
          reqCtx.Err(),
          time.Since(reqStart),
        )
  }
}

Expected result

rpc: call timed out: context deadline exceeded, net_call/duration=30.209943ms

Got

rpc: call timed out: %!w(<nil>), net_call/duration=1.429943ms

Yaegi Version

v0.16.1

Additional Notes

It should handle timeouts correctly, and not yield prematurely

In Traefik context, on the logs I can observe messages like this from my custom written plugin:

rpc: call timed out: %!w(<nil>), net_call/duration=1.429943ms

which effectively means that context has yielded on reqCtx.Done() chan before the timeout was reached and the reqCtx.Err() is nil.

This occurs under relatively high load instance, with ~70hit/s on particular plugin utilizing the code and global load on same Traefik instance is ~400 hit/s.

The issue is occurring for ~0.08% (out of 70hits/s on the plugin) of requests handled.

so probably it is some kind of low level race condition in yaegi.

Sorry I haven't provided the fully functional code for reproducing the bug.

If you ultimately need it, I can try to create a minimal reproducible example a bit later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant