Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

Closed
katiehockman opened this issue Jan 23, 2024 · 2 comments · Fixed by #2651
Closed

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

katiehockman opened this issue Jan 23, 2024 · 2 comments · Fixed by #2651

Comments

@katiehockman
Copy link
Contributor

https://github.com/DataDog/dd-trace-go/actions/runs/7617175486/job/20745637927

Seeing occasional flakes in CI for TestExecutionTraceRandom/rate-0.066667 in the profiler. An example failure:

=== Failed
=== FAIL: profiler TestExecutionTraceRandom/rate-0.066667 (2.16s)
    profiler_test.go:573: observed 15 traces, outside the desired bound of [1, 11]
    profiler_test.go:573: observed 12 traces, outside the desired bound of [1, 11]
    profiler_test.go:596: failed after retry
    --- FAIL: TestExecutionTraceRandom/rate-0.066667 (2.16s)

=== FAIL: profiler TestExecutionTraceRandom (4.42s)

cc @nsrip-dd @felixge

@nsrip-dd
Copy link
Contributor

nsrip-dd commented Jan 23, 2024

Thanks! I'll dig into this. This is an inherently random test, and I hoped to make it quite unlikely to fail in the good case. But if it's failing too frequently, I'll see if I can make it better.

For reference, looking at our CI app I see that this has failed ~a single digit number of times out of ~520 runs since the test was introduced:

image

@katiehockman
Copy link
Contributor Author

This seems to be happening more often lately. I saw another of these flakes today: https://github.com/DataDog/dd-trace-go/actions/runs/7672345551/job/20912652647?pr=2521

nsrip-dd added a commit that referenced this issue Jan 29, 2024
This test is failing more often than we'd like. Skip it for now until we
can pick more reasonable bounds (or, a better way to test this in
general)

Updates #2529
nsrip-dd added a commit that referenced this issue Apr 1, 2024
To de-flake TestExecutionTraceRandom, provide a fixed-seed random number
generator so that the results are deterministic. This is done through a
non-exported profiler option so it's easy to provide in specific test
cases (only one so far). Developers should remove this option while
working on anything that might rely on real randomness, verify that it
works as intended, and then add the option back to get reliable tests in
CI.

Fixes #2529
nsrip-dd added a commit that referenced this issue Apr 8, 2024
Due to a bit of math sloppiness, we were getting a ~1/500 failure rate
for TestExecutionTraceRandom, which was often enough to be irritating to
dd-trace-go developers. Each trial has a 95% success rate given a
correct implementation. We were doing 2 trials. The comment in the test
incorrectly states that 2 trials should have a 99.999% success rate.
But, actuall we should expect a ~99.75% success rate for 2 trials, or a
1/400 failure rate, roughly matching what we saw.

Increase the number of trials to 4. This actually gives the desired
99.999% success rate. We should expect roughly 1 failure for every
160000 runs. This is a tolerable failure rate, and lets the test remain
somewhat robust, rather than use a fixed seed as considered in #2642.
I have manually tested this by breaking the implementation (multiplying
by an extra rand.Float64() draw) and confirmed that the test still fails
reliably.

Fixes #2529
nsrip-dd added a commit that referenced this issue Apr 11, 2024
…als (#2651)

Due to a bit of math sloppiness, we were getting a ~1/500 failure rate
for TestExecutionTraceRandom, which was often enough to be irritating to
dd-trace-go developers. Each trial has a 95% success rate given a
correct implementation. We were doing 2 trials. The comment in the test
incorrectly states that 2 trials should have a 99.999% success rate.
But, actuall we should expect a ~99.75% success rate for 2 trials, or a
1/400 failure rate, roughly matching what we saw.

Increase the number of trials to 4. This actually gives the desired
99.999% success rate. We should expect roughly 1 failure for every
160000 runs. This is a tolerable failure rate, and lets the test remain
somewhat robust, rather than use a fixed seed as considered in #2642.
I have manually tested this by breaking the implementation (multiplying
by an extra rand.Float64() draw) and confirmed that the test still fails
reliably.

Fixes #2529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment