[Flaky test] profiler: TestExecutionTraceRandom flake #2529

katiehockman · 2024-01-23T20:16:51Z

https://github.com/DataDog/dd-trace-go/actions/runs/7617175486/job/20745637927

Seeing occasional flakes in CI for TestExecutionTraceRandom/rate-0.066667 in the profiler. An example failure:

=== Failed
=== FAIL: profiler TestExecutionTraceRandom/rate-0.066667 (2.16s)
    profiler_test.go:573: observed 15 traces, outside the desired bound of [1, 11]
    profiler_test.go:573: observed 12 traces, outside the desired bound of [1, 11]
    profiler_test.go:596: failed after retry
    --- FAIL: TestExecutionTraceRandom/rate-0.066667 (2.16s)

=== FAIL: profiler TestExecutionTraceRandom (4.42s)

cc @nsrip-dd @felixge

The text was updated successfully, but these errors were encountered:

nsrip-dd · 2024-01-23T20:21:54Z

Thanks! I'll dig into this. This is an inherently random test, and I hoped to make it quite unlikely to fail in the good case. But if it's failing too frequently, I'll see if I can make it better.

For reference, looking at our CI app I see that this has failed ~a single digit number of times out of ~520 runs since the test was introduced:

katiehockman · 2024-01-29T17:35:16Z

This seems to be happening more often lately. I saw another of these flakes today: https://github.com/DataDog/dd-trace-go/actions/runs/7672345551/job/20912652647?pr=2521

This test is failing more often than we'd like. Skip it for now until we can pick more reasonable bounds (or, a better way to test this in general) Updates #2529

To de-flake TestExecutionTraceRandom, provide a fixed-seed random number generator so that the results are deterministic. This is done through a non-exported profiler option so it's easy to provide in specific test cases (only one so far). Developers should remove this option while working on anything that might rely on real randomness, verify that it works as intended, and then add the option back to get reliable tests in CI. Fixes #2529

Due to a bit of math sloppiness, we were getting a ~1/500 failure rate for TestExecutionTraceRandom, which was often enough to be irritating to dd-trace-go developers. Each trial has a 95% success rate given a correct implementation. We were doing 2 trials. The comment in the test incorrectly states that 2 trials should have a 99.999% success rate. But, actuall we should expect a ~99.75% success rate for 2 trials, or a 1/400 failure rate, roughly matching what we saw. Increase the number of trials to 4. This actually gives the desired 99.999% success rate. We should expect roughly 1 failure for every 160000 runs. This is a tolerable failure rate, and lets the test remain somewhat robust, rather than use a fixed seed as considered in #2642. I have manually tested this by breaking the implementation (multiplying by an extra rand.Float64() draw) and confirmed that the test still fails reliably. Fixes #2529

…als (#2651) Due to a bit of math sloppiness, we were getting a ~1/500 failure rate for TestExecutionTraceRandom, which was often enough to be irritating to dd-trace-go developers. Each trial has a 95% success rate given a correct implementation. We were doing 2 trials. The comment in the test incorrectly states that 2 trials should have a 99.999% success rate. But, actuall we should expect a ~99.75% success rate for 2 trials, or a 1/400 failure rate, roughly matching what we saw. Increase the number of trials to 4. This actually gives the desired 99.999% success rate. We should expect roughly 1 failure for every 160000 runs. This is a tolerable failure rate, and lets the test remain somewhat robust, rather than use a fixed seed as considered in #2642. I have manually tested this by breaking the implementation (multiplying by an extra rand.Float64() draw) and confirmed that the test still fails reliably. Fixes #2529

katiehockman added profiler flaky-test labels Jan 23, 2024

nsrip-dd self-assigned this Jan 23, 2024

nsrip-dd added a commit that referenced this issue Jan 29, 2024

profiler: skip flaky TestExecutionTraceRandom

c783abc

This test is failing more often than we'd like. Skip it for now until we can pick more reasonable bounds (or, a better way to test this in general) Updates #2529

nsrip-dd mentioned this issue Jan 29, 2024

profiler: skip flaky TestExecutionTraceRandom #2531

Merged

nsrip-dd mentioned this issue Apr 1, 2024

profiler: test random execution trace collection with a fixed seed #2642

Closed

nsrip-dd mentioned this issue Apr 8, 2024

profiler: reduce TestExecutionTraceRandom flakiness by increasing trials #2651

Merged

nsrip-dd closed this as completed in #2651 Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

katiehockman commented Jan 23, 2024

nsrip-dd commented Jan 23, 2024 •

edited

Loading

katiehockman commented Jan 29, 2024

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

[Flaky test] profiler: TestExecutionTraceRandom flake #2529

Comments

katiehockman commented Jan 23, 2024

nsrip-dd commented Jan 23, 2024 • edited Loading

katiehockman commented Jan 29, 2024

nsrip-dd commented Jan 23, 2024 •

edited

Loading