Eventually visible benchmark #7700

trask · 2025-09-23T22:59:31Z

Investigating implementation options for open-telemetry/opentelemetry-specification#4645

TL;DR - I couldn't conjure up an implementation that satisfies "eventually visible" that was faster than just using volatile (immediately visible)

Java 17 Results

BooleanStateBenchmark

Benchmark	Configuration	Score	Units
read_singleThread	implementation=NonVolatileBooleanState	1.205 ±0.002	ns/op
read_singleThread	implementation=ImmediateBooleanState	74.078 ±0.196	ns/op
read_singleThread	implementation=EventualBooleanState	272.405 ±3.691	ns/op
read_singleThread	implementation=VarHandleImmediateBooleanState	80.017 ±1.464	ns/op
read_singleThread	implementation=VarHandleEventualBooleanState	269.313 ±0.210	ns/op
read_twoThreads	implementation=NonVolatileBooleanState	1.218 ±0.014	ns/op
read_twoThreads	implementation=ImmediateBooleanState	73.995 ±0.061	ns/op
read_twoThreads	implementation=EventualBooleanState	799.889 ±2.235	ns/op
read_twoThreads	implementation=VarHandleImmediateBooleanState	79.455 ±0.688	ns/op
read_twoThreads	implementation=VarHandleEventualBooleanState	864.939 ±173.696	ns/op

Java 24 Results

BooleanStateBenchmark

Benchmark	Configuration	Score	Units
read_singleThread	implementation=NonVolatileBooleanState	1.217 ±0.024	ns/op
read_singleThread	implementation=ImmediateBooleanState	59.877 ±0.072	ns/op
read_singleThread	implementation=EventualBooleanState	270.300 ±1.612	ns/op
read_singleThread	implementation=VarHandleImmediateBooleanState	79.709 ±0.160	ns/op
read_singleThread	implementation=VarHandleEventualBooleanState	269.310 ±0.271	ns/op
read_twoThreads	implementation=NonVolatileBooleanState	1.213 ±0.014	ns/op
read_twoThreads	implementation=ImmediateBooleanState	60.002 ±0.057	ns/op
read_twoThreads	implementation=EventualBooleanState	826.375 ±24.897	ns/op
read_twoThreads	implementation=VarHandleImmediateBooleanState	81.311 ±2.690	ns/op
read_twoThreads	implementation=VarHandleEventualBooleanState	792.749 ±19.175	ns/op

codecov · 2025-09-23T23:06:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.13%. Comparing base (1e763b2) to head (dd28501).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##               main    #7700   +/-   ##
=========================================
  Coverage     90.12%   90.13%           
- Complexity     7187     7192    +5     
=========================================
  Files           814      814           
  Lines         21700    21713   +13     
  Branches       2123     2127    +4     
=========================================
+ Hits          19557    19570   +13     
  Misses         1477     1477           
  Partials        666      666

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jkwatson · 2025-09-24T20:53:54Z

So, non-volatile is the fastest, but we know may not satisfy "eventually visible", correct? Did your benchmark measure "time to visibility" across multiple threads, or just throughput with different implementations?

trask · 2025-09-24T21:38:29Z

Did your benchmark measure "time to visibility" across multiple threads

the "eventual visibility" implementations rely on a non-volatile access counter to be "eventual", though that's also what destroys the performance

jkwatson · 2025-09-24T22:09:51Z

Did your benchmark measure "time to visibility" across multiple threads

the "eventual visibility" implementations rely on a non-volatile access counter to be "eventual", though that's also what destroys the performance

I guess I'm asking what the "score" represents in the benchmarks... time to visibility?

trask · 2025-09-24T22:14:39Z

I guess I'm asking what the "score" represents in the benchmarks... time to visibility?

ah, number of nanoseconds to perform 100 boolean reads on the same thread

this is where non-volatile shines because the JIT compiler can optimize it to do a single memory read

jack-berg · 2025-09-25T03:28:47Z

If I'm reading the results correctly, the non-volatile is much much faster than everything else. This is what I suspected based on research about perf penalty of the volatile keyword, but didn't test myself.

I'm reluctant to add this to penalty when only a small percent of users will ever take advantage of the dynamism that requires the hit.

The penalty has to be paid on the hot path of metrics, logs, and traces. See my old big post on metric systems for some ballpark figures on time to record measurements. I think the volatile keyword moves the needle in a meaningful way: https://opentelemetry.io/blog/2024/java-metric-systems-compared/#metrics-primer

jack-berg · 2025-09-25T03:45:37Z

We could make it a setting. I.e. when you initialize the sdk, indicate whether you intend to use dynamic config. If so, we substitute the implementation to guarantee eventually consistency. If not, we use an implementation which is fast and doesnt waste perf checking for config changes that will never come.

pellared · 2025-09-29T07:13:31Z

I'm reluctant to add this to penalty when only a small percent of users will ever take advantage of the dynamism that requires the hit.

Do you really believe that the nanoseconds overhead is something that would be noticeable? What is the sense of adding a feature there is no guarantee that it will work?

I suggest to check what is the speedup factor between non-volatile vs volatile in a end to end scenario (e.g. emitting a log record on a logger that is disabled).

laurit · 2025-09-29T08:06:18Z

keep in mind what jmh prints out

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.

Firstly I didn't try to run these or attempt to verify whey the numbers are like they are.
As far as I know on x86, where these tests are run, volatile reads don't use a barrier. Based on that I'd guess that the difference in read only perf is a not caused directly by volatile reads being slower but rather some sort of compiler optimization. Probably for the non volatile case the compiler reads the state once and keeps it in the register while the volatile case keeps rereading the state. Or perhaps the compiler does something even more clever. If that turns out to be so then IMO this test would really be invalid. It doesn't realistically reflect the logging instrumentation reading a configuration flag as reading that flag won't happen in a loop that could be unrolled and optimized like that. Inspect the generated asm might give a better understanding on what is different for these 2 tests. I believe that adding BlackHole.consumeCPU that simulates real work could help with building a more realistic test. Note that on other architectures volatile reads use a barrier.
https://shipilev.net/blog/2014/nanotrusting-nanotime/ states that

It chimes back to our observation that volatile write costs are dramatically amortized if we are not choking the system with them.

jack-berg · 2025-09-29T13:28:43Z

Do you really believe that the nanoseconds overhead is something that would be noticeable? What is the sense of adding a feature there is no guarantee that it will work?

Yes especially for metrics. Perf arguments are brought up as reasons not to use otel.

As for the guaranteed to work argument:

this is an experimental feature and I've maintained the position that if anyone actually observed the updates are not occurring in the real world, we should adjust
I offered a potential solution above which is guaranteed to work and doesn't sacrifice perf for the vast majority of users who won't use this feature

pellared · 2025-09-29T18:36:10Z

For transparancy: open-telemetry/opentelemetry-specification#4645 (comment)

trask added the run benchmarks label Sep 23, 2025

trask force-pushed the eventually-visible-benchmark branch 4 times, most recently from b2204c8 to 2b1b386 Compare September 24, 2025 19:23

trask marked this pull request as ready for review September 24, 2025 19:40

trask requested a review from a team as a code owner September 24, 2025 19:40

trask force-pushed the eventually-visible-benchmark branch from 2b1b386 to 78c668d Compare September 24, 2025 19:43

Add eventually visible benchmark

e5706e0

trask force-pushed the eventually-visible-benchmark branch from 78c668d to e5706e0 Compare September 24, 2025 19:44

trask mentioned this pull request Sep 25, 2025

Spec update: Changes to disabled config must be eventually visible #7706

Open

trask added 2 commits September 24, 2025 19:46

Respect testJavaVersion when running JMH benchmarks

9be852c

Add PR benchmark workflow

dd28501

jack-berg mentioned this pull request Sep 26, 2025

Changes of disabled config MUST be eventually visible open-telemetry/opentelemetry-specification#4645

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eventually visible benchmark #7700

Eventually visible benchmark #7700

Uh oh!

trask commented Sep 23, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

jkwatson commented Sep 24, 2025

Uh oh!

trask commented Sep 24, 2025

Uh oh!

jkwatson commented Sep 24, 2025

Uh oh!

trask commented Sep 24, 2025

Uh oh!

jack-berg commented Sep 25, 2025

Uh oh!

jack-berg commented Sep 25, 2025

Uh oh!

pellared commented Sep 29, 2025

Uh oh!

laurit commented Sep 29, 2025

Uh oh!

jack-berg commented Sep 29, 2025 •

edited

Loading

Uh oh!

pellared commented Sep 29, 2025

Uh oh!

Uh oh!

Eventually visible benchmark #7700

Are you sure you want to change the base?

Eventually visible benchmark #7700

Uh oh!

Conversation

trask commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Java 17 Results

BooleanStateBenchmark

Java 24 Results

BooleanStateBenchmark

Uh oh!

codecov bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jkwatson commented Sep 24, 2025

Uh oh!

trask commented Sep 24, 2025

Uh oh!

jkwatson commented Sep 24, 2025

Uh oh!

trask commented Sep 24, 2025

Uh oh!

jack-berg commented Sep 25, 2025

Uh oh!

jack-berg commented Sep 25, 2025

Uh oh!

pellared commented Sep 29, 2025

Uh oh!

laurit commented Sep 29, 2025

Uh oh!

jack-berg commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pellared commented Sep 29, 2025

Uh oh!

Uh oh!

trask commented Sep 23, 2025 •

edited

Loading

codecov bot commented Sep 23, 2025 •

edited

Loading

jack-berg commented Sep 29, 2025 •

edited

Loading