-
Notifications
You must be signed in to change notification settings - Fork 9
EHT-1025 - Replace otel deps with prometheus client #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #108 +/- ##
==========================================
+ Coverage 92.76% 93.29% +0.52%
==========================================
Files 43 43
Lines 5391 5220 -171
Branches 5391 5220 -171
==========================================
- Hits 5001 4870 -131
+ Misses 326 283 -43
- Partials 64 67 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
| Branch | johnsonw/prometheus-client-switch |
| Testbed | ci-runner |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Lower Boundary milliseconds (ms) (Limit %) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|---|
| jobstats 100 | 📈 view plot 🚷 view threshold | 1.50 ms(-0.09%)Baseline: 1.50 ms | 1.43 ms (95.63%) | 1.57 ms (95.65%) |
| jobstats 1000 | 📈 view plot 🚷 view threshold | 14.34 ms(-1.12%)Baseline: 14.50 ms | 13.94 ms (97.23%) | 15.06 ms (95.21%) |
|
| Branch | johnsonw/prometheus-client-switch |
| Testbed | ci-runner |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
- RAM Hit Rate (hits (%))
- L1 Hit Rate (hits (%))
- LL Hit Rate (hits (%))
- L1 Hits (hits)
- ILmr (misses (reads))
- LLd Miss Rate (misses (%))
- Dw (writes)
- DLmr (misses (reads))
- LLi Miss Rate (misses (%))
- DLmw (misses (writes))
- LL Miss Rate (misses (%))
- RAM Hits (hits)
- D1mr (misses (reads))
- I1mr (misses (reads))
- I1 Miss Rate (misses (%))
- Dr (reads)
- LL Hits (hits)
- Total read+write (reads/writes)
- Estimated Cycles (cycles)
- D1mw (misses (writes))
- D1 Miss Rate (misses (%))
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
Click to view all benchmark results
| Benchmark | D1 Miss Rate | misses (%) | D1mr | misses (reads) x 1e3 | D1mw | misses (writes) x 1e3 | DLmr | misses (reads) | DLmw | misses (writes) x 1e3 | Dr | reads x 1e6 | Dw | writes x 1e6 | Estimated Cycles | cycles x 1e6 | I1 Miss Rate | misses (%) | I1mr | misses (reads) | ILmr | misses (reads) | Instructions | Benchmark Result instructions x 1e6 (Result Δ%) | Lower Boundary instructions x 1e6 (Limit %) | Upper Boundary instructions x 1e6 (Limit %) | L1 Hit Rate | hits (%) | L1 Hits | hits x 1e6 | LL Hit Rate | hits (%) | LL Hits | hits x 1e3 | LL Miss Rate | misses (%) | LLd Miss Rate | misses (%) | LLi Miss Rate | misses (%) | RAM Hit Rate | hits (%) | RAM Hits | hits x 1e3 | Total read+write | reads/writes x 1e6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| lustre_metrics::memory_benches::bench_encode_lustre_metrics with_setup:generate_records() | 📈 view plot | 0.91 % | 📈 view plot | 24.45 reads x 1e3 | 📈 view plot | 8.66 writes x 1e3 | 📈 view plot | 107.00 reads | 📈 view plot | 6.37 writes x 1e3 | 📈 view plot | 2.44 x 1e6 | 📈 view plot | 1.22 x 1e6 | 📈 view plot | 14.61 x 1e6 | 📈 view plot | 0.01 % | 📈 view plot | 996.00 reads | 📈 view plot | 843.00 reads | 📈 view plot 🚷 view threshold | 10.60 x 1e6(-22.99%)Baseline: 13.77 x 1e6 | 3.66 x 1e6 (34.49%) | 23.87 x 1e6 (44.40%) | 📈 view plot | 99.76 % | 📈 view plot | 14.22 x 1e6 | 📈 view plot | 0.19 % | 📈 view plot | 26.79 x 1e3 | 📈 view plot | 0.05 % | 📈 view plot | 0.18 % | 📈 view plot | 0.01 % | 📈 view plot | 0.05 % | 📈 view plot | 7.32 x 1e3 | 📈 view plot | 14.26 x 1e6 |
| ] } | ||
| pretty_assertions = "1.4.1" | ||
| prometheus = "0.14" | ||
| prometheus-client = { git = "https://github.com/whamcloud/client_rust", branch = "whamcloud-08-12-2025" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an upstream PR for these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple patches that are in review:
prometheus/client_rust#278
prometheus/client_rust#279
This patch does not have a PR upstream: whamcloud/client_rust@d81f9a1
...-exporter/src/historical_snapshots/lustrefs_exporter__tests__host_stats_non_healthy.histsnap
Outdated
Show resolved
Hide resolved
...fs-exporter/src/snapshots/lustrefs_exporter__routes__tests__jobstats_with_stderr_output.snap
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Some snapshot outputs have different number of lines (and output appears to have changed order). Please pass through all the snapshots and ensure that the content has not notably changed (we cannot break backwards compatibility).
- Output coming from server should be
Vec<u8>. I noticed that this patch changes some output back to String.
changes - update `normalize_docs` to strip off the `.` at the end of help text so that it doesn't use the period at the end during comparison. - Update a few other properties to match existing help text Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
|
I've updated https://github.com/whamcloud/lustrefs-exporter/blob/main/lustrefs-exporter/benches/lustre_metrics.rs#L38 to use |
|
Comparing snapshots is a bit difficult due to the following:
The above line appears twice.
The approach I have taken to compare snapshots is the following:
This procedure looks at it from a text point of view to see what is different between the two files. With that said, I think we also have a test that compares the historical vs new snapshots: https://github.com/whamcloud/lustrefs-exporter/blob/johnsonw/prometheus-client-switch/lustrefs-exporter/src/lib.rs#L330 Would it be worth writing a test from a text standpoint to perform the above procedure but inside of a unit test? |
- Update some comments Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
versions. Signed-off-by: William Johnson <[email protected]>
`otelsnap` extension. Otherwise, we will receive an error indicating that there are stale snapshots. Signed-off-by: William Johnson <[email protected]>
|
Benchmark on server running original binary: original-lustrefs-exporter2.mp4RES 1.6G |
This comment was marked as resolved.
This comment was marked as resolved.
|
Benchmark on server running el8 binary: el8-lustrefs-exporter2.mp4RES: 1.4G |
f72e507 to
667cff4
Compare
Does your benchmark take compression into account? I wonder if that is / is not making a difference here. |
|
Benchmark on server running el8 binary compiled with mimalloc global allocatore: el8-mimalloc-lustrefs-exporter2.mp4RES: 992MB |
Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
4d9717b to
40ddc41
Compare
Signed-off-by: William Johnson <[email protected]>
- remove stale snapshot Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
Signed-off-by: William Johnson <[email protected]>
ce555f3 to
ddd771f
Compare
https requests to hit the /metrics endpoint. This will make it closer to what we are doing in real life. Signed-off-by: William Johnson <[email protected]>
I've updated the benchmark to run an http server and make http requests. In fact, it uses the same Chart: Scrape Memory Metrics
Chart: Scrape Memory Metrics
|
Signed-off-by: William Johnson <[email protected]>



EHT-1025 - Replace otel deps with prometheus client