Skip to content

fix: make prometheus metric calls fault-tolerant in hb_http_client#717

Open
Lucifer0x17 wants to merge 1 commit intoneo/edgefrom
fix/dev-lua-test
Open

fix: make prometheus metric calls fault-tolerant in hb_http_client#717
Lucifer0x17 wants to merge 1 commit intoneo/edgefrom
fix/dev-lua-test

Conversation

@Lucifer0x17
Copy link

When prometheus is running but init_prometheus() is not called while running test and then during metric recording race condition occur and calls like prometheus_counter:inc would throw {error, unknown_metric} because the metrics were never declared.

This crash propagated up through hb_store_gateway:read, which catches exceptions and returns the atom failure. Since the store chain only falls through on not_found, failure stopped the chain -- preventing subsequent stores from being tried and ultimately crashing.

When prometheus is running but `init_prometheus()` is not called while running test and then during metric recording race condition occur and calls like prometheus_counter:inc would throw `{error, unknown_metric}` because the metrics were never declared.

This crash propagated up through `hb_store_gateway:read`, which catches exceptions and returns the atom `failure`. Since the store chain only falls through on `not_found`, `failure` stopped the chain -- preventing subsequent stores from being tried and ultimately crashing.
@Lucifer0x17 Lucifer0x17 self-assigned this Mar 1, 2026
@Lucifer0x17 Lucifer0x17 added the bug Something isn't working label Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant