Skip to content
Open
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
74f4283
Circuit Breakers on Subgraph Execution
ardatan Apr 8, 2026
505dd34
docs: update documentation
theguild-bot Apr 8, 2026
17d2853
fixes
ardatan Apr 8, 2026
42ec510
docs: update documentation
theguild-bot Apr 8, 2026
541f998
Update lib/executor/src/executors/map.rs
ardatan Apr 8, 2026
e147a33
..
ardatan Apr 8, 2026
8a54b57
.
ardatan Apr 8, 2026
c18b1c3
docs: update documentation
theguild-bot Apr 8, 2026
a38a733
Changeset
ardatan Apr 9, 2026
8d4d2c1
..
ardatan Apr 9, 2026
647c8c6
Update lib/router-config/src/traffic_shaping.rs
ardatan Apr 9, 2026
ef7300e
Update lib/executor/Cargo.toml
ardatan Apr 9, 2026
0f9b01e
docs: update documentation
theguild-bot Apr 9, 2026
0f2c88a
Update docs/README.md
ardatan Apr 9, 2026
6602155
Address
ardatan Apr 9, 2026
fbec585
addressed
ardatan Apr 9, 2026
18de8ce
docs: update documentation
theguild-bot Apr 9, 2026
dda0978
Go
ardatan Apr 9, 2026
1da6569
..
ardatan Apr 9, 2026
5797f34
docs: update documentation
theguild-bot Apr 9, 2026
967fbb2
Fix typo
ardatan Apr 9, 2026
965ddde
docs: update documentation
theguild-bot Apr 9, 2026
0fb7484
Improvements
ardatan Apr 9, 2026
b249e93
Merge origin/main into circuit-breaker-subgraph: resolve conflicts in…
Copilot Apr 15, 2026
7a15ead
docs: regenerate README from merged schema
Copilot Apr 15, 2026
fcacdb7
Merge main into circuit-breaker-subgraph and resolve conflicts
Copilot Apr 15, 2026
e547ed8
Merge branch 'main' into circuit-breaker-subgraph
ardatan Apr 15, 2026
fca1aa9
Merge origin/main into circuit-breaker-subgraph: resolve conflicts
Copilot Apr 29, 2026
7489919
Address Devin comment
ardatan Apr 29, 2026
f63509c
docs: update documentation
theguild-bot Apr 29, 2026
40c27f0
fix: clone circuit breaker before await to avoid DashMap lock held ac…
Copilot Apr 29, 2026
8f852f7
..
ardatan Apr 30, 2026
237a3e4
5XX as errors tests
ardatan Apr 30, 2026
e3daccf
More
ardatan Apr 30, 2026
ce48ba7
docs: update documentation
theguild-bot Apr 30, 2026
d05da9c
Update JSON schema
ardatan Apr 30, 2026
24908e6
docs: update documentation
theguild-bot Apr 30, 2026
9ae8038
Lets go
ardatan May 4, 2026
22f32d6
Merge branch 'main' into circuit-breaker-subgraph
ardatan May 8, 2026
1cb1e24
docs: update documentation
theguild-bot May 8, 2026
7a5f42a
Reorder percentage module in mod.rs
ardatan May 8, 2026
ac63b56
Merge branch 'main' into circuit-breaker-subgraph
ardatan May 8, 2026
7d15c8f
Merge branch 'main' into circuit-breaker-subgraph
ardatan May 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .changeset/circuit-breaker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
hive-router: patch
hive-router-plan-executor: patch
hive-router-config: patch
---

# Implement Circuit Breaker for Subgraph Requests

This change introduces a circuit breaker mechanism for subgraph requests in the Hive Router. The circuit breaker will monitor the success and failure rates of requests to each subgraph and will prevent future requests if the failure rate exceeds a certain threshold. When the circuit breaker is opened, subsequent requests to that subgraph will fail immediately without attempting to send the request.

This implementation helps improve the resilience and stability of the Hive Router when dealing with unreliable subgraphs.
5 changes: 4 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ strum = { version = "0.28.0", features = ["derive"] }
mockito = "1.7.0"
futures-util = "0.3.31"
axum = "0.8.4"
recloser = "1.3.1"
notify = "8.2.0"

# Telemetry
Expand Down
88 changes: 86 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
|[**subscriptions**](#subscriptions)|`object`|Configuration for subscriptions.<br/>Default: `{"broadcast_capacity":0,"enabled":false}`<br/>||
|[**supergraph**](#supergraph)|`object`|Configuration for the Federation supergraph source. By default, the router will use a local file-based supergraph source (`./supergraph.graphql`).<br/>||
|[**telemetry**](#telemetry)|`object`|Default: `{"client_identification":{"name_header":"graphql-client-name","version_header":"graphql-client-version"},"hive":null,"metrics":{"exporters":[],"instrumentation":{"common":{"histogram":{"aggregation":"explicit","bytes":{"buckets":[128,512,1024,2048,4096,8192,16384,32768,65536,131072,262144,524288,1048576,2097152,3145728,4194304,5242880],"record_min_max":false},"seconds":{"buckets":[0.005,0.01,0.025,0.05,0.075,0.1,0.25,0.5,0.75,1,2.5,5,7.5,10],"record_min_max":false}}},"instruments":{}}},"resource":{"attributes":{}},"tracing":{"collect":{"max_attributes_per_event":16,"max_attributes_per_link":32,"max_attributes_per_span":128,"max_events_per_span":128,"parent_based_sampler":false,"sampling":1},"exporters":[],"instrumentation":{"spans":{"mode":"spec_compliant"}},"propagation":{"b3":false,"baggage":false,"jaeger":false,"trace_context":true}}}`<br/>||
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"all":{"allow_only_http2":false,"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"},"max_connections_per_host":100,"router":{"dedupe":{"enabled":false,"headers":"all"},"max_long_lived_clients":128,"request_timeout":"1m"}}`<br/>||
|[**traffic\_shaping**](#traffic_shaping)|`object`|Configuration for the traffic-shaping of the executor. Use these configurations to control how requests are being executed to subgraphs.<br/>Default: `{"all":{"allow_only_http2":false,"circuit_breaker":null,"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"},"max_connections_per_host":100,"router":{"dedupe":{"enabled":false,"headers":"all"},"max_long_lived_clients":128,"request_timeout":"1m"}}`<br/>||
|[**websocket**](#websocket)|`object`|Configuration of router's WebSocket server.<br/>Default: `{"enabled":false,"headers":{"persist":false,"source":"connection"},"path":null}`<br/>||

**Additional Properties:** not allowed
Expand Down Expand Up @@ -203,6 +203,7 @@ telemetry:
traffic_shaping:
all:
allow_only_http2: false
circuit_breaker: null
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s
Expand Down Expand Up @@ -3117,7 +3118,7 @@ Configuration for the traffic-shaping of the executor. Use these configurations

|Name|Type|Description|Required|
|----|----|-----------|--------|
|[**all**](#traffic_shapingall)|`object`|The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.<br/>Default: `{"allow_only_http2":false,"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"}`<br/>||
|[**all**](#traffic_shapingall)|`object`|The default configuration that will be applied to all subgraphs, unless overridden by a specific subgraph configuration.<br/>Default: `{"allow_only_http2":false,"circuit_breaker":null,"dedupe_enabled":true,"pool_idle_timeout":"50s","request_timeout":"30s"}`<br/>||
|**max\_connections\_per\_host**|`integer`|Limits the concurrent amount of requests/connections per host/subgraph.<br/>Default: `100`<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||
|[**router**](#traffic_shapingrouter)|`object`|Configuration for the router itself, e.g., for handling incoming requests, or other router-level traffic shaping configurations.<br/>Default: `{"dedupe":{"enabled":false,"headers":"all"},"max_long_lived_clients":128,"request_timeout":"1m"}`<br/>||
|[**subgraphs**](#traffic_shapingsubgraphs)|`object`|Optional per-subgraph configurations that will override the default configuration for specific subgraphs.<br/>||
Expand All @@ -3128,6 +3129,7 @@ Configuration for the traffic-shaping of the executor. Use these configurations
```yaml
all:
allow_only_http2: false
circuit_breaker: null
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s
Expand All @@ -3152,6 +3154,7 @@ The default configuration that will be applied to all subgraphs, unless overridd
|Name|Type|Description|Required|
|----|----|-----------|--------|
|**allow\_only\_http2**|`boolean`|Forces HTTP/2 for requests to subgraphs.<br/><br/>For plain HTTP, it will use HTTP/2 cleartext (h2c).<br/>For HTTPS, it also requires HTTP/2.<br/>This will make the subgraph requests never fall back to HTTP/1.1,<br/>and will fail if the subgraph doesn't support HTTP/2.<br/>Default: `false`<br/>||
|[**circuit\_breaker**](#traffic_shapingallcircuit_breaker)|`object`, `null`|Circuit Breaker configuration for all subgraphs.<br/>||
|**dedupe\_enabled**|`boolean`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>Default: `true`<br/>||
|**pool\_idle\_timeout**|`string`|Timeout for idle sockets being kept-alive.<br/>Default: `"50s"`<br/>||
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>Default: `"30s"`<br/>||
Expand All @@ -3162,12 +3165,53 @@ The default configuration that will be applied to all subgraphs, unless overridd

```yaml
allow_only_http2: false
circuit_breaker: null
dedupe_enabled: true
pool_idle_timeout: 50s
request_timeout: 30s

```

<a name="traffic_shapingallcircuit_breaker"></a>
#### traffic\_shaping\.all\.circuit\_breaker: object,null

Circuit Breaker configuration for all subgraphs.
When the circuit breaker is open, requests to the subgraph will be
short-circuited and an error will be returned to the client.
The circuit breaker will be triggered based on the error rate of requests to the subgraph, and will attempt to reset after a certain timeout.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**enabled**|`boolean`, `null`|Enable or disable the circuit breaker for the subgraph.<br/>Default: false (circuit breaker is disabled)<br/><br/>When unset on a subgraph-level configuration, the value falls back<br/>to the value defined in the global (`all`) circuit breaker<br/>configuration.<br/>||
|[**error\_status\_codes**](#traffic_shapingallcircuit_breakererror_status_codes)|`integer[]`|HTTP status codes returned by the subgraph that should be counted as<br/>||
|**error\_threshold**|`string`|Percentage after what the circuit breaker should kick in.<br/>Default: 50%<br/>||
|**reset\_timeout**|`string`|The duration after which the circuit breaker will attempt to retry sending requests to the subgraph.<br/>Default: 30s<br/>||
|**volume\_threshold**|`integer`, `null`|Count of requests before starting evaluating.<br/>Default: 5<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||

**Additional Properties:** not allowed
<a name="traffic_shapingallcircuit_breakererror_status_codes"></a>
##### traffic\_shaping\.all\.circuit\_breaker\.error\_status\_codes\[\]: array,null

HTTP status codes returned by the subgraph that should be counted as
failures by the circuit breaker.

Only responses whose status code is contained in this list will be
recorded as failures. Responses with any other status code (including
other 5xx codes) are treated as successes from the circuit breaker's
point of view.

Default: `[503]`


**Items**

**Item Type:** `integer`
**Item Minimum:** `100`
**Item Maximum:** `599`
**Unique Items:** yes
<a name="traffic_shapingalltls"></a>
#### traffic\_shaping\.all\.tls: object,null

Expand Down Expand Up @@ -3280,12 +3324,52 @@ Optional per-subgraph configurations that will override the default configuratio
|Name|Type|Description|Required|
|----|----|-----------|--------|
|**allow\_only\_http2**|`boolean`, `null`|Forces HTTP/2 for requests to subgraphs.<br/><br/>For plain HTTP, it will use HTTP/2 cleartext (h2c).<br/>For HTTPS, it also requires HTTP/2.<br/>This will make the subgraph requests never fall back to HTTP/1.1,<br/>and will fail if the subgraph doesn't support HTTP/2.<br/>||
|[**circuit\_breaker**](#traffic_shapingsubgraphsadditionalpropertiescircuit_breaker)|`object`, `null`|Circuit Breaker configuration for the subgraph.<br/>||
|**dedupe\_enabled**|`boolean`, `null`|Enables/disables request deduplication to subgraphs.<br/><br/>When requests exactly matches the hashing mechanism (e.g., subgraph name, URL, headers, query, variables), and are executed at the same time, they will<br/>be deduplicated by sharing the response of other in-flight requests.<br/>||
|**pool\_idle\_timeout**|`string`, `null`|Timeout for idle sockets being kept-alive.<br/>||
|**request\_timeout**||Optional timeout configuration for requests to subgraphs.<br/><br/>Example with a fixed duration:<br/>```yaml<br/> timeout:<br/> duration: 5s<br/>```<br/><br/>Or with a VRL expression that can return a duration based on the operation kind:<br/>```yaml<br/> timeout:<br/> expression: \|<br/> if (.request.operation.type == "mutation") {<br/> "10s"<br/> } else {<br/> "15s"<br/> }<br/>```<br/>||
|[**tls**](#traffic_shapingsubgraphsadditionalpropertiestls)|`object`, `null`|||

**Additional Properties:** not allowed
<a name="traffic_shapingsubgraphsadditionalpropertiescircuit_breaker"></a>
##### traffic\_shaping\.subgraphs\.additionalProperties\.circuit\_breaker: object,null

Circuit Breaker configuration for the subgraph.
When the circuit breaker is open, requests to the subgraph will be short-circuited and an error will be returned to the client.
The circuit breaker will be triggered based on the error rate of requests to the subgraph, and will attempt to reset after a certain timeout.


**Properties**

|Name|Type|Description|Required|
|----|----|-----------|--------|
|**enabled**|`boolean`, `null`|Enable or disable the circuit breaker for the subgraph.<br/>Default: false (circuit breaker is disabled)<br/><br/>When unset on a subgraph-level configuration, the value falls back<br/>to the value defined in the global (`all`) circuit breaker<br/>configuration.<br/>||
|[**error\_status\_codes**](#traffic_shapingsubgraphsadditionalpropertiescircuit_breakererror_status_codes)|`integer[]`|HTTP status codes returned by the subgraph that should be counted as<br/>||
|**error\_threshold**|`string`|Percentage after what the circuit breaker should kick in.<br/>Default: 50%<br/>||
|**reset\_timeout**|`string`|The duration after which the circuit breaker will attempt to retry sending requests to the subgraph.<br/>Default: 30s<br/>||
|**volume\_threshold**|`integer`, `null`|Count of requests before starting evaluating.<br/>Default: 5<br/>Format: `"uint"`<br/>Minimum: `0`<br/>||

**Additional Properties:** not allowed
<a name="traffic_shapingsubgraphsadditionalpropertiescircuit_breakererror_status_codes"></a>
###### traffic\_shaping\.subgraphs\.additionalProperties\.circuit\_breaker\.error\_status\_codes\[\]: array,null

HTTP status codes returned by the subgraph that should be counted as
failures by the circuit breaker.

Only responses whose status code is contained in this list will be
recorded as failures. Responses with any other status code (including
other 5xx codes) are treated as successes from the circuit breaker's
point of view.

Default: `[503]`


**Items**

**Item Type:** `integer`
**Item Minimum:** `100`
**Item Maximum:** `599`
**Unique Items:** yes
<a name="traffic_shapingsubgraphsadditionalpropertiestls"></a>
##### traffic\_shaping\.subgraphs\.additionalProperties\.tls: object,null

Expand Down
Loading
Loading