docs(router): Persisted Documents by kamilkisiela · Pull Request #76 · graphql-hive/docs

kamilkisiela · 2026-04-02T18:25:57Z

github-actions · 2026-04-02T18:26:56Z

Deployment	URL
🌐 Website	https://1a91717b-hive-platform-docs.theguild.workers.dev
📚 Storybook	https://pr-76.the-guild-docs-storybook.pages.dev

This PR introduces Persisted Documents support with configurable extraction and storage, plus lot of e2e tests. Closes #311 --- Documentation PR: graphql-hive/docs#76 - Preview of [security/persisted-documents](https://dc3c070a-hive-platform-docs.theguild.workers.dev/graphql/hive/docs/router/security/persisted-documents) - Preview of [configuration/persisted_documents](https://dc3c070a-hive-platform-docs.theguild.workers.dev/graphql/hive/docs/router/configuration/persisted_documents) --- Supports document ID extraction from: - `documentId` body field or URL query param (by default) - Apollo-style `extensions.persistedQuery.sha256Hash` (by default) - custom `json_path` (like `doc_id` or `extensions.whatever.id` - `url_query_param` (like `?doc_id=123`) - `url_path_param` (like `/graphql/:id`) In the example below, we first look for the path pattern and then the query param. ```yaml persisted_documents: extractors: - type: url_path_param template: /:id # relative to configured endpoint - type: url_query_param name: id # `?id=123 ``` Supports different document storages: - file manifest (in Apollo and Key-Value Relay style formats) - Hive CDN (via `hive-console-sdk`) File storage has **watch mode** by default (works well with `relay-compiler --watch`), so when a file changes (we debounce the events for 150ms) the document manifest is reloaded and served fresh. Hive storage includes syntax validation of the provided document id. We make sure we don't send what `str.replace('~', '/')` produces to the Hive CDN without verification. If we do, people would see 404 with no info that doc id is incorrect. Includes `require_id: boolean` to control whether to require requests with document id only or not. Includes `log_missing_id_requests: bool` (false by default) that logs information about requests with no document id. Helpful if you migrate from regular to queryless requests. Regarding Hive CDN. We don't rely only on `appName~appVersion~documentId` format of the document id, but app's name and version can be inferred from client identification headers (`graphql-client-name` etc - configurable via telemetry settings). We support it for reasons mentioned in the Slack Canvas doc (better DX and reusable `clientAwarness` feature of Apollo Client). I also added two metrics to measure: - requests with no document id - so devs know that some requests still send no id - document resolution failures - so devs know that some requests with doc id that has no document text ## Noteworthy implementation details Persisted documents are implemented under `pipeline/persisted_documents/*` with clear split: - extraction (`extract/*`) - resolution (`resolve/*`) - runtime (`mod.rs`, `types.rs`) Closes #867 - as I introduced single-flight resolution of documents in the SDK. The **Err had to be cloanable** (otherwise I would have to change the API to return Arc<Err>), so some error enum variants in the SDK was converted to `String` instead of raw errors from 3rd-party libraries. I also added a **negative cache** to store non 2XX requests for 5s (configurable, but in SDK it's disabled by default) to not keep repeating the same requests that eventually give errors or 404s. I cleaned up and moved the code responsible for preparation of graphql params, decoding of GET and POST payloads into `GraphQLGetInput` and `GraphQLPostInput` and `OperationPreparation` structs. This way the flow is clear, like what happens when we receive GET request, what when we receive POST, and how it's all translated to what the rest of the pipeline expects. It's in `bin/router/src/pipeline/execution_request.rs`. I did bunch of tricks to make sure we're performant: - custom query param reader (based on `memchr`) - conditional extraction of non standard JSON fields (fields that are not `query`, `extensions` etc) - built-in extraction of `documentId` during deserialization - supafast validation of document ids (based on `memchr`) --- There are many new lines of code, but majority is just e2e tests. For reviewers, I recommend to check: - `docs/persisted-documents` to understand what I built and why - `bin/router/src/pipeline/persisted_documents` - pretty much everything related to persisted documents, how things are extracted, how documents are resolved - `bin/router/src/pipeline/execution_request.rs` - to understand how we convert POST and GET request into data consumed by the rest of the pipeline and this is when extraction and resolution of persisted documents happen. Performance is identical as before (check `persisted-documents` bench in CI). --------- Co-authored-by: theguild-bot <bot@the-guild.dev>

This PR introduces Persisted Documents support with configurable extraction and storage, plus lot of e2e tests. Closes #311 --- Documentation PR: graphql-hive/docs#76 - Preview of [security/persisted-documents](https://dc3c070a-hive-platform-docs.theguild.workers.dev/graphql/hive/docs/router/security/persisted-documents) - Preview of [configuration/persisted_documents](https://dc3c070a-hive-platform-docs.theguild.workers.dev/graphql/hive/docs/router/configuration/persisted_documents) --- Supports document ID extraction from: - `documentId` body field or URL query param (by default) - Apollo-style `extensions.persistedQuery.sha256Hash` (by default) - custom `json_path` (like `doc_id` or `extensions.whatever.id` - `url_query_param` (like `?doc_id=123`) - `url_path_param` (like `/graphql/:id`) In the example below, we first look for the path pattern and then the query param. ```yaml persisted_documents: extractors: - type: url_path_param template: /:id # relative to configured endpoint - type: url_query_param name: id # `?id=123 ``` Supports different document storages: - file manifest (in Apollo and Key-Value Relay style formats) - Hive CDN (via `hive-console-sdk`) File storage has **watch mode** by default (works well with `relay-compiler --watch`), so when a file changes (we debounce the events for 150ms) the document manifest is reloaded and served fresh. Hive storage includes syntax validation of the provided document id. We make sure we don't send what `str.replace('~', '/')` produces to the Hive CDN without verification. If we do, people would see 404 with no info that doc id is incorrect. Includes `require_id: boolean` to control whether to require requests with document id only or not. Includes `log_missing_id_requests: bool` (false by default) that logs information about requests with no document id. Helpful if you migrate from regular to queryless requests. Regarding Hive CDN. We don't rely only on `appName~appVersion~documentId` format of the document id, but app's name and version can be inferred from client identification headers (`graphql-client-name` etc - configurable via telemetry settings). We support it for reasons mentioned in the Slack Canvas doc (better DX and reusable `clientAwarness` feature of Apollo Client). I also added two metrics to measure: - requests with no document id - so devs know that some requests still send no id - document resolution failures - so devs know that some requests with doc id that has no document text ## Noteworthy implementation details Persisted documents are implemented under `pipeline/persisted_documents/*` with clear split: - extraction (`extract/*`) - resolution (`resolve/*`) - runtime (`mod.rs`, `types.rs`) Closes #867 - as I introduced single-flight resolution of documents in the SDK. The **Err had to be cloanable** (otherwise I would have to change the API to return Arc<Err>), so some error enum variants in the SDK was converted to `String` instead of raw errors from 3rd-party libraries. I also added a **negative cache** to store non 2XX requests for 5s (configurable, but in SDK it's disabled by default) to not keep repeating the same requests that eventually give errors or 404s. I cleaned up and moved the code responsible for preparation of graphql params, decoding of GET and POST payloads into `GraphQLGetInput` and `GraphQLPostInput` and `OperationPreparation` structs. This way the flow is clear, like what happens when we receive GET request, what when we receive POST, and how it's all translated to what the rest of the pipeline expects. It's in `bin/router/src/pipeline/execution_request.rs`. I did bunch of tricks to make sure we're performant: - custom query param reader (based on `memchr`) - conditional extraction of non standard JSON fields (fields that are not `query`, `extensions` etc) - built-in extraction of `documentId` during deserialization - supafast validation of document ids (based on `memchr`) --- There are many new lines of code, but majority is just e2e tests. For reviewers, I recommend to check: - `docs/persisted-documents` to understand what I built and why - `bin/router/src/pipeline/persisted_documents` - pretty much everything related to persisted documents, how things are extracted, how documents are resolved - `bin/router/src/pipeline/execution_request.rs` - to understand how we convert POST and GET request into data consumed by the rest of the pipeline and this is when extraction and resolution of persisted documents happen. Performance is identical as before (check `persisted-documents` bench in CI). --------- Co-authored-by: theguild-bot <bot@the-guild.dev> Co-authored-by: kamilkisiela <8167190+kamilkisiela@users.noreply.github.com>

kamilkisiela added the waits for release Represents changes in a library that have not yet been released label Apr 2, 2026

kamilkisiela had a problem deploying to preview April 2, 2026 18:26 — with GitHub Actions Error

kamilkisiela temporarily deployed to storybook-preview April 2, 2026 18:26 — with GitHub Actions Inactive

kamilkisiela mentioned this pull request Apr 2, 2026

Persisted Documents graphql-hive/router#868

Merged

kamilkisiela temporarily deployed to storybook-preview April 2, 2026 18:29 — with GitHub Actions Inactive

kamilkisiela deployed to preview April 2, 2026 18:29 — with GitHub Actions View deployment

kamilkisiela added 3 commits April 17, 2026 11:55

Persisted Documents in Router

ec29125

Update persisted_documents.mdx

cf4e81c

asd

0554152

asd

624630c

kamilkisiela force-pushed the kamil-persisted-documents branch from da3ca54 to 624630c Compare April 17, 2026 10:09

kamilkisiela temporarily deployed to storybook-preview April 17, 2026 10:09 — with GitHub Actions Inactive

kamilkisiela deployed to preview April 17, 2026 10:09 — with GitHub Actions View deployment

Update operation-complexity.mdx

0193df8

kamilkisiela deployed to preview April 17, 2026 11:26 — with GitHub Actions View deployment

kamilkisiela temporarily deployed to storybook-preview April 17, 2026 11:26 — with GitHub Actions Inactive

product update

1d46a19

kamilkisiela deployed to preview April 17, 2026 16:11 — with GitHub Actions View deployment

kamilkisiela deployed to storybook-preview April 17, 2026 16:11 — with GitHub Actions View deployment

dotansimha approved these changes Apr 20, 2026

View reviewed changes

dotansimha changed the title ~~Persisted Documents in Router~~ docs(router): Persisted Documents Apr 20, 2026

dotansimha merged commit 90a37fa into main Apr 20, 2026
8 checks passed

dotansimha deleted the kamil-persisted-documents branch April 20, 2026 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(router): Persisted Documents#76

docs(router): Persisted Documents#76
dotansimha merged 6 commits into
mainfrom
kamil-persisted-documents

kamilkisiela commented Apr 2, 2026

Uh oh!

github-actions Bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kamilkisiela commented Apr 2, 2026

Uh oh!

github-actions Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 2, 2026 •

edited

Loading