From f6e013fce2b9aadac813e225fd88a4f12a31831b Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Wed, 1 Oct 2025 14:06:47 -0500 Subject: [PATCH 1/6] [docs] SQL Server HA failover --- .../ingest-data/sql-server/self-hosted.md | 97 +++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index 8d0691c9daa2d..5d711d839748f 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -276,6 +276,103 @@ available(also for PostgreSQL)." {{% sql-server-direct/next-steps %}} +## High Availability + +### Using SQL Server Always On Availability Groups + +To make your SQL Server source resilient to database failovers, configure +Materialize to connect through a SQL Server [Always On Availability Group (AG) +listener](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/listeners-client-connectivity-application-failover). +When a failover occurs, SQL Server drops the existing connection and routes new +connections to the new primary replica transparently. + +#### Prerequisites + +Before connecting Materialize to an AG, ensure: + +1. **Your AG listener is configured and accessible.** Materialize must connect + via the listener DNS name, not individual node hostnames. + +1. **CDC is enabled on all potential primary replicas.** SQL Server's Change + Data Capture metadata is **not** replicated across AG nodes. + +1. **CDC capture and cleanup jobs exist on all potential primary replicas.** + After a role change, the new primary must have these jobs to continue + replicating changes. + + SQL Server CDC metadata, including capture and cleanup jobs, **does not + replicate** to AG secondary replicas. After a failover, you must ensure the new + primary has CDC enabled and the required jobs are running. + + **Recommended approach:** Create an automated script or SQL Agent job that runs + on each potential primary after a role change: + + ```sql + USE YourDatabase; + + -- Enable CDC if not already enabled + IF NOT EXISTS (SELECT 1 FROM sys.databases WHERE name = 'YourDatabase' AND is_cdc_enabled = 1) + BEGIN + EXEC sys.sp_cdc_enable_db; + END + + -- Enable CDC on tables (if not already enabled) + IF NOT EXISTS (SELECT 1 FROM cdc.change_tables WHERE source_object_id = OBJECT_ID('schema.table_name')) + BEGIN + EXEC sys.sp_cdc_enable_table + @source_schema = 'schema', + @source_name = 'table_name', + @role_name = NULL; + END + + -- Create capture job if it doesn't exist + IF NOT EXISTS (SELECT 1 FROM msdb.dbo.cdc_jobs WHERE job_type = 'capture') + BEGIN + EXEC sys.sp_cdc_add_job @job_type = 'capture'; + END + + -- Create cleanup job if it doesn't exist + IF NOT EXISTS (SELECT 1 FROM msdb.dbo.cdc_jobs WHERE job_type = 'cleanup') + BEGIN + EXEC sys.sp_cdc_add_job @job_type = 'cleanup'; + -- Extend retention to cover expected failover + recovery time + EXEC sys.sp_cdc_change_job @job_type = 'cleanup', @retention = 43200; + END + ``` + + {{< note >}} + Adjust the `@retention` value based on your expected recovery time. The default + retention is ~3 days (4320 minutes). If CDC change data is pruned before + Materialize can ingest it after a failover, you must [drop and recreate the + source](/sql/drop-source/) to trigger a new snapshot. + {{< /note >}} + +#### Connecting to an AG listener + +Create your SQL Server connection using the **AG listener** as the host: + +```mzsql +CREATE SECRET sqlserver_pass AS ''; + +CREATE CONNECTION sqlserver_ag TO SQL SERVER ( + HOST 'my-ag-listener.example.com', -- AG listener DNS name + PORT 1433, + USER 'materialize', + PASSWORD SECRET sqlserver_pass, + DATABASE '' +); + +CREATE SOURCE mz_source + FROM SQL SERVER CONNECTION sqlserver_ag + FOR ALL TABLES; +``` + +When the AG fails over to a new primary, Materialize will: + +1. Detect the dropped connection +1. Reconnect to the AG listener (now pointing to the new primary) +1. Resume ingestion from the last persisted LSN + ## Considerations {{% include-md file="shared-content/sql-server-considerations.md" %}} From 116048450d0d7cc9216fd48cd39cedffae5447dc Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Wed, 1 Oct 2025 15:38:48 -0500 Subject: [PATCH 2/6] address feedback --- doc/user/content/ingest-data/sql-server/self-hosted.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index 5d711d839748f..cfc8b0843c6f9 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -241,7 +241,10 @@ scenarios, we recommend separating your workloads into multiple clusters for {{< note >}} For a new SQL Server source, if none of the replicating tables are receiving write queries, snapshotting may take up to an additional 5 minutes -to complete. For details, see [snapshot latency for inactive databases](#snapshot-latency-for-inactive-databases) +to complete. For details, see [snapshot latency for inactive databases](#snapshot-latency-for-inactive-databases). + +For production deployments with SQL Server Always On Availability Groups, see +[High Availability](#high-availability) for configuration guidance. {{}} Now that you've configured your database network, you can connect Materialize to @@ -322,13 +325,14 @@ Before connecting Materialize to an AG, ensure: EXEC sys.sp_cdc_enable_table @source_schema = 'schema', @source_name = 'table_name', - @role_name = NULL; + @role_name = NULL, + @supports_net_changes = 0; END -- Create capture job if it doesn't exist IF NOT EXISTS (SELECT 1 FROM msdb.dbo.cdc_jobs WHERE job_type = 'capture') BEGIN - EXEC sys.sp_cdc_add_job @job_type = 'capture'; + EXEC sys.sp_cdc_add_job @job_type = 'capture', @continuous = 1; END -- Create cleanup job if it doesn't exist From 85ead5227cbe5751e043583f801cacb33d35cad2 Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Wed, 1 Oct 2025 15:45:59 -0500 Subject: [PATCH 3/6] add warning about async mode --- .../content/ingest-data/sql-server/self-hosted.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index cfc8b0843c6f9..62a142b37e7c3 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -289,6 +289,17 @@ listener](https://learn.microsoft.com/en-us/sql/database-engine/availability-gro When a failover occurs, SQL Server drops the existing connection and routes new connections to the new primary replica transparently. +{{< warning >}} +**Availability modes and data consistency:** SQL Server AGs support two +[availability modes](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-ver17#availability-modes): +synchronous-commit and asynchronous-commit. + +With **asynchronous-commit mode**, transactions commit on the primary before +being sent to secondaries. If the primary fails before replicating recent +transactions, those changes will be lost and **Materialize will not ingest +them**. For guaranteed data consistency, use **synchronous-commit mode**. +{{< /warning >}} + #### Prerequisites Before connecting Materialize to an AG, ensure: From 53a82fec039e558cb6198ab0b7879196a1075fe7 Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Wed, 1 Oct 2025 15:49:16 -0500 Subject: [PATCH 4/6] fix formatting --- doc/user/content/ingest-data/sql-server/self-hosted.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index 62a142b37e7c3..bb00aa9a73939 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -281,8 +281,6 @@ available(also for PostgreSQL)." ## High Availability -### Using SQL Server Always On Availability Groups - To make your SQL Server source resilient to database failovers, configure Materialize to connect through a SQL Server [Always On Availability Group (AG) listener](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/listeners-client-connectivity-application-failover). From 6ec8bcc9dc5c22439f3b630aef32421637c30040 Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Fri, 3 Oct 2025 14:50:07 -0500 Subject: [PATCH 5/6] [docs] link to best practices --- doc/user/content/ingest-data/sql-server/self-hosted.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index bb00aa9a73939..65744b04cb409 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -296,6 +296,9 @@ With **asynchronous-commit mode**, transactions commit on the primary before being sent to secondaries. If the primary fails before replicating recent transactions, those changes will be lost and **Materialize will not ingest them**. For guaranteed data consistency, use **synchronous-commit mode**. + +For additional best practices on configuring CDC with availability groups, see +[Microsoft's documentation on replication agents with availability groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/replicate-track-change-data-capture-always-on-availability?view=sql-server-ver17#general-changes-to-replication-agents-to-support-availability-groups). {{< /warning >}} #### Prerequisites From 52d666f1ff61971bb4ff9c29c92bc3f74af3ae35 Mon Sep 17 00:00:00 2001 From: Seth Wiesman Date: Mon, 6 Oct 2025 15:55:33 -0500 Subject: [PATCH 6/6] Update doc/user/content/ingest-data/sql-server/self-hosted.md Co-authored-by: Kay Kim --- doc/user/content/ingest-data/sql-server/self-hosted.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/doc/user/content/ingest-data/sql-server/self-hosted.md b/doc/user/content/ingest-data/sql-server/self-hosted.md index 65744b04cb409..8f8e447f0e2a4 100644 --- a/doc/user/content/ingest-data/sql-server/self-hosted.md +++ b/doc/user/content/ingest-data/sql-server/self-hosted.md @@ -288,15 +288,17 @@ When a failover occurs, SQL Server drops the existing connection and routes new connections to the new primary replica transparently. {{< warning >}} -**Availability modes and data consistency:** SQL Server AGs support two +SQL Server AGs support two [availability modes](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-ver17#availability-modes): -synchronous-commit and asynchronous-commit. -With **asynchronous-commit mode**, transactions commit on the primary before +- **Asynchronous-commit mode**: Does not guarantee data consistency. Transactions commit on the primary before being sent to secondaries. If the primary fails before replicating recent transactions, those changes will be lost and **Materialize will not ingest -them**. For guaranteed data consistency, use **synchronous-commit mode**. +them**. +- **Synchronous-commit mode**: Guarantees data consistency. + +For guaranteed data consistency, use **synchronous-commit mode**. For additional best practices on configuring CDC with availability groups, see [Microsoft's documentation on replication agents with availability groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/replicate-track-change-data-capture-always-on-availability?view=sql-server-ver17#general-changes-to-replication-agents-to-support-availability-groups). {{< /warning >}}