diff --git a/packages/microsoft_sqlserver/_dev/build/docs/README.md b/packages/microsoft_sqlserver/_dev/build/docs/README.md index c98fbc5782f..ec25286fdd7 100644 --- a/packages/microsoft_sqlserver/_dev/build/docs/README.md +++ b/packages/microsoft_sqlserver/_dev/build/docs/README.md @@ -19,6 +19,7 @@ Find more details in [Logs](#logs). * `performance`: Comprehensive performance counters and objects available on the server. * `transaction_log`: Usage statistics and space utilization metrics for transaction logs. +* `availability_groups`: Health and synchronization metrics for Always On Availability Groups. Find more details in [Metrics](#metrics). @@ -39,6 +40,9 @@ If you browse Microsoft Developer Network (MSDN) for the following tables, you w - [sys.dm_db_log_stats (DB_ID)](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-log-stats-transact-sql?view=sql-server-ver16) (Available on SQL Server (MSSQL) 2016 (13.x) SP 2 and later) 2. `performance`: - [sys.dm_os_performance_counters](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-os-performance-counters-transact-sql?view=sql-server-ver16) +3. `availability_groups`: + - [sys.availability_groups](https://learn.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-availability-groups-transact-sql?view=sql-server-ver16) + - [sys.dm_hadr_availability_group_states](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-hadr-availability-group-states-transact-sql) Please make sure the user has the permissions to system as well as user-defined databases. For the particular user used in the integration, the following requirements are met: @@ -85,8 +89,6 @@ As part of the input configuration, you need to provide the user name, password * `host/instance_name` (e.g. `localhost/namedinstance_01`) * `host:named_instance_port` (e.g. `localhost:60873`) - - ### Configuration #### Audit @@ -137,6 +139,22 @@ Keep in mind that this feature is disabled by default and needs to be manually e When the password contains special characters, pass these special characters using URL encoding. +### Availability Groups Metrics + +Collects metrics related to Always On Availability Groups, including replica status and synchronization health. This dataset queries the following SQL Server tables: + +- `sys.availability_groups` +- `sys.dm_hadr_availability_group_states` + +**Note:** Always On Availability Groups must be enabled on your SQL Server instance for this dataset to collect metrics. This feature is available in SQL Server Enterprise and Standard editions (with limitations in Standard). + +**Prerequisites**: To collect Availability Groups metrics, ensure the following: + +1. Always On Availability Groups feature is enabled on the SQL Server instance. +2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*. + +Read more in [Monitor Availability Groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/monitoring-of-availability-groups-sql-server?view=sql-server-ver16) and [Always On Availability Groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-ver16) overview. + ## Logs ### audit @@ -187,5 +205,17 @@ Please refer to the following [document](https://www.elastic.co/guide/en/ecs/cur {{fields "transaction_log"}} +### availability_groups + +The Microsoft SQL Server `availability_groups` dataset provides metrics from the Always On Availability Groups DMVs (Dynamic Management Views). All availability_groups metrics will be available in the `sqlserver.metrics` field group. + +{{event "availability_groups"}} + +**ECS Field Reference** + +Please refer to the following [document](https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html) for detailed information on ECS fields. + +{{fields "availability_groups"}} + ## Alerting Rule Template {{alertRuleTemplates}} \ No newline at end of file diff --git a/packages/microsoft_sqlserver/changelog.yml b/packages/microsoft_sqlserver/changelog.yml index 9bd966d9a22..e11a075f23d 100644 --- a/packages/microsoft_sqlserver/changelog.yml +++ b/packages/microsoft_sqlserver/changelog.yml @@ -1,4 +1,9 @@ # newer versions go on top +- version: "2.15.1" + changes: + - description: Add health metrics for Always On Availability Groups. + type: enhancement + link: https://github.com/elastic/integrations/pull/16759 - version: "2.15.0" changes: - description: Update README with Alerting Rule Template. diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/agent/stream/stream.yml.hbs b/packages/microsoft_sqlserver/data_stream/availability_groups/agent/stream/stream.yml.hbs new file mode 100644 index 00000000000..763572bdf63 --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/agent/stream/stream.yml.hbs @@ -0,0 +1,31 @@ +metricsets: ["query"] +# Specify hosts in the below format. TODO: hosts need to be updated to support multiple entries. +hosts: + - sqlserver://{{username}}:{{password}}@{{hosts}} +period: {{period}} +raw_data.enabled: true +merge_results: false +driver: "mssql" +sql_queries: + # Availability groups metrics + - query: "SELECT + @@servername AS server_name, + ag.name, + CONVERT(NVARCHAR(36), ag.group_id) AS group_id, + ags.primary_replica, + ags.synchronization_health, + ags.synchronization_health_desc, + ags.primary_recovery_health, + ags.secondary_recovery_health +FROM sys.dm_hadr_availability_group_states ags +JOIN sys.availability_groups ag + ON ags.group_id = ag.group_id;" + response_format: table +{{#if processors}} +processors: +{{processors}} +{{/if}} +tags: +{{#if preserve_sql_queries}} +- preserve_sql_queries +{{/if}} \ No newline at end of file diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/elasticsearch/ingest_pipeline/default.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/elasticsearch/ingest_pipeline/default.yml new file mode 100644 index 00000000000..ac019cf147e --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/elasticsearch/ingest_pipeline/default.yml @@ -0,0 +1,21 @@ +--- +description: Pipeline for processing mssql always on availability group metrics +processors: +- remove: + field: sql.driver + ignore_missing: true + ignore_failure: true +- rename: + field: sql + target_field: mssql + ignore_missing: true + ignore_failure: true +- remove: + field: mssql.query + if: "ctx?.tags == null || !(ctx.tags.contains('preserve_sql_queries'))" + ignore_missing: true + ignore_failure: true +on_failure: +- set: + field: error.message + value: "{{ _ingest.on_failure_message }}" diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/fields/agent.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/agent.yml new file mode 100644 index 00000000000..2bc58530bac --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/agent.yml @@ -0,0 +1,33 @@ +- name: cloud + title: Cloud + group: 2 + description: Fields related to the cloud or infrastructure the events are coming from. + footnote: 'Examples: If Metricbeat is running on an EC2 host and fetches data from its host, the cloud info contains the data about this machine. If Metricbeat runs on a remote machine outside the cloud and fetches data from a service running in the cloud, the field contains cloud data from the machine the service is running on.' + type: group + fields: + - name: image.id + type: keyword + description: Image ID for the cloud instance. +- name: host + title: Host + group: 2 + description: 'A host is defined as a general computing instance. ECS host.* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes.' + type: group + fields: + - name: containerized + type: boolean + description: > + If the host is a container. + + - name: os.build + type: keyword + example: "18D109" + description: > + OS build information. + + - name: os.codename + type: keyword + example: "stretch" + description: > + OS codename, if any. + diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/fields/base-fields.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/base-fields.yml new file mode 100644 index 00000000000..7c798f4534c --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/base-fields.yml @@ -0,0 +1,12 @@ +- name: data_stream.type + type: constant_keyword + description: Data stream type. +- name: data_stream.dataset + type: constant_keyword + description: Data stream dataset. +- name: data_stream.namespace + type: constant_keyword + description: Data stream namespace. +- name: '@timestamp' + type: date + description: Event timestamp. diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/fields/ecs.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/ecs.yml new file mode 100644 index 00000000000..d0a842204dc --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/ecs.yml @@ -0,0 +1,27 @@ +- external: ecs + name: service.address + dimension: true +- external: ecs + name: host.name + dimension: true +- external: ecs + name: agent.id + dimension: true +- external: ecs + name: cloud.instance.id + dimension: true +- external: ecs + name: cloud.provider + dimension: true +- external: ecs + name: container.id + dimension: true +- external: ecs + name: cloud.account.id + dimension: true +- external: ecs + name: cloud.region + dimension: true +- external: ecs + name: cloud.availability_zone + dimension: true diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/fields/fields.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/fields.yml new file mode 100644 index 00000000000..f55639c2824 --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/fields/fields.yml @@ -0,0 +1,33 @@ +- name: mssql + type: group + fields: + - name: metrics + type: group + fields: + - name: server_name + type: keyword + description: SQL Server instance name where metrics were collected. + - name: name + type: keyword + description: Availability group name. + - name: group_id + type: keyword + description: Unique identifier (GUID) of the availability group. + - name: primary_replica + type: keyword + description: Server name of the current primary replica. + - name: synchronization_health + type: keyword + description: AG synchronization health status (0 = NOT_HEALTHY, 1 = PARTIALLY_HEALTHY, 2 = HEALTHY). + - name: synchronization_health_desc + type: keyword + description: Text description of AG synchronization health. + - name: primary_recovery_health + type: keyword + description: Primary replica recovery health (0 = ONLINE_IN_PROGRESS, 1 = ONLINE. NULL on secondary replicas). + - name: secondary_recovery_health + type: keyword + description: Secondary replica recovery health (0 = ONLINE_IN_PROGRESS, 1 = ONLINE. NULL on primary replicas). + - name: query + type: keyword + description: The SQL queries executed. \ No newline at end of file diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/manifest.yml b/packages/microsoft_sqlserver/data_stream/availability_groups/manifest.yml new file mode 100644 index 00000000000..7dfa89ab035 --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/manifest.yml @@ -0,0 +1,33 @@ +title: "Microsoft SQL Server Always On Availability Groups metrics" +type: metrics +streams: + - input: sql/metrics + vars: + - name: period + type: text + title: Period + multi: false + required: true + show_user: true + default: 5m + - name: preserve_sql_queries + required: true + show_user: false + title: Preserve SQL Queries + description: Preserves SQL queries for debugging purposes. This feature is available in Elastic stack version 8.18 and later. + type: bool + multi: false + default: false + - name: processors + type: yaml + title: Processors + multi: false + required: false + show_user: false + description: > + Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the events are shipped. See [Processors](https://www.elastic.co/guide/en/fleet/current/elastic-agent-processor-configuration.html) for details. + + title: Microsoft SQL Server Always On Availability Groups metrics + description: Collect Microsoft SQL Server Always On Availability Groups metrics. Monitors overall AG health, synchronization status, and primary/secondary recovery state. +elasticsearch: + index_mode: "time_series" diff --git a/packages/microsoft_sqlserver/data_stream/availability_groups/sample_event.json b/packages/microsoft_sqlserver/data_stream/availability_groups/sample_event.json new file mode 100644 index 00000000000..71e43a4df23 --- /dev/null +++ b/packages/microsoft_sqlserver/data_stream/availability_groups/sample_event.json @@ -0,0 +1,72 @@ +{ + "@timestamp": "2026-01-09T13:26:41.427Z", + "agent": { + "ephemeral_id": "9879229c-2cb1-4082-a6e7-289de62de193", + "id": "819a1c28-7fc6-4490-8456-c04d37abce3d", + "name": "elastic-agent-12312", + "type": "metricbeat", + "version": "8.19.8" + }, + "data_stream": { + "dataset": "microsoft_sqlserver.availability_groups", + "namespace": "default", + "type": "metrics" + }, + "ecs": { + "version": "8.0.0" + }, + "elastic_agent": { + "id": "819a1c28-7fc6-4490-8456-c04d37abce3d", + "snapshot": false, + "version": "8.19.8" + }, + "event": { + "agent_id_status": "verified", + "dataset": "microsoft_sqlserver.availability_groups", + "duration": 797064208, + "ingested": "2026-01-09T13:26:43Z", + "module": "sql" + }, + "host": { + "architecture": "arm64", + "hostname": "localhost", + "id": "11111-8ABC-59B6-95BC-11111111", + "ip": [ + "192.168.242.2", + "192.168.255.6" + ], + "mac": [ + "02-42-C0-A8-F2-02", + "02-42-C0-A8-FF-06" + ], + "name": "elastic-agent-12312", + "os": { + "build": "24G84", + "family": "darwin", + "kernel": "24.6.0", + "name": "macOS", + "platform": "darwin", + "type": "macos", + "version": "15.6" + } + }, + "metricset": { + "name": "query", + "period": 300000 + }, + "mssql": { + "metrics": { + "group_id": "13495A5F-460C-4D93-BFA2-477E9F555A5A", + "name": "ag_test", + "primary_recovery_health": 1, + "primary_replica": "myVm-1", + "server_name": "myVm-1", + "synchronization_health": 2, + "synchronization_health_desc": "HEALTHY" + } + }, + "service": { + "address": "microsoft_sqlserver", + "type": "sql" + } +} \ No newline at end of file diff --git a/packages/microsoft_sqlserver/docs/README.md b/packages/microsoft_sqlserver/docs/README.md index 8186e870a5c..89cfee4a81a 100644 --- a/packages/microsoft_sqlserver/docs/README.md +++ b/packages/microsoft_sqlserver/docs/README.md @@ -19,6 +19,7 @@ Find more details in [Logs](#logs). * `performance`: Comprehensive performance counters and objects available on the server. * `transaction_log`: Usage statistics and space utilization metrics for transaction logs. +* `availability_groups`: Health and synchronization metrics for Always On Availability Groups. Find more details in [Metrics](#metrics). @@ -39,6 +40,9 @@ If you browse Microsoft Developer Network (MSDN) for the following tables, you w - [sys.dm_db_log_stats (DB_ID)](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-log-stats-transact-sql?view=sql-server-ver16) (Available on SQL Server (MSSQL) 2016 (13.x) SP 2 and later) 2. `performance`: - [sys.dm_os_performance_counters](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-os-performance-counters-transact-sql?view=sql-server-ver16) +3. `availability_groups`: + - [sys.availability_groups](https://learn.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-availability-groups-transact-sql?view=sql-server-ver16) + - [sys.dm_hadr_availability_group_states](https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-hadr-availability-group-states-transact-sql) Please make sure the user has the permissions to system as well as user-defined databases. For the particular user used in the integration, the following requirements are met: @@ -85,8 +89,6 @@ As part of the input configuration, you need to provide the user name, password * `host/instance_name` (e.g. `localhost/namedinstance_01`) * `host:named_instance_port` (e.g. `localhost:60873`) - - ### Configuration #### Audit @@ -137,6 +139,22 @@ Keep in mind that this feature is disabled by default and needs to be manually e When the password contains special characters, pass these special characters using URL encoding. +### Availability Groups Metrics + +Collects metrics related to Always On Availability Groups, including replica status and synchronization health. This dataset queries the following SQL Server tables: + +- `sys.availability_groups` +- `sys.dm_hadr_availability_group_states` + +**Note:** Always On Availability Groups must be enabled on your SQL Server instance for this dataset to collect metrics. This feature is available in SQL Server Enterprise and Standard editions (with limitations in Standard). + +**Prerequisites**: To collect Availability Groups metrics, ensure the following: + +1. Always On Availability Groups feature is enabled on the SQL Server instance. +2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*. + +Read more in [Monitor Availability Groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/monitoring-of-availability-groups-sql-server?view=sql-server-ver16) and [Always On Availability Groups](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-ver16) overview. + ## Logs ### audit @@ -625,6 +643,123 @@ Please refer to the following [document](https://www.elastic.co/guide/en/ecs/cur | service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword | | | +### availability_groups + +The Microsoft SQL Server `availability_groups` dataset provides metrics from the Always On Availability Groups DMVs (Dynamic Management Views). All availability_groups metrics will be available in the `sqlserver.metrics` field group. + +An example event for `availability_groups` looks as following: + +```json +{ + "@timestamp": "2026-01-09T13:26:41.427Z", + "agent": { + "ephemeral_id": "9879229c-2cb1-4082-a6e7-289de62de193", + "id": "819a1c28-7fc6-4490-8456-c04d37abce3d", + "name": "elastic-agent-12312", + "type": "metricbeat", + "version": "8.19.8" + }, + "data_stream": { + "dataset": "microsoft_sqlserver.availability_groups", + "namespace": "default", + "type": "metrics" + }, + "ecs": { + "version": "8.0.0" + }, + "elastic_agent": { + "id": "819a1c28-7fc6-4490-8456-c04d37abce3d", + "snapshot": false, + "version": "8.19.8" + }, + "event": { + "agent_id_status": "verified", + "dataset": "microsoft_sqlserver.availability_groups", + "duration": 797064208, + "ingested": "2026-01-09T13:26:43Z", + "module": "sql" + }, + "host": { + "architecture": "arm64", + "hostname": "localhost", + "id": "11111-8ABC-59B6-95BC-11111111", + "ip": [ + "192.168.242.2", + "192.168.255.6" + ], + "mac": [ + "02-42-C0-A8-F2-02", + "02-42-C0-A8-FF-06" + ], + "name": "elastic-agent-12312", + "os": { + "build": "24G84", + "family": "darwin", + "kernel": "24.6.0", + "name": "macOS", + "platform": "darwin", + "type": "macos", + "version": "15.6" + } + }, + "metricset": { + "name": "query", + "period": 300000 + }, + "mssql": { + "metrics": { + "group_id": "13495A5F-460C-4D93-BFA2-477E9F555A5A", + "name": "ag_test", + "primary_recovery_health": 1, + "primary_replica": "myVm-1", + "server_name": "myVm-1", + "synchronization_health": 2, + "synchronization_health_desc": "HEALTHY" + } + }, + "service": { + "address": "microsoft_sqlserver", + "type": "sql" + } +} +``` + +**ECS Field Reference** + +Please refer to the following [document](https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html) for detailed information on ECS fields. + +**Exported fields** + +| Field | Description | Type | +|---|---|---| +| @timestamp | Event timestamp. | date | +| agent.id | Unique identifier of this agent (if one exists). Example: For Beats this would be beat.id. | keyword | +| cloud.account.id | The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. | keyword | +| cloud.availability_zone | Availability zone in which this host, resource, or service is located. | keyword | +| cloud.image.id | Image ID for the cloud instance. | keyword | +| cloud.instance.id | Instance ID of the host machine. | keyword | +| cloud.provider | Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean. | keyword | +| cloud.region | Region in which this host, resource, or service is located. | keyword | +| container.id | Unique container id. | keyword | +| data_stream.dataset | Data stream dataset. | constant_keyword | +| data_stream.namespace | Data stream namespace. | constant_keyword | +| data_stream.type | Data stream type. | constant_keyword | +| host.containerized | If the host is a container. | boolean | +| host.name | Name of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host. | keyword | +| host.os.build | OS build information. | keyword | +| host.os.codename | OS codename, if any. | keyword | +| mssql.metrics.group_id | Unique identifier (GUID) of the availability group. | keyword | +| mssql.metrics.name | Availability group name. | keyword | +| mssql.metrics.primary_recovery_health | Primary replica recovery health (0 = ONLINE_IN_PROGRESS, 1 = ONLINE. NULL on secondary replicas). | keyword | +| mssql.metrics.primary_replica | Server name of the current primary replica. | keyword | +| mssql.metrics.secondary_recovery_health | Secondary replica recovery health (0 = ONLINE_IN_PROGRESS, 1 = ONLINE. NULL on primary replicas). | keyword | +| mssql.metrics.server_name | SQL Server instance name where metrics were collected. | keyword | +| mssql.metrics.synchronization_health | AG synchronization health status (0 = NOT_HEALTHY, 1 = PARTIALLY_HEALTHY, 2 = HEALTHY). | keyword | +| mssql.metrics.synchronization_health_desc | Text description of AG synchronization health. | keyword | +| mssql.query | The SQL queries executed. | keyword | +| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword | + + ## Alerting Rule Template Alert rule templates provide pre-defined configurations for creating alert rules in Kibana. diff --git a/packages/microsoft_sqlserver/manifest.yml b/packages/microsoft_sqlserver/manifest.yml index f90155473d3..9c4299c9aaf 100644 --- a/packages/microsoft_sqlserver/manifest.yml +++ b/packages/microsoft_sqlserver/manifest.yml @@ -1,7 +1,7 @@ format_version: "3.4.0" name: microsoft_sqlserver title: "Microsoft SQL Server" -version: "2.15.0" +version: "2.15.1" description: Collect events from Microsoft SQL Server with Elastic Agent type: integration categories: