Microsoft SQL

Microsoft SQL

Available solutions




Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/mssql_odbc


Template DB MSSQL by ODBC

Overview

For Zabbix version: 5.0
The template is developed for monitoring DBMS Microsoft SQL Server via ODBC.

This template was tested on:

  • Zabbix, version 5.0
  • Microsoft SQL, version 2017, 2019

Setup

See Zabbix template operation for basic instructions.

  1. Create an MSSQL user for monitoring.
    View Server State and View Any Definition permissions should be granted to the user.
    For more information please see the MSSQL documentation:
    https://docs.microsoft.com/en-us/sql/relational-databases/security/authentication-access/create-a-database-user?view=sql-server-ver15
    https://docs.microsoft.com/en-us/sql/t-sql/statements/grant-server-permissions-transact-sql?view=sql-server-ver15
  2. Set the user name and password in host macros ({$MSSQL.USER} and {$MSSQL.PASSWORD}).
    Do not forget to install the Microsoft ODBC driver on the Zabbix server or the Zabbix proxy.
    See Microsoft documentation for instructions: https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15.
    Note! Credentials in the odbc.ini do not work for MSSQL.
  3. For named instance set the value of {$MSSQL.INSTANCE} macro as MSSQL$instance name, please.

The "Service's TCP port state" item uses {HOST.CONN} and {$MSSQL.PORT} macros to check the availability of the MSSQL instance. If you have instance on non-default TCP port, set the port in your section of odbc.ini in line Server = IP or FQDN name, port.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$MSSQL.AVERAGE_WAIT_TIME.MAX}

The maximum average wait time in ms for trigger expression.

500
{$MSSQL.BUFFER_CACHE_RATIO.MIN.CRIT}

The minimum % buffer cache hit ratio for high trigger expression.

30
{$MSSQL.BUFFER_CACHE_RATIO.MIN.WARN}

The minimum % buffer cache hit ratio for warning trigger expression.

50
{$MSSQL.DBNAME.MATCHES}

This macro is used in database discovery. It can be overridden on the host or linked template level.

.*
{$MSSQL.DBNAME.NOT_MATCHES}

This macro is used in database discovery. It can be overridden on the host or linked template level.

master|tempdb|model|msdb
{$MSSQL.DEADLOCKS.MAX}

The maximum deadlocks per second for trigger expression.

1
{$MSSQL.DSN}

System data source name.

<Put your DSN here>
{$MSSQL.FREE_LIST_STALLS.MAX}

The maximum free list stalls per second for trigger expression.

2
{$MSSQL.INSTANCE}

The instance name for the default instance is SQLServer. For named instance set macro value as MSSQL$instance name.

SQLServer
{$MSSQL.LAZY_WRITES.MAX}

The maximum lazy writes per second for trigger expression.

20
{$MSSQL.LOCK_REQUESTS.MAX}

The maximum lock requests per second for trigger expression.

1000
{$MSSQL.LOCK_TIMEOUTS.MAX}

The maximum lock timeouts per second for trigger expression.

1
{$MSSQL.LOG_FLUSH_WAITS.MAX}

The maximum log flush waits per second for trigger expression.

1
{$MSSQL.LOG_FLUSH_WAIT_TIME.MAX}

The maximum log flush wait time in ms for trigger expression.

1
{$MSSQL.PAGE_LIFE_EXPECTANCY.MIN}

The minimum page life expectancy for trigger expression.

300
{$MSSQL.PAGE_READS.MAX}

The maximum page reads per second for trigger expression.

90
{$MSSQL.PAGE_WRITES.MAX}

The maximum page writes per second for trigger expression.

90
{$MSSQL.PASSWORD}

MSSQL user password.

<Put your password here>
{$MSSQL.PERCENT_COMPILATIONS.MAX}

The maximum percent of Transact-SQL compilations for trigger expression.

10
{$MSSQL.PERCENT_LOG_USED.MAX}

The maximum percent log used for trigger expression.

80
{$MSSQL.PERCENT_READAHEAD.MAX}

The maximum percent of pages read/sec in anticipation of use for trigger expression.

20
{$MSSQL.PERCENT_RECOMPILATIONS.MAX}

The maximum percent of Transact-SQL recompilations for trigger expression.

10
{$MSSQL.PORT}

MSSQL TCP port.

1433
{$MSSQL.USER}

MSSQL username.

<Put your username here>
{$MSSQL.WORKTABLES_FROM_CACHE_RATIO.MIN.CRIT}

The minimum % worktables from cache ratio for high trigger expression.

90
{$MSSQL.WORK_FILES.MAX}

The maximum number of work files created per second for trigger expression.

20
{$MSSQL.WORK_TABLES.MAX}

The maximum number of work tables created per second for trigger expression.

20

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Database discovery

Scanning databases in DBMS.

ODBC db.odbc.discovery[dbname,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Filter:

AND

- A: {#DBNAME} MATCHES_REGEX {$MSSQL.DBNAME.MATCHES}

- B: {#DBNAME} NOT_MATCHES_REGEX {$MSSQL.DBNAME.NOT_MATCHES}

Availability groups discovery

Discovery the existed availability groups.

ODBC db.odbc.discovery[availability_groups,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Local database discovery

Discovery of the local availability databases.

ODBC db.odbc.discovery[local_db,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Non-local database discovery

Discovery of the non-local (not local to the SQL Server instance) availability databases.

ODBC db.odbc.discovery[non-local_db,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Replication discovery

Discovery of the database replicas.

ODBC db.odbc.discovery[replicas,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Mirroring discovery

To see the row for a database other than master or tempdb, you must

either be the database owner or have at least ALTER ANY DATABASE or VIEW ANY

DATABASE server-level permission or CREATE DATABASE permission in the master

database. To see non-NULL values on a mirror database, you must be a member

of the sysadmin fixed server role.

ODBC db.odbc.discovery[mirrors,"{$MSSQL.DSN}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 1d

Items collected

Group Name Description Type Key and additional info
MSSQL MSSQL: Service's TCP port state

Test the availability of MS SQL Server on TCP port.

SIMPLE net.tcp.service[tcp,{HOST.CONN},{$MSSQL.PORT}]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 10m

MSSQL MSSQL: Version

MS SQL Server version.

DEPENDENT mssql.version

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}' && @.counter_name=='Version')].instance_name.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1d

MSSQL MSSQL: Uptime

MS SQL Server uptime in 'N days, hh:mm:ss' format.

DEPENDENT mssql.uptime

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}' && @.counter_name=='Uptime')].cntr_value.first()

MSSQL MSSQL: Forwarded records per second

Number of records per second fetched through forwarded record pointers.

DEPENDENT mssql.forwarded_records_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Forwarded Records/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Full scans per second

Number of unrestricted full scans per second. These can be either base-table or full-index scans. Values greater than 1 or 2 indicate that we are having table / Index page scans. If we see high CPU then we need to investigate this counter, otherwise if the full scans are on small tables we can ignore this counter.

DEPENDENT mssql.full_scans_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Full Scans/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Index searches per second

Number of index searches per second. These are used to start a range scan, reposition a range scan, revalidate a scan point, fetch a single index record, and search down the index to locate where to insert a new row.

DEPENDENT mssql.index_searches_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Index Searches/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Page splits per second

Number of page splits per second that occur as the result of overflowing index pages.

DEPENDENT mssql.page_splits_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Page Splits/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Work files created per second

Number of work files created per second. For example, work files could be used to store temporary results for hash joins and hash aggregates.

DEPENDENT mssql.workfiles_created_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Workfiles Created/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Work tables created per second

Number of work tables created per second. For example, work tables could be used to store temporary results for query spool, lob variables, XML variables, and cursors.

DEPENDENT mssql.worktables_created_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Worktables Created/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Table lock escalations per second

Number of times locks on a table were escalated to the TABLE or HoBT granularity.

DEPENDENT mssql.table_lock_escalations.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='Table Lock Escalations/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Worktables from cache ratio

Percentage of work tables created where the initial two pages of the work table were not allocated but were immediately available from the work table cache.

DEPENDENT mssql.worktables_from_cache_ratio

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Access Methods' && @.counter_name=='WorktablesFromCacheRatio')].cntr_value.first()

MSSQL MSSQL: Buffer cache hit ratio

Indicates the percentage of pages found in the buffer cache without having to read from disk. The ratio is the total number of cache hits divided by the total number of cache lookups over the last few thousand page accesses. After a long period of time, the ratio moves very little. Because reading from the cache is much less expensive than reading from disk, you want this ratio to be high. Generally, you can increase the buffer cache hit ratio by increasing the amount of memory available to SQL Server or by using the buffer pool extension feature.

DEPENDENT mssql.buffer_cache_hit_ratio

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='BufferCacheHitRatio')].cntr_value.first()

MSSQL MSSQL: Checkpoint pages per second

Indicates the number of pages flushed to disk per second by a checkpoint or other operation that require all dirty pages to be flushed.

DEPENDENT mssql.checkpoint_pages_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Checkpoint pages/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Database pages

Indicates the number of pages in the buffer pool with database content.

DEPENDENT mssql.database_pages

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Database pages')].cntr_value.first()

MSSQL MSSQL: Free list stalls per second

Indicates the number of requests per second that had to wait for a free page.

DEPENDENT mssql.free_list_stalls_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Free list stalls/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Lazy writes per second

Indicates the number of buffers written per second by the buffer manager's lazy writer. The lazy writer is a system process that flushes out batches of dirty, aged buffers (buffers that contain changes that must be written back to disk before the buffer can be reused for a different page) and makes them available to user processes. The lazy writer eliminates the need to perform frequent checkpoints in order to create available buffers.

DEPENDENT mssql.lazy_writes_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Lazy writes/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Page life expectancy

Indicates the number of seconds a page will stay in the buffer pool without references.

DEPENDENT mssql.page_life_expectancy

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page life expectancy')].cntr_value.first()

MSSQL MSSQL: Page lookups per second

Indicates the number of requests per second to find a page in the buffer pool.

DEPENDENT mssql.page_lookups_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page lookups/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Page reads per second

Indicates the number of physical database page reads that are issued per second. This statistic displays the total number of physical page reads across all databases. Because physical I/O is expensive, you may be able to minimize the cost, either by using a larger data cache, intelligent indexes, and more efficient queries, or by changing the database design.

DEPENDENT mssql.page_reads_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page reads/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Page writes per second

Indicates the number of physical database page writes that are issued per second.

DEPENDENT mssql.page_writes_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Page writes/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Read-ahead pages per second

Indicates the number of pages read per second in anticipation of use.

DEPENDENT mssql.readahead_pages_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Readahead pages/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Target pages

The optimum number of pages in the buffer pool.

DEPENDENT mssql.target_pages

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Buffer Manager' && @.counter_name=='Target pages')].cntr_value.first()

MSSQL MSSQL: Total data file size

Total size of all the data files.

DEPENDENT mssql.data_files_size

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Data File(s) Size (KB)' && @.instance_name=='_Total')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Total log file size

Total size of all the transaction log files.

DEPENDENT mssql.log_files_size

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Size (KB)' && @.instance_name=='_Total')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Total log file used size

The cumulative used size of all the log files in the database.

DEPENDENT mssql.log_files_used_size

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Used Size (KB)' && @.instance_name=='_Total')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Total transactions per second

Total number of transactions started for all databases per second.

DEPENDENT mssql.transactions_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Transactions/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Logins per second

Total number of logins started per second. This does not include pooled connections. Any value over 2 may indicate insufcient connection pooling.

DEPENDENT mssql.logins_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Logins/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Logouts per second

Total number of logout operations started per second. Any value over 2 may indicate insufcient connection pooling.

DEPENDENT mssql.logouts_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Logouts/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Number of blocked processes

Number of currently blocked processes.

DEPENDENT mssql.processes_blocked

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='Processes blocked')].cntr_value.first()

MSSQL MSSQL: Number users connected

Number of users connected to MS SQL Server.

DEPENDENT mssql.user_connections

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:General Statistics' && @.counter_name=='User Connections')].cntr_value.first()

MSSQL MSSQL: Average latch wait time

Average latch wait time (in milliseconds) for latch requests that had to wait.

CALCULATED mssql.average_latch_wait_time

Expression:

(last(mssql.average_latch_wait_time_raw) - prev(mssql.average_latch_wait_time_raw)) / (last(mssql.average_latch_wait_time_base) - prev(mssql.average_latch_wait_time_base) + (last(mssql.average_latch_wait_time_base) - prev(mssql.average_latch_wait_time_base)=0))
MSSQL MSSQL: Latch waits per second

The number of latch requests that could not be granted immediately. Latches are lightweight means of holding a very transient server resource, such as an address in memory.

DEPENDENT mssql.latch_waits_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Latch Waits/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total latch wait Time

Total latch wait time (in milliseconds) for latch requests in the last second. This value should stay stable compared to the number of latch waits per second.

DEPENDENT mssql.total_latch_wait_time

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Total Latch Wait Time (ms)')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total average wait time

The average wait time, in milliseconds, for each lock request that had to wait.

CALCULATED mssql.average_wait_time

Expression:

(last(mssql.average_wait_time_raw) - prev(mssql.average_wait_time_raw)) / (last(mssql.average_wait_time_base) - prev(mssql.average_wait_time_base) + (last(mssql.average_wait_time_base) - prev(mssql.average_wait_time_base)=0))
MSSQL MSSQL: Total lock requests per second

Number of new locks and lock conversions per second requested from the lock manager.

DEPENDENT mssql.lock_requests_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Requests/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total lock requests per second that timed out

Number of lock requests per second that timed out, including requests for NOWAIT locks.

DEPENDENT mssql.lock_timeouts_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Timeouts/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total lock requests per second that required waiting

Number of lock requests per second that required the caller to wait.

DEPENDENT mssql.lock_waits_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Waits/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Lock wait time

Average of total wait time (in milliseconds) for locks in the last second.

DEPENDENT mssql.lock_wait_time

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Lock Wait Time (ms)' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total lock requests per second that have deadlocks

Number of lock requests per second that resulted in a deadlock.

DEPENDENT mssql.number_deadlocks_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Number of Deadlocks/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Granted Workspace Memory

Specifies the total amount of memory currently granted to executing processes, such as hash, sort, bulk copy, and index creation operations.

DEPENDENT mssql.granted_workspace_memory

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Granted Workspace Memory (KB)')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Maximum workspace memory

Indicates the maximum amount of memory available for executing processes, such as hash, sort, bulk copy, and index creation operations.

DEPENDENT mssql.maximum_workspace_memory

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Maximum Workspace Memory (KB)')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Memory grants outstanding

Specifies the total number of processes that have successfully acquired a workspace memory grant.

DEPENDENT mssql.memory_grants_outstanding

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Memory Grants Outstanding')].cntr_value.first()

MSSQL MSSQL: Memory grants pending

Specifies the total number of processes waiting for a workspace memory grant.

DEPENDENT mssql.memory_grants_pending

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Memory Grants Pending')].cntr_value.first()

MSSQL MSSQL: Target server memory

Indicates the ideal amount of memory the server can consume.

DEPENDENT mssql.target_server_memory

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Target Server Memory (KB)')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Total server memory

Specifies the amount of memory the server has committed using the memory manager.

DEPENDENT mssql.total_server_memory

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Memory Manager' && @.counter_name=='Total Server Memory (KB)')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL: Cache hit ratio

Ratio between cache hits and lookups.

DEPENDENT mssql.cache_hit_ratio

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='CacheHitRatio' && @.instance_name=='_Total')].cntr_value.first()

MSSQL MSSQL: Cache object counts

Number of cache objects in the cache.

DEPENDENT mssql.cache_object_counts

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Object Counts' && @.instance_name=='_Total')].cntr_value.first()

MSSQL MSSQL: Cache objects in use

Number of cache objects in use.

DEPENDENT mssql.cache_objects_in_use

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Objects in use' && @.instance_name=='_Total')].cntr_value.first()

MSSQL MSSQL: Cache pages

Number of 8-kilobyte (KB) pages used by cache objects.

DEPENDENT mssql.cache_pages

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Plan Cache' && @.counter_name=='Cache Pages' && @.instance_name=='_Total')].cntr_value.first()

MSSQL MSSQL: Errors per second (DB offline errors)

Number of errors per second.

DEPENDENT mssql.offline_errors_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='DB Offline Errors')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Errors per second (Info errors)

Number of errors per second.

DEPENDENT mssql.info_errors_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='Info Errors')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Errors per second (Kill connection errors)

Number of errors per second.

DEPENDENT mssql.kill_connection_errors_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='Kill Connection Errors')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Errors per second (User errors)

Number of errors per second.

DEPENDENT mssql.user_errors_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='User Errors')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total errors per second

Number of errors per second.

DEPENDENT mssql.errors_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Errors' && @.counter_name=='Errors/sec' && @.instance_name=='_Total')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Auto-param attempts per second

Number of auto-parameterization attempts per second. Total should be the sum of the failed, safe, and unsafe auto-parameterizations. Auto-parameterization occurs when an instance of SQL Server tries to parameterize a Transact-SQL request by replacing some literals with parameters so that reuse of the resulting cached execution plan across multiple similar-looking requests is possible. Note that auto-parameterizations are also known as simple parameterizations in newer versions of SQL Server. This counter does not include forced parameterizations.

DEPENDENT mssql.autoparam_attempts_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Auto-Param Attempts/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Batch requests per second

Number of Transact-SQL command batches received per second. This statistic is affected by all constraints (such as I/O, number of users, cache size, complexity of requests, and so on). High batch requests mean good throughput.

DEPENDENT mssql.batch_requests_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Batch Requests/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Percent of Adhoc queries running

The ratio of SQL compilations per second to Batch requests per second in percent.

CALCULATED mssql.percent_of_adhoc_queries

Expression:

last(mssql.sql_compilations_sec.rate) * 100 / (last(mssql.batch_requests_sec.rate) + (last(mssql.batch_requests_sec.rate)=0))
MSSQL MSSQL: Percent of Recompiled Transact-SQL Objects

The ratio of SQL re-compilations per second to SQL compilations per second in percent.

CALCULATED mssql.percent_recompilations_to_compilations

Expression:

last(mssql.sql_recompilations_sec.rate) * 100 / (last(mssql.sql_compilations_sec.rate) + (last(mssql.sql_compilations_sec.rate)=0))
MSSQL MSSQL: Full scans to Index searches ratio

The ratio of Full scans per second to Index searches per second. The threshold recommendation is strictly for OLTP workloads.

CALCULATED mssql.scan_to_search

Expression:

last(mssql.full_scans_sec.rate) / (last(mssql.index_searches_sec.rate) + (last(mssql.index_searches_sec.rate)=0))
MSSQL MSSQL: Failed auto-params per second

Number of failed auto-parameterization attempts per second. This should be small. Note that auto-parameterizations are also known as simple parameterizations in later versions of SQL Server.

DEPENDENT mssql.failed_autoparams_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Failed Auto-Params/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Safe auto-params per second

Number of safe auto-parameterization attempts per second. Safe refers to a determination that a cached execution plan can be shared between different similar-looking Transact-SQL statements. SQL Server makes many auto-parameterization attempts some of which turn out to be safe and others fail. Note that auto-parameterizations are also known as simple parameterizations in later versions of SQL Server. This does not include forced parameterizations.

DEPENDENT mssql.safe_autoparams_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Safe Auto-Params/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: SQL compilations per second

Number of SQL compilations per second. Indicates the number of times the compile code path is entered. Includes compiles caused by statement-level recompilations in SQL Server. After SQL Server user activity is stable, this value reaches a steady state.

DEPENDENT mssql.sql_compilations_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='SQL Compilations/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: SQL re-compilations per second

Number of statement recompiles per second. Counts the number of times statement recompiles are triggered. Generally, you want the recompiles to be low.

DEPENDENT mssql.sql_recompilations_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='SQL Re-Compilations/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Unsafe auto-params per second

Number of unsafe auto-parameterization attempts per second. For example, the query has some characteristics that prevent the cached plan from being shared. These are designated as unsafe. This does not count the number of forced parameterizations.

DEPENDENT mssql.unsafe_autoparams_sec.rate

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:SQL Statistics' && @.counter_name=='Unsafe Auto-Params/sec')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL: Total transactions number

The number of currently active transactions of all types.

DEPENDENT mssql.transactions

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Transactions' && @.counter_name=='Transactions')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': State

0 = ONLINE

1 = RESTORING

2 = RECOVERING

SQL Server 2008 and later

3 = RECOVERY_PENDING

SQL Server 2008 and later

4 = SUSPECT

5 = EMERGENCY

SQL Server 2008 and later

6 = OFFLINE

SQL Server 2008 and later

7 = COPYING

Azure SQL Database Active Geo-Replication

10 = OFFLINE_SECONDARY

Azure SQL Database Active Geo-Replication

DEPENDENT mssql.db.state["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='State' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- DISCARD_UNCHANGED_HEARTBEAT: 15m

MSSQL MSSQL DB '{#DBNAME}': Active transactions

Number of active transactions for the database.

DEPENDENT mssql.db.active_transactions["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Active Transactions' && @.instance_name=='{#DBNAME}')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': Data file size

Cumulative size of all the data files in the database including any automatic growth. Monitoring this counter is useful, for example, for determining the correct size of tempdb.

DEPENDENT mssql.db.data_files_size["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Data File(s) Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL DB '{#DBNAME}': Log bytes flushed per second

Total number of log bytes flushed per second. Useful for determining trends and utilization of the transaction log.

DEPENDENT mssql.db.log_bytes_flushed_sec.rate["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Bytes Flushed/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL DB '{#DBNAME}': Log file size

Cumulative size of all the transaction log files in the database.

DEPENDENT mssql.db.log_files_size["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL DB '{#DBNAME}': Log file used size

The cumulative used size of all the log files in the database.

DEPENDENT mssql.db.log_files_used_size["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log File(s) Used Size (KB)' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- MULTIPLIER: 1024

MSSQL MSSQL DB '{#DBNAME}': Log flushes per second

Number of log flushes per second.

DEPENDENT mssql.db.log_flushes_sec.rate["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flushes/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL DB '{#DBNAME}': Log flush waits per second

Number of commits per second waiting for the log flush.

DEPENDENT mssql.db.log_flush_waits_sec.rate["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flush Waits/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL DB '{#DBNAME}': Log flush wait time

Total wait time (in milliseconds) to flush the log. On an AlwaysOn secondary database, this value indicates the wait time for log records to be hardened to disk.

DEPENDENT mssql.db.log_flush_wait_time["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Flush Wait Time' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL DB '{#DBNAME}': Log growths

Total number of times the transaction log for the database has been expanded.

DEPENDENT mssql.db.log_growths["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Growths' && @.instance_name=='{#DBNAME}')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': Log shrinks

Total number of times the transaction log for the database has been shrunk.

DEPENDENT mssql.db.log_shrinks["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Shrinks' && @.instance_name=='{#DBNAME}')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': Log truncations

The number of times the transaction log has been shrunk.

DEPENDENT mssql.db.log_truncations["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Log Truncations' && @.instance_name=='{#DBNAME}')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': Percent log used

Percentage of space in the log that is in use.

DEPENDENT mssql.db.percent_log_used["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Percent Log Used' && @.instance_name=='{#DBNAME}')].cntr_value.first()

MSSQL MSSQL DB '{#DBNAME}': Transactions per second

Number of transactions started for the database per second.

DEPENDENT mssql.db.transactions_sec.rate["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Databases' && @.counter_name=='Transactions/sec' && @.instance_name=='{#DBNAME}')].cntr_value.first()

- CHANGE_PER_SECOND

MSSQL MSSQL AG '{#GROUP_NAME}': Primary replica recovery health

Indicates the recovery health of the primary replica:

0 = In progress

1 = Online

2 = Unavailable

DEPENDENT mssql.primary_recovery_health["{#GROUP_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}')].primary_recovery_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}': Primary replica name

Name of the server instance that is hosting the current primary replica.

DEPENDENT mssql.primary_replica["{#GROUP_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}')].primary_replica.first()

- DISCARD_UNCHANGED_HEARTBEAT: 3h

MSSQL MSSQL AG '{#GROUP_NAME}': Secondary replica recovery health

Indicates the recovery health of a secondary replica replica:

0 = In progress

1 = Online

2 = Unavailable

DEPENDENT mssql.secondary_recovery_health["{#GROUP_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}')].secondary_recovery_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}': Synchronization health

Reflects a rollup of the synchronization_health of all availability replicas in the availability group:

0: Not healthy. None of the availability replicas have a healthy.

1: Partially healthy. The synchronization health of some, but not all, availability replicas is healthy.

2: Healthy. The synchronization health of every availability replica is healthy.

DEPENDENT mssql.synchronization_health["{#GROUP_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}')].synchronization_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': State

0 = Online

1 = Restoring

2 = Recovering

3 = Recovery pending

4 = Suspect

5 = Emergency

6 = Offline

DEPENDENT mssql.local_db.state["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].database_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Suspended

Database state:

0 = Resumed

1 = Suspended

DEPENDENT mssql.local_db.is_suspended["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].is_suspended.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Synchronization health

Reflects the intersection of the synchronization state of a database that is joined to the availability group on the availability replica and the availability mode of the availability replica (synchronous-commit or asynchronous-commit mode):

0 = Not healthy. The synchronization_state of the database is 0 (NOT SYNCHRONIZING).

1 = Partially healthy. A database on a synchronous-commit availability replica is considered

partially healthy if synchronization_state is 1 (SYNCHRONIZING).

2 = Healthy. A database on an synchronous-commit availability replica is considered healthy if synchronization_state is 2 (SYNCHRONIZED), and a database on an asynchronous-commit availability replica is considered healthy if synchronization_state is 1 (SYNCHRONIZING).

DEPENDENT mssql.local_db.synchronization_health["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].synchronization_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Non-Local DB '{#DBNAME}': Log queue size

Amount of log records of the primary database that has not been sent to the secondary databases.

DEPENDENT mssql.non-local_db.log_send_queue_size["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].log_send_queue_size.first()

- MULTIPLIER: 1024

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Non-Local DB '{#DBNAME}': Redo log queue size

Amount of log records in the log files of the secondary replica that has not yet been redone.

DEPENDENT mssql.non-local_db.redo_queue_size["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}' && @.dbname=='{#DBNAME}')].redo_queue_size.first()

- MULTIPLIER: 1024

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Connected state

Whether a secondary replica is currently connected to the primary replica:

0 : Disconnected. The response of an availability replica to the DISCONNECTED state depends on its role:

On the primary replica, if a secondary replica is disconnected, its secondary databases are marked as NOT SYNCHRONIZED on the primary replica, which waits for the secondary to reconnect;

On a secondary replica, upon detecting that it is disconnected, the secondary replica attempts to reconnect to the primary replica.

1 : Connected. Each primary replica tracks the connection state for every secondary replica in the same availability group. Secondary replicas track the connection state of only the primary replica.

DEPENDENT mssql.replica.connected_state["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].connected_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Is local

Whether the replica is local:

0 = Indicates a remote secondary replica in an availability group whose primary replica is hosted by the local server instance. This value occurs only on the primary replica location.

1 = Indicates a local replica. On secondary replicas, this is the only available value for the availability group to which the replica belongs.

DEPENDENT mssql.replica.is_local["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].is_local.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Join state

0 = Not joined

1 = Joined, standalone instance

2 = Joined, failover cluster instance

DEPENDENT mssql.replica.join_state["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].join_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Operational state

Current operational state of the replica:

0 = Pending failover

1 = Pending

2 = Online

3 = Offline

4 = Failed

5 = Failed, no quorum

6 = Not local

DEPENDENT mssql.replica.operational_state["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].operational_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Recovery health

Rollup of the database_state column of the sys.dm_hadr_database_replica_states dynamic management view:

0 : In progress. At least one joined database has a database state other than ONLINE

(database_state is not 0).

1 : Online. All the joined databases have a database state of ONLINE (database_state is 0).

DEPENDENT mssql.replica.recovery_health["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].recovery_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Role

Current Always On availability groups role of a local replica or a connected remote replica:

0 = Resolving

1 = Primary

2 = Secondary

DEPENDENT mssql.replica.role["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].role.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Sync health

Reflects a rollup of the database synchronization state (synchronization_state)of all joined availability databases (also known as replicas) and the availability mode of the replica (synchronous-commit or asynchronous-commit mode). The rollup will reflect the least healthy accumulated state the databases on the replica:

0 : Not healthy. At least one joined database is in the NOT SYNCHRONIZING state.

1 : Partially healthy. Some replicas are not in the target synchronization state: synchronous-commit replicas should be synchronized, and asynchronous-commit replicas should be synchronizing.

2 : Healthy. All replicas are in the target synchronization state: synchronous-commit replicas are synchronized, and asynchronous-commit replicas are synchronizing.

DEPENDENT mssql.replica.synchronization_health["{#GROUPNAME}{#REPLICA_NAME}"]

Preprocessing:

- JSONPATH: $[?(@.group_name=='{#GROUP_NAME}' && @.replica_name=='{#REPLICA_NAME}')].synchronization_health.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL Mirroring '{#DBNAME}': Role

Current role of the local database plays in the database mirroring session.

1 = Principal

2 = Mirror

DEPENDENT mssql.mirroring.role["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.dbname=='{#DBNAME}')].mirroring_role.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL Mirroring '{#DBNAME}': Role sequence

The number of times that mirroring partners have switched the principal and mirror roles due to a failover or forced service.

DEPENDENT mssql.mirroring.role_sequence["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.dbname=='{#DBNAME}')].mirroring_role_sequence.first()

- SIMPLE_CHANGE

MSSQL MSSQL Mirroring '{#DBNAME}': State

State of the mirror database and of the database mirroring session.

0 = Suspended

1 = Disconnected from the other partner

2 = Synchronizing

3 = Pending Failover

4 = Synchronized

5 = The partners are not synchronized. Failover is not possible now.

6 = The partners are synchronized. Failover is potentially possible. For information about the requirements for failover see, Database Mirroring Operating Modes.

DEPENDENT mssql.mirroring.state["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.dbname=='{#DBNAME}')].mirroring_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL Mirroring '{#DBNAME}': Witness state

State of the witness in the database mirroring session of the database:

0 = Unknown

1 = Connected

2 = Disconnected

DEPENDENT mssql.mirroring.witness_state["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.dbname=='{#DBNAME}')].mirroring_witness_state.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

MSSQL MSSQL Mirroring '{#DBNAME}': Safety level

Safety setting for updates on the mirror database:

0 = Unknown state

1 = Off [asynchronous]

2 = Full [synchronous]

DEPENDENT mssql.mirroring.safety_level["{#DBNAME}"]

Preprocessing:

- JSONPATH: $[?(@.dbname=='{#DBNAME}')].mirroring_safety_level.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zabbix_raw_items MSSQL: Get performance counters

The item gets server global status information.

ODBC db.odbc.get[get_status_variables,"{$MSSQL.DSN}"]

Expression:

SELECT object_name, counter_name, instance_name, cntr_value FROM sys.dm_os_performance_counters UNION SELECT '{$MSSQL.INSTANCE}' as object_name, 'Version' as counter_name, @@version as instance_name, 0 as cntr_value UNION SELECT '{$MSSQL.INSTANCE}' as object_name, 'Uptime' as counter_name, '' as instance_name, DATEDIFF(second, sqlserver_start_time, GETDATE()) as cntr_value FROM sys.dm_os_sys_info UNION SELECT '{$MSSQL.INSTANCE}:Databases' as object_name, 'State' as counter_name, name as instance_name, state as cntr_value FROM sys.databases UNION SELECT a.object_name, 'BufferCacheHitRatio' as counter_name, '' as instance_name, cast(a.cntr_value*100.0 / b.cntr_value as dec(3,0)) as cntr_value FROM sys.dm_os_performance_counters a JOIN (SELECT cntr_value,OBJECT_NAME FROM sys.dm_os_performance_counters WHERE counter_name = 'Buffer cache hit ratio base' AND OBJECT_NAME = '{$MSSQL.INSTANCE}:Buffer Manager') b ON a.OBJECT_NAME = b.OBJECT_NAME WHERE a.counter_name = 'Buffer cache hit ratio' AND a.OBJECT_NAME = '{$MSSQL.INSTANCE}:Buffer Manager' UNION SELECT a.object_name, 'WorktablesFromCacheRatio' as counter_name, '' as instance_name, cast(a.cntr_value*100.0 / b.cntr_value as dec(3,0)) as cntr_value FROM sys.dm_os_performance_counters a JOIN (SELECT cntr_value,OBJECT_NAME FROM sys.dm_os_performance_counters WHERE counter_name = 'Worktables From Cache Base' AND OBJECT_NAME = '{$MSSQL.INSTANCE}:Access Methods') b ON a.OBJECT_NAME = b.OBJECT_NAME WHERE a.counter_name = 'Worktables From Cache Ratio' AND a.OBJECT_NAME = '{$MSSQL.INSTANCE}:Access Methods' UNION SELECT a.object_name, 'CacheHitRatio' as counter_name, '_Total' as instance_name, cast(a.cntr_value*100.0 / b.cntr_value as dec(3,0)) as cntr_value FROM sys.dm_os_performance_counters a JOIN (SELECT cntr_value,OBJECT_NAME FROM sys.dm_os_performance_counters WHERE counter_name = 'Cache Hit Ratio base' AND OBJECT_NAME = '{$MSSQL.INSTANCE}:Plan Cache' AND instance_name = '_Total') b ON a.OBJECT_NAME = b.OBJECT_NAME WHERE a.counter_name = 'Cache Hit Ratio' AND a.OBJECT_NAME = '{$MSSQL.INSTANCE}:Plan Cache' AND instance_name = '_Total'
Zabbix_raw_items MSSQL: Average latch wait time raw

Average latch wait time (in milliseconds) for latch requests that had to wait.

DEPENDENT mssql.average_latch_wait_time_raw

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Average Latch Wait Time (ms)')].cntr_value.first()

Zabbix_raw_items MSSQL: Average latch wait time base

For internal use only.

DEPENDENT mssql.average_latch_wait_time_base

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Latches' && @.counter_name=='Average Latch Wait Time Base')].cntr_value.first()

Zabbix_raw_items MSSQL: Total average wait time raw

Average amount of wait time (in milliseconds) for each lock request that resulted in a wait. Information for all locks.

DEPENDENT mssql.average_wait_time_raw

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Average Wait Time (ms)' && @.instance_name=='_Total')].cntr_value.first()

Zabbix_raw_items MSSQL: Total average wait time base

For internal use only.

DEPENDENT mssql.average_wait_time_base

Preprocessing:

- JSONPATH: $[?(@.object_name=='{$MSSQL.INSTANCE}:Locks' && @.counter_name=='Average Wait Time Base' && @.instance_name=='_Total')].cntr_value.first()

Zabbix_raw_items MSSQL AG '{#GROUP_NAME}': Get replica states

Getting replica states - name, primary and secondary health, synchronization health.

ODBC db.odbc.get[{#GROUP_NAME}_replica_states,"{$MSSQL.DSN}"]

Expression:

SELECT ag.name as group_name, ISNULL(ags.primary_recovery_health, 2) as primary_recovery_health, ISNULL(ags.primary_replica, 'Unknown') as primary_replica, ISNULL(ags.secondary_recovery_health, 2) as secondary_recovery_health, ags.synchronization_health as synchronization_health FROM sys.dm_hadr_availability_group_states ags JOIN sys.availability_groups ag ON ag.group_id = ags.group_id
Zabbix_raw_items MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': Get local DB states

Getting the states of the local availability database.

ODBC db.odbc.get["{#GROUPNAME}{#DBNAME}_local_db.states","{$MSSQL.DSN}"]

Expression:

SELECT drs.database_state as database_state, drs.is_suspended as is_suspended, drs.synchronization_health as synchronization_health, ag.name as group_name, arcs.replica_server_name as replica_name, db_name(drs.database_id) as dbname FROM sys.dm_hadr_database_replica_states drs JOIN sys.dm_hadr_availability_replica_cluster_states arcs ON arcs.replica_id = drs.replica_id JOIN sys.availability_groups ag ON ag.group_id = arcs.group_id JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = arcs.replica_id
Zabbix_raw_items MSSQL AG '{#GROUP_NAME}' Non-Local DB '{#DBNAME}': Get non-local DB states

Getting the states of the non-local availability database.

ODBC db.odbc.get["{#GROUPNAME}{#DBNAME}_non-local_db.states","{$MSSQL.DSN}"]

Expression:

SELECT drs.log_send_queue_size as log_send_queue_size, drs.redo_queue_size as redo_queue_size, ag.name as group_name, arcs.replica_server_name as replica_name, db_name(drs.database_id) as dbname FROM sys.dm_hadr_database_replica_states drs JOIN sys.dm_hadr_availability_replica_cluster_states arcs ON arcs.replica_id = drs.replica_id JOIN sys.availability_groups ag ON ag.group_id = arcs.group_id JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = arcs.replica_id
Zabbix_raw_items MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': Get the replica state

Getting the database replica states.

ODBC db.odbc.get["{#GROUPNAME}{#REPLICA_NAME}_replica.state","{$MSSQL.DSN}"]

Expression:

SELECT ars.connected_state as connected_state, ars.is_local as is_local, arcs.join_state as join_state, ISNULL(ars.operational_state,6) as operational_state, ISNULL(ars.recovery_health,2) as recovery_health, ars.role as role, ars.synchronization_health as synchronization_health, ag.name as group_name, arcs.replica_server_name as replica_name FROM sys.dm_hadr_availability_replica_cluster_states as arcs JOIN sys.availability_groups ag ON ag.group_id = arcs.group_id JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = arcs.replica_id WHERE ag.name = '{#GROUP_NAME}' AND arcs.replica_server_name = '{#REPLICA_NAME}'
Zabbix_raw_items MSSQL Mirroring '{#DBNAME}': Get the mirror state

Getting mirrors state

ODBC db.odbc.get["{#DBNAME}_mirroring_state","{$MSSQL.DSN}"]

Expression:

SELECT ISNULL(m.mirroring_role,0) as mirroring_role, ISNULL(m.mirroring_role_sequence,0) as mirroring_role_sequence, ISNULL(m.mirroring_state,7) as mirroring_state, ISNULL(m.mirroring_witness_state,3) as mirroring_witness_state, ISNULL(m.mirroring_safety_level,3) as mirroring_safety_level, db_name(m.database_id) as dbname FROM sys.database_mirroring as m WHERE m.mirroring_state_desc IS NOT NULL

Triggers

Name Description Expression Severity Dependencies and additional info
MSSQL: Service is unavailable

The TCP port of the MS SQL Server service is currently unavailable.

{TEMPLATE_NAME:net.tcp.service[tcp,{HOST.CONN},{$MSSQL.PORT}].last()}=0 DISASTER
MSSQL: Version has changed (new version value received: {ITEM.VALUE})

MSSQL version has changed. Ack to close.

{TEMPLATE_NAME:mssql.version.diff()}=1 and {TEMPLATE_NAME:mssql.version.strlen()}>0 INFO

Manual close: YES

MSSQL: has been restarted (uptime < 10m)

Uptime is less than 10 minutes

{TEMPLATE_NAME:mssql.uptime.last()}<10m INFO

Manual close: YES

MSSQL: Failed to fetch info data (or no data for 30m)

Zabbix has not received data for items for the last 30 minutes.

{TEMPLATE_NAME:mssql.uptime.nodata(30m)}=1 INFO

Depends on:

- MSSQL: Service is unavailable

MSSQL: Too frequently pointers using

Rows with varchar columns can experience expansion when varchar values are updated with a longer string. In the case where the row cannot fit in the existing page, the row migrates and access to the row will traverse a pointer. This only happens on heaps (tables without clustered indexes). Evaluate clustered index for heap tables. In cases where clustered indexes cannot be used, drop non-clustered indexes, build a clustered index to reorg pages and rows, drop the clustered index, then recreate non-clustered indexes.

{TEMPLATE_NAME:mssql.forwarded_records_sec.rate.last()} * 100 > 10 * {Template DB MSSQL by ODBC:mssql.batch_requests_sec.rate.last()} WARNING
MSSQL: Number work files created per second is high (over {$MSSQL.WORK_FILES.MAX} for 5m)

Too many work files created per second to store temporary results for hash joins and hash aggregates.

{TEMPLATE_NAME:mssql.workfiles_created_sec.rate.min(5m)}>{$MSSQL.WORK_FILES.MAX} AVERAGE
MSSQL: Number work tables created per second is high (over {$MSSQL.WORK_TABLES.MAX} for 5m)

Too many work tables created per second to store temporary results for query spool, lob variables, XML variables, and cursors.

{TEMPLATE_NAME:mssql.worktables_created_sec.rate.min(5m)}>{$MSSQL.WORK_TABLES.MAX} AVERAGE
MSSQL: Percentage of work tables available from the work table cache is low (below {$MSSQL.WORKTABLES_FROM_CACHE_RATIO.MIN.CRIT}% for 5m)

A value less than 90% may indicate insufcient memory, since execution plans are being dropped, or may indicate, on 32-bit systems, the need for an upgrade to a 64-bit system

{TEMPLATE_NAME:mssql.worktables_from_cache_ratio.max(5m)}<{$MSSQL.WORKTABLES_FROM_CACHE_RATIO.MIN.CRIT} HIGH
MSSQL: Percentage of the buffer cache efficiency is low (below {$MSSQL.BUFFER_CACHE_RATIO.MIN.CRIT}% for 5m)

Too low buffer cache hit ratio.

{TEMPLATE_NAME:mssql.buffer_cache_hit_ratio.max(5m)}<{$MSSQL.BUFFER_CACHE_RATIO.MIN.CRIT} HIGH
MSSQL: Percentage of the buffer cache efficiency is low (below {$MSSQL.BUFFER_CACHE_RATIO.MIN.WARN}% for 5m)

Low buffer cache hit ratio.

{TEMPLATE_NAME:mssql.buffer_cache_hit_ratio.max(5m)}<{$MSSQL.BUFFER_CACHE_RATIO.MIN.WARN} WARNING

Depends on:

- MSSQL: Percentage of the buffer cache efficiency is low (below {$MSSQL.BUFFER_CACHE_RATIO.MIN.CRIT}% for 5m)

MSSQL: Number of rps waiting for a free page is high (over {$MSSQL.FREE_LIST_STALLS.MAX} for 5m)

Some requests have to wait for a free page.

{TEMPLATE_NAME:mssql.free_list_stalls_sec.rate.min(5m)}>{$MSSQL.FREE_LIST_STALLS.MAX} WARNING
MSSQL: Number of buffers written per second by the lazy writer is high (over {$MSSQL.LAZY_WRITES.MAX} for 5m)

The number of buffers written per second by the buffer manager's lazy writer exceeds the threshold.

{TEMPLATE_NAME:mssql.lazy_writes_sec.rate.min(5m)}>{$MSSQL.LAZY_WRITES.MAX} WARNING
MSSQL: Page life expectancy is low (less {$MSSQL.PAGE_LIFE_EXPECTANCY.MIN}min for 15m)

The page stays in the buffer pool without references of less time than the threshold value.

{TEMPLATE_NAME:mssql.page_life_expectancy.max(15m)}<{$MSSQL.PAGE_LIFE_EXPECTANCY.MIN} HIGH
MSSQL: Number of physical database page reads per second is high (over {$MSSQL.PAGE_READS.MAX} for 5m)

The physical database page reads are issued too frequently.

{TEMPLATE_NAME:mssql.page_reads_sec.rate.min(5m)}>{$MSSQL.PAGE_READS.MAX} WARNING
MSSQL: Number of physical database page writes per second is high (over {$MSSQL.PAGE_WRITES.MAX} for 5m)

The physical database page writes are issued too frequently.

{TEMPLATE_NAME:mssql.page_writes_sec.rate.min(5m)}>{$MSSQL.PAGE_WRITES.MAX} WARNING
MSSQL: Too many physical reads occurring

If this value is makes up even a sizeable minority of total Page Reads/sec (say, greater than 20% of total page reads), you may have too many physical reads occurring.

{TEMPLATE_NAME:mssql.readahead_pages_sec.rate.last()} > {$MSSQL.PERCENT_READAHEAD.MAX} / 100 * {Template DB MSSQL by ODBC:mssql.page_reads_sec.rate.last()} WARNING
MSSQL: Total average wait time for locks is high (over {$MSSQL.AVERAGE_WAIT_TIME.MAX}ms for 5m)

An average wait time longer than 500ms may indicate excessive blocking. This value should generally correlate to 'Lock Waits/sec' and move up or down with it accordingly.

{TEMPLATE_NAME:mssql.average_wait_time.min(5m)}>{$MSSQL.AVERAGE_WAIT_TIME.MAX} WARNING
MSSQL: Total number of locks per second is high (over {$MSSQL.LOCK_REQUESTS.MAX} for 5m)

The number of new locks and lock conversions per second requested from the lock manager is high.

{TEMPLATE_NAME:mssql.lock_requests_sec.rate.min(5m)}>{$MSSQL.LOCK_REQUESTS.MAX} WARNING
MSSQL: Total lock requests per second that timed out is high (over {$MSSQL.LOCK_TIMEOUTS.MAX} for 5m)

The total number of lock requests per second that timed out, including requests for NOWAIT locks is high.

{TEMPLATE_NAME:mssql.lock_timeouts_sec.rate.min(5m)}>{$MSSQL.LOCK_TIMEOUTS.MAX} WARNING
MSSQL: Some blocking is occurring for 5m

Values greater than zero indicate at least some blocking is occurring, while a value of zero can quickly eliminate blocking as a potential root-cause problem.

{TEMPLATE_NAME:mssql.lock_waits_sec.rate.min(5m)}>0 AVERAGE
MSSQL: Number of deadlock is high (over {$MSSQL.DEADLOCKS.MAX} for 5m)

Too many deadlocks are occurring currently.

{TEMPLATE_NAME:mssql.number_deadlocks_sec.rate.min(5m)}>{$MSSQL.DEADLOCKS.MAX} AVERAGE
MSSQL: Percent of adhoc queries running is over {$MSSQL.PERCENT_COMPILATIONS.MAX}% for 15m

The lower this value is the better. High values often indicate excessive adhoc querying and should be as low as possible. If excessive adhoc querying is happening, try rewriting the queries as procedures or invoke the queries using sp_executeSQL. When rewriting isn’t possible, consider using a plan guide or setting the database to parameterization forced mode.

{TEMPLATE_NAME:mssql.percent_of_adhoc_queries.min(15m)} > {$MSSQL.PERCENT_COMPILATIONS.MAX} WARNING
MSSQL: Percent of times statement recompiles is over {$MSSQL.PERCENT_RECOMPILATIONS.MAX}% for 15m

This number should be at or near zero, since recompiles can cause deadlocks and exclusive compile locks. This counter’s value should follow in proportion to “Batch Requests/sec” and “SQL Compilations/sec”. This needs to be nil in your system as much as possible.

{TEMPLATE_NAME:mssql.percent_recompilations_to_compilations.min(15m)} > {$MSSQL.PERCENT_RECOMPILATIONS.MAX} WARNING
MSSQL: Index and table scans are often than index searches for 15m

Index searches are preferable to index and table scans. For OLTP applications, optimize for more index searches and less scans (preferably, 1 full scan for every 1000 index searches). Index and table scans are expensive I/O operations.

{TEMPLATE_NAME:mssql.scan_to_search.min(15m)} > 0.001 WARNING
MSSQL DB '{#DBNAME}': State is {ITEM.VALUE}

The DB has a non-working state.

{TEMPLATE_NAME:mssql.db.state["{#DBNAME}"].last()}>1 HIGH
MSSQL DB '{#DBNAME}': Number of commits waiting for the log flush is high (over {$MSSQL.LOG_FLUSH_WAITS.MAX:"{#DBNAME}"}/sec for 5m)

Too many commits are waiting for the log flush.

{TEMPLATE_NAME:mssql.db.log_flush_waits_sec.rate["{#DBNAME}"].min(5m)}>{$MSSQL.LOG_FLUSH_WAITS.MAX:"{#DBNAME}"} WARNING
MSSQL DB '{#DBNAME}': Total wait time to flush the log is high (over {$MSSQL.LOG_FLUSH_WAIT_TIME.MAX:"{#DBNAME}"}ms for 5m)

The wait time to flush the log is too long.

{TEMPLATE_NAME:mssql.db.log_flush_wait_time["{#DBNAME}"].min(5m)}>{$MSSQL.LOG_FLUSH_WAIT_TIME.MAX:"{#DBNAME}"} WARNING
MSSQL DB '{#DBNAME}': Percent of log using is high (over {$MSSQL.PERCENT_LOG_USED.MAX:"{#DBNAME}"}% for 5m)

There's not enough space left in the log.

{TEMPLATE_NAME:mssql.db.percent_log_used["{#DBNAME}"].min(5m)}>{$MSSQL.PERCENT_LOG_USED.MAX:"{#DBNAME}"} WARNING
MSSQL AG '{#GROUP_NAME}': Primary replica recovery health in progress

The primary replica is in the synchronization process.

{TEMPLATE_NAME:mssql.primary_recovery_health["{#GROUP_NAME}"].last()}=0 WARNING
MSSQL AG '{#GROUP_NAME}': Secondary replica recovery health in progress

The secondary replica is in the synchronization process.

{TEMPLATE_NAME:mssql.secondary_recovery_health["{#GROUP_NAME}"].last()}=0 WARNING
MSSQL AG '{#GROUP_NAME}': All replicas unhealthy

None of the availability replicas have a healthy.

{TEMPLATE_NAME:mssql.synchronization_health["{#GROUP_NAME}"].last()}=0 DISASTER
MSSQL AG '{#GROUP_NAME}': Some replicas unhealthy

The synchronization health of some, but not all, availability replicas is healthy.

{TEMPLATE_NAME:mssql.synchronization_health["{#GROUP_NAME}"].last()}=1 HIGH
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}

The local availability database has a non-working state.

{TEMPLATE_NAME:mssql.local_db.state["{#DBNAME}"].last()}>0 WARNING
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is Not healthy

The synchronization state of the local availability database is NOT SYNCHRONIZING.

{TEMPLATE_NAME:mssql.local_db.synchronization_health["{#DBNAME}"].last()}=0 HIGH
MSSQL AG '{#GROUP_NAME}' Local DB '{#DBNAME}': "{#DBNAME}" is Partially healthy

A database on a synchronous-commit availability replica is considered partially healthy if synchronization state is SYNCHRONIZING.

{TEMPLATE_NAME:mssql.local_db.synchronization_health["{#DBNAME}"].last()}=1 AVERAGE
MSSQL AG '{#GROUP_NAME}' Non-Local DB '{#DBNAME}': Log queue size is growing

The log records of the primary database do not send to the secondary databases.

{TEMPLATE_NAME:mssql.non-local_db.log_send_queue_size["{#DBNAME}"].last(#1)}>{TEMPLATE_NAME:mssql.non-local_db.log_send_queue_size["{#DBNAME}"].last(#2)} and {TEMPLATE_NAME:mssql.non-local_db.log_send_queue_size["{#DBNAME}"].last(#2)}>{TEMPLATE_NAME:mssql.non-local_db.log_send_queue_size["{#DBNAME}"].last(#3)} HIGH
MSSQL AG '{#GROUP_NAME}' Non-Local DB '{#DBNAME}': Redo log queue size is growing

The log records in the log files of the secondary replica have not yet been redone.

{TEMPLATE_NAME:mssql.non-local_db.redo_queue_size["{#DBNAME}"].last(#1)}>{TEMPLATE_NAME:mssql.non-local_db.redo_queue_size["{#DBNAME}"].last(#2)} and {TEMPLATE_NAME:mssql.non-local_db.redo_queue_size["{#DBNAME}"].last(#2)}>{TEMPLATE_NAME:mssql.non-local_db.redo_queue_size["{#DBNAME}"].last(#3)} HIGH
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is disconnected

The response of an availability replica to the DISCONNECTED state depends on its role:

On the primary replica, if a secondary replica is disconnected, its secondary databases are marked as NOT SYNCHRONIZED on the primary replica, which waits for the secondary to reconnect; On a secondary replica, upon detecting that it is disconnected, the secondary replica attempts to reconnect to the primary replica.

{TEMPLATE_NAME:mssql.replica.connected_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=0 and {Template DB MSSQL by ODBC:mssql.replica.role["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=2 WARNING
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is {ITEM.VALUE}

The operational state of the replica in a given availability group is "Pending" or "Offline".

{TEMPLATE_NAME:mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=0 or {TEMPLATE_NAME:mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=1 or {TEMPLATE_NAME:mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=3 WARNING
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is {ITEM.VALUE}

The operational state of the replica in a given availability group is "Failed".

{TEMPLATE_NAME:mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=4 AVERAGE
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is {ITEM.VALUE}

The operational state of the replica in a given availability group is "Failed, no quorum".

{TEMPLATE_NAME:mssql.replica.operational_state["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=5 HIGH
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} Recovery in progress

At least one joined database has a database state other than ONLINE.

{TEMPLATE_NAME:mssql.replica.recovery_health["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=0 INFO
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is Not healthy

At least one joined database is in the NOT SYNCHRONIZING state.

{TEMPLATE_NAME:mssql.replica.synchronization_health["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=0 AVERAGE
MSSQL AG '{#GROUP_NAME}' Replica '{#REPLICA_NAME}': {#REPLICA_NAME} is Partially healthy

Some replicas are not in the target synchronization state: synchronous-commit replicas should be synchronized, and asynchronous-commit replicas should be synchronizing.

{TEMPLATE_NAME:mssql.replica.synchronization_health["{#GROUP_NAME}_{#REPLICA_NAME}"].last()}=1 WARNING
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}

The state of the mirror database and of the database mirroring session is "Suspended", "Disconnected from the other partner", or "Synchronizing".

{TEMPLATE_NAME:mssql.mirroring.state["{#DBNAME}"].last()}>=0 and {TEMPLATE_NAME:mssql.mirroring.state["{#DBNAME}"].last()}<=2 INFO
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}

The state of the mirror database and of the database mirroring session is "Pending Failover".

{TEMPLATE_NAME:mssql.mirroring.state["{#DBNAME}"].last()}=3 WARNING
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" is {ITEM.VALUE}

The state of the mirror database and of the database mirroring session is "Not synchronized". The partners are not synchronized. A failover is not possible now.

{TEMPLATE_NAME:mssql.mirroring.state["{#DBNAME}"].last()}=5 HIGH
MSSQL Mirroring '{#DBNAME}': "{#DBNAME}" Witness is disconnected

The state of the witness in the database mirroring session of the database is "Disconnected".

{TEMPLATE_NAME:mssql.mirroring.witness_state["{#DBNAME}"].last()}=2 WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide a feedback, discuss the template or ask for help with it at ZABBIX forums.

References

http://www.grumpyolddba.co.uk/monitoring/Performance%20Counter%20Guidance%20-%20SQL%20Server.htm
https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-access-methods-object?view=sql-server-ver15

Articles and documentation

+ Propose new article
Add your solution