CockroachDB

CockroachDB is a cloud-native distributed SQL database designed to build, scale, and manage modern, data-intensive applications.

Available solutions




This template is for Zabbix version: 6.2
Also available for: 6.0

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/cockroachdb_http?at=release/6.2

CockroachDB by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor CockroachDB nodes by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template CockroachDB node by HTTP — collects metrics by HTTP agent from Prometheus endpoint and health endpoints.

This template was tested on:

  • CockroachDB, version 21.2.8

Setup

See Zabbix template operation for basic instructions.

Internal node metrics are collected from Prometheus /_status/vars endpoint. Node health metrics are collected from /health and /health?ready=1 endpoints. Template doesn't require usage of session token.

Don't forget change macros {$COCKROACHDB.API.SCHEME} according to your situation (secure/insecure node). Also, see the Macros section for a list of macros used to set trigger values.

NOTE. Some metrics may not be collected depending on your CockroachDB version and configuration.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$COCKROACHDB.API.PORT}

The port of CockroachDB API and Prometheus endpoint.

8080
{$COCKROACHDB.API.SCHEME}

Request scheme which may be http or https.

http
{$COCKROACHDB.CERT.CA.EXPIRY.WARN}

Number of days until the CA certificate expires.

90
{$COCKROACHDB.CERT.NODE.EXPIRY.WARN}

Number of days until the node certificate expires.

30
{$COCKROACHDB.CLOCK.OFFSET.MAX.WARN}

Maximum clock offset of the node against the rest of the cluster in milliseconds for trigger expression.

300
{$COCKROACHDB.OPEN.FDS.MAX.WARN}

Maximum percentage of used file descriptors.

80
{$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN}

Maximum number of SQL statements errors for trigger expression.

2
{$COCKROACHDB.STORE.USED.MIN.CRIT}

The critical threshold of the available disk space in percent.

10
{$COCKROACHDB.STORE.USED.MIN.WARN}

The warning threshold of the available disk space in percent.

20

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Storage metrics discovery

Discover per store metrics.

DEPENDENT cockroachdb.store.discovery

Preprocessing:

- PROMETHEUS_TO_JSON: capacity

- DISCARD_UNCHANGED_HEARTBEAT: 3h

Items collected

Group Name Description Type Key and additional info
CockroachDB CockroachDB: Service ping

Check if HTTP/HTTPS service accepts TCP connections.

SIMPLE net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"]

Preprocessing:

- DISCARD_UNCHANGED_HEARTBEAT: 10m

CockroachDB CockroachDB: Clock offset

Mean clock offset of the node against the rest of the cluster.

DEPENDENT cockroachdb.clock.offset

Preprocessing:

- PROMETHEUS_PATTERN: clock_offset_meannanos: value: `</p><p>- MULTIPLIER:0.000000001`

CockroachDB CockroachDB: Version

Build information.

DEPENDENT cockroachdb.version

Preprocessing:

- PROMETHEUS_PATTERN: build_timestamp: label: tag

- DISCARD_UNCHANGED_HEARTBEAT: 3h

CockroachDB CockroachDB: CPU: System time

System CPU time.

DEPENDENT cockroachdb.cpu.system_time

Preprocessing:

- PROMETHEUS_PATTERN: sys_cpu_sys_ns: value: `</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:0.000000001`

CockroachDB CockroachDB: CPU: User time

User CPU time.

DEPENDENT cockroachdb.cpu.user_time

Preprocessing:

- PROMETHEUS_PATTERN: sys_cpu_user_ns: value: `</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:0.000000001`

CockroachDB CockroachDB: CPU: Utilization

CPU utilization in %.

DEPENDENT cockroachdb.cpu.util

Preprocessing:

- PROMETHEUS_PATTERN: sys_cpu_combined_percent_normalized: value: `</p><p>- MULTIPLIER:100`

CockroachDB CockroachDB: Disk: IOPS in progress, rate

Number of disk IO operations currently in progress on this host.

DEPENDENT cockroachdb.disk.iops.in_progress.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_disk_iopsinprogress: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Disk: Reads, rate

Bytes read from all disks per second since this process started

DEPENDENT cockroachdb.disk.read.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_disk_read_bytes: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Disk: Read IOPS, rate

Number of disk read operations per second across all disks since this process started.

DEPENDENT cockroachdb.disk.iops.read.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_disk_read_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Disk: Writes, rate

Bytes written to all disks per second since this process started.

DEPENDENT cockroachdb.disk.write.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_disk_write_bytes: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Disk: Write IOPS, rate

Disk write operations per second across all disks since this process started.

DEPENDENT cockroachdb.disk.iops.write.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_disk_write_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: File descriptors: Limit

Open file descriptors soft limit of the process.

DEPENDENT cockroachdb.descriptors.limit

Preprocessing:

- PROMETHEUS_PATTERN: sys_fd_softlimit: value: `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:3h`

CockroachDB CockroachDB: File descriptors: Open

The number of open file descriptors.

DEPENDENT cockroachdb.descriptors.open

Preprocessing:

- PROMETHEUS_PATTERN: sys_fd_open: value: ``

CockroachDB CockroachDB: GC: Pause time

The amount of processor time used by Go's garbage collector across all nodes. During garbage collection, application code execution is paused.

DEPENDENT cockroachdb.gc.pause_time

Preprocessing:

- PROMETHEUS_PATTERN: sys_gc_pause_ns: value: `</p><p>- CHANGE_PER_SECOND</p><p>- MULTIPLIER:0.000000001`

CockroachDB CockroachDB: GC: Runs, rate

The number of times that Go's garbage collector was invoked per second across all nodes.

DEPENDENT cockroachdb.gc.runs.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_gc_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Go: Goroutines count

Current number of Goroutines. This count should rise and fall based on load.

DEPENDENT cockroachdb.go.goroutines.count

Preprocessing:

- PROMETHEUS_PATTERN: sys_goroutines: value: ``

CockroachDB CockroachDB: KV transactions: Aborted, rate

Number of aborted KV transactions per second.

DEPENDENT cockroachdb.kv.transactions.aborted.rate

Preprocessing:

- PROMETHEUS_PATTERN: txn_aborts: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: KV transactions: Committed, rate

Number of KV transactions (including 1PC) committed per second.

DEPENDENT cockroachdb.kv.transactions.committed.rate

Preprocessing:

- PROMETHEUS_PATTERN: txn_commits: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Live nodes count

The number of live nodes in the cluster (will be 0 if this node is not itself live).

DEPENDENT cockroachdb.live_count

Preprocessing:

- PROMETHEUS_PATTERN: liveness_livenodes: value: `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:3h`

CockroachDB CockroachDB: Liveness heartbeats, rate

Number of successful node liveness heartbeats per second from this node.

DEPENDENT cockroachdb.heartbeaths.success.rate

Preprocessing:

- PROMETHEUS_PATTERN: liveness_heartbeatsuccesses: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Memory: Allocated by Cgo

Current bytes of memory allocated by the C layer.

DEPENDENT cockroachdb.memory.cgo.allocated

Preprocessing:

- PROMETHEUS_PATTERN: sys_cgo_allocbytes: value: ``

CockroachDB CockroachDB: Memory: Allocated by Go

Current bytes of memory allocated by the Go layer.

DEPENDENT cockroachdb.memory.go.allocated

Preprocessing:

- PROMETHEUS_PATTERN: sys_go_allocbytes: value: ``

CockroachDB CockroachDB: Memory: Managed by Cgo

Total bytes of memory managed by the C layer.

DEPENDENT cockroachdb.memory.cgo.managed

Preprocessing:

- PROMETHEUS_PATTERN: sys_cgo_totalbytes: value: ``

CockroachDB CockroachDB: Memory: Managed by Go

Total bytes of memory managed by the Go layer.

DEPENDENT cockroachdb.memory.go.managed

Preprocessing:

- PROMETHEUS_PATTERN: sys_go_totalbytes: value: ``

CockroachDB CockroachDB: Memory: Total usage

Resident set size (RSS) of memory in use by the node.

DEPENDENT cockroachdb.memory.total

Preprocessing:

- PROMETHEUS_PATTERN: sys_rss: value: ``

CockroachDB CockroachDB: Network: Bytes received, rate

Bytes received per second on all network interfaces since this process started.

DEPENDENT cockroachdb.network.bytes.received.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_net_recv_bytes: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Network: Bytes sent, rate

Bytes sent per second on all network interfaces since this process started.

DEPENDENT cockroachdb.network.bytes.sent.rate

Preprocessing:

- PROMETHEUS_PATTERN: sys_host_net_send_bytes: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Time series: Sample errors, rate

The number of errors encountered while attempting to write metrics to disk, per second.

DEPENDENT cockroachdb.ts.samples.errors.rate

Preprocessing:

- PROMETHEUS_PATTERN: timeseries_write_errors: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Time series: Samples written, rate

The number of successfully written metric samples per second.

DEPENDENT cockroachdb.ts.samples.written.rate

Preprocessing:

- PROMETHEUS_PATTERN: timeseries_write_samples: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Slow requests: DistSender RPCs

Number of RPCs stuck or retrying for a long time.

DEPENDENT cockroachdb.slow_requests.rpc

Preprocessing:

- PROMETHEUS_PATTERN: requests_slow_distsender: value: ``

CockroachDB CockroachDB: SQL: Bytes received, rate

Total amount of incoming SQL client network traffic in bytes per second.

DEPENDENT cockroachdb.sql.bytes.received.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_bytesin: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL: Bytes sent, rate

Total amount of outgoing SQL client network traffic in bytes per second.

DEPENDENT cockroachdb.sql.bytes.sent.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_bytesout: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Memory: Allocated by SQL

Current SQL statement memory usage for root.

DEPENDENT cockroachdb.memory.sql

Preprocessing:

- PROMETHEUS_PATTERN: sql_mem_root_current: value: ``

CockroachDB CockroachDB: SQL: Schema changes, rate

Total number of SQL DDL statements successfully executed per second.

DEPENDENT cockroachdb.sql.schema_changes.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_ddl_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL sessions: Open

Total number of open SQL sessions.

DEPENDENT cockroachdb.sql.sessions

Preprocessing:

- PROMETHEUS_PATTERN: sql_conns: value: ``

CockroachDB CockroachDB: SQL statements: Active

Total number of SQL statements currently active.

DEPENDENT cockroachdb.sql.statements.active

Preprocessing:

- PROMETHEUS_PATTERN: sql_distsql_queries_active: value: ``

CockroachDB CockroachDB: SQL statements: DELETE, rate

A moving average of the number of DELETE statements successfully executed per second.

DEPENDENT cockroachdb.sql.statements.delete.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_delete_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: Executed, rate

Number of SQL queries executed per second.

DEPENDENT cockroachdb.sql.statements.executed.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_query_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: Denials, rate

The number of statements denied per second by a feature flag.

DEPENDENT cockroachdb.sql.statements.denials.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_feature_flag_denial: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: Active flows distributed, rate

The number of distributed SQL flows currently active per second.

DEPENDENT cockroachdb.sql.statements.flows.active.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_distsql_flows_active: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: INSERT, rate

A moving average of the number of INSERT statements successfully executed per second.

DEPENDENT cockroachdb.sql.statements.insert.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_insert_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: SELECT, rate

A moving average of the number of SELECT statements successfully executed per second.

DEPENDENT cockroachdb.sql.statements.select.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_select_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: UPDATE, rate

A moving average of the number of UPDATE statements successfully executed per second.

DEPENDENT cockroachdb.sql.statements.update.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_update_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: Contention, rate

Total number of SQL statements that experienced contention per second.

DEPENDENT cockroachdb.sql.statements.contention.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_distsql_contended_queries_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL statements: Errors, rate

Total number of statements which returned a planning or runtime error per second.

DEPENDENT cockroachdb.sql.statements.errors.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_failure_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL transactions: Open

Total number of currently open SQL transactions.

DEPENDENT cockroachdb.sql.transactions.open

Preprocessing:

- PROMETHEUS_PATTERN: sql_txns_open: value: ``

CockroachDB CockroachDB: SQL transactions: Aborted, rate

Total number of SQL transaction abort errors per second.

DEPENDENT cockroachdb.sql.transactions.aborted.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_txn_abort_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL transactions: Committed, rate

Total number of SQL transaction COMMIT statements successfully executed per second.

DEPENDENT cockroachdb.sql.transactions.committed.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_txn_commit_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL transactions: Initiated, rate

Total number of SQL transaction BEGIN statements successfully executed per second.

DEPENDENT cockroachdb.sql.transactions.initiated.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_txn_begin_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: SQL transactions: Rolled back, rate

Total number of SQL transaction ROLLBACK statements successfully executed per second.

DEPENDENT cockroachdb.sql.transactions.rollbacks.rate

Preprocessing:

- PROMETHEUS_PATTERN: sql_txn_rollback_count: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Uptime

Process uptime.

DEPENDENT cockroachdb.uptime

Preprocessing:

- PROMETHEUS_PATTERN: sys_uptime: value: ``

CockroachDB CockroachDB: Node certificate expiration date

Node certificate expires at that date.

DEPENDENT cockroachdb.cert.expire_date.node

Preprocessing:

- PROMETHEUS_PATTERN: security_certificate_expiration_node: value: `</p><p>⛔️ON_FAIL:DISCARD_VALUE -> </p><p>- DISCARD_UNCHANGED_HEARTBEAT:6h`

CockroachDB CockroachDB: CA certificate expiration date

CA certificate expires at that date.

DEPENDENT cockroachdb.cert.expire_date.ca

Preprocessing:

- PROMETHEUS_PATTERN: security_certificate_expiration_ca: value: `</p><p>⛔️ON_FAIL:DISCARD_VALUE -> </p><p>- DISCARD_UNCHANGED_HEARTBEAT:6h`

CockroachDB CockroachDB: Storage [{#STORE}]: Bytes: Live

Number of logical bytes stored in live key-value pairs on this node. Live data excludes historical and deleted data.

DEPENDENT cockroachdb.storage.bytes.[{#STORE},live]

Preprocessing:

- PROMETHEUS_PATTERN: livebytes{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Bytes: System

Number of physical bytes stored in system key-value pairs.

DEPENDENT cockroachdb.storage.bytes.[{#STORE},system]

Preprocessing:

- PROMETHEUS_PATTERN: sysbytes{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Capacity available

Available storage capacity.

DEPENDENT cockroachdb.storage.capacity.[{#STORE},available]

Preprocessing:

- PROMETHEUS_PATTERN: capacity_available{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Capacity total

Total storage capacity. This value may be explicitly set using --store. If a store size has not been set, this metric displays the actual disk capacity.

DEPENDENT cockroachdb.storage.capacity.[{#STORE},total]

Preprocessing:

- PROMETHEUS_PATTERN: capacity{store="{#STORE}"}: value: `</p><p>- DISCARD_UNCHANGED_HEARTBEAT:3h`

CockroachDB CockroachDB: Storage [{#STORE}]: Capacity used

Disk space in use by CockroachDB data on this node. This excludes the Cockroach binary, operating system, and other system files.

DEPENDENT cockroachdb.storage.capacity.[{#STORE},used]

Preprocessing:

- PROMETHEUS_PATTERN: capacity_used{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Capacity available in %

Available storage capacity in %.

CALCULATED cockroachdb.storage.capacity.[{#STORE},available_percent]

Expression:

last(//cockroachdb.storage.capacity.[{#STORE},available]) / last(//cockroachdb.storage.capacity.[{#STORE},total]) * 100
CockroachDB CockroachDB: Storage [{#STORE}]: Replication: Lease holders

Number of lease holders.

DEPENDENT cockroachdb.replication.[{#STORE},lease_holders]

Preprocessing:

- PROMETHEUS_PATTERN: replicas_leaseholders{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Bytes: Logical

Number of logical bytes stored in key-value pairs on this node. This includes historical and deleted data.

DEPENDENT cockroachdb.storage.bytes.[{#STORE},logical]

Preprocessing:

- PROMETHEUS_PATTERN: totalbytes{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Rebalancing: Average queries, rate

Number of kv-level requests received per second by the store, averaged over a large time period as used in rebalancing decisions.

DEPENDENT cockroachdb.rebalancing.queries.average.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: rebalancing_queriespersecond{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Rebalancing: Average writes, rate

Number of keys written (i.e. applied by raft) per second to the store, averaged over a large time period as used in rebalancing decisions.

DEPENDENT cockroachdb.rebalancing.writes.average.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: rebalancing_writespersecond{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Consistency, rate

Number of replicas which failed processing in the consistency checker queue per second.

DEPENDENT cockroachdb.queue.processing_failures.consistency.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_consistency_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: GC, rate

Number of replicas which failed processing in the GC queue per second.

DEPENDENT cockroachdb.queue.processing_failures.gc.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_gc_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft log, rate

Number of replicas which failed processing in the Raft log queue per second.

DEPENDENT cockroachdb.queue.processing_failures.raftlog.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_raftlog_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft snapshot, rate

Number of replicas which failed processing in the Raft repair queue per second.

DEPENDENT cockroachdb.queue.processing_failures.raftsnapshot.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_raftsnapshot_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Replica GC, rate

Number of replicas which failed processing in the replica GC queue per second.

DEPENDENT cockroachdb.queue.processing_failures.gc_replica.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_replicagc_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Replicate, rate

Number of replicas which failed processing in the replicate queue per second.

DEPENDENT cockroachdb.queue.processing_failures.replicate.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_replicate_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Split, rate

Number of replicas which failed processing in the split queue per second.

DEPENDENT cockroachdb.queue.processing_failures.split.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_split_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Queue processing failures: Time series maintenance, rate

Number of replicas which failed processing in the time series maintenance queue per second.

DEPENDENT cockroachdb.queue.processing_failures.tsmaintenance.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: queue_tsmaintenance_process_failure{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: Ranges count

Number of ranges.

DEPENDENT cockroachdb.ranges.[{#STORE},count]

Preprocessing:

- PROMETHEUS_PATTERN: ranges{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Ranges unavailable

Number of ranges with fewer live replicas than needed for quorum.

DEPENDENT cockroachdb.ranges.[{#STORE},unavailable]

Preprocessing:

- PROMETHEUS_PATTERN: ranges_unavailable{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Ranges underreplicated

Number of ranges with fewer live replicas than the replication target.

DEPENDENT cockroachdb.ranges.[{#STORE},underreplicated]

Preprocessing:

- PROMETHEUS_PATTERN: ranges_underreplicated{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: RocksDB read amplification

The average number of real read operations executed per logical read operation.

DEPENDENT cockroachdb.rocksdb.[{#STORE},read_amp]

Preprocessing:

- PROMETHEUS_PATTERN: rocksdb_read_amplification{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: RocksDB cache hits, rate

Count of block cache hits per second.

DEPENDENT cockroachdb.rocksdb.cache.hits.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: rocksdb_block_cache_hits{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: RocksDB cache misses, rate

Count of block cache misses per second.

DEPENDENT cockroachdb.rocksdb.cache.misses.[{#STORE},rate]

Preprocessing:

- PROMETHEUS_PATTERN: rocksdb_block_cache_misses{store="{#STORE}"}: value: ``

- CHANGE_PER_SECOND

CockroachDB CockroachDB: Storage [{#STORE}]: RocksDB cache hit ratio

Block cache hit ratio in %.

CALCULATED cockroachdb.rocksdb.cache.[{#STORE},hit_ratio]

Expression:

last(//cockroachdb.rocksdb.cache.hits.[{#STORE},rate]) / (last(//cockroachdb.rocksdb.cache.hits.[{#STORE},rate]) + last(//cockroachdb.rocksdb.cache.misses.[{#STORE},rate])) * 100
CockroachDB CockroachDB: Storage [{#STORE}]: Replication: Replicas

Number of replicas.

DEPENDENT cockroachdb.replication.replicas.[{#STORE},count]

Preprocessing:

- PROMETHEUS_PATTERN: replicas{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Replication: Replicas quiesced

Number of quiesced replicas.

DEPENDENT cockroachdb.replication.replicas.[{#STORE},quiesced]

Preprocessing:

- PROMETHEUS_PATTERN: replicas_quiescent{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Slow requests: Latch acquisitions

Number of requests that have been stuck for a long time acquiring latches.

DEPENDENT cockroachdb.slow_requests.[{#STORE},latch_acquisitions]

Preprocessing:

- PROMETHEUS_PATTERN: requests_slow_latch{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Slow requests: Lease acquisitions

Number of requests that have been stuck for a long time acquiring a lease.

DEPENDENT cockroachdb.slow_requests.[{#STORE},lease_acquisitions]

Preprocessing:

- PROMETHEUS_PATTERN: requests_slow_lease{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: Slow requests: Raft proposals

Number of requests that have been stuck for a long time in raft.

DEPENDENT cockroachdb.slow_requests.[{#STORE},raft_proposals]

Preprocessing:

- PROMETHEUS_PATTERN: requests_slow_raft{store="{#STORE}"}: value: ``

CockroachDB CockroachDB: Storage [{#STORE}]: RocksDB SSTables

The number of SSTables in use.

DEPENDENT cockroachdb.rocksdb.[{#STORE},sstables]

Preprocessing:

- PROMETHEUS_PATTERN: rocksdb_num_sstables{store="{#STORE}"}: value: ``

Zabbix raw items CockroachDB: Get metrics

Get raw metrics from the Prometheus endpoint.

HTTP_AGENT cockroachdb.get_metrics

Preprocessing:

- CHECK_NOT_SUPPORTED

⛔️ON_FAIL: DISCARD_VALUE ->

Zabbix raw items CockroachDB: Get health

Get node /health endpoint

HTTP_AGENT cockroachdb.get_health

Preprocessing:

- CHECK_NOT_SUPPORTED

⛔️ON_FAIL: DISCARD_VALUE ->

- REGEX: HTTP.*\s(\d+): \1

- DISCARD_UNCHANGED_HEARTBEAT: 3h

Zabbix raw items CockroachDB: Get readiness

Get node /health?ready=1 endpoint

HTTP_AGENT cockroachdb.get_readiness

Preprocessing:

- CHECK_NOT_SUPPORTED

⛔️ON_FAIL: DISCARD_VALUE ->

- REGEX: HTTP.*\s(\d+): \1

- DISCARD_UNCHANGED_HEARTBEAT: 3h

Triggers

Name Description Expression Severity Dependencies and additional info
CockroachDB: Service is down

-

last(/CockroachDB by HTTP/net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"]) = 0 AVERAGE
CockroachDB: Clock offset is too high

Cockroach-measured clock offset is nearing limit (by default, servers kill themselves at 400ms from the mean).

min(/CockroachDB by HTTP/cockroachdb.clock.offset,5m) > {$COCKROACHDB.CLOCK.OFFSET.MAX.WARN} * 0.001 WARNING
CockroachDB: Version has changed

-

last(/CockroachDB by HTTP/cockroachdb.version) <> last(/CockroachDB by HTTP/cockroachdb.version,#2) and length(last(/CockroachDB by HTTP/cockroachdb.version)) > 0 INFO
CockroachDB: Current number of open files is too high

Getting close to open file descriptor limit.

min(/CockroachDB by HTTP/cockroachdb.descriptors.open,10m) / last(/CockroachDB by HTTP/cockroachdb.descriptors.limit) * 100 > {$COCKROACHDB.OPEN.FDS.MAX.WARN} WARNING
CockroachDB: Node is not executing SQL

Node is not executing SQL despite having connections.

last(/CockroachDB by HTTP/cockroachdb.sql.sessions) > 0 and last(/CockroachDB by HTTP/cockroachdb.sql.statements.executed.rate) = 0 WARNING
CockroachDB: SQL statements errors rate is too high

-

min(/CockroachDB by HTTP/cockroachdb.sql.statements.errors.rate,5m) > {$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN} WARNING
CockroachDB: Node has been restarted

Uptime is less than 10 minutes.

last(/CockroachDB by HTTP/cockroachdb.uptime) < 10m INFO
CockroachDB: Failed to fetch node data

Zabbix has not received data for items for the last 5 minutes.

nodata(/CockroachDB by HTTP/cockroachdb.uptime,5m) = 1 WARNING

Depends on:

- CockroachDB: Service is down

CockroachDB: Node certificate expires soon

Node certificate expires soon.

(last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.node) - now()) / 86400 < {$COCKROACHDB.CERT.NODE.EXPIRY.WARN} WARNING
CockroachDB: CA certificate expires soon

CA certificate expires soon.

(last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.ca) - now()) / 86400 < {$COCKROACHDB.CERT.CA.EXPIRY.WARN} WARNING
CockroachDB: Storage [{#STORE}]: Available storage capacity is low

Storage is running low on free space (less than {$COCKROACHDB.STORE.USED.MIN.WARN}% available).

max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.WARN}

Recovery expression:

min(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) > {$COCKROACHDB.STORE.USED.MIN.WARN}
WARNING

Depends on:

- CockroachDB: Storage [{#STORE}]: Available storage capacity is critically low

CockroachDB: Storage [{#STORE}]: Available storage capacity is critically low

Storage is running critically low on free space (less than {$COCKROACHDB.STORE.USED.MIN.CRIT}% available).

max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.CRIT}

Recovery expression:

min(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) > {$COCKROACHDB.STORE.USED.MIN.CRIT}
AVERAGE
CockroachDB: Node is unhealthy

Node's /health endpoint has returned HTTP 500 Internal Server Error which indicates unhealthy mode.

last(/CockroachDB by HTTP/cockroachdb.get_health) = 500 AVERAGE

Depends on:

- CockroachDB: Service is down

CockroachDB: Node is not ready

Node's /health?ready=1 endpoint has returned HTTP 503 Service Unavailable. Possible reasons:

- node is in the wait phase of the node shutdown sequence;

- node is unable to communicate with a majority of the other nodes in the cluster, likely because the cluster is unavailable due to too many nodes being down.

last(/CockroachDB by HTTP/cockroachdb.get_readiness) = 503 and last(/CockroachDB by HTTP/cockroachdb.uptime) > 5m AVERAGE

Depends on:

- CockroachDB: Service is down

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

Articles and documentation

+ Propose new article

Não encontrou a integração que vocá precisa?