HashiCorp Consul Node by HTTP
Overview
For Zabbix version: 6.2 and higher
The template to monitor HashiCorp Consul by Zabbix that works without any external scripts.
Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Do not forget to enable Prometheus format for export metrics.
See documentation.
More information about metrics you can find in official documentation.
Template HashiCorp Consul Node by HTTP
— collects metrics by HTTP agent from /v1/agent/metrics endpoint.
This template was tested on:
- HashiCorp Consul, version 1.10.0
Setup
See Zabbix template operation for basic instructions.
Internal service metrics are collected from /v1/agent/metrics endpoint. Do not forget to enable Prometheus format for export metrics. See documentation. Template need to use Authorization via API token.
Don't forget to change macros {$CONSUL.NODE.API.URL}, {$CONSUL.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.
This template support Consul namespaces. You can set macros {$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.MATCHES}, {$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.NOT_MATCHES} if you want to filter discovered services by namespace.
In case of Open Source version service namespace will be set to 'None'.
NOTE. Some metrics may not be collected depending on your HashiCorp Consul instance version and configuration.
NOTE. You maybe are interested in Envoy Proxy by HTTP template.
Zabbix configuration
No specific Zabbix configuration is required.
Macros used
Name | Description | Default |
---|---|---|
{$CONSUL.LLD.FILTER.LOCAL_SERVICE_NAME.MATCHES} | Filter of discoverable discovered services on local node. |
.* |
{$CONSUL.LLD.FILTER.LOCAL_SERVICE_NAME.NOT_MATCHES} | Filter to exclude discovered services on local node. |
CHANGE IF NEEDED |
{$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.MATCHES} | Filter of discoverable discovered service by namespace on local node. Enterprise only, in case of Open Source version Namespace will be set to 'None'. |
.* |
{$CONSUL.LLD.FILTER.SERVICE_NAMESPACE.NOT_MATCHES} | Filter to exclude discovered service by namespace on local node. Enterprise only, in case of Open Source version Namespace will be set to 'None'. |
CHANGE IF NEEDED |
{$CONSUL.NODE.API.URL} | Consul instance URL. |
http://localhost:8500 |
{$CONSUL.NODE.HEALTH_SCORE.MAX.HIGH} | Maximum acceptable value of node's health score for AVERAGE trigger expression. |
4 |
{$CONSUL.NODE.HEALTH_SCORE.MAX.WARN} | Maximum acceptable value of node's health score for WARNING trigger expression. |
2 |
{$CONSUL.OPEN.FDS.MAX.WARN} | Maximum percentage of used file descriptors. |
90 |
{$CONSUL.TOKEN} | Consul auth token. |
<PUT YOUR AUTH TOKEN> |
Template links
There are no template links in this template.
Discovery rules
Name | Description | Type | Key and additional info |
---|---|---|---|
HTTP API methods discovery | Discovery HTTP API methods specific metrics. |
DEPENDENT | consul.http_api_discovery Preprocessing: - PROMETHEUS_TO_JSON: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: |
Local node services discovery | Discover metrics for services that are registered with the local agent. |
DEPENDENT | consul.node_services_lld Preprocessing: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: Filter: - {#SERVICE_NAME} MATCHES_REGEX - {#SERVICE_NAME} NOT_MATCHES_REGEX - {#SERVICE_NAMESPACE} MATCHES_REGEX - {#SERVICE_NAMESPACE} NOT_MATCHES_REGEX Overrides: aggregated status - ITEM_PROTOTYPE LIKE State - DISCOVERchecks |
Raft leader metrics discovery | Discover raft metrics for leader nodes. |
DEPENDENT | consul.raft.leader.discovery Preprocessing: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: |
Raft server metrics discovery | Discover raft metrics for server nodes. |
DEPENDENT | consul.raft.server.discovery Preprocessing: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: |
Items collected
Group | Name | Description | Type | Key and additional info |
---|---|---|---|---|
Consul | Consul: Role | Role of current Consul agent. |
DEPENDENT | consul.role Preprocessing: - JSONPATH: - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Version | Version of Consul agent. |
DEPENDENT | consul.version Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Number of services | Number of services on current node. |
DEPENDENT | consul.services_number Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Number of checks | Number of checks on current node. |
DEPENDENT | consul.checks_number Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Number of check monitors | Number of check monitors on current node. |
DEPENDENT | consul.check_monitors_number Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Process CPU seconds, total | Total user and system CPU time spent in seconds. |
DEPENDENT | consul.cpu_seconds_total.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Virtual memory size | Virtual memory size in bytes. |
DEPENDENT | consul.virtual_memory_bytes Preprocessing: - PROMETHEUS_PATTERN: |
Consul | Consul: RSS memory usage | Resident memory size in bytes. |
DEPENDENT | consul.resident_memory_bytes Preprocessing: - PROMETHEUS_PATTERN: |
Consul | Consul: Goroutine count | The number of Goroutines on Consul instance. |
DEPENDENT | consul.goroutines Preprocessing: - PROMETHEUS_PATTERN: |
Consul | Consul: Open file descriptors | Number of open file descriptors. |
DEPENDENT | consul.process_open_fds Preprocessing: - PROMETHEUS_PATTERN: |
Consul | Consul: Open file descriptors, max | Maximum number of open file descriptors. |
DEPENDENT | consul.process_max_fds Preprocessing: - PROMETHEUS_PATTERN: |
Consul | Consul: Client RPC, per second | Number of times per second whenever a Consul agent in client mode makes an RPC request to a Consul server. This gives a measure of how much a given agent is loading the Consul servers. This is only generated by agents in client mode, not Consul servers. |
DEPENDENT | consul.client_rpc Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Client RPC failed ,per second | Number of times per second whenever a Consul agent in client mode makes an RPC request to a Consul server and fails. |
DEPENDENT | consul.client_rpc_failed Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: TCP connections, accepted per second | This metric counts the number of times a Consul agent has accepted an incoming TCP stream connection per second. |
DEPENDENT | consul.memberlist.tcp_accept Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: TCP connections, per second | This metric counts the number of times a Consul agent has initiated a push/pull sync with an other agent per second. |
DEPENDENT | consul.memberlist.tcp_connect Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: TCP send bytes, per second | This metric measures the total number of bytes sent by a Consul agent through the TCP protocol per second. |
DEPENDENT | consul.memberlist.tcp_sent Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: UDP received bytes, per second | This metric measures the total number of bytes received by a Consul agent through the UDP protocol per second. |
DEPENDENT | consul.memberlist.udp_received Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: UDP sent bytes, per second | This metric measures the total number of bytes sent by a Consul agent through the UDP protocol per second. |
DEPENDENT | consul.memberlist.udp_sent Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: GC pause, p90 | The 90 percentile for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started, in milliseconds. |
DEPENDENT | consul.gc_pause.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: - MULTIPLIER: |
Consul | Consul: GC pause, p50 | The 50 percentile (median) for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started, in milliseconds. |
DEPENDENT | consul.gc_pause.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: - MULTIPLIER: |
Consul | Consul: Memberlist: degraded | This metric counts the number of times the Consul agent has performed failure detection on another agent at a slower probe rate. The agent uses its own health metric as an indicator to perform this action. If its health score is low, it means that the node is healthy, and vice versa. |
DEPENDENT | consul.memberlist.degraded Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Memberlist: health score | This metric describes a node's perception of its own health based on how well it is meeting the soft real-time requirements of the protocol. This metric ranges from 0 to 8, where 0 indicates "totally healthy". |
DEPENDENT | consul.memberlist.health_score Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Memberlist: gossip, p90 | The 90 percentile for the number of gossips (messages) broadcasted to a set of randomly selected nodes. |
DEPENDENT | consul.memberlist.dispatch_log.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Memberlist: gossip, p50 | The 50 for the number of gossips (messages) broadcasted to a set of randomly selected nodes. |
DEPENDENT | consul.memberlist.gossip.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Memberlist: msg alive | This metric counts the number of alive Consul agents, that the agent has mapped out so far, based on the message information given by the network layer. |
DEPENDENT | consul.memberlist.msg.alive Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Memberlist: msg dead | This metric counts the number of times a Consul agent has marked another agent to be a dead node. |
DEPENDENT | consul.memberlist.msg.dead Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Memberlist: msg suspect | The number of times a Consul agent suspects another as failed while probing during gossip protocol. |
DEPENDENT | consul.memberlist.msg.suspect Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Memberlist: probe node, p90 | The 90 percentile for the time taken to perform a single round of failure detection on a select Consul agent. |
DEPENDENT | consul.memberlist.probe_node.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Memberlist: probe node, p50 | The 50 percentile (median) for the time taken to perform a single round of failure detection on a select Consul agent. |
DEPENDENT | consul.memberlist.probe_node.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Memberlist: push pull node, p90 | The 90 percentile for the number of Consul agents that have exchanged state with this agent. |
DEPENDENT | consul.memberlist.push_pull_node.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Memberlist: push pull node, p50 | The 50 percentile (median) for the number of Consul agents that have exchanged state with this agent. |
DEPENDENT | consul.memberlist.push_pull_node.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: KV store: apply, p90 | The 90 percentile for the time it takes to complete an update to the KV store. |
DEPENDENT | consul.kvs.apply.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: KV store: apply, p50 | The 50 percentile (median) for the time it takes to complete an update to the KV store. |
DEPENDENT | consul.kvs.apply.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: KV store: apply, rate | The number of updates to the KV store per second. |
DEPENDENT | consul.kvs.apply.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Serf member: flap, rate | Increments when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports. Shown as events per second. |
DEPENDENT | consul.serf.member.flap.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Serf member: failed, rate | Increments when an agent is marked dead. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports. Shown as events per second. |
DEPENDENT | consul.serf.member.failed.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Serf member: join, rate | Increments when an agent joins the cluster. If an agent flapped or failed this counter also increments when it re-joins. Shown as events per second. |
DEPENDENT | consul.serf.member.join.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Serf member: left, rate | Increments when an agent leaves the cluster. Shown as events per second. |
DEPENDENT | consul.serf.member.left.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Serf member: update, rate | Increments when a Consul agent updates. Shown as events per second. |
DEPENDENT | consul.serf.member.update.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: ACL: resolves, rate | The number of ACL resolves per second. |
DEPENDENT | consul.acl.resolves.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Catalog: register, rate | The number of catalog register operation per second. |
DEPENDENT | consul.catalog.register.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Catalog: deregister, rate | The number of catalog deregister operation per second. |
DEPENDENT | consul.catalog.deregister.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Snapshot: append line, p90 | The 90 percentile for the time taken by the Consul agent to append an entry into the existing log. |
DEPENDENT | consul.snapshot.append_line.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Snapshot: append line, p50 | The 50 percentile (median) for the time taken by the Consul agent to append an entry into the existing log. |
DEPENDENT | consul.snapshot.append_line.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Snapshot: append line, rate | The number of snapshot appendLine operations per second. |
DEPENDENT | consul.snapshot.append_line.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Snapshot: compact, p90 | The 90 percentile for the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction. |
DEPENDENT | consul.snapshot.compact.p90 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Snapshot: compact, p50 | The 50 percentile (median) for the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction. |
DEPENDENT | consul.snapshot.compact.p50 Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Snapshot: compact, rate | The number of snapshot compact operations per second. |
DEPENDENT | consul.snapshot.compact.rate Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Get local services check | Data collection check. |
DEPENDENT | consul.get_local_services.check Preprocessing: - JSONPATH: ⛔️ON_FAIL: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: ["{#SERVICE_NAME}"]: Aggregated status | Aggregated values of all health checks for the service instance. |
DEPENDENT | consul.service.aggregated_state["{#SERVICE_ID}"] Preprocessing: - JSONPATH: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: ["{#SERVICE_NAME}"]: Check ["{#SERVICE_CHECK_NAME}"]: Status | Current state of health check for the service. |
DEPENDENT | consul.service.check.state["{#SERVICE_ID}/{#SERVICE_CHECK_ID}"] Preprocessing: - JSONPATH: - JAVASCRIPT: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: ["{#SERVICE_NAME}"]: Check ["{#SERVICE_CHECK_NAME}"]: Output | Current output of health check for the service. |
DEPENDENT | consul.service.check.output["{#SERVICE_ID}/{#SERVICE_CHECK_ID}"] Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: HTTP request: ["{#HTTP_METHOD}"], p90 | The 90 percentile of how long it takes to service the given HTTP request for the given verb. |
DEPENDENT | consul.http.api.p90["{#HTTP_METHOD}"] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: HTTP request: ["{#HTTP_METHOD}"], p50 | The 50 percentile (median) of how long it takes to service the given HTTP request for the given verb. |
DEPENDENT | consul.http.api.p50["{#HTTP_METHOD}"] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: HTTP request: ["{#HTTP_METHOD}"], rate | Thr number of HTTP request for the given verb per second. |
DEPENDENT | consul.http.api.rate["{#HTTP_METHOD}"] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Raft state | Current state of Consul agent. |
DEPENDENT | consul.raft.state[{#SINGLETON}] Preprocessing: - JSONPATH: - DISCARD_UNCHANGED_HEARTBEAT: |
Consul | Consul: Raft state: leader | Increments when a server becomes a leader. |
DEPENDENT | consul.raft.state_leader[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Raft state: candidate | The number of initiated leader elections. |
DEPENDENT | consul.raft.state_candidate[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Consul | Consul: Raft: apply, rate | Incremented whenever a leader first passes a message into the Raft commit process (called an Apply operation). This metric describes the arrival rate of new logs into Raft per second. |
DEPENDENT | consul.raft.apply.rate[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Raft state: leader last contact, p90 | The 90 percentile of how long it takes a leader node to communicate with followers during a leader lease check, in milliseconds. |
DEPENDENT | consul.raft.leader_last_contact.p90[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: leader last contact, p50 | The 50 percentile (median) of how long it takes a leader node to communicate with followers during a leader lease check, in milliseconds. |
DEPENDENT | consul.raft.leader_last_contact.p50[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: commit time, p90 | The 90 percentile time it takes to commit a new entry to the raft log on the leader, in milliseconds. |
DEPENDENT | consul.raft.commit_time.p90[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: commit time, p50 | The 50 percentile (median) time it takes to commit a new entry to the raft log on the leader, in milliseconds. |
DEPENDENT | consul.raft.commit_time.p50[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: dispatch log, p90 | The 90 percentile time it takes for the leader to write log entries to disk, in milliseconds. |
DEPENDENT | consul.raft.dispatch_log.p90[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: dispatch log, p50 | The 50 percentile (median) time it takes for the leader to write log entries to disk, in milliseconds. |
DEPENDENT | consul.raft.dispatch_log.p50[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - JAVASCRIPT: |
Consul | Consul: Raft state: dispatch log, rate | The number of times a Raft leader writes a log to disk per second. |
DEPENDENT | consul.raft.dispatch_log.rate[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Raft state: commit, rate | The number of commits a new entry to the Raft log on the leader per second. |
DEPENDENT | consul.raft.commit_time.rate[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: - CHANGE_PER_SECOND |
Consul | Consul: Autopilot healthy | Tracks the overall health of the local server cluster. 1 if all servers are healthy, 0 if one or more are unhealthy. |
DEPENDENT | consul.autopilot.healthy[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: ⛔️ON_FAIL: |
Zabbix raw items | Consul: Get instance metrics | Get raw metrics from Consul instance /metrics endpoint. |
HTTP_AGENT | consul.get_metrics Preprocessing: - CHECK_NOT_SUPPORTED ⛔️ON_FAIL: |
Zabbix raw items | Consul: Get node info | Get configuration and member information of the local agent. |
HTTP_AGENT | consul.get_node_info Preprocessing: - CHECK_NOT_SUPPORTED ⛔️ON_FAIL: |
Zabbix raw items | Consul: Get local services | Get all the services that are registered with the local agent and their status. |
SCRIPT | consul.get_local_services Expression: The text is too long. Please see the template. |
Triggers
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Consul: Version has been changed | Consul version has changed. Ack to close. |
last(/HashiCorp Consul Node by HTTP/consul.version,#1)<>last(/HashiCorp Consul Node by HTTP/consul.version,#2) and length(last(/HashiCorp Consul Node by HTTP/consul.version))>0 |
INFO | Manual close: YES |
Consul: Current number of open files is too high | "Heavy file descriptor usage (i.e., near the process’s file descriptor limit) indicates a potential file descriptor exhaustion issue." |
min(/HashiCorp Consul Node by HTTP/consul.process_open_fds,5m)/last(/HashiCorp Consul Node by HTTP/consul.process_max_fds)*100>{$CONSUL.OPEN.FDS.MAX.WARN} |
WARNING | |
Consul: Node's health score is warning | This metric ranges from 0 to 8, where 0 indicates "totally healthy". This health score is used to scale the time between outgoing probes, and higher scores translate into longer probing intervals. For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf |
max(/HashiCorp Consul Node by HTTP/consul.memberlist.health_score,#3)>{$CONSUL.NODE.HEALTH_SCORE.MAX.WARN} |
WARNING | Depends on: - Consul: Node's health score is critical |
Consul: Node's health score is critical | This metric ranges from 0 to 8, where 0 indicates "totally healthy". This health score is used to scale the time between outgoing probes, and higher scores translate into longer probing intervals. For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf |
max(/HashiCorp Consul Node by HTTP/consul.memberlist.health_score,#3)>{$CONSUL.NODE.HEALTH_SCORE.MAX.HIGH} |
AVERAGE | |
Consul: Failed to get local services | Failed to get local services. Check debug log for more information. |
length(last(/HashiCorp Consul Node by HTTP/consul.get_local_services.check))>0 |
WARNING | |
Consul: Aggregated status is 'warning' | Aggregated state of service on the local agent is 'warning'. |
last(/HashiCorp Consul Node by HTTP/consul.service.aggregated_state["{#SERVICE_ID}"]) = 1 |
WARNING | |
Consul: Aggregated status is 'critical' | Aggregated state of service on the local agent is 'critical'. |
last(/HashiCorp Consul Node by HTTP/consul.service.aggregated_state["{#SERVICE_ID}"]) = 2 |
AVERAGE |
Feedback
Please report any issues with the template at https://support.zabbix.com
You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.