ZooKeeper

ZooKeeper

Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

Available solutions




Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/zookeeper_http


Zookeeper by HTTP

Overview

For Zabbix version: 5.2 and higher
The template to monitor Apache Zookeeper by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

This template was tested on:

  • Apache Zookeeper, version 3.6+

Setup

See Zabbix template operation for basic instructions.

This template works with standalone and cluster instances. Metrics are collected from each Zookeper node by requests to AdminServer.
By default AdminServer is enabled and listens on port 8080.
You can еnable or configure AdminServer parameters according official documentations.
Don't forget to change macros {$ZOOKEEPER.COMMAND_URL}, {$ZOOKEEPER.PORT}, {$ZOOKEEPER.SCHEME}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$ZOOKEEPER.COMMAND_URL}

The URL for listing and issuing commands relative to the root URL (admin.commandURL).

commands
{$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}

Maximum percentage of file descriptors usage alert treshold (for trigger expression).

85
{$ZOOKEEPER.OUTSTANDING_REQ.MAX.WARN}

Maximum number of outstanding requests (for trigger expression).

10
{$ZOOKEEPER.PENDING_SYNCS.MAX.WARN}

Maximum number of pending syncs from the followers (for trigger expression).

10
{$ZOOKEEPER.PORT}

The port the embedded Jetty server listens on (admin.serverPort).

8080
{$ZOOKEEPER.SCHEME}

Request scheme which may be http or https

http

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Leader metrics discovery

Additional metrics for leader node

DEPENDENT zookeeper.metrics.leader

Preprocessing:

- JSONPATH: $.server_state

- JAVASCRIPT: return JSON.stringify(value == 'leader' ? [{'{#SINGLETON}': ''}] : []);

Clients discovery

Get list of client connections.

Note, depending on the number of client connections this operation may be expensive (i.e. impact server performance).

HTTP_AGENT zookeeper.clients

Preprocessing:

- JAVASCRIPT: Text is too long. Please see the template.

Items collected

Group Name Description Type Key and additional info
Zabbix_raw_items Zookeeper: Get server metrics

-

HTTP_AGENT zookeeper.get_metrics
Zabbix_raw_items Zookeeper: Get connections stats

Get information on client connections to server. Note, depending on the number of client connections this operation may be expensive (i.e. impact server performance).

HTTP_AGENT zookeeper.get_connections_stats
Zookeeper Zookeeper: Server mode

Mode of the server. In an ensemble, this may either be leader or follower. Otherwise, it is standalone

DEPENDENT zookeeper.server_state

Preprocessing:

- JSONPATH: $.server_state

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zookeeper Zookeeper: Uptime

Uptime of Zookeeper server.

DEPENDENT zookeeper.uptime

Preprocessing:

- JSONPATH: $.uptime

- MULTIPLIER: 0.001

Zookeeper Zookeeper: Version

Version of Zookeeper server.

DEPENDENT zookeeper.version

Preprocessing:

- JSONPATH: $.version

- REGEX: ([^,]+)--(.+) \1

- DISCARD_UNCHANGED_HEARTBEAT: 3h

Zookeeper Zookeeper: Approximate data size

Data tree size in bytes.The size includes the znode path and its value.

DEPENDENT zookeeper.approximate_data_size

Preprocessing:

- JSONPATH: $.approximate_data_size

Zookeeper Zookeeper: File descriptors, max

Maximum number of file descriptors that a zookeeper server can open.

DEPENDENT zookeeper.max_file_descriptor_count

Preprocessing:

- JSONPATH: $.max_file_descriptor_count

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zookeeper Zookeeper: File descriptors, open

Number of file descriptors that a zookeeper server has open.

DEPENDENT zookeeper.open_file_descriptor_count

Preprocessing:

- JSONPATH: $.open_file_descriptor_count

Zookeeper Zookeeper: Outstanding requests

The number of queued requests when the server is under load and is receiving more sustained requests than it can process.

DEPENDENT zookeeper.outstanding_requests

Preprocessing:

- JSONPATH: $.outstanding_requests

Zookeeper Zookeeper: Commit per sec

The number of commits performed per second

DEPENDENT zookeeper.commit_count.rate

Preprocessing:

- JSONPATH: $.commit_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Diff syncs per sec

Number of diff syncs performed per second

DEPENDENT zookeeper.diff_count.rate

Preprocessing:

- JSONPATH: $.diff_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Snap syncs per sec

Number of snap syncs performed per second

DEPENDENT zookeeper.snap_count.rate

Preprocessing:

- JSONPATH: $.snap_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Looking per sec

Rate of transitions into looking state.

DEPENDENT zookeeper.looking_count.rate

Preprocessing:

- JSONPATH: $.looking_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Alive connections

Number of active clients connected to a zookeeper server.

DEPENDENT zookeeper.num_alive_connections

Preprocessing:

- JSONPATH: $.num_alive_connections

Zookeeper Zookeeper: Global sessions

Number of global sessions.

DEPENDENT zookeeper.global_sessions

Preprocessing:

- JSONPATH: $.global_sessions

Zookeeper Zookeeper: Local sessions

Number of local sessions.

DEPENDENT zookeeper.local_sessions

Preprocessing:

- JSONPATH: $.local_sessions

Zookeeper Zookeeper: Drop connections per sec

Rate of connection drops.

DEPENDENT zookeeper.connection_drop_count.rate

Preprocessing:

- JSONPATH: $.connection_drop_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Rejected connections per sec

Rate of connection rejected.

DEPENDENT zookeeper.connection_rejected.rate

Preprocessing:

- JSONPATH: $.connection_rejected

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Revalidate connections per sec

Rate ofconnection revalidations.

DEPENDENT zookeeper.connection_revalidate_count.rate

Preprocessing:

- JSONPATH: $.connection_revalidate_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Revalidate per sec

Rate of revalidations.

DEPENDENT zookeeper.revalidate_count.rate

Preprocessing:

- JSONPATH: $.revalidate_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Latency, max

The maximum amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.max_latency

Preprocessing:

- JSONPATH: $.max_latency

Zookeeper Zookeeper: Latency, min

The minimum amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.min_latency

Preprocessing:

- JSONPATH: $.min_latency

Zookeeper Zookeeper: Latency, avg

The average amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.avg_latency

Preprocessing:

- JSONPATH: $.avg_latency

Zookeeper Zookeeper: Znode count

The number of znodes in the ZooKeeper namespace (the data)

DEPENDENT zookeeper.znode_count

Preprocessing:

- JSONPATH: $.znode_count

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zookeeper Zookeeper: Ephemeral nodes count

Number of ephemeral nodes that a zookeeper server has in its data tree.

DEPENDENT zookeeper.ephemerals_count

Preprocessing:

- JSONPATH: $.ephemerals_count

Zookeeper Zookeeper: Watch count

Number of watches currently set on the local ZooKeeper process.

DEPENDENT zookeeper.watch_count

Preprocessing:

- JSONPATH: $.watch_count

Zookeeper Zookeeper: Packets sent per sec

The number of zookeeper packets sent from a server per second.

DEPENDENT zookeeper.packets_sent

Preprocessing:

- JSONPATH: $.packets_sent

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Packets received per sec

The number of zookeeper packets received by a server per second.

DEPENDENT zookeeper.packets_received.rate

Preprocessing:

- JSONPATH: $.packets_received

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Bytes received per sec

Number of bytes received per second.

DEPENDENT zookeeper.bytes_received_count.rate

Preprocessing:

- JSONPATH: $.bytes_received_count

- CHANGE_PER_SECOND

Zookeeper Zookeeper: Election time, avg

Time between entering and leaving election.

DEPENDENT zookeeper.avg_election_time

Preprocessing:

- JAVASCRIPT: Text is too long. Please see the template.

Zookeeper Zookeeper: Elections

Number of elections happened.

DEPENDENT zookeeper.cnt_election_time

Preprocessing:

- JAVASCRIPT: Text is too long. Please see the template.

Zookeeper Zookeeper: Fsync time, avg

Time to fsync transaction log.

DEPENDENT zookeeper.avg_fsynctime

Preprocessing:

- JAVASCRIPT: Text is too long. Please see the template.

Zookeeper Zookeeper: Fsync

Count of performed fsyncs.

DEPENDENT zookeeper.cnt_fsynctime

Preprocessing:

- JAVASCRIPT: var metrics = JSON.parse(value) return metrics.cnt_fsynctime || metrics.fsynctime_count

Zookeeper Zookeeper: Snapshot write time, avg

Average time to write a snapshot.

DEPENDENT zookeeper.avg_snapshottime

Preprocessing:

- JAVASCRIPT: Text is too long. Please see the template.

Zookeeper Zookeeper: Snapshot writes

Count of performed snapshot writes.

DEPENDENT zookeeper.cnt_snapshottime

Preprocessing:

- JAVASCRIPT: var metrics = JSON.parse(value) return metrics.snapshottime_count || metrics.cnt_snapshottime

Zookeeper Zookeeper: Pending syncs{#SINGLETON}

Number of pending syncs to carry out to ZooKeeper ensemble followers.

DEPENDENT zookeeper.pending_syncs[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.pending_syncs

Zookeeper Zookeeper: Quorum size{#SINGLETON} DEPENDENT zookeeper.quorum_size[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.quorum_size

Zookeeper Zookeeper: Synced followers{#SINGLETON}

Number of synced followers reported when a node server_state is leader.

DEPENDENT zookeeper.synced_followers[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.synced_followers

Zookeeper Zookeeper: Synced non-voting follower{#SINGLETON}

Number of synced voting followers reported when a node server_state is leader.

DEPENDENT zookeeper.synced_non_voting_followers[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.synced_non_voting_followers

Zookeeper Zookeeper: Synced observers{#SINGLETON}

Number of synced observers.

DEPENDENT zookeeper.synced_observers[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.synced_observers

Zookeeper Zookeeper: Learners{#SINGLETON}

Number of learners.

DEPENDENT zookeeper.learners[{#SINGLETON}]

Preprocessing:

- JSONPATH: $.learners

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Latency, max

The maximum amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.max_latency[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].max_latency.first()

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Latency, min

The minimum amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.min_latency[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].min_latency.first()

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Latency, avg

The average amount of time it takes for the server to respond to a client request.

DEPENDENT zookeeper.avg_latency[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].avg_latency.first()

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Packets sent per sec

The number of packets sent.

DEPENDENT zookeeper.packets_sent[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].packets_sent.first()

- CHANGE_PER_SECOND

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Packets received per sec

The number of packets received.

DEPENDENT zookeeper.packets_received[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].packets_received.first()

- CHANGE_PER_SECOND

Zookeeper Zookeeper client {#TYPE} [{#CLIENT}]: Outstanding requests

The number of queued requests when the server is under load and is receiving more sustained requests than it can process.

DEPENDENT zookeeper.outstanding_requests[{#TYPE},{#CLIENT}]

Preprocessing:

- JSONPATH: $.{#TYPE}.[?(@.remote_socket_address == "{#ADDRESS}")].outstanding_requests.first()

Triggers

Name Description Expression Severity Dependencies and additional info
Zookeeper: Server mode has changed (new mode: {ITEM.VALUE})

Zookeeper node state has changed. Ack to close.

{TEMPLATE_NAME:zookeeper.server_state.diff()}=1 and {TEMPLATE_NAME:zookeeper.server_state.strlen()}>0 INFO

Manual close: YES

Zookeeper: has been restarted (uptime < 10m)

Uptime is less than 10 minutes

{TEMPLATE_NAME:zookeeper.uptime.last()}<10m INFO

Manual close: YES

Zookeeper: Failed to fetch info data (or no data for 10m)

Zabbix has not received data for items for the last 10 minutes

{TEMPLATE_NAME:zookeeper.uptime.nodata(10m)}=1 WARNING

Manual close: YES

Zookeeper: Version has changed (new version: {ITEM.VALUE})

Zookeeper version has changed. Ack to close.

{TEMPLATE_NAME:zookeeper.version.diff()}=1 and {TEMPLATE_NAME:zookeeper.version.strlen()}>0 INFO

Manual close: YES

Zookeeper: Too many file descriptors used (over {$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}% for 5 min)

Number of file descriptors used more than {$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN}% of the available number of file descriptors.

{TEMPLATE_NAME:zookeeper.open_file_descriptor_count.min(5m)} * 100 / {Zookeeper by HTTP:zookeeper.max_file_descriptor_count.last()} > {$ZOOKEEPER.FILE_DESCRIPTORS.MAX.WARN} WARNING
Zookeeper: Too many queued requests (over {$ZOOKEEPER.OUTSTANDING_REQ.MAX.WARN}% for 5 min)

Number of queued requests in the server. This goes up when the server receives more requests than it can process.

{TEMPLATE_NAME:zookeeper.outstanding_requests.min(5m)}>{$ZOOKEEPER.OUTSTANDING_REQ.MAX.WARN} AVERAGE

Manual close: YES

Zookeeper: Too many pending syncs (over {$ZOOKEEPER.PENDING_SYNCS.MAX.WARN}% for 5 min)

-

{TEMPLATE_NAME:zookeeper.pending_syncs[{#SINGLETON}].min(5m)}>{$ZOOKEEPER.PENDING_SYNCS.MAX.WARN} AVERAGE

Manual close: YES

Zookeeper: Too few active followers

The number of followers should equal the total size of your ZooKeeper ensemble, minus 1 (the leader is not included in the follower count). If the ensemble fails to maintain quorum, all automatic failover features are suspended.

{TEMPLATE_NAME:zookeeper.synced_followers[{#SINGLETON}].last()} < {Zookeeper by HTTP:zookeeper.quorum_size[{#SINGLETON}].last()}-1 AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide a feedback, discuss the template or ask for help with it at ZABBIX forums.

Articles and documentation

+ Propose new article
Add your solution