HashiCorp Vault

HashiCorp Vault is a secrets management tool specifically designed to control access to sensitive credentials in a low-trust environment. It can be used to store sensitive values and at the same time dynamically generate access for specific services/applications on lease

Available solutions




This template is for Zabbix version: 6.4
Also available for: 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/6.4

HashiCorp Vault by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

This template was tested on:

  • Vault, version 1.6

Setup

See Zabbix template operation for basic instructions.

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$VAULT.API.PORT}

Vault port.

8200
{$VAULT.API.SCHEME}

Vault API scheme.

http
{$VAULT.HOST}

Vault host name.

<PUT YOUR VAULT HOST>
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}

Maximum number of Vault leadership losses.

5
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}

Maximum number of Vault leadership setup failed.

5
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}

Maximum number of Vault leadership step downs.

5
{$VAULT.LLD.FILTER.STORAGE.MATCHES}

Filter of discoverable storage backends.

.+
{$VAULT.OPEN.FDS.MAX.WARN}

Maximum percentage of used file descriptors for trigger expression.

90
{$VAULT.TOKEN.ACCESSORS}

Vault accessors separated by spaces for monitoring token expiration time.

``
{$VAULT.TOKEN.TTL.MIN.CRIT}

Token TTL critical threshold.

3d
{$VAULT.TOKEN.TTL.MIN.WARN}

Token TTL warning threshold.

7d
{$VAULT.TOKEN}

Vault auth token.

<PUT YOUR AUTH TOKEN>

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Mountpoint metrics discovery

Mountpoint metrics discovery.

DEPENDENT vault.mountpoint.discovery
Replication metrics discovery

Discovery for replication metrics.

DEPENDENT vault.replication.discovery
Storage metrics discovery

Storage backend metrics discovery.

DEPENDENT vault.storage.discovery

Filter:

AND

- {#STORAGE} MATCHES_REGEX {$VAULT.LLD.FILTER.STORAGE.MATCHES}

Token metrics discovery

Tokens metrics discovery.

DEPENDENT vault.tokens.discovery
WAL metrics discovery

Discovery for WAL metrics.

DEPENDENT vault.wal.discovery

Items collected

Group Name Description Type Key and additional info
Vault Vault: Initialized

Initialization status.

DEPENDENT vault.health.initialized

Preprocessing:

- JSONPATH: $.initialized

⛔️ON_FAIL: DISCARD_VALUE ->

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Sealed

Seal status.

DEPENDENT vault.health.sealed

Preprocessing:

- JSONPATH: $.sealed

⛔️ON_FAIL: DISCARD_VALUE ->

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Standby

Standby status.

DEPENDENT vault.health.standby

Preprocessing:

- JSONPATH: $.standby

⛔️ON_FAIL: DISCARD_VALUE ->

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Performance standby

Performance standby status.

DEPENDENT vault.health.performance_standby

Preprocessing:

- JSONPATH: $.performance_standby

⛔️ON_FAIL: DISCARD_VALUE ->

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Performance replication

Performance replication mode

https://www.vaultproject.io/docs/enterprise/replication

DEPENDENT vault.health.replication_performance_mode

Preprocessing:

- JSONPATH: $.replication_performance_mode

⛔️ON_FAIL: DISCARD_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Disaster Recovery replication

Disaster recovery replication mode

https://www.vaultproject.io/docs/enterprise/replication

DEPENDENT vault.health.replication_dr_mode

Preprocessing:

- JSONPATH: $.replication_dr_mode

⛔️ON_FAIL: DISCARD_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Version

Server version.

DEPENDENT vault.health.version

Preprocessing:

- JSONPATH: $.version

⛔️ON_FAIL: DISCARD_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Healthcheck

Vault healthcheck.

DEPENDENT vault.health.check

Preprocessing:

- JSONPATH: $.healthcheck

⛔️ON_FAIL: CUSTOM_VALUE -> 1

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: HA enabled

HA enabled status.

DEPENDENT vault.leader.ha_enabled

Preprocessing:

- JSONPATH: $.ha_enabled

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Is leader

Leader status.

DEPENDENT vault.leader.is_self

Preprocessing:

- JSONPATH: $.is_self

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Get metrics error

Get metrics error.

DEPENDENT vault.get_metrics.error

Preprocessing:

- JSONPATH: $.errors[0]

⛔️ON_FAIL: CUSTOM_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Process CPU seconds, total

Total user and system CPU time spent in seconds.

DEPENDENT vault.metrics.process.cpu.seconds.total

Preprocessing:

- PROMETHEUS_PATTERN: process_cpu_seconds_total

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Open file descriptors, max

Maximum number of open file descriptors.

DEPENDENT vault.metrics.process.max.fds

Preprocessing:

- PROMETHEUS_PATTERN: process_max_fds

⛔️ON_FAIL: DISCARD_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Open file descriptors, current

Number of open file descriptors.

DEPENDENT vault.metrics.process.open.fds

Preprocessing:

- PROMETHEUS_PATTERN: process_open_fds

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Process resident memory

Resident memory size in bytes.

DEPENDENT vault.metrics.process.resident_memory.bytes

Preprocessing:

- PROMETHEUS_PATTERN: process_resident_memory_bytes

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Uptime

Server uptime.

DEPENDENT vault.metrics.process.uptime

Preprocessing:

- PROMETHEUS_PATTERN: process_start_time_seconds

⛔️ON_FAIL: DISCARD_VALUE ->

- JAVASCRIPT: return Math.floor(Date.now()/1000 - Number(value));

Vault Vault: Process virtual memory, current

Virtual memory size in bytes.

DEPENDENT vault.metrics.process.virtual_memory.bytes

Preprocessing:

- PROMETHEUS_PATTERN: process_virtual_memory_bytes

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Process virtual memory, max

Maximum amount of virtual memory available in bytes.

DEPENDENT vault.metrics.process.virtual_memory.max.bytes

Preprocessing:

- PROMETHEUS_PATTERN: process_virtual_memory_max_bytes

⛔️ON_FAIL: DISCARD_VALUE ->

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Audit log requests, rate

Number of all audit log requests across all audit log devices.

DEPENDENT vault.metrics.audit.log.request.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_audit_log_request_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Audit log request failures, rate

Number of audit log request failures.

DEPENDENT vault.metrics.audit.log.request.failure.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_audit_log_request_failure

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Audit log response, rate

Number of audit log responses across all audit log devices.

DEPENDENT vault.metrics.audit.log.response.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_audit_log_response_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Audit log response failures, rate

Number of audit log response failures.

DEPENDENT vault.metrics.audit.log.response.failure.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_audit_log_response_failure

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Barrier DELETE ops, rate

Number of DELETE operations at the barrier.

DEPENDENT vault.metrics.barrier.delete.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_barrier_delete_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Barrier GET ops, rate

Number of GET operations at the barrier.

DEPENDENT vault.metrics.vault.barrier.get.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_barrier_get_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Barrier LIST ops, rate

Number of LIST operations at the barrier.

DEPENDENT vault.metrics.barrier.list.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_barrier_list_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Barrier PUT ops, rate

Number of PUT operations at the barrier.

DEPENDENT vault.metrics.barrier.put.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_barrier_put_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Cache hit, rate

Number of times a value was retrieved from the LRU cache.

DEPENDENT vault.metrics.cache.hit.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_cache_hit

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Cache miss, rate

Number of times a value was not in the LRU cache. The results in a read from the configured storage.

DEPENDENT vault.metrics.cache.miss.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_cache_miss

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Cache write, rate

Number of times a value was written to the LRU cache.

DEPENDENT vault.metrics.cache.write.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_cache_write

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Check token, rate

Number of token checks handled by Vault core.

DEPENDENT vault.metrics.core.check.token.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_check_token_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Fetch ACL and token, rate

Number of ACL and corresponding token entry fetches handled by Vault core.

DEPENDENT vault.metrics.core.fetch.acl_and_token

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_fetch_acl_and_token_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Requests, rate

Number of requests handled by Vault core.

DEPENDENT vault.metrics.core.handle.request

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_handle_request_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Leadership setup failed, counter

Cluster leadership setup failures which have occurred in a highly available Vault cluster.

DEPENDENT vault.metrics.core.leadership.setup_failed

Preprocessing:

- PROMETHEUS_TO_JSON: vault_core_leadership_setup_failed

- JSONPATH: $[?(@.name=="vault_core_leadership_setup_failed")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Leadership setup lost, counter

Cluster leadership losses which have occurred in a highly available Vault cluster.

DEPENDENT vault.metrics.core.leadership_lost

Preprocessing:

- PROMETHEUS_TO_JSON: vault_core_leadership_lost_count

- JSONPATH: $[?(@.name=="vault_core_leadership_lost_count")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Post-unseal ops, counter

Duration of time taken by post-unseal operations handled by Vault core.

DEPENDENT vault.metrics.core.post_unseal

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_post_unseal_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Pre-seal ops, counter

Duration of time taken by pre-seal operations.

DEPENDENT vault.metrics.core.pre_seal

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_pre_seal_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Requested seal ops, counter

Duration of time taken by requested seal operations.

DEPENDENT vault.metrics.core.seal_with_request

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_seal_with_request_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Seal ops, counter

Duration of time taken by seal operations.

DEPENDENT vault.metrics.core.seal

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_seal_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Internal seal ops, counter

Duration of time taken by internal seal operations.

DEPENDENT vault.metrics.core.seal_internal

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_seal_internal_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Leadership step downs, counter

Cluster leadership step down.

DEPENDENT vault.metrics.core.step_down

Preprocessing:

- PROMETHEUS_TO_JSON: vault_core_step_down_count

- JSONPATH: $[?(@.name=="vault_core_step_down_count")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Unseal ops, counter

Duration of time taken by unseal operations.

DEPENDENT vault.metrics.core.unseal

Preprocessing:

- PROMETHEUS_PATTERN: vault_core_unseal_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Fetch lease times, counter

Time taken to fetch lease times.

DEPENDENT vault.metrics.expire.fetch.lease.times

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_fetch_lease_times_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Fetch lease times by token, counter

Time taken to fetch lease times by token.

DEPENDENT vault.metrics.expire.fetch.lease.times.by_token

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_fetch_lease_times_by_token_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Number of expiring leases

Number of all leases which are eligible for eventual expiry.

DEPENDENT vault.metrics.expire.num_leases

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_num_leases

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Expire revoke, count

Time taken to revoke a token.

DEPENDENT vault.metrics.expire.revoke

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_revoke_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Expire revoke force, count

Time taken to forcibly revoke a token.

DEPENDENT vault.metrics.expire.revoke.force

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_revoke_force_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Expire revoke prefix, count

Tokens revoke on a prefix.

DEPENDENT vault.metrics.expire.revoke.prefix

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_revoke_prefix_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Revoke secrets by token, count

Time taken to revoke all secrets issued with a given token.

DEPENDENT vault.metrics.expire.revoke.by_token

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_revoke_by_token_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Expire renew, count

Time taken to renew a lease.

DEPENDENT vault.metrics.expire.renew

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_renew_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Renew token, count

Time taken to renew a token which does not need to invoke a logical backend.

DEPENDENT vault.metrics.expire.renew_token

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_renew_token_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Register ops, count

Time taken for register operations.

DEPENDENT vault.metrics.expire.register

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_register_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Register auth ops, count

Time taken for register authentication operations which create lease entries without lease ID.

DEPENDENT vault.metrics.expire.register.auth

Preprocessing:

- PROMETHEUS_PATTERN: vault_expire_register_auth_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Policy GET ops, rate

Number of operations to get a policy.

DEPENDENT vault.metrics.policy.get_policy.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_policy_get_policy_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Policy LIST ops, rate

Number of operations to list policies.

DEPENDENT vault.metrics.policy.list_policies.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_policy_list_policies_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Policy DELETE ops, rate

Number of operations to delete a policy.

DEPENDENT vault.metrics.policy.delete_policy.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_policy_delete_policy_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Policy SET ops, rate

Number of operations to set a policy.

DEPENDENT vault.metrics.policy.set_policy.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_policy_set_policy_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Token create, count

The time taken to create a token.

DEPENDENT vault.metrics.token.create

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_create_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token createAccessor, count

The time taken to create a token accessor.

DEPENDENT vault.metrics.token.createAccessor

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_createAccessor_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token lookup, rate

Number of token look up.

DEPENDENT vault.metrics.token.lookup.rate

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_lookup_count

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Token revoke, count

The time taken to look up a token.

DEPENDENT vault.metrics.token.revoke

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_revoke_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token revoke tree, count

Time taken to revoke a token tree.

DEPENDENT vault.metrics.token.revoke.tree

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_revoke_tree_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token store, count

Time taken to store an updated token entry without writing to the secondary index.

DEPENDENT vault.metrics.token.store

Preprocessing:

- PROMETHEUS_PATTERN: vault_token_store_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime allocated bytes

Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.

DEPENDENT vault.metrics.runtime.alloc.bytes

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_alloc_bytes

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime freed objects

Number of freed objects.

DEPENDENT vault.metrics.runtime.free.count

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_free_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime heap objects

Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.

DEPENDENT vault.metrics.runtime.heap.objects

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_heap_objects

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime malloc count

Cumulative count of allocated heap objects.

DEPENDENT vault.metrics.runtime.malloc.count

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_malloc_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime num goroutines

Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.

DEPENDENT vault.metrics.runtime.num_goroutines

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_num_goroutines

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime sys bytes

Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.

DEPENDENT vault.metrics.runtime.sys.bytes

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_sys_bytes

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Runtime GC pause, total

The total garbage collector pause time since Vault was last started.

DEPENDENT vault.metrics.total.gc.pause

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_total_gc_pause_ns

⛔️ON_FAIL: DISCARD_VALUE ->

- MULTIPLIER: 1.0E-9

Vault Vault: Runtime GC runs, total

Total number of garbage collection runs since Vault was last started.

DEPENDENT vault.metrics.runtime.total.gc.runs

Preprocessing:

- PROMETHEUS_PATTERN: vault_runtime_total_gc_runs

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token count, total

Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.

DEPENDENT vault.metrics.token

Preprocessing:

- PROMETHEUS_TO_JSON: vault_token_count

- JSONPATH: $[?(@.name=="vault_token_count")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Token count by auth, total

Total number of service tokens that were created by a auth method.

DEPENDENT vault.metrics.token.by_auth

Preprocessing:

- PROMETHEUS_TO_JSON: vault_token_count_by_auth

- JSONPATH: $[?(@.name=="vault_token_count_by_auth")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Token count by policy, total

Total number of service tokens that have a policy attached.

DEPENDENT vault.metrics.token.by_policy

Preprocessing:

- PROMETHEUS_TO_JSON: vault_token_count_by_policy

- JSONPATH: $[?(@.name=="vault_token_count_by_policy")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Token count by ttl, total

Number of service tokens, grouped by the TTL range they were assigned at creation.

DEPENDENT vault.metrics.token.by_ttl

Preprocessing:

- PROMETHEUS_TO_JSON: vault_token_count_by_ttl

- JSONPATH: $[?(@.name=="vault_token_count_by_ttl")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Token creation, rate

Number of service or batch tokens created.

DEPENDENT vault.metrics.token.creation.rate

Preprocessing:

- PROMETHEUS_TO_JSON: vault_token_creation

- JSONPATH: $[?(@.name=="vault_token_creation")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

- CHANGE_PER_SECOND

Vault Vault: Secret kv entries

Number of entries in each key-value secret engine.

DEPENDENT vault.metrics.secret.kv.count

Preprocessing:

- PROMETHEUS_TO_JSON: vault_secret_kv_count

- JSONPATH: $[?(@.name=="vault_secret_kv_count")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

Vault Vault: Token secret lease creation, rate

Counts the number of leases created by secret engines.

DEPENDENT vault.metrics.secret.lease.creation.rate

Preprocessing:

- PROMETHEUS_TO_JSON: vault_secret_lease_creation

- JSONPATH: $[?(@.name=="vault_secret_lease_creation")].value.sum()

⛔️ON_FAIL: CUSTOM_VALUE -> 0

- CHANGE_PER_SECOND

Vault Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

DEPENDENT vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing:

- PROMETHEUS_PATTERN: {#PATTERN_C}

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate

Number of operations to perform a rollback operation on the given mount point.

DEPENDENT vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing:

- PROMETHEUS_PATTERN: {#PATTERN_C}

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

DEPENDENT vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing:

- PROMETHEUS_PATTERN: {#PATTERN_C}

⛔️ON_FAIL: DISCARD_VALUE ->

- CHANGE_PER_SECOND

Vault Vault: Delete WALs, count{#SINGLETON}

Time taken to delete a Write Ahead Log (WAL).

DEPENDENT vault.metrics.wal.deletewals[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_deletewals_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: GC deleted WAL{#SINGLETON}

Number of Write Ahead Logs (WAL) deleted during each garbage collection run.

DEPENDENT vault.metrics.wal.gc.deleted[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_gc_deleted

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: WALs on disk, total{#SINGLETON}

Total Number of Write Ahead Logs (WAL) on disk.

DEPENDENT vault.metrics.wal.gc.total[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_gc_total

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Load WALs, count{#SINGLETON}

Time taken to load a Write Ahead Log (WAL).

DEPENDENT vault.metrics.wal.loadWAL[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_loadWAL_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Persist WALs, count{#SINGLETON}

Time taken to persist a Write Ahead Log (WAL).

DEPENDENT vault.metrics.wal.persistwals[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_persistwals_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Flush ready WAL, count{#SINGLETON}

Time taken to flush a ready Write Ahead Log (WAL) to storage.

DEPENDENT vault.metrics.wal.flushready[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: vault_wal_flushready_count

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Stream WAL missing guard, count{#SINGLETON}

Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.

DEPENDENT vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: logshipper_streamWALs_missing_guard

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Stream WAL guard found, count{#SINGLETON}

Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.

DEPENDENT vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: logshipper_streamWALs_guard_found

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Merkle commit index{#SINGLETON}

The last committed index in the Merkle Tree.

DEPENDENT vault.metrics.replication.merkle.commit_index[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: replication_merkle_commit_index

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Last WAL{#SINGLETON}

The index of the last WAL.

DEPENDENT vault.metrics.replication.wal.last_wal[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: replication_wal_last_wal

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Last DR WAL{#SINGLETON}

The index of the last DR WAL.

DEPENDENT vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: replication_wal_last_dr_wal

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Last performance WAL{#SINGLETON}

The index of the last Performance WAL.

DEPENDENT vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: replication_wal_last_performance_wal

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Last remote WAL{#SINGLETON}

The index of the last remote WAL.

DEPENDENT vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}]

Preprocessing:

- PROMETHEUS_PATTERN: replication_fsm_last_remote_wal

⛔️ON_FAIL: DISCARD_VALUE ->

Vault Vault: Token [{#TOKEN_NAME}] error

Token lookup error text.

DEPENDENT vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing:

- JSONPATH: $.[?(@.accessor == "{#ACCESSOR}")].error.first()

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

DEPENDENT vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing:

- JSONPATH: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()

- BOOL_TO_DECIMAL

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Vault Vault: Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

DEPENDENT vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing:

- JSONPATH: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Zabbix raw items Vault: Get health

-

HTTP_AGENT vault.get_health

Preprocessing:

- CHECK_NOT_SUPPORTED

⛔️ON_FAIL: CUSTOM_VALUE -> {"healthcheck": 0}

Zabbix raw items Vault: Get leader

-

HTTP_AGENT vault.get_leader

Preprocessing:

- CHECK_NOT_SUPPORTED

Zabbix raw items Vault: Get metrics

-

HTTP_AGENT vault.get_metrics

Preprocessing:

- CHECK_NOT_SUPPORTED

Zabbix raw items Vault: Clear metrics

-

DEPENDENT vault.clear_metrics

Preprocessing:

- CHECK_JSON_ERROR: $.errors

⛔️ON_FAIL: DISCARD_VALUE ->

Zabbix raw items Vault: Get tokens

Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".

SCRIPT vault.get_tokens

Expression:

The text is too long. Please see the template.
Zabbix raw items Vault: Check WAL discovery

-

DEPENDENT vault.check_wal_discovery

Preprocessing:

- PROMETHEUS_TO_JSON: {__name__=~"^vault_wal_(?:.+)$"}

⛔️ON_FAIL: DISCARD_VALUE ->

- JAVASCRIPT: return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);

- DISCARD_UNCHANGED_HEARTBEAT: 15m

Zabbix raw items Vault: Check replication discovery

-

DEPENDENT vault.check_replication_discovery

Preprocessing:

- PROMETHEUS_TOJSON: `{name=~"^replication(?:.+)$"}</p><p>⛔️ON_FAIL:DISCARD_VALUE -> </p><p>- JAVASCRIPT:return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);</p><p>- DISCARD_UNCHANGED_HEARTBEAT:15m`

Zabbix raw items Vault: Check storage discovery

-

DEPENDENT vault.check_storage_discovery

Preprocessing:

- PROMETHEUS_TOJSON: `{name=~"^vault(?:.+)_(?:get|put|list|delete)_count$"}</p><p>⛔️ON_FAIL:DISCARD_VALUE -> </p><p>- JAVASCRIPT:The text is too long. Please see the template.</p><p>- DISCARD_UNCHANGED_HEARTBEAT:15m`

Zabbix raw items Vault: Check mountpoint discovery

-

DEPENDENT vault.check_mountpoint_discovery

Preprocessing:

- PROMETHEUS_TO_JSON: {__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}

⛔️ON_FAIL: DISCARD_VALUE ->

- JAVASCRIPT: The text is too long. Please see the template.

- DISCARD_UNCHANGED_HEARTBEAT: 15m

Triggers

Name Description Expression Severity Dependencies and additional info
Vault: Vault server is sealed

https://www.vaultproject.io/docs/concepts/seal

last(/HashiCorp Vault by HTTP/vault.health.sealed)=1 AVERAGE
Vault: Version has changed

Vault version has changed. Ack to close.

last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0 INFO

Manual close: YES

Vault: Vault server is not responding

-

last(/HashiCorp Vault by HTTP/vault.health.check)=0 HIGH
Vault: Failed to get metrics

-

length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0 WARNING

Depends on:

- Vault: Vault server is sealed

Vault: Current number of open files is too high

-

min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN} WARNING
Vault: has been restarted

Uptime is less than 10 minutes.

last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m INFO

Manual close: YES

Vault: High frequency of leadership setup failures

There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.

(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} AVERAGE
Vault: High frequency of leadership losses

There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.

(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN} AVERAGE
Vault: High frequency of leadership step downs

There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.

(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} AVERAGE
Vault: Token [{#TOKEN_NAME}] lookup error occurred

-

length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0 WARNING

Depends on:

- Vault: Vault server is sealed

Vault: Token [{#TOKEN_NAME}] will expire soon

-

last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT} AVERAGE
Vault: Token [{#TOKEN_NAME}] will expire soon

-

last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN} WARNING

Depends on:

- Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

Articles and documentation

+ Propose new article

Didn't find what you are looking for?