Available solutions

HashiCorp Vault by HTTP
3rd party solutions

This template is for Zabbix version: 7.4

Also available for: 7.2 7.0 6.4 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/7.4

HashiCorp Vault by HTTP

Overview

The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

Requirements

Zabbix version: 7.4 and higher.

Tested versions

This template has been tested on:

Vault 1.6

Configuration

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`

Items

Name	Description	Type	Key and additional info
Get health		HTTP agent	vault.get_health Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
Get leader		HTTP agent	vault.get_leader Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Get metrics		HTTP agent	vault.get_metrics Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Clear metrics		Dependent item	vault.clear_metrics Preprocessing Check for error in JSON: `$.errors` ⛔️Custom on fail: Discard value
Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	Script	vault.get_tokens
Check WAL discovery		Dependent item	vault.check_wal_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check replication discovery		Dependent item	vault.check_replication_discovery Preprocessing Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check storage discovery		Dependent item	vault.check_storage_discovery Preprocessing Prometheus to JSON: `{name=~"^vault_(?:.+)_(?:get
Check mountpoint discovery		Dependent item	vault.check_mountpoint_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Initialized	Initialization status.	Dependent item	vault.health.initialized Preprocessing JSON Path: `$.initialized` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Sealed	Seal status.	Dependent item	vault.health.sealed Preprocessing JSON Path: `$.sealed` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Standby	Standby status.	Dependent item	vault.health.standby Preprocessing JSON Path: `$.standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance standby	Performance standby status.	Dependent item	vault.health.performance_standby Preprocessing JSON Path: `$.performance_standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_performance_mode Preprocessing JSON Path: `$.replication_performance_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_dr_mode Preprocessing JSON Path: `$.replication_dr_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Version	Server version.	Dependent item	vault.health.version Preprocessing JSON Path: `$.version` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Healthcheck	Vault healthcheck.	Dependent item	vault.health.check Preprocessing JSON Path: `$.healthcheck` ⛔️Custom on fail: Set value to: `1` Discard unchanged with heartbeat: `1h`
HA enabled	HA enabled status.	Dependent item	vault.leader.ha_enabled Preprocessing JSON Path: `$.ha_enabled` Boolean to decimal Discard unchanged with heartbeat: `1h`
Is leader	Leader status.	Dependent item	vault.leader.is_self Preprocessing JSON Path: `$.is_self` Boolean to decimal Discard unchanged with heartbeat: `1h`
Get metrics error	Get metrics error.	Dependent item	vault.get_metrics.error Preprocessing JSON Path: `$.errors[0]` ⛔️Custom on fail: Set value to: `` Discard unchanged with heartbeat: `1h`
Process CPU seconds, total	Total user and system CPU time spent in seconds.	Dependent item	vault.metrics.process.cpu.seconds.total Preprocessing Prometheus pattern: `VALUE(process_cpu_seconds_total)` ⛔️Custom on fail: Discard value
Open file descriptors, max	Maximum number of open file descriptors.	Dependent item	vault.metrics.process.max.fds Preprocessing Prometheus pattern: `VALUE(process_max_fds)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Open file descriptors, current	Number of open file descriptors.	Dependent item	vault.metrics.process.open.fds Preprocessing Prometheus pattern: `VALUE(process_open_fds)` ⛔️Custom on fail: Discard value
Process resident memory	Resident memory size in bytes.	Dependent item	vault.metrics.process.resident_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_resident_memory_bytes)` ⛔️Custom on fail: Discard value
Uptime	Server uptime.	Dependent item	vault.metrics.process.uptime Preprocessing Prometheus pattern: `VALUE(process_start_time_seconds)` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.`
Process virtual memory, current	Virtual memory size in bytes.	Dependent item	vault.metrics.process.virtual_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_bytes)` ⛔️Custom on fail: Discard value
Process virtual memory, max	Maximum amount of virtual memory available in bytes.	Dependent item	vault.metrics.process.virtual_memory.max.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Audit log requests, rate	Number of all audit log requests across all audit log devices.	Dependent item	vault.metrics.audit.log.request.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_count)` ⛔️Custom on fail: Discard value Change per second
Audit log request failures, rate	Number of audit log request failures.	Dependent item	vault.metrics.audit.log.request.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_failure)` ⛔️Custom on fail: Discard value Change per second
Audit log response, rate	Number of audit log responses across all audit log devices.	Dependent item	vault.metrics.audit.log.response.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_count)` ⛔️Custom on fail: Discard value Change per second
Audit log response failures, rate	Number of audit log response failures.	Dependent item	vault.metrics.audit.log.response.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_failure)` ⛔️Custom on fail: Discard value Change per second
Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	Dependent item	vault.metrics.barrier.delete.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_delete_count)` ⛔️Custom on fail: Discard value Change per second
Barrier GET ops, rate	Number of GET operations at the barrier.	Dependent item	vault.metrics.vault.barrier.get.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_get_count)` ⛔️Custom on fail: Discard value Change per second
Barrier LIST ops, rate	Number of LIST operations at the barrier.	Dependent item	vault.metrics.barrier.list.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_list_count)` ⛔️Custom on fail: Discard value Change per second
Barrier PUT ops, rate	Number of PUT operations at the barrier.	Dependent item	vault.metrics.barrier.put.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_put_count)` ⛔️Custom on fail: Discard value Change per second
Cache hit, rate	Number of times a value was retrieved from the LRU cache.	Dependent item	vault.metrics.cache.hit.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_hit)` ⛔️Custom on fail: Discard value Change per second
Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	Dependent item	vault.metrics.cache.miss.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_miss)` ⛔️Custom on fail: Discard value Change per second
Cache write, rate	Number of times a value was written to the LRU cache.	Dependent item	vault.metrics.cache.write.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_write)` ⛔️Custom on fail: Discard value Change per second
Check token, rate	Number of token checks handled by Vault core.	Dependent item	vault.metrics.core.check.token.rate Preprocessing Prometheus pattern: `VALUE(vault_core_check_token_count)` ⛔️Custom on fail: Discard value Change per second
Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	Dependent item	vault.metrics.core.fetch.acl_and_token Preprocessing Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)` ⛔️Custom on fail: Discard value Change per second
Requests, rate	Number of requests handled by Vault core.	Dependent item	vault.metrics.core.handle.request Preprocessing Prometheus pattern: `VALUE(vault_core_handle_request_count)` ⛔️Custom on fail: Discard value Change per second
Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership.setup_failed Preprocessing Prometheus to JSON: `vault_core_leadership_setup_failed` JSON Path: `The text is too long. Please see the template.` ⛔️Custom on fail: Set value to: `0`
Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership_lost Preprocessing Prometheus to JSON: `vault_core_leadership_lost_count` JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	Dependent item	vault.metrics.core.post_unseal Preprocessing Prometheus pattern: `VALUE(vault_core_post_unseal_count)` ⛔️Custom on fail: Discard value
Pre-seal ops, counter	Duration of time taken by pre-seal operations.	Dependent item	vault.metrics.core.pre_seal Preprocessing Prometheus pattern: `VALUE(vault_core_pre_seal_count)` ⛔️Custom on fail: Discard value
Requested seal ops, counter	Duration of time taken by requested seal operations.	Dependent item	vault.metrics.core.seal_with_request Preprocessing Prometheus pattern: `VALUE(vault_core_seal_with_request_count)` ⛔️Custom on fail: Discard value
Seal ops, counter	Duration of time taken by seal operations.	Dependent item	vault.metrics.core.seal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_count)` ⛔️Custom on fail: Discard value
Internal seal ops, counter	Duration of time taken by internal seal operations.	Dependent item	vault.metrics.core.seal_internal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_internal_count)` ⛔️Custom on fail: Discard value
Leadership step downs, counter	Cluster leadership step down.	Dependent item	vault.metrics.core.step_down Preprocessing Prometheus to JSON: `vault_core_step_down_count` JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Unseal ops, counter	Duration of time taken by unseal operations.	Dependent item	vault.metrics.core.unseal Preprocessing Prometheus pattern: `VALUE(vault_core_unseal_count)` ⛔️Custom on fail: Discard value
Fetch lease times, counter	Time taken to fetch lease times.	Dependent item	vault.metrics.expire.fetch.lease.times Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)` ⛔️Custom on fail: Discard value
Fetch lease times by token, counter	Time taken to fetch lease times by token.	Dependent item	vault.metrics.expire.fetch.lease.times.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)` ⛔️Custom on fail: Discard value
Number of expiring leases	Number of all leases which are eligible for eventual expiry.	Dependent item	vault.metrics.expire.num_leases Preprocessing Prometheus pattern: `VALUE(vault_expire_num_leases)` ⛔️Custom on fail: Discard value
Expire revoke, count	Time taken to revoke a token.	Dependent item	vault.metrics.expire.revoke Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_count)` ⛔️Custom on fail: Discard value
Expire revoke force, count	Time taken to forcibly revoke a token.	Dependent item	vault.metrics.expire.revoke.force Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_force_count)` ⛔️Custom on fail: Discard value
Expire revoke prefix, count	Tokens revoke on a prefix.	Dependent item	vault.metrics.expire.revoke.prefix Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)` ⛔️Custom on fail: Discard value
Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	Dependent item	vault.metrics.expire.revoke.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)` ⛔️Custom on fail: Discard value
Expire renew, count	Time taken to renew a lease.	Dependent item	vault.metrics.expire.renew Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_count)` ⛔️Custom on fail: Discard value
Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	Dependent item	vault.metrics.expire.renew_token Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_token_count)` ⛔️Custom on fail: Discard value
Register ops, count	Time taken for register operations.	Dependent item	vault.metrics.expire.register Preprocessing Prometheus pattern: `VALUE(vault_expire_register_count)` ⛔️Custom on fail: Discard value
Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	Dependent item	vault.metrics.expire.register.auth Preprocessing Prometheus pattern: `VALUE(vault_expire_register_auth_count)` ⛔️Custom on fail: Discard value
Policy GET ops, rate	Number of operations to get a policy.	Dependent item	vault.metrics.policy.get_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_get_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy LIST ops, rate	Number of operations to list policies.	Dependent item	vault.metrics.policy.list_policies.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_list_policies_count)` ⛔️Custom on fail: Discard value Change per second
Policy DELETE ops, rate	Number of operations to delete a policy.	Dependent item	vault.metrics.policy.delete_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_delete_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy SET ops, rate	Number of operations to set a policy.	Dependent item	vault.metrics.policy.set_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_set_policy_count)` ⛔️Custom on fail: Discard value Change per second
Token create, count	The time taken to create a token.	Dependent item	vault.metrics.token.create Preprocessing Prometheus pattern: `VALUE(vault_token_create_count)` ⛔️Custom on fail: Discard value
Token createAccessor, count	The time taken to create a token accessor.	Dependent item	vault.metrics.token.createAccessor Preprocessing Prometheus pattern: `VALUE(vault_token_createAccessor_count)` ⛔️Custom on fail: Discard value
Token lookup, rate	Number of token look up.	Dependent item	vault.metrics.token.lookup.rate Preprocessing Prometheus pattern: `VALUE(vault_token_lookup_count)` ⛔️Custom on fail: Discard value Change per second
Token revoke, count	The time taken to look up a token.	Dependent item	vault.metrics.token.revoke Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_count)` ⛔️Custom on fail: Discard value
Token revoke tree, count	Time taken to revoke a token tree.	Dependent item	vault.metrics.token.revoke.tree Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_tree_count)` ⛔️Custom on fail: Discard value
Token store, count	Time taken to store an updated token entry without writing to the secondary index.	Dependent item	vault.metrics.token.store Preprocessing Prometheus pattern: `VALUE(vault_token_store_count)` ⛔️Custom on fail: Discard value
Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	Dependent item	vault.metrics.runtime.alloc.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)` ⛔️Custom on fail: Discard value
Runtime freed objects	Number of freed objects.	Dependent item	vault.metrics.runtime.free.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_free_count)` ⛔️Custom on fail: Discard value
Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.heap.objects Preprocessing Prometheus pattern: `VALUE(vault_runtime_heap_objects)` ⛔️Custom on fail: Discard value
Runtime malloc count	Cumulative count of allocated heap objects.	Dependent item	vault.metrics.runtime.malloc.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_malloc_count)` ⛔️Custom on fail: Discard value
Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.num_goroutines Preprocessing Prometheus pattern: `VALUE(vault_runtime_num_goroutines)` ⛔️Custom on fail: Discard value
Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	Dependent item	vault.metrics.runtime.sys.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_sys_bytes)` ⛔️Custom on fail: Discard value
Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	Dependent item	vault.metrics.total.gc.pause Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)` ⛔️Custom on fail: Discard value Custom multiplier: `1e-09`
Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	Dependent item	vault.metrics.runtime.total.gc.runs Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)` ⛔️Custom on fail: Discard value
Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	Dependent item	vault.metrics.token Preprocessing Prometheus to JSON: `vault_token_count` JSON Path: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by auth, total	Total number of service tokens that were created by an auth method.	Dependent item	vault.metrics.token.by_auth Preprocessing Prometheus to JSON: `vault_token_count_by_auth` JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by policy, total	Total number of service tokens that have a policy attached.	Dependent item	vault.metrics.token.by_policy Preprocessing Prometheus to JSON: `vault_token_count_by_policy` JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	Dependent item	vault.metrics.token.by_ttl Preprocessing Prometheus to JSON: `vault_token_count_by_ttl` JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token creation, rate	Number of service or batch tokens created.	Dependent item	vault.metrics.token.creation.rate Preprocessing Prometheus to JSON: `vault_token_creation` JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second
Secret kv entries	Number of entries in each key-value secret engine.	Dependent item	vault.metrics.secret.kv.count Preprocessing Prometheus to JSON: `vault_secret_kv_count` JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token secret lease creation, rate	Counts the number of leases created by secret engines.	Dependent item	vault.metrics.secret.lease.creation.rate Preprocessing Prometheus to JSON: `vault_secret_lease_creation` JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	Average
HashiCorp Vault: Version has changed	Vault version has changed. Acknowledge to close the problem manually.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	Info	Manual close: Yes
HashiCorp Vault: Vault server is not responding		`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	High
HashiCorp Vault: Failed to get metrics		`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Current number of open files is too high		`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	Warning
HashiCorp Vault: Service has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	Info	Manual close: Yes
HashiCorp Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	Average

LLD rule Storage metrics discovery

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	Dependent item	vault.storage.discovery

Item prototypes for Storage metrics discovery

Name Description Type Key and additional info

Storage [{#STORAGE}] {#OPERATION} ops, rate

Name	Description	Type	Key and additional info
Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	Dependent item	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

Dependent item

vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule Mountpoint metrics discovery

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	Dependent item	vault.mountpoint.discovery

Item prototypes for Mountpoint metrics discovery

Name Description Type Key and additional info

Rollback attempt [{#MOUNTPOINT}] ops, rate

Name	Description	Type	Key and additional info
Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	Dependent item	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second
Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	Dependent item	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of operations to perform a rollback operation on the given mount point.

Dependent item

vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

Dependent item

vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule WAL metrics discovery

Name	Description	Type	Key and additional info
WAL metrics discovery	Discovery for WAL metrics.	Dependent item	vault.wal.discovery

Item prototypes for WAL metrics discovery

Name	Description	Type	Key and additional info
Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_deletewals_count)` ⛔️Custom on fail: Discard value
GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	Dependent item	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_deleted)` ⛔️Custom on fail: Discard value
WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	Dependent item	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_total)` ⛔️Custom on fail: Discard value
Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_loadWAL_count)` ⛔️Custom on fail: Discard value
Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_persistwals_count)` ⛔️Custom on fail: Discard value
Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	Dependent item	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_flushready_count)` ⛔️Custom on fail: Discard value

LLD rule Replication metrics discovery

Name	Description	Type	Key and additional info
Replication metrics discovery	Discovery for replication metrics.	Dependent item	vault.replication.discovery

Item prototypes for Replication metrics discovery

Name	Description	Type	Key and additional info
Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)` ⛔️Custom on fail: Discard value
Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)` ⛔️Custom on fail: Discard value
Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	Dependent item	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_merkle_commit_index)` ⛔️Custom on fail: Discard value
Last WAL{#SINGLETON}	The index of the last WAL.	Dependent item	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_wal)` ⛔️Custom on fail: Discard value
Last DR WAL{#SINGLETON}	The index of the last DR WAL.	Dependent item	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_dr_wal)` ⛔️Custom on fail: Discard value
Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	Dependent item	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_performance_wal)` ⛔️Custom on fail: Discard value
Last remote WAL{#SINGLETON}	The index of the last remote WAL.	Dependent item	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)` ⛔️Custom on fail: Discard value

LLD rule Token metrics discovery

Name	Description	Type	Key and additional info
Token metrics discovery	Tokens metrics discovery.	Dependent item	vault.tokens.discovery

Item prototypes for Token metrics discovery

Name Description Type Key and additional info

Token [{#TOKEN_NAME}] error

Name	Description	Type	Key and additional info
Token [{#TOKEN_NAME}] error	Token lookup error text.	Dependent item	vault.token_via_accessor.error["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	Dependent item	vault.token_via_accessor.has_ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` Boolean to decimal Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	Dependent item	vault.token_via_accessor.ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`

Token lookup error text.

Dependent item

vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].error.first()
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

Dependent item

vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()
Boolean to decimal
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

Dependent item

vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Trigger prototypes for Token metrics discovery

Name	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Token [{#TOKEN_NAME}] lookup error occurred	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	Average
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	Warning	Depends on: HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 7.2

Also available for: 7.4 7.0 6.4 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/7.2

HashiCorp Vault by HTTP

Overview

The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

Requirements

Zabbix version: 7.2 and higher.

Tested versions

This template has been tested on:

Vault 1.6

Configuration

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`

Items

Name	Description	Type	Key and additional info
Get health		HTTP agent	vault.get_health Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
Get leader		HTTP agent	vault.get_leader Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Get metrics		HTTP agent	vault.get_metrics Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Clear metrics		Dependent item	vault.clear_metrics Preprocessing Check for error in JSON: `$.errors` ⛔️Custom on fail: Discard value
Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	Script	vault.get_tokens
Check WAL discovery		Dependent item	vault.check_wal_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check replication discovery		Dependent item	vault.check_replication_discovery Preprocessing Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check storage discovery		Dependent item	vault.check_storage_discovery Preprocessing Prometheus to JSON: `{name=~"^vault_(?:.+)_(?:get
Check mountpoint discovery		Dependent item	vault.check_mountpoint_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Initialized	Initialization status.	Dependent item	vault.health.initialized Preprocessing JSON Path: `$.initialized` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Sealed	Seal status.	Dependent item	vault.health.sealed Preprocessing JSON Path: `$.sealed` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Standby	Standby status.	Dependent item	vault.health.standby Preprocessing JSON Path: `$.standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance standby	Performance standby status.	Dependent item	vault.health.performance_standby Preprocessing JSON Path: `$.performance_standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_performance_mode Preprocessing JSON Path: `$.replication_performance_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_dr_mode Preprocessing JSON Path: `$.replication_dr_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Version	Server version.	Dependent item	vault.health.version Preprocessing JSON Path: `$.version` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Healthcheck	Vault healthcheck.	Dependent item	vault.health.check Preprocessing JSON Path: `$.healthcheck` ⛔️Custom on fail: Set value to: `1` Discard unchanged with heartbeat: `1h`
HA enabled	HA enabled status.	Dependent item	vault.leader.ha_enabled Preprocessing JSON Path: `$.ha_enabled` Boolean to decimal Discard unchanged with heartbeat: `1h`
Is leader	Leader status.	Dependent item	vault.leader.is_self Preprocessing JSON Path: `$.is_self` Boolean to decimal Discard unchanged with heartbeat: `1h`
Get metrics error	Get metrics error.	Dependent item	vault.get_metrics.error Preprocessing JSON Path: `$.errors[0]` ⛔️Custom on fail: Set value to: `` Discard unchanged with heartbeat: `1h`
Process CPU seconds, total	Total user and system CPU time spent in seconds.	Dependent item	vault.metrics.process.cpu.seconds.total Preprocessing Prometheus pattern: `VALUE(process_cpu_seconds_total)` ⛔️Custom on fail: Discard value
Open file descriptors, max	Maximum number of open file descriptors.	Dependent item	vault.metrics.process.max.fds Preprocessing Prometheus pattern: `VALUE(process_max_fds)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Open file descriptors, current	Number of open file descriptors.	Dependent item	vault.metrics.process.open.fds Preprocessing Prometheus pattern: `VALUE(process_open_fds)` ⛔️Custom on fail: Discard value
Process resident memory	Resident memory size in bytes.	Dependent item	vault.metrics.process.resident_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_resident_memory_bytes)` ⛔️Custom on fail: Discard value
Uptime	Server uptime.	Dependent item	vault.metrics.process.uptime Preprocessing Prometheus pattern: `VALUE(process_start_time_seconds)` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.`
Process virtual memory, current	Virtual memory size in bytes.	Dependent item	vault.metrics.process.virtual_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_bytes)` ⛔️Custom on fail: Discard value
Process virtual memory, max	Maximum amount of virtual memory available in bytes.	Dependent item	vault.metrics.process.virtual_memory.max.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Audit log requests, rate	Number of all audit log requests across all audit log devices.	Dependent item	vault.metrics.audit.log.request.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_count)` ⛔️Custom on fail: Discard value Change per second
Audit log request failures, rate	Number of audit log request failures.	Dependent item	vault.metrics.audit.log.request.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_failure)` ⛔️Custom on fail: Discard value Change per second
Audit log response, rate	Number of audit log responses across all audit log devices.	Dependent item	vault.metrics.audit.log.response.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_count)` ⛔️Custom on fail: Discard value Change per second
Audit log response failures, rate	Number of audit log response failures.	Dependent item	vault.metrics.audit.log.response.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_failure)` ⛔️Custom on fail: Discard value Change per second
Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	Dependent item	vault.metrics.barrier.delete.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_delete_count)` ⛔️Custom on fail: Discard value Change per second
Barrier GET ops, rate	Number of GET operations at the barrier.	Dependent item	vault.metrics.vault.barrier.get.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_get_count)` ⛔️Custom on fail: Discard value Change per second
Barrier LIST ops, rate	Number of LIST operations at the barrier.	Dependent item	vault.metrics.barrier.list.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_list_count)` ⛔️Custom on fail: Discard value Change per second
Barrier PUT ops, rate	Number of PUT operations at the barrier.	Dependent item	vault.metrics.barrier.put.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_put_count)` ⛔️Custom on fail: Discard value Change per second
Cache hit, rate	Number of times a value was retrieved from the LRU cache.	Dependent item	vault.metrics.cache.hit.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_hit)` ⛔️Custom on fail: Discard value Change per second
Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	Dependent item	vault.metrics.cache.miss.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_miss)` ⛔️Custom on fail: Discard value Change per second
Cache write, rate	Number of times a value was written to the LRU cache.	Dependent item	vault.metrics.cache.write.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_write)` ⛔️Custom on fail: Discard value Change per second
Check token, rate	Number of token checks handled by Vault core.	Dependent item	vault.metrics.core.check.token.rate Preprocessing Prometheus pattern: `VALUE(vault_core_check_token_count)` ⛔️Custom on fail: Discard value Change per second
Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	Dependent item	vault.metrics.core.fetch.acl_and_token Preprocessing Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)` ⛔️Custom on fail: Discard value Change per second
Requests, rate	Number of requests handled by Vault core.	Dependent item	vault.metrics.core.handle.request Preprocessing Prometheus pattern: `VALUE(vault_core_handle_request_count)` ⛔️Custom on fail: Discard value Change per second
Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership.setup_failed Preprocessing Prometheus to JSON: `vault_core_leadership_setup_failed` JSON Path: `The text is too long. Please see the template.` ⛔️Custom on fail: Set value to: `0`
Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership_lost Preprocessing Prometheus to JSON: `vault_core_leadership_lost_count` JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	Dependent item	vault.metrics.core.post_unseal Preprocessing Prometheus pattern: `VALUE(vault_core_post_unseal_count)` ⛔️Custom on fail: Discard value
Pre-seal ops, counter	Duration of time taken by pre-seal operations.	Dependent item	vault.metrics.core.pre_seal Preprocessing Prometheus pattern: `VALUE(vault_core_pre_seal_count)` ⛔️Custom on fail: Discard value
Requested seal ops, counter	Duration of time taken by requested seal operations.	Dependent item	vault.metrics.core.seal_with_request Preprocessing Prometheus pattern: `VALUE(vault_core_seal_with_request_count)` ⛔️Custom on fail: Discard value
Seal ops, counter	Duration of time taken by seal operations.	Dependent item	vault.metrics.core.seal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_count)` ⛔️Custom on fail: Discard value
Internal seal ops, counter	Duration of time taken by internal seal operations.	Dependent item	vault.metrics.core.seal_internal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_internal_count)` ⛔️Custom on fail: Discard value
Leadership step downs, counter	Cluster leadership step down.	Dependent item	vault.metrics.core.step_down Preprocessing Prometheus to JSON: `vault_core_step_down_count` JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Unseal ops, counter	Duration of time taken by unseal operations.	Dependent item	vault.metrics.core.unseal Preprocessing Prometheus pattern: `VALUE(vault_core_unseal_count)` ⛔️Custom on fail: Discard value
Fetch lease times, counter	Time taken to fetch lease times.	Dependent item	vault.metrics.expire.fetch.lease.times Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)` ⛔️Custom on fail: Discard value
Fetch lease times by token, counter	Time taken to fetch lease times by token.	Dependent item	vault.metrics.expire.fetch.lease.times.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)` ⛔️Custom on fail: Discard value
Number of expiring leases	Number of all leases which are eligible for eventual expiry.	Dependent item	vault.metrics.expire.num_leases Preprocessing Prometheus pattern: `VALUE(vault_expire_num_leases)` ⛔️Custom on fail: Discard value
Expire revoke, count	Time taken to revoke a token.	Dependent item	vault.metrics.expire.revoke Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_count)` ⛔️Custom on fail: Discard value
Expire revoke force, count	Time taken to forcibly revoke a token.	Dependent item	vault.metrics.expire.revoke.force Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_force_count)` ⛔️Custom on fail: Discard value
Expire revoke prefix, count	Tokens revoke on a prefix.	Dependent item	vault.metrics.expire.revoke.prefix Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)` ⛔️Custom on fail: Discard value
Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	Dependent item	vault.metrics.expire.revoke.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)` ⛔️Custom on fail: Discard value
Expire renew, count	Time taken to renew a lease.	Dependent item	vault.metrics.expire.renew Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_count)` ⛔️Custom on fail: Discard value
Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	Dependent item	vault.metrics.expire.renew_token Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_token_count)` ⛔️Custom on fail: Discard value
Register ops, count	Time taken for register operations.	Dependent item	vault.metrics.expire.register Preprocessing Prometheus pattern: `VALUE(vault_expire_register_count)` ⛔️Custom on fail: Discard value
Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	Dependent item	vault.metrics.expire.register.auth Preprocessing Prometheus pattern: `VALUE(vault_expire_register_auth_count)` ⛔️Custom on fail: Discard value
Policy GET ops, rate	Number of operations to get a policy.	Dependent item	vault.metrics.policy.get_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_get_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy LIST ops, rate	Number of operations to list policies.	Dependent item	vault.metrics.policy.list_policies.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_list_policies_count)` ⛔️Custom on fail: Discard value Change per second
Policy DELETE ops, rate	Number of operations to delete a policy.	Dependent item	vault.metrics.policy.delete_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_delete_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy SET ops, rate	Number of operations to set a policy.	Dependent item	vault.metrics.policy.set_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_set_policy_count)` ⛔️Custom on fail: Discard value Change per second
Token create, count	The time taken to create a token.	Dependent item	vault.metrics.token.create Preprocessing Prometheus pattern: `VALUE(vault_token_create_count)` ⛔️Custom on fail: Discard value
Token createAccessor, count	The time taken to create a token accessor.	Dependent item	vault.metrics.token.createAccessor Preprocessing Prometheus pattern: `VALUE(vault_token_createAccessor_count)` ⛔️Custom on fail: Discard value
Token lookup, rate	Number of token look up.	Dependent item	vault.metrics.token.lookup.rate Preprocessing Prometheus pattern: `VALUE(vault_token_lookup_count)` ⛔️Custom on fail: Discard value Change per second
Token revoke, count	The time taken to look up a token.	Dependent item	vault.metrics.token.revoke Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_count)` ⛔️Custom on fail: Discard value
Token revoke tree, count	Time taken to revoke a token tree.	Dependent item	vault.metrics.token.revoke.tree Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_tree_count)` ⛔️Custom on fail: Discard value
Token store, count	Time taken to store an updated token entry without writing to the secondary index.	Dependent item	vault.metrics.token.store Preprocessing Prometheus pattern: `VALUE(vault_token_store_count)` ⛔️Custom on fail: Discard value
Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	Dependent item	vault.metrics.runtime.alloc.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)` ⛔️Custom on fail: Discard value
Runtime freed objects	Number of freed objects.	Dependent item	vault.metrics.runtime.free.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_free_count)` ⛔️Custom on fail: Discard value
Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.heap.objects Preprocessing Prometheus pattern: `VALUE(vault_runtime_heap_objects)` ⛔️Custom on fail: Discard value
Runtime malloc count	Cumulative count of allocated heap objects.	Dependent item	vault.metrics.runtime.malloc.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_malloc_count)` ⛔️Custom on fail: Discard value
Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.num_goroutines Preprocessing Prometheus pattern: `VALUE(vault_runtime_num_goroutines)` ⛔️Custom on fail: Discard value
Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	Dependent item	vault.metrics.runtime.sys.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_sys_bytes)` ⛔️Custom on fail: Discard value
Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	Dependent item	vault.metrics.total.gc.pause Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)` ⛔️Custom on fail: Discard value Custom multiplier: `1e-09`
Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	Dependent item	vault.metrics.runtime.total.gc.runs Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)` ⛔️Custom on fail: Discard value
Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	Dependent item	vault.metrics.token Preprocessing Prometheus to JSON: `vault_token_count` JSON Path: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by auth, total	Total number of service tokens that were created by an auth method.	Dependent item	vault.metrics.token.by_auth Preprocessing Prometheus to JSON: `vault_token_count_by_auth` JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by policy, total	Total number of service tokens that have a policy attached.	Dependent item	vault.metrics.token.by_policy Preprocessing Prometheus to JSON: `vault_token_count_by_policy` JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	Dependent item	vault.metrics.token.by_ttl Preprocessing Prometheus to JSON: `vault_token_count_by_ttl` JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token creation, rate	Number of service or batch tokens created.	Dependent item	vault.metrics.token.creation.rate Preprocessing Prometheus to JSON: `vault_token_creation` JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second
Secret kv entries	Number of entries in each key-value secret engine.	Dependent item	vault.metrics.secret.kv.count Preprocessing Prometheus to JSON: `vault_secret_kv_count` JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token secret lease creation, rate	Counts the number of leases created by secret engines.	Dependent item	vault.metrics.secret.lease.creation.rate Preprocessing Prometheus to JSON: `vault_secret_lease_creation` JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	Average
HashiCorp Vault: Version has changed	Vault version has changed. Acknowledge to close the problem manually.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	Info	Manual close: Yes
HashiCorp Vault: Vault server is not responding		`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	High
HashiCorp Vault: Failed to get metrics		`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Current number of open files is too high		`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	Warning
HashiCorp Vault: Service has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	Info	Manual close: Yes
HashiCorp Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	Average

LLD rule Storage metrics discovery

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	Dependent item	vault.storage.discovery

Item prototypes for Storage metrics discovery

Name Description Type Key and additional info

Storage [{#STORAGE}] {#OPERATION} ops, rate

Name	Description	Type	Key and additional info
Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	Dependent item	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

Dependent item

vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule Mountpoint metrics discovery

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	Dependent item	vault.mountpoint.discovery

Item prototypes for Mountpoint metrics discovery

Name Description Type Key and additional info

Rollback attempt [{#MOUNTPOINT}] ops, rate

Name	Description	Type	Key and additional info
Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	Dependent item	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second
Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	Dependent item	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of operations to perform a rollback operation on the given mount point.

Dependent item

vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

Dependent item

vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule WAL metrics discovery

Name	Description	Type	Key and additional info
WAL metrics discovery	Discovery for WAL metrics.	Dependent item	vault.wal.discovery

Item prototypes for WAL metrics discovery

Name	Description	Type	Key and additional info
Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_deletewals_count)` ⛔️Custom on fail: Discard value
GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	Dependent item	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_deleted)` ⛔️Custom on fail: Discard value
WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	Dependent item	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_total)` ⛔️Custom on fail: Discard value
Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_loadWAL_count)` ⛔️Custom on fail: Discard value
Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_persistwals_count)` ⛔️Custom on fail: Discard value
Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	Dependent item	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_flushready_count)` ⛔️Custom on fail: Discard value

LLD rule Replication metrics discovery

Name	Description	Type	Key and additional info
Replication metrics discovery	Discovery for replication metrics.	Dependent item	vault.replication.discovery

Item prototypes for Replication metrics discovery

Name	Description	Type	Key and additional info
Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)` ⛔️Custom on fail: Discard value
Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)` ⛔️Custom on fail: Discard value
Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	Dependent item	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_merkle_commit_index)` ⛔️Custom on fail: Discard value
Last WAL{#SINGLETON}	The index of the last WAL.	Dependent item	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_wal)` ⛔️Custom on fail: Discard value
Last DR WAL{#SINGLETON}	The index of the last DR WAL.	Dependent item	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_dr_wal)` ⛔️Custom on fail: Discard value
Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	Dependent item	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_performance_wal)` ⛔️Custom on fail: Discard value
Last remote WAL{#SINGLETON}	The index of the last remote WAL.	Dependent item	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)` ⛔️Custom on fail: Discard value

LLD rule Token metrics discovery

Name	Description	Type	Key and additional info
Token metrics discovery	Tokens metrics discovery.	Dependent item	vault.tokens.discovery

Item prototypes for Token metrics discovery

Name Description Type Key and additional info

Token [{#TOKEN_NAME}] error

Name	Description	Type	Key and additional info
Token [{#TOKEN_NAME}] error	Token lookup error text.	Dependent item	vault.token_via_accessor.error["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	Dependent item	vault.token_via_accessor.has_ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` Boolean to decimal Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	Dependent item	vault.token_via_accessor.ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`

Token lookup error text.

Dependent item

vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].error.first()
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

Dependent item

vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()
Boolean to decimal
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

Dependent item

vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Trigger prototypes for Token metrics discovery

Name	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Token [{#TOKEN_NAME}] lookup error occurred	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	Average
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	Warning	Depends on: HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 7.0

Also available for: 7.4 7.2 6.4 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/7.0

HashiCorp Vault by HTTP

Overview

The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

Requirements

Zabbix version: 7.0 and higher.

Tested versions

This template has been tested on:

Vault 1.6

Configuration

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`

Items

Name	Description	Type	Key and additional info
Get health		HTTP agent	vault.get_health Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
Get leader		HTTP agent	vault.get_leader Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Get metrics		HTTP agent	vault.get_metrics Preprocessing Check for not supported value: `any error` ⛔️Custom on fail: Discard value
Clear metrics		Dependent item	vault.clear_metrics Preprocessing Check for error in JSON: `$.errors` ⛔️Custom on fail: Discard value
Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	Script	vault.get_tokens
Check WAL discovery		Dependent item	vault.check_wal_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check replication discovery		Dependent item	vault.check_replication_discovery Preprocessing Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Check storage discovery		Dependent item	vault.check_storage_discovery Preprocessing Prometheus to JSON: `{name=~"^vault_(?:.+)_(?:get
Check mountpoint discovery		Dependent item	vault.check_mountpoint_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Initialized	Initialization status.	Dependent item	vault.health.initialized Preprocessing JSON Path: `$.initialized` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Sealed	Seal status.	Dependent item	vault.health.sealed Preprocessing JSON Path: `$.sealed` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Standby	Standby status.	Dependent item	vault.health.standby Preprocessing JSON Path: `$.standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance standby	Performance standby status.	Dependent item	vault.health.performance_standby Preprocessing JSON Path: `$.performance_standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_performance_mode Preprocessing JSON Path: `$.replication_performance_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_dr_mode Preprocessing JSON Path: `$.replication_dr_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Version	Server version.	Dependent item	vault.health.version Preprocessing JSON Path: `$.version` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Healthcheck	Vault healthcheck.	Dependent item	vault.health.check Preprocessing JSON Path: `$.healthcheck` ⛔️Custom on fail: Set value to: `1` Discard unchanged with heartbeat: `1h`
HA enabled	HA enabled status.	Dependent item	vault.leader.ha_enabled Preprocessing JSON Path: `$.ha_enabled` Boolean to decimal Discard unchanged with heartbeat: `1h`
Is leader	Leader status.	Dependent item	vault.leader.is_self Preprocessing JSON Path: `$.is_self` Boolean to decimal Discard unchanged with heartbeat: `1h`
Get metrics error	Get metrics error.	Dependent item	vault.get_metrics.error Preprocessing JSON Path: `$.errors[0]` ⛔️Custom on fail: Set value to: `` Discard unchanged with heartbeat: `1h`
Process CPU seconds, total	Total user and system CPU time spent in seconds.	Dependent item	vault.metrics.process.cpu.seconds.total Preprocessing Prometheus pattern: `VALUE(process_cpu_seconds_total)` ⛔️Custom on fail: Discard value
Open file descriptors, max	Maximum number of open file descriptors.	Dependent item	vault.metrics.process.max.fds Preprocessing Prometheus pattern: `VALUE(process_max_fds)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Open file descriptors, current	Number of open file descriptors.	Dependent item	vault.metrics.process.open.fds Preprocessing Prometheus pattern: `VALUE(process_open_fds)` ⛔️Custom on fail: Discard value
Process resident memory	Resident memory size in bytes.	Dependent item	vault.metrics.process.resident_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_resident_memory_bytes)` ⛔️Custom on fail: Discard value
Uptime	Server uptime.	Dependent item	vault.metrics.process.uptime Preprocessing Prometheus pattern: `VALUE(process_start_time_seconds)` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.`
Process virtual memory, current	Virtual memory size in bytes.	Dependent item	vault.metrics.process.virtual_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_bytes)` ⛔️Custom on fail: Discard value
Process virtual memory, max	Maximum amount of virtual memory available in bytes.	Dependent item	vault.metrics.process.virtual_memory.max.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Audit log requests, rate	Number of all audit log requests across all audit log devices.	Dependent item	vault.metrics.audit.log.request.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_count)` ⛔️Custom on fail: Discard value Change per second
Audit log request failures, rate	Number of audit log request failures.	Dependent item	vault.metrics.audit.log.request.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_failure)` ⛔️Custom on fail: Discard value Change per second
Audit log response, rate	Number of audit log responses across all audit log devices.	Dependent item	vault.metrics.audit.log.response.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_count)` ⛔️Custom on fail: Discard value Change per second
Audit log response failures, rate	Number of audit log response failures.	Dependent item	vault.metrics.audit.log.response.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_failure)` ⛔️Custom on fail: Discard value Change per second
Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	Dependent item	vault.metrics.barrier.delete.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_delete_count)` ⛔️Custom on fail: Discard value Change per second
Barrier GET ops, rate	Number of GET operations at the barrier.	Dependent item	vault.metrics.vault.barrier.get.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_get_count)` ⛔️Custom on fail: Discard value Change per second
Barrier LIST ops, rate	Number of LIST operations at the barrier.	Dependent item	vault.metrics.barrier.list.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_list_count)` ⛔️Custom on fail: Discard value Change per second
Barrier PUT ops, rate	Number of PUT operations at the barrier.	Dependent item	vault.metrics.barrier.put.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_put_count)` ⛔️Custom on fail: Discard value Change per second
Cache hit, rate	Number of times a value was retrieved from the LRU cache.	Dependent item	vault.metrics.cache.hit.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_hit)` ⛔️Custom on fail: Discard value Change per second
Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	Dependent item	vault.metrics.cache.miss.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_miss)` ⛔️Custom on fail: Discard value Change per second
Cache write, rate	Number of times a value was written to the LRU cache.	Dependent item	vault.metrics.cache.write.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_write)` ⛔️Custom on fail: Discard value Change per second
Check token, rate	Number of token checks handled by Vault core.	Dependent item	vault.metrics.core.check.token.rate Preprocessing Prometheus pattern: `VALUE(vault_core_check_token_count)` ⛔️Custom on fail: Discard value Change per second
Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	Dependent item	vault.metrics.core.fetch.acl_and_token Preprocessing Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)` ⛔️Custom on fail: Discard value Change per second
Requests, rate	Number of requests handled by Vault core.	Dependent item	vault.metrics.core.handle.request Preprocessing Prometheus pattern: `VALUE(vault_core_handle_request_count)` ⛔️Custom on fail: Discard value Change per second
Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership.setup_failed Preprocessing Prometheus to JSON: `vault_core_leadership_setup_failed` JSON Path: `The text is too long. Please see the template.` ⛔️Custom on fail: Set value to: `0`
Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership_lost Preprocessing Prometheus to JSON: `vault_core_leadership_lost_count` JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	Dependent item	vault.metrics.core.post_unseal Preprocessing Prometheus pattern: `VALUE(vault_core_post_unseal_count)` ⛔️Custom on fail: Discard value
Pre-seal ops, counter	Duration of time taken by pre-seal operations.	Dependent item	vault.metrics.core.pre_seal Preprocessing Prometheus pattern: `VALUE(vault_core_pre_seal_count)` ⛔️Custom on fail: Discard value
Requested seal ops, counter	Duration of time taken by requested seal operations.	Dependent item	vault.metrics.core.seal_with_request Preprocessing Prometheus pattern: `VALUE(vault_core_seal_with_request_count)` ⛔️Custom on fail: Discard value
Seal ops, counter	Duration of time taken by seal operations.	Dependent item	vault.metrics.core.seal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_count)` ⛔️Custom on fail: Discard value
Internal seal ops, counter	Duration of time taken by internal seal operations.	Dependent item	vault.metrics.core.seal_internal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_internal_count)` ⛔️Custom on fail: Discard value
Leadership step downs, counter	Cluster leadership step down.	Dependent item	vault.metrics.core.step_down Preprocessing Prometheus to JSON: `vault_core_step_down_count` JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Unseal ops, counter	Duration of time taken by unseal operations.	Dependent item	vault.metrics.core.unseal Preprocessing Prometheus pattern: `VALUE(vault_core_unseal_count)` ⛔️Custom on fail: Discard value
Fetch lease times, counter	Time taken to fetch lease times.	Dependent item	vault.metrics.expire.fetch.lease.times Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)` ⛔️Custom on fail: Discard value
Fetch lease times by token, counter	Time taken to fetch lease times by token.	Dependent item	vault.metrics.expire.fetch.lease.times.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)` ⛔️Custom on fail: Discard value
Number of expiring leases	Number of all leases which are eligible for eventual expiry.	Dependent item	vault.metrics.expire.num_leases Preprocessing Prometheus pattern: `VALUE(vault_expire_num_leases)` ⛔️Custom on fail: Discard value
Expire revoke, count	Time taken to revoke a token.	Dependent item	vault.metrics.expire.revoke Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_count)` ⛔️Custom on fail: Discard value
Expire revoke force, count	Time taken to forcibly revoke a token.	Dependent item	vault.metrics.expire.revoke.force Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_force_count)` ⛔️Custom on fail: Discard value
Expire revoke prefix, count	Tokens revoke on a prefix.	Dependent item	vault.metrics.expire.revoke.prefix Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)` ⛔️Custom on fail: Discard value
Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	Dependent item	vault.metrics.expire.revoke.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)` ⛔️Custom on fail: Discard value
Expire renew, count	Time taken to renew a lease.	Dependent item	vault.metrics.expire.renew Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_count)` ⛔️Custom on fail: Discard value
Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	Dependent item	vault.metrics.expire.renew_token Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_token_count)` ⛔️Custom on fail: Discard value
Register ops, count	Time taken for register operations.	Dependent item	vault.metrics.expire.register Preprocessing Prometheus pattern: `VALUE(vault_expire_register_count)` ⛔️Custom on fail: Discard value
Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	Dependent item	vault.metrics.expire.register.auth Preprocessing Prometheus pattern: `VALUE(vault_expire_register_auth_count)` ⛔️Custom on fail: Discard value
Policy GET ops, rate	Number of operations to get a policy.	Dependent item	vault.metrics.policy.get_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_get_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy LIST ops, rate	Number of operations to list policies.	Dependent item	vault.metrics.policy.list_policies.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_list_policies_count)` ⛔️Custom on fail: Discard value Change per second
Policy DELETE ops, rate	Number of operations to delete a policy.	Dependent item	vault.metrics.policy.delete_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_delete_policy_count)` ⛔️Custom on fail: Discard value Change per second
Policy SET ops, rate	Number of operations to set a policy.	Dependent item	vault.metrics.policy.set_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_set_policy_count)` ⛔️Custom on fail: Discard value Change per second
Token create, count	The time taken to create a token.	Dependent item	vault.metrics.token.create Preprocessing Prometheus pattern: `VALUE(vault_token_create_count)` ⛔️Custom on fail: Discard value
Token createAccessor, count	The time taken to create a token accessor.	Dependent item	vault.metrics.token.createAccessor Preprocessing Prometheus pattern: `VALUE(vault_token_createAccessor_count)` ⛔️Custom on fail: Discard value
Token lookup, rate	Number of token look up.	Dependent item	vault.metrics.token.lookup.rate Preprocessing Prometheus pattern: `VALUE(vault_token_lookup_count)` ⛔️Custom on fail: Discard value Change per second
Token revoke, count	The time taken to look up a token.	Dependent item	vault.metrics.token.revoke Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_count)` ⛔️Custom on fail: Discard value
Token revoke tree, count	Time taken to revoke a token tree.	Dependent item	vault.metrics.token.revoke.tree Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_tree_count)` ⛔️Custom on fail: Discard value
Token store, count	Time taken to store an updated token entry without writing to the secondary index.	Dependent item	vault.metrics.token.store Preprocessing Prometheus pattern: `VALUE(vault_token_store_count)` ⛔️Custom on fail: Discard value
Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	Dependent item	vault.metrics.runtime.alloc.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)` ⛔️Custom on fail: Discard value
Runtime freed objects	Number of freed objects.	Dependent item	vault.metrics.runtime.free.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_free_count)` ⛔️Custom on fail: Discard value
Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.heap.objects Preprocessing Prometheus pattern: `VALUE(vault_runtime_heap_objects)` ⛔️Custom on fail: Discard value
Runtime malloc count	Cumulative count of allocated heap objects.	Dependent item	vault.metrics.runtime.malloc.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_malloc_count)` ⛔️Custom on fail: Discard value
Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.num_goroutines Preprocessing Prometheus pattern: `VALUE(vault_runtime_num_goroutines)` ⛔️Custom on fail: Discard value
Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	Dependent item	vault.metrics.runtime.sys.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_sys_bytes)` ⛔️Custom on fail: Discard value
Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	Dependent item	vault.metrics.total.gc.pause Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)` ⛔️Custom on fail: Discard value Custom multiplier: `1e-09`
Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	Dependent item	vault.metrics.runtime.total.gc.runs Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)` ⛔️Custom on fail: Discard value
Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	Dependent item	vault.metrics.token Preprocessing Prometheus to JSON: `vault_token_count` JSON Path: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by auth, total	Total number of service tokens that were created by an auth method.	Dependent item	vault.metrics.token.by_auth Preprocessing Prometheus to JSON: `vault_token_count_by_auth` JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by policy, total	Total number of service tokens that have a policy attached.	Dependent item	vault.metrics.token.by_policy Preprocessing Prometheus to JSON: `vault_token_count_by_policy` JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	Dependent item	vault.metrics.token.by_ttl Preprocessing Prometheus to JSON: `vault_token_count_by_ttl` JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token creation, rate	Number of service or batch tokens created.	Dependent item	vault.metrics.token.creation.rate Preprocessing Prometheus to JSON: `vault_token_creation` JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second
Secret kv entries	Number of entries in each key-value secret engine.	Dependent item	vault.metrics.secret.kv.count Preprocessing Prometheus to JSON: `vault_secret_kv_count` JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Token secret lease creation, rate	Counts the number of leases created by secret engines.	Dependent item	vault.metrics.secret.lease.creation.rate Preprocessing Prometheus to JSON: `vault_secret_lease_creation` JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	Average
HashiCorp Vault: Version has changed	Vault version has changed. Acknowledge to close the problem manually.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	Info	Manual close: Yes
HashiCorp Vault: Vault server is not responding		`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	High
HashiCorp Vault: Failed to get metrics		`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Current number of open files is too high		`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	Warning
HashiCorp Vault: Service has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	Info	Manual close: Yes
HashiCorp Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	Average
HashiCorp Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	Average

LLD rule Storage metrics discovery

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	Dependent item	vault.storage.discovery

Item prototypes for Storage metrics discovery

Name Description Type Key and additional info

Storage [{#STORAGE}] {#OPERATION} ops, rate

Name	Description	Type	Key and additional info
Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	Dependent item	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

Dependent item

vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule Mountpoint metrics discovery

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	Dependent item	vault.mountpoint.discovery

Item prototypes for Mountpoint metrics discovery

Name Description Type Key and additional info

Rollback attempt [{#MOUNTPOINT}] ops, rate

Name	Description	Type	Key and additional info
Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	Dependent item	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second
Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	Dependent item	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of operations to perform a rollback operation on the given mount point.

Dependent item

vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

Dependent item

vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule WAL metrics discovery

Name	Description	Type	Key and additional info
WAL metrics discovery	Discovery for WAL metrics.	Dependent item	vault.wal.discovery

Item prototypes for WAL metrics discovery

Name	Description	Type	Key and additional info
Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_deletewals_count)` ⛔️Custom on fail: Discard value
GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	Dependent item	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_deleted)` ⛔️Custom on fail: Discard value
WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	Dependent item	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_total)` ⛔️Custom on fail: Discard value
Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_loadWAL_count)` ⛔️Custom on fail: Discard value
Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_persistwals_count)` ⛔️Custom on fail: Discard value
Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	Dependent item	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_flushready_count)` ⛔️Custom on fail: Discard value

LLD rule Replication metrics discovery

Name	Description	Type	Key and additional info
Replication metrics discovery	Discovery for replication metrics.	Dependent item	vault.replication.discovery

Item prototypes for Replication metrics discovery

Name	Description	Type	Key and additional info
Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)` ⛔️Custom on fail: Discard value
Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)` ⛔️Custom on fail: Discard value
Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	Dependent item	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_merkle_commit_index)` ⛔️Custom on fail: Discard value
Last WAL{#SINGLETON}	The index of the last WAL.	Dependent item	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_wal)` ⛔️Custom on fail: Discard value
Last DR WAL{#SINGLETON}	The index of the last DR WAL.	Dependent item	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_dr_wal)` ⛔️Custom on fail: Discard value
Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	Dependent item	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_performance_wal)` ⛔️Custom on fail: Discard value
Last remote WAL{#SINGLETON}	The index of the last remote WAL.	Dependent item	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)` ⛔️Custom on fail: Discard value

LLD rule Token metrics discovery

Name	Description	Type	Key and additional info
Token metrics discovery	Tokens metrics discovery.	Dependent item	vault.tokens.discovery

Item prototypes for Token metrics discovery

Name Description Type Key and additional info

Token [{#TOKEN_NAME}] error

Name	Description	Type	Key and additional info
Token [{#TOKEN_NAME}] error	Token lookup error text.	Dependent item	vault.token_via_accessor.error["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	Dependent item	vault.token_via_accessor.has_ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` Boolean to decimal Discard unchanged with heartbeat: `1h`
Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	Dependent item	vault.token_via_accessor.ttl["{#ACCESSOR}"] Preprocessing JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`

Token lookup error text.

Dependent item

vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].error.first()
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

Dependent item

vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()
Boolean to decimal
Discard unchanged with heartbeat: 1h

Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

Dependent item

vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Trigger prototypes for Token metrics discovery

Name	Expression	Severity	Dependencies and additional info
HashiCorp Vault: Token [{#TOKEN_NAME}] lookup error occurred	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	Warning	Depends on: HashiCorp Vault: Vault server is sealed
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	Average
HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	Warning	Depends on: HashiCorp Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 6.4

Also available for: 7.4 7.2 7.0 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/6.4

HashiCorp Vault by HTTP

Overview

The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

Requirements

Zabbix version: 6.4 and higher.

Tested versions

This template has been tested on:

Vault 1.6

Configuration

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`

Items

Name	Description	Type	Key and additional info
Vault: Get health		HTTP agent	vault.get_health Preprocessing Check for not supported value ⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
Vault: Get leader		HTTP agent	vault.get_leader Preprocessing Check for not supported value ⛔️Custom on fail: Discard value
Vault: Get metrics		HTTP agent	vault.get_metrics Preprocessing Check for not supported value ⛔️Custom on fail: Discard value
Vault: Clear metrics		Dependent item	vault.clear_metrics Preprocessing Check for error in JSON: `$.errors` ⛔️Custom on fail: Discard value
Vault: Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	Script	vault.get_tokens
Vault: Check WAL discovery		Dependent item	vault.check_wal_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Check replication discovery		Dependent item	vault.check_replication_discovery Preprocessing Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Check storage discovery		Dependent item	vault.check_storage_discovery Preprocessing Prometheus to JSON: `{name=~"^vault_(?:.+)_(?:get
Vault: Check mountpoint discovery		Dependent item	vault.check_mountpoint_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Initialized	Initialization status.	Dependent item	vault.health.initialized Preprocessing JSON Path: `$.initialized` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Sealed	Seal status.	Dependent item	vault.health.sealed Preprocessing JSON Path: `$.sealed` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Standby	Standby status.	Dependent item	vault.health.standby Preprocessing JSON Path: `$.standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Performance standby	Performance standby status.	Dependent item	vault.health.performance_standby Preprocessing JSON Path: `$.performance_standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_performance_mode Preprocessing JSON Path: `$.replication_performance_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_dr_mode Preprocessing JSON Path: `$.replication_dr_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Version	Server version.	Dependent item	vault.health.version Preprocessing JSON Path: `$.version` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Healthcheck	Vault healthcheck.	Dependent item	vault.health.check Preprocessing JSON Path: `$.healthcheck` ⛔️Custom on fail: Set value to: `1` Discard unchanged with heartbeat: `1h`
Vault: HA enabled	HA enabled status.	Dependent item	vault.leader.ha_enabled Preprocessing JSON Path: `$.ha_enabled` Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Is leader	Leader status.	Dependent item	vault.leader.is_self Preprocessing JSON Path: `$.is_self` Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Get metrics error	Get metrics error.	Dependent item	vault.get_metrics.error Preprocessing JSON Path: `$.errors[0]` ⛔️Custom on fail: Set value to: `` Discard unchanged with heartbeat: `1h`
Vault: Process CPU seconds, total	Total user and system CPU time spent in seconds.	Dependent item	vault.metrics.process.cpu.seconds.total Preprocessing Prometheus pattern: `VALUE(process_cpu_seconds_total)` ⛔️Custom on fail: Discard value
Vault: Open file descriptors, max	Maximum number of open file descriptors.	Dependent item	vault.metrics.process.max.fds Preprocessing Prometheus pattern: `VALUE(process_max_fds)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Open file descriptors, current	Number of open file descriptors.	Dependent item	vault.metrics.process.open.fds Preprocessing Prometheus pattern: `VALUE(process_open_fds)` ⛔️Custom on fail: Discard value
Vault: Process resident memory	Resident memory size in bytes.	Dependent item	vault.metrics.process.resident_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_resident_memory_bytes)` ⛔️Custom on fail: Discard value
Vault: Uptime	Server uptime.	Dependent item	vault.metrics.process.uptime Preprocessing Prometheus pattern: `VALUE(process_start_time_seconds)` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.`
Vault: Process virtual memory, current	Virtual memory size in bytes.	Dependent item	vault.metrics.process.virtual_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_bytes)` ⛔️Custom on fail: Discard value
Vault: Process virtual memory, max	Maximum amount of virtual memory available in bytes.	Dependent item	vault.metrics.process.virtual_memory.max.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Audit log requests, rate	Number of all audit log requests across all audit log devices.	Dependent item	vault.metrics.audit.log.request.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log request failures, rate	Number of audit log request failures.	Dependent item	vault.metrics.audit.log.request.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_failure)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log response, rate	Number of audit log responses across all audit log devices.	Dependent item	vault.metrics.audit.log.response.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log response failures, rate	Number of audit log response failures.	Dependent item	vault.metrics.audit.log.response.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_failure)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	Dependent item	vault.metrics.barrier.delete.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_delete_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier GET ops, rate	Number of GET operations at the barrier.	Dependent item	vault.metrics.vault.barrier.get.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_get_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier LIST ops, rate	Number of LIST operations at the barrier.	Dependent item	vault.metrics.barrier.list.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_list_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier PUT ops, rate	Number of PUT operations at the barrier.	Dependent item	vault.metrics.barrier.put.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_put_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache hit, rate	Number of times a value was retrieved from the LRU cache.	Dependent item	vault.metrics.cache.hit.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_hit)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	Dependent item	vault.metrics.cache.miss.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_miss)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache write, rate	Number of times a value was written to the LRU cache.	Dependent item	vault.metrics.cache.write.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_write)` ⛔️Custom on fail: Discard value Change per second
Vault: Check token, rate	Number of token checks handled by Vault core.	Dependent item	vault.metrics.core.check.token.rate Preprocessing Prometheus pattern: `VALUE(vault_core_check_token_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	Dependent item	vault.metrics.core.fetch.acl_and_token Preprocessing Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Requests, rate	Number of requests handled by Vault core.	Dependent item	vault.metrics.core.handle.request Preprocessing Prometheus pattern: `VALUE(vault_core_handle_request_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership.setup_failed Preprocessing Prometheus to JSON: `vault_core_leadership_setup_failed` JSON Path: `The text is too long. Please see the template.` ⛔️Custom on fail: Set value to: `0`
Vault: Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership_lost Preprocessing Prometheus to JSON: `vault_core_leadership_lost_count` JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	Dependent item	vault.metrics.core.post_unseal Preprocessing Prometheus pattern: `VALUE(vault_core_post_unseal_count)` ⛔️Custom on fail: Discard value
Vault: Pre-seal ops, counter	Duration of time taken by pre-seal operations.	Dependent item	vault.metrics.core.pre_seal Preprocessing Prometheus pattern: `VALUE(vault_core_pre_seal_count)` ⛔️Custom on fail: Discard value
Vault: Requested seal ops, counter	Duration of time taken by requested seal operations.	Dependent item	vault.metrics.core.seal_with_request Preprocessing Prometheus pattern: `VALUE(vault_core_seal_with_request_count)` ⛔️Custom on fail: Discard value
Vault: Seal ops, counter	Duration of time taken by seal operations.	Dependent item	vault.metrics.core.seal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_count)` ⛔️Custom on fail: Discard value
Vault: Internal seal ops, counter	Duration of time taken by internal seal operations.	Dependent item	vault.metrics.core.seal_internal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_internal_count)` ⛔️Custom on fail: Discard value
Vault: Leadership step downs, counter	Cluster leadership step down.	Dependent item	vault.metrics.core.step_down Preprocessing Prometheus to JSON: `vault_core_step_down_count` JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Unseal ops, counter	Duration of time taken by unseal operations.	Dependent item	vault.metrics.core.unseal Preprocessing Prometheus pattern: `VALUE(vault_core_unseal_count)` ⛔️Custom on fail: Discard value
Vault: Fetch lease times, counter	Time taken to fetch lease times.	Dependent item	vault.metrics.expire.fetch.lease.times Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)` ⛔️Custom on fail: Discard value
Vault: Fetch lease times by token, counter	Time taken to fetch lease times by token.	Dependent item	vault.metrics.expire.fetch.lease.times.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)` ⛔️Custom on fail: Discard value
Vault: Number of expiring leases	Number of all leases which are eligible for eventual expiry.	Dependent item	vault.metrics.expire.num_leases Preprocessing Prometheus pattern: `VALUE(vault_expire_num_leases)` ⛔️Custom on fail: Discard value
Vault: Expire revoke, count	Time taken to revoke a token.	Dependent item	vault.metrics.expire.revoke Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_count)` ⛔️Custom on fail: Discard value
Vault: Expire revoke force, count	Time taken to forcibly revoke a token.	Dependent item	vault.metrics.expire.revoke.force Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_force_count)` ⛔️Custom on fail: Discard value
Vault: Expire revoke prefix, count	Tokens revoke on a prefix.	Dependent item	vault.metrics.expire.revoke.prefix Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)` ⛔️Custom on fail: Discard value
Vault: Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	Dependent item	vault.metrics.expire.revoke.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)` ⛔️Custom on fail: Discard value
Vault: Expire renew, count	Time taken to renew a lease.	Dependent item	vault.metrics.expire.renew Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_count)` ⛔️Custom on fail: Discard value
Vault: Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	Dependent item	vault.metrics.expire.renew_token Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_token_count)` ⛔️Custom on fail: Discard value
Vault: Register ops, count	Time taken for register operations.	Dependent item	vault.metrics.expire.register Preprocessing Prometheus pattern: `VALUE(vault_expire_register_count)` ⛔️Custom on fail: Discard value
Vault: Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	Dependent item	vault.metrics.expire.register.auth Preprocessing Prometheus pattern: `VALUE(vault_expire_register_auth_count)` ⛔️Custom on fail: Discard value
Vault: Policy GET ops, rate	Number of operations to get a policy.	Dependent item	vault.metrics.policy.get_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_get_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy LIST ops, rate	Number of operations to list policies.	Dependent item	vault.metrics.policy.list_policies.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_list_policies_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy DELETE ops, rate	Number of operations to delete a policy.	Dependent item	vault.metrics.policy.delete_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_delete_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy SET ops, rate	Number of operations to set a policy.	Dependent item	vault.metrics.policy.set_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_set_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Token create, count	The time taken to create a token.	Dependent item	vault.metrics.token.create Preprocessing Prometheus pattern: `VALUE(vault_token_create_count)` ⛔️Custom on fail: Discard value
Vault: Token createAccessor, count	The time taken to create a token accessor.	Dependent item	vault.metrics.token.createAccessor Preprocessing Prometheus pattern: `VALUE(vault_token_createAccessor_count)` ⛔️Custom on fail: Discard value
Vault: Token lookup, rate	Number of token look up.	Dependent item	vault.metrics.token.lookup.rate Preprocessing Prometheus pattern: `VALUE(vault_token_lookup_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Token revoke, count	The time taken to look up a token.	Dependent item	vault.metrics.token.revoke Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_count)` ⛔️Custom on fail: Discard value
Vault: Token revoke tree, count	Time taken to revoke a token tree.	Dependent item	vault.metrics.token.revoke.tree Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_tree_count)` ⛔️Custom on fail: Discard value
Vault: Token store, count	Time taken to store an updated token entry without writing to the secondary index.	Dependent item	vault.metrics.token.store Preprocessing Prometheus pattern: `VALUE(vault_token_store_count)` ⛔️Custom on fail: Discard value
Vault: Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	Dependent item	vault.metrics.runtime.alloc.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)` ⛔️Custom on fail: Discard value
Vault: Runtime freed objects	Number of freed objects.	Dependent item	vault.metrics.runtime.free.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_free_count)` ⛔️Custom on fail: Discard value
Vault: Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.heap.objects Preprocessing Prometheus pattern: `VALUE(vault_runtime_heap_objects)` ⛔️Custom on fail: Discard value
Vault: Runtime malloc count	Cumulative count of allocated heap objects.	Dependent item	vault.metrics.runtime.malloc.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_malloc_count)` ⛔️Custom on fail: Discard value
Vault: Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.num_goroutines Preprocessing Prometheus pattern: `VALUE(vault_runtime_num_goroutines)` ⛔️Custom on fail: Discard value
Vault: Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	Dependent item	vault.metrics.runtime.sys.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_sys_bytes)` ⛔️Custom on fail: Discard value
Vault: Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	Dependent item	vault.metrics.total.gc.pause Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)` ⛔️Custom on fail: Discard value Custom multiplier: `1e-09`
Vault: Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	Dependent item	vault.metrics.runtime.total.gc.runs Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)` ⛔️Custom on fail: Discard value
Vault: Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	Dependent item	vault.metrics.token Preprocessing Prometheus to JSON: `vault_token_count` JSON Path: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by auth, total	Total number of service tokens that were created by an auth method.	Dependent item	vault.metrics.token.by_auth Preprocessing Prometheus to JSON: `vault_token_count_by_auth` JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by policy, total	Total number of service tokens that have a policy attached.	Dependent item	vault.metrics.token.by_policy Preprocessing Prometheus to JSON: `vault_token_count_by_policy` JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	Dependent item	vault.metrics.token.by_ttl Preprocessing Prometheus to JSON: `vault_token_count_by_ttl` JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token creation, rate	Number of service or batch tokens created.	Dependent item	vault.metrics.token.creation.rate Preprocessing Prometheus to JSON: `vault_token_creation` JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second
Vault: Secret kv entries	Number of entries in each key-value secret engine.	Dependent item	vault.metrics.secret.kv.count Preprocessing Prometheus to JSON: `vault_secret_kv_count` JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token secret lease creation, rate	Counts the number of leases created by secret engines.	Dependent item	vault.metrics.secret.lease.creation.rate Preprocessing Prometheus to JSON: `vault_secret_lease_creation` JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	Average
Vault: Version has changed	Vault version has changed. Acknowledge to close the problem manually.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	Info	Manual close: Yes
Vault: Vault server is not responding		`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	High
Vault: Failed to get metrics		`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	Warning	Depends on: Vault: Vault server is sealed
Vault: Current number of open files is too high		`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	Warning
Vault: has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	Info	Manual close: Yes
Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	Average
Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	Average
Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	Average

LLD rule Storage metrics discovery

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	Dependent item	vault.storage.discovery

Item prototypes for Storage metrics discovery

Name Description Type Key and additional info

Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate

Name	Description	Type	Key and additional info
Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	Dependent item	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

Dependent item

vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule Mountpoint metrics discovery

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	Dependent item	vault.mountpoint.discovery

Item prototypes for Mountpoint metrics discovery

Name Description Type Key and additional info

Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate

Name	Description	Type	Key and additional info
Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	Dependent item	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second
Vault: Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	Dependent item	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing Prometheus pattern: `VALUE({#PATTERN_C})` ⛔️Custom on fail: Discard value Change per second

Number of operations to perform a rollback operation on the given mount point.

Dependent item

vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

Vault: Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

Dependent item

vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule WAL metrics discovery

Name	Description	Type	Key and additional info
WAL metrics discovery	Discovery for WAL metrics.	Dependent item	vault.wal.discovery

Item prototypes for WAL metrics discovery

Name	Description	Type	Key and additional info
Vault: Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_deletewals_count)` ⛔️Custom on fail: Discard value
Vault: GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	Dependent item	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_deleted)` ⛔️Custom on fail: Discard value
Vault: WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	Dependent item	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_total)` ⛔️Custom on fail: Discard value
Vault: Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_loadWAL_count)` ⛔️Custom on fail: Discard value
Vault: Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_persistwals_count)` ⛔️Custom on fail: Discard value
Vault: Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	Dependent item	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_flushready_count)` ⛔️Custom on fail: Discard value

LLD rule Replication metrics discovery

Name	Description	Type	Key and additional info
Replication metrics discovery	Discovery for replication metrics.	Dependent item	vault.replication.discovery

Item prototypes for Replication metrics discovery

Name	Description	Type	Key and additional info
Vault: Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)` ⛔️Custom on fail: Discard value
Vault: Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)` ⛔️Custom on fail: Discard value
Vault: Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	Dependent item	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_merkle_commit_index)` ⛔️Custom on fail: Discard value
Vault: Last WAL{#SINGLETON}	The index of the last WAL.	Dependent item	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_wal)` ⛔️Custom on fail: Discard value
Vault: Last DR WAL{#SINGLETON}	The index of the last DR WAL.	Dependent item	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_dr_wal)` ⛔️Custom on fail: Discard value
Vault: Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	Dependent item	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_performance_wal)` ⛔️Custom on fail: Discard value
Vault: Last remote WAL{#SINGLETON}	The index of the last remote WAL.	Dependent item	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)` ⛔️Custom on fail: Discard value

LLD rule Token metrics discovery

Name	Description	Type	Key and additional info
Token metrics discovery	Tokens metrics discovery.	Dependent item	vault.tokens.discovery

Item prototypes for Token metrics discovery

Name Description Type Key and additional info

Vault: Token [{#TOKEN_NAME}] error

Token lookup error text.

Dependent item

vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].error.first()
Discard unchanged with heartbeat: 1h

Vault: Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

Dependent item

vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()
Boolean to decimal
Discard unchanged with heartbeat: 1h

Vault: Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

Dependent item

vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Trigger prototypes for Token metrics discovery

Name	Expression	Severity	Dependencies and additional info
Vault: Token [{#TOKEN_NAME}] lookup error occurred	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	Warning	Depends on: Vault: Vault server is sealed
Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	Average
Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	Warning	Depends on: Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 6.2

Also available for: 7.4 7.2 7.0 6.4 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/6.2

HashiCorp Vault by HTTP

Overview

For Zabbix version: 6.2 and higher
The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

This template was tested on:

Vault, version 1.6

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.	``
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	DEPENDENT	vault.mountpoint.discovery
Replication metrics discovery	Discovery for replication metrics.	DEPENDENT	vault.replication.discovery
Storage metrics discovery	Storage backend metrics discovery.	DEPENDENT	vault.storage.discovery Filter: AND - {#STORAGE} MATCHES_REGEX `{$VAULT.LLD.FILTER.STORAGE.MATCHES}`
Token metrics discovery	Tokens metrics discovery.	DEPENDENT	vault.tokens.discovery
WAL metrics discovery	Discovery for WAL metrics.	DEPENDENT	vault.wal.discovery

Items collected

Group	Name	Description	Type	Key and additional info
Vault	Vault: Initialized	Initialization status.	DEPENDENT	vault.health.initialized Preprocessing: - JSONPATH: `$.initialized` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Sealed	Seal status.	DEPENDENT	vault.health.sealed Preprocessing: - JSONPATH: `$.sealed` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Standby	Standby status.	DEPENDENT	vault.health.standby Preprocessing: - JSONPATH: `$.standby` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Performance standby	Performance standby status.	DEPENDENT	vault.health.performance_standby Preprocessing: - JSONPATH: `$.performance_standby` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replication_performance_mode Preprocessing: - JSONPATH: `$.replication_performance_mode` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replication_dr_mode Preprocessing: - JSONPATH: `$.replication_dr_mode` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Version	Server version.	DEPENDENT	vault.health.version Preprocessing: - JSONPATH: `$.version` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Healthcheck	Vault healthcheck.	DEPENDENT	vault.health.check Preprocessing: - JSONPATH: `$.healthcheck` ⛔️ON_FAIL: `CUSTOM_VALUE -> 1` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: HA enabled	HA enabled status.	DEPENDENT	vault.leader.ha_enabled Preprocessing: - JSONPATH: `$.ha_enabled` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Is leader	Leader status.	DEPENDENT	vault.leader.is_self Preprocessing: - JSONPATH: `$.is_self` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Get metrics error	Get metrics error.	DEPENDENT	vault.get_metrics.error Preprocessing: - JSONPATH: `$.errors[0]` ⛔️ON_FAIL: `CUSTOM_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Process CPU seconds, total	Total user and system CPU time spent in seconds.	DEPENDENT	vault.metrics.process.cpu.seconds.total Preprocessing: - PROMETHEUS_PATTERN: `process_cpu_seconds_total` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Open file descriptors, max	Maximum number of open file descriptors.	DEPENDENT	vault.metrics.process.max.fds Preprocessing: - PROMETHEUS_PATTERN: `process_max_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Open file descriptors, current	Number of open file descriptors.	DEPENDENT	vault.metrics.process.open.fds Preprocessing: - PROMETHEUS_PATTERN: `process_open_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Process resident memory	Resident memory size in bytes.	DEPENDENT	vault.metrics.process.resident_memory.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Uptime	Server uptime.	DEPENDENT	vault.metrics.process.uptime Preprocessing: - PROMETHEUS_PATTERN: `process_start_time_seconds` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor(Date.now()/1000 - Number(value));`
Vault	Vault: Process virtual memory, current	Virtual memory size in bytes.	DEPENDENT	vault.metrics.process.virtual_memory.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Process virtual memory, max	Maximum amount of virtual memory available in bytes.	DEPENDENT	vault.metrics.process.virtual_memory.max.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_max_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Audit log requests, rate	Number of all audit log requests across all audit log devices.	DEPENDENT	vault.metrics.audit.log.request.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log request failures, rate	Number of audit log request failures.	DEPENDENT	vault.metrics.audit.log.request.failure.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_request_failure` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log response, rate	Number of audit log responses across all audit log devices.	DEPENDENT	vault.metrics.audit.log.response.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_response_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log response failures, rate	Number of audit log response failures.	DEPENDENT	vault.metrics.audit.log.response.failure.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_response_failure` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	DEPENDENT	vault.metrics.barrier.delete.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_delete_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier GET ops, rate	Number of GET operations at the barrier.	DEPENDENT	vault.metrics.vault.barrier.get.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_get_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier LIST ops, rate	Number of LIST operations at the barrier.	DEPENDENT	vault.metrics.barrier.list.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_list_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier PUT ops, rate	Number of PUT operations at the barrier.	DEPENDENT	vault.metrics.barrier.put.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_put_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache hit, rate	Number of times a value was retrieved from the LRU cache.	DEPENDENT	vault.metrics.cache.hit.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_hit` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	DEPENDENT	vault.metrics.cache.miss.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_miss` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache write, rate	Number of times a value was written to the LRU cache.	DEPENDENT	vault.metrics.cache.write.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_write` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Check token, rate	Number of token checks handled by Vault core.	DEPENDENT	vault.metrics.core.check.token.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_core_check_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	DEPENDENT	vault.metrics.core.fetch.acl_and_token Preprocessing: - PROMETHEUS_PATTERN: `vault_core_fetch_acl_and_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Requests, rate	Number of requests handled by Vault core.	DEPENDENT	vault.metrics.core.handle.request Preprocessing: - PROMETHEUS_PATTERN: `vault_core_handle_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadership.setup_failed Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_leadership_setup_failed` - JSONPATH: `$[?(@.name=="vault_core_leadership_setup_failed")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadership_lost Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_leadership_lost_count` - JSONPATH: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	DEPENDENT	vault.metrics.core.post_unseal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_post_unseal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Pre-seal ops, counter	Duration of time taken by pre-seal operations.	DEPENDENT	vault.metrics.core.pre_seal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_pre_seal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Requested seal ops, counter	Duration of time taken by requested seal operations.	DEPENDENT	vault.metrics.core.seal_with_request Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_with_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Seal ops, counter	Duration of time taken by seal operations.	DEPENDENT	vault.metrics.core.seal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Internal seal ops, counter	Duration of time taken by internal seal operations.	DEPENDENT	vault.metrics.core.seal_internal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_internal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Leadership step downs, counter	Cluster leadership step down.	DEPENDENT	vault.metrics.core.step_down Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_step_down_count` - JSONPATH: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Unseal ops, counter	Duration of time taken by unseal operations.	DEPENDENT	vault.metrics.core.unseal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_unseal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times, counter	Time taken to fetch lease times.	DEPENDENT	vault.metrics.expire.fetch.lease.times Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_fetch_lease_times_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times by token, counter	Time taken to fetch lease times by token.	DEPENDENT	vault.metrics.expire.fetch.lease.times.by_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_fetch_lease_times_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Number of expiring leases	Number of all leases which are eligible for eventual expiry.	DEPENDENT	vault.metrics.expire.num_leases Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_num_leases` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke, count	Time taken to revoke a token.	DEPENDENT	vault.metrics.expire.revoke Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke force, count	Time taken to forcibly revoke a token.	DEPENDENT	vault.metrics.expire.revoke.force Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_force_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke prefix, count	Tokens revoke on a prefix.	DEPENDENT	vault.metrics.expire.revoke.prefix Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_prefix_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	DEPENDENT	vault.metrics.expire.revoke.by_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire renew, count	Time taken to renew a lease.	DEPENDENT	vault.metrics.expire.renew Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_renew_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	DEPENDENT	vault.metrics.expire.renew_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_renew_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Register ops, count	Time taken for register operations.	DEPENDENT	vault.metrics.expire.register Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_register_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	DEPENDENT	vault.metrics.expire.register.auth Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_register_auth_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Policy GET ops, rate	Number of operations to get a policy.	DEPENDENT	vault.metrics.policy.get_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_get_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy LIST ops, rate	Number of operations to list policies.	DEPENDENT	vault.metrics.policy.list_policies.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_list_policies_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy DELETE ops, rate	Number of operations to delete a policy.	DEPENDENT	vault.metrics.policy.delete_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_delete_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy SET ops, rate	Number of operations to set a policy.	DEPENDENT	vault.metrics.policy.set_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_set_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Token create, count	The time taken to create a token.	DEPENDENT	vault.metrics.token.create Preprocessing: - PROMETHEUS_PATTERN: `vault_token_create_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token createAccessor, count	The time taken to create a token accessor.	DEPENDENT	vault.metrics.token.createAccessor Preprocessing: - PROMETHEUS_PATTERN: `vault_token_createAccessor_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token lookup, rate	Number of token look up.	DEPENDENT	vault.metrics.token.lookup.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_token_lookup_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Token revoke, count	The time taken to look up a token.	DEPENDENT	vault.metrics.token.revoke Preprocessing: - PROMETHEUS_PATTERN: `vault_token_revoke_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token revoke tree, count	Time taken to revoke a token tree.	DEPENDENT	vault.metrics.token.revoke.tree Preprocessing: - PROMETHEUS_PATTERN: `vault_token_revoke_tree_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token store, count	Time taken to store an updated token entry without writing to the secondary index.	DEPENDENT	vault.metrics.token.store Preprocessing: - PROMETHEUS_PATTERN: `vault_token_store_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	DEPENDENT	vault.metrics.runtime.alloc.bytes Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_alloc_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime freed objects	Number of freed objects.	DEPENDENT	vault.metrics.runtime.free.count Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_free_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.heap.objects Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_heap_objects` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime malloc count	Cumulative count of allocated heap objects.	DEPENDENT	vault.metrics.runtime.malloc.count Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_malloc_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.num_goroutines Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_num_goroutines` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	DEPENDENT	vault.metrics.runtime.sys.bytes Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_sys_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	DEPENDENT	vault.metrics.total.gc.pause Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_total_gc_pause_ns` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `1.0E-9`
Vault	Vault: Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	DEPENDENT	vault.metrics.runtime.total.gc.runs Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_total_gc_runs` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	DEPENDENT	vault.metrics.token Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count` - JSONPATH: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by auth, total	Total number of service tokens that were created by a auth method.	DEPENDENT	vault.metrics.token.by_auth Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_auth` - JSONPATH: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by policy, total	Total number of service tokens that have a policy attached.	DEPENDENT	vault.metrics.token.by_policy Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_policy` - JSONPATH: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	DEPENDENT	vault.metrics.token.by_ttl Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_ttl` - JSONPATH: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token creation, rate	Number of service or batch tokens created.	DEPENDENT	vault.metrics.token.creation.rate Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_creation` - JSONPATH: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0` - CHANGE_PER_SECOND
Vault	Vault: Secret kv entries	Number of entries in each key-value secret engine.	DEPENDENT	vault.metrics.secret.kv.count Preprocessing: - PROMETHEUS_TO_JSON: `vault_secret_kv_count` - JSONPATH: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token secret lease creation, rate	Counts the number of leases created by secret engines.	DEPENDENT	vault.metrics.secret.lease.creation.rate Preprocessing: - PROMETHEUS_TO_JSON: `vault_secret_lease_creation` - JSONPATH: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0` - CHANGE_PER_SECOND
Vault	Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	DEPENDENT	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	DEPENDENT	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	DEPENDENT	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_deletewals_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	DEPENDENT	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_gc_deleted` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	DEPENDENT	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_gc_total` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_loadWAL_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_persistwals_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	DEPENDENT	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_flushready_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `logshipper_streamWALs_missing_guard` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `logshipper_streamWALs_guard_found` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	DEPENDENT	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_merkle_commit_index` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last WAL{#SINGLETON}	The index of the last WAL.	DEPENDENT	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last DR WAL{#SINGLETON}	The index of the last DR WAL.	DEPENDENT	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_dr_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	DEPENDENT	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_performance_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last remote WAL{#SINGLETON}	The index of the last remote WAL.	DEPENDENT	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_fsm_last_remote_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token [{#TOKEN_NAME}] error	Token lookup error text.	DEPENDENT	vault.token_via_accessor.error["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	DEPENDENT	vault.token_via_accessor.has_ttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	DEPENDENT	vault.token_via_accessor.ttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`
Zabbix raw items	Vault: Get health	-	HTTP_AGENT	vault.get_health Preprocessing: - CHECK_NOT_SUPPORTED ⛔️ON_FAIL: `CUSTOM_VALUE -> {"healthcheck": 0}`
Zabbix raw items	Vault: Get leader	-	HTTP_AGENT	vault.get_leader Preprocessing: - CHECK_NOT_SUPPORTED
Zabbix raw items	Vault: Get metrics	-	HTTP_AGENT	vault.get_metrics Preprocessing: - CHECK_NOT_SUPPORTED
Zabbix raw items	Vault: Clear metrics	-	DEPENDENT	vault.clear_metrics Preprocessing: - CHECK_JSON_ERROR: `$.errors` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix raw items	Vault: Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	SCRIPT	vault.get_tokens Expression: `The text is too long. Please see the template.`
Zabbix raw items	Vault: Check WAL discovery	-	DEPENDENT	vault.check_wal_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARD_UNCHANGED_HEARTBEAT: `15m`
Zabbix raw items	Vault: Check replication discovery	-	DEPENDENT	vault.check_replication_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARD_UNCHANGED_HEARTBEAT: `15m`
Zabbix raw items	Vault: Check storage discovery	-	DEPENDENT	vault.check_storage_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{name=~"^vault_(?:.+)_(?:get
Zabbix raw items	Vault: Check mountpoint discovery	-	DEPENDENT	vault.check_mountpoint_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARD_UNCHANGED_HEARTBEAT: `15m`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	AVERAGE
Vault: Version has changed	Vault version has changed. Ack to close.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	INFO	Manual close: YES
Vault: Vault server is not responding	-	`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	HIGH
Vault: Failed to get metrics	-	`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Current number of open files is too high	-	`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	WARNING
Vault: has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	INFO	Manual close: YES
Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] lookup error occurred	-	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Token [{#TOKEN_NAME}] will expire soon	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] will expire soon	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	WARNING	Depends on: - Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

This template is for Zabbix version: 6.0

Also available for: 7.4 7.2 7.0 6.4 6.2 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/6.0

HashiCorp Vault by HTTP

Overview

The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

Requirements

Zabbix version: 6.0 and higher.

Tested versions

This template has been tested on:

Vault 1.6

Configuration

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`

Items

Name	Description	Type	Key and additional info
Vault: Get health		HTTP agent	vault.get_health Preprocessing Check for not supported value ⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
Vault: Get leader		HTTP agent	vault.get_leader Preprocessing Check for not supported value ⛔️Custom on fail: Discard value
Vault: Get metrics		HTTP agent	vault.get_metrics Preprocessing Check for not supported value ⛔️Custom on fail: Discard value
Vault: Clear metrics		Dependent item	vault.clear_metrics Preprocessing Check for error in JSON: `$.errors` ⛔️Custom on fail: Discard value
Vault: Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	Script	vault.get_tokens
Vault: Check WAL discovery		Dependent item	vault.check_wal_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Check replication discovery		Dependent item	vault.check_replication_discovery Preprocessing Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Check storage discovery		Dependent item	vault.check_storage_discovery Preprocessing Prometheus to JSON: `{name=~"^vault_(?:.+)_(?:get
Vault: Check mountpoint discovery		Dependent item	vault.check_mountpoint_discovery Preprocessing Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.` Discard unchanged with heartbeat: `15m`
Vault: Initialized	Initialization status.	Dependent item	vault.health.initialized Preprocessing JSON Path: `$.initialized` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Sealed	Seal status.	Dependent item	vault.health.sealed Preprocessing JSON Path: `$.sealed` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Standby	Standby status.	Dependent item	vault.health.standby Preprocessing JSON Path: `$.standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Performance standby	Performance standby status.	Dependent item	vault.health.performance_standby Preprocessing JSON Path: `$.performance_standby` ⛔️Custom on fail: Discard value Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_performance_mode Preprocessing JSON Path: `$.replication_performance_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	Dependent item	vault.health.replication_dr_mode Preprocessing JSON Path: `$.replication_dr_mode` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Version	Server version.	Dependent item	vault.health.version Preprocessing JSON Path: `$.version` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Healthcheck	Vault healthcheck.	Dependent item	vault.health.check Preprocessing JSON Path: `$.healthcheck` ⛔️Custom on fail: Set value to: `1` Discard unchanged with heartbeat: `1h`
Vault: HA enabled	HA enabled status.	Dependent item	vault.leader.ha_enabled Preprocessing JSON Path: `$.ha_enabled` Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Is leader	Leader status.	Dependent item	vault.leader.is_self Preprocessing JSON Path: `$.is_self` Boolean to decimal Discard unchanged with heartbeat: `1h`
Vault: Get metrics error	Get metrics error.	Dependent item	vault.get_metrics.error Preprocessing JSON Path: `$.errors[0]` ⛔️Custom on fail: Set value to: `` Discard unchanged with heartbeat: `1h`
Vault: Process CPU seconds, total	Total user and system CPU time spent in seconds.	Dependent item	vault.metrics.process.cpu.seconds.total Preprocessing Prometheus pattern: `VALUE(process_cpu_seconds_total)` ⛔️Custom on fail: Discard value
Vault: Open file descriptors, max	Maximum number of open file descriptors.	Dependent item	vault.metrics.process.max.fds Preprocessing Prometheus pattern: `VALUE(process_max_fds)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Open file descriptors, current	Number of open file descriptors.	Dependent item	vault.metrics.process.open.fds Preprocessing Prometheus pattern: `VALUE(process_open_fds)` ⛔️Custom on fail: Discard value
Vault: Process resident memory	Resident memory size in bytes.	Dependent item	vault.metrics.process.resident_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_resident_memory_bytes)` ⛔️Custom on fail: Discard value
Vault: Uptime	Server uptime.	Dependent item	vault.metrics.process.uptime Preprocessing Prometheus pattern: `VALUE(process_start_time_seconds)` ⛔️Custom on fail: Discard value JavaScript: `The text is too long. Please see the template.`
Vault: Process virtual memory, current	Virtual memory size in bytes.	Dependent item	vault.metrics.process.virtual_memory.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_bytes)` ⛔️Custom on fail: Discard value
Vault: Process virtual memory, max	Maximum amount of virtual memory available in bytes.	Dependent item	vault.metrics.process.virtual_memory.max.bytes Preprocessing Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)` ⛔️Custom on fail: Discard value Discard unchanged with heartbeat: `1h`
Vault: Audit log requests, rate	Number of all audit log requests across all audit log devices.	Dependent item	vault.metrics.audit.log.request.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log request failures, rate	Number of audit log request failures.	Dependent item	vault.metrics.audit.log.request.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_request_failure)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log response, rate	Number of audit log responses across all audit log devices.	Dependent item	vault.metrics.audit.log.response.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Audit log response failures, rate	Number of audit log response failures.	Dependent item	vault.metrics.audit.log.response.failure.rate Preprocessing Prometheus pattern: `VALUE(vault_audit_log_response_failure)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	Dependent item	vault.metrics.barrier.delete.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_delete_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier GET ops, rate	Number of GET operations at the barrier.	Dependent item	vault.metrics.vault.barrier.get.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_get_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier LIST ops, rate	Number of LIST operations at the barrier.	Dependent item	vault.metrics.barrier.list.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_list_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Barrier PUT ops, rate	Number of PUT operations at the barrier.	Dependent item	vault.metrics.barrier.put.rate Preprocessing Prometheus pattern: `VALUE(vault_barrier_put_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache hit, rate	Number of times a value was retrieved from the LRU cache.	Dependent item	vault.metrics.cache.hit.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_hit)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	Dependent item	vault.metrics.cache.miss.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_miss)` ⛔️Custom on fail: Discard value Change per second
Vault: Cache write, rate	Number of times a value was written to the LRU cache.	Dependent item	vault.metrics.cache.write.rate Preprocessing Prometheus pattern: `VALUE(vault_cache_write)` ⛔️Custom on fail: Discard value Change per second
Vault: Check token, rate	Number of token checks handled by Vault core.	Dependent item	vault.metrics.core.check.token.rate Preprocessing Prometheus pattern: `VALUE(vault_core_check_token_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	Dependent item	vault.metrics.core.fetch.acl_and_token Preprocessing Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Requests, rate	Number of requests handled by Vault core.	Dependent item	vault.metrics.core.handle.request Preprocessing Prometheus pattern: `VALUE(vault_core_handle_request_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership.setup_failed Preprocessing Prometheus to JSON: `vault_core_leadership_setup_failed` JSON Path: `The text is too long. Please see the template.` ⛔️Custom on fail: Set value to: `0`
Vault: Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	Dependent item	vault.metrics.core.leadership_lost Preprocessing Prometheus to JSON: `vault_core_leadership_lost_count` JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	Dependent item	vault.metrics.core.post_unseal Preprocessing Prometheus pattern: `VALUE(vault_core_post_unseal_count)` ⛔️Custom on fail: Discard value
Vault: Pre-seal ops, counter	Duration of time taken by pre-seal operations.	Dependent item	vault.metrics.core.pre_seal Preprocessing Prometheus pattern: `VALUE(vault_core_pre_seal_count)` ⛔️Custom on fail: Discard value
Vault: Requested seal ops, counter	Duration of time taken by requested seal operations.	Dependent item	vault.metrics.core.seal_with_request Preprocessing Prometheus pattern: `VALUE(vault_core_seal_with_request_count)` ⛔️Custom on fail: Discard value
Vault: Seal ops, counter	Duration of time taken by seal operations.	Dependent item	vault.metrics.core.seal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_count)` ⛔️Custom on fail: Discard value
Vault: Internal seal ops, counter	Duration of time taken by internal seal operations.	Dependent item	vault.metrics.core.seal_internal Preprocessing Prometheus pattern: `VALUE(vault_core_seal_internal_count)` ⛔️Custom on fail: Discard value
Vault: Leadership step downs, counter	Cluster leadership step down.	Dependent item	vault.metrics.core.step_down Preprocessing Prometheus to JSON: `vault_core_step_down_count` JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Unseal ops, counter	Duration of time taken by unseal operations.	Dependent item	vault.metrics.core.unseal Preprocessing Prometheus pattern: `VALUE(vault_core_unseal_count)` ⛔️Custom on fail: Discard value
Vault: Fetch lease times, counter	Time taken to fetch lease times.	Dependent item	vault.metrics.expire.fetch.lease.times Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)` ⛔️Custom on fail: Discard value
Vault: Fetch lease times by token, counter	Time taken to fetch lease times by token.	Dependent item	vault.metrics.expire.fetch.lease.times.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)` ⛔️Custom on fail: Discard value
Vault: Number of expiring leases	Number of all leases which are eligible for eventual expiry.	Dependent item	vault.metrics.expire.num_leases Preprocessing Prometheus pattern: `VALUE(vault_expire_num_leases)` ⛔️Custom on fail: Discard value
Vault: Expire revoke, count	Time taken to revoke a token.	Dependent item	vault.metrics.expire.revoke Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_count)` ⛔️Custom on fail: Discard value
Vault: Expire revoke force, count	Time taken to forcibly revoke a token.	Dependent item	vault.metrics.expire.revoke.force Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_force_count)` ⛔️Custom on fail: Discard value
Vault: Expire revoke prefix, count	Tokens revoke on a prefix.	Dependent item	vault.metrics.expire.revoke.prefix Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)` ⛔️Custom on fail: Discard value
Vault: Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	Dependent item	vault.metrics.expire.revoke.by_token Preprocessing Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)` ⛔️Custom on fail: Discard value
Vault: Expire renew, count	Time taken to renew a lease.	Dependent item	vault.metrics.expire.renew Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_count)` ⛔️Custom on fail: Discard value
Vault: Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	Dependent item	vault.metrics.expire.renew_token Preprocessing Prometheus pattern: `VALUE(vault_expire_renew_token_count)` ⛔️Custom on fail: Discard value
Vault: Register ops, count	Time taken for register operations.	Dependent item	vault.metrics.expire.register Preprocessing Prometheus pattern: `VALUE(vault_expire_register_count)` ⛔️Custom on fail: Discard value
Vault: Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	Dependent item	vault.metrics.expire.register.auth Preprocessing Prometheus pattern: `VALUE(vault_expire_register_auth_count)` ⛔️Custom on fail: Discard value
Vault: Policy GET ops, rate	Number of operations to get a policy.	Dependent item	vault.metrics.policy.get_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_get_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy LIST ops, rate	Number of operations to list policies.	Dependent item	vault.metrics.policy.list_policies.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_list_policies_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy DELETE ops, rate	Number of operations to delete a policy.	Dependent item	vault.metrics.policy.delete_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_delete_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Policy SET ops, rate	Number of operations to set a policy.	Dependent item	vault.metrics.policy.set_policy.rate Preprocessing Prometheus pattern: `VALUE(vault_policy_set_policy_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Token create, count	The time taken to create a token.	Dependent item	vault.metrics.token.create Preprocessing Prometheus pattern: `VALUE(vault_token_create_count)` ⛔️Custom on fail: Discard value
Vault: Token createAccessor, count	The time taken to create a token accessor.	Dependent item	vault.metrics.token.createAccessor Preprocessing Prometheus pattern: `VALUE(vault_token_createAccessor_count)` ⛔️Custom on fail: Discard value
Vault: Token lookup, rate	Number of token look up.	Dependent item	vault.metrics.token.lookup.rate Preprocessing Prometheus pattern: `VALUE(vault_token_lookup_count)` ⛔️Custom on fail: Discard value Change per second
Vault: Token revoke, count	The time taken to look up a token.	Dependent item	vault.metrics.token.revoke Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_count)` ⛔️Custom on fail: Discard value
Vault: Token revoke tree, count	Time taken to revoke a token tree.	Dependent item	vault.metrics.token.revoke.tree Preprocessing Prometheus pattern: `VALUE(vault_token_revoke_tree_count)` ⛔️Custom on fail: Discard value
Vault: Token store, count	Time taken to store an updated token entry without writing to the secondary index.	Dependent item	vault.metrics.token.store Preprocessing Prometheus pattern: `VALUE(vault_token_store_count)` ⛔️Custom on fail: Discard value
Vault: Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	Dependent item	vault.metrics.runtime.alloc.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)` ⛔️Custom on fail: Discard value
Vault: Runtime freed objects	Number of freed objects.	Dependent item	vault.metrics.runtime.free.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_free_count)` ⛔️Custom on fail: Discard value
Vault: Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.heap.objects Preprocessing Prometheus pattern: `VALUE(vault_runtime_heap_objects)` ⛔️Custom on fail: Discard value
Vault: Runtime malloc count	Cumulative count of allocated heap objects.	Dependent item	vault.metrics.runtime.malloc.count Preprocessing Prometheus pattern: `VALUE(vault_runtime_malloc_count)` ⛔️Custom on fail: Discard value
Vault: Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	Dependent item	vault.metrics.runtime.num_goroutines Preprocessing Prometheus pattern: `VALUE(vault_runtime_num_goroutines)` ⛔️Custom on fail: Discard value
Vault: Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	Dependent item	vault.metrics.runtime.sys.bytes Preprocessing Prometheus pattern: `VALUE(vault_runtime_sys_bytes)` ⛔️Custom on fail: Discard value
Vault: Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	Dependent item	vault.metrics.total.gc.pause Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)` ⛔️Custom on fail: Discard value Custom multiplier: `1e-09`
Vault: Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	Dependent item	vault.metrics.runtime.total.gc.runs Preprocessing Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)` ⛔️Custom on fail: Discard value
Vault: Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	Dependent item	vault.metrics.token Preprocessing Prometheus to JSON: `vault_token_count` JSON Path: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by auth, total	Total number of service tokens that were created by an auth method.	Dependent item	vault.metrics.token.by_auth Preprocessing Prometheus to JSON: `vault_token_count_by_auth` JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by policy, total	Total number of service tokens that have a policy attached.	Dependent item	vault.metrics.token.by_policy Preprocessing Prometheus to JSON: `vault_token_count_by_policy` JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	Dependent item	vault.metrics.token.by_ttl Preprocessing Prometheus to JSON: `vault_token_count_by_ttl` JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token creation, rate	Number of service or batch tokens created.	Dependent item	vault.metrics.token.creation.rate Preprocessing Prometheus to JSON: `vault_token_creation` JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second
Vault: Secret kv entries	Number of entries in each key-value secret engine.	Dependent item	vault.metrics.secret.kv.count Preprocessing Prometheus to JSON: `vault_secret_kv_count` JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️Custom on fail: Set value to: `0`
Vault: Token secret lease creation, rate	Counts the number of leases created by secret engines.	Dependent item	vault.metrics.secret.lease.creation.rate Preprocessing Prometheus to JSON: `vault_secret_lease_creation` JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️Custom on fail: Set value to: `0` Change per second

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	Average
Vault: Version has changed	Vault version has changed. Acknowledge to close the problem manually.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	Info	Manual close: Yes
Vault: Vault server is not responding		`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	High
Vault: Failed to get metrics		`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	Warning	Depends on: Vault: Vault server is sealed
Vault: Current number of open files is too high		`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	Warning
Vault: has been restarted	Uptime is less than 10 minutes.	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	Info	Manual close: Yes
Vault: High frequency of leadership setup failures	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	Average
Vault: High frequency of leadership losses	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	Average
Vault: High frequency of leadership step downs	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	Average

LLD rule Storage metrics discovery

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	Dependent item	vault.storage.discovery

Item prototypes for Storage metrics discovery

Name Description Type Key and additional info

Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate

Number of a {#OPERATION} operation against the {#STORAGE} storage backend.

Dependent item

vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule Mountpoint metrics discovery

Name	Description	Type	Key and additional info
Mountpoint metrics discovery	Mountpoint metrics discovery.	Dependent item	vault.mountpoint.discovery

Item prototypes for Mountpoint metrics discovery

Name Description Type Key and additional info

Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate

Number of operations to perform a rollback operation on the given mount point.

Dependent item

vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

Vault: Route rollback [{#MOUNTPOINT}] ops, rate

Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.

Dependent item

vault.metrics.route.rollback.rate[{#MOUNTPOINT}]

Preprocessing

Prometheus pattern: VALUE({#PATTERN_C})
⛔️Custom on fail: Discard value
Change per second

LLD rule WAL metrics discovery

Name	Description	Type	Key and additional info
WAL metrics discovery	Discovery for WAL metrics.	Dependent item	vault.wal.discovery

Item prototypes for WAL metrics discovery

Name	Description	Type	Key and additional info
Vault: Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_deletewals_count)` ⛔️Custom on fail: Discard value
Vault: GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	Dependent item	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_deleted)` ⛔️Custom on fail: Discard value
Vault: WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	Dependent item	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_gc_total)` ⛔️Custom on fail: Discard value
Vault: Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_loadWAL_count)` ⛔️Custom on fail: Discard value
Vault: Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	Dependent item	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_persistwals_count)` ⛔️Custom on fail: Discard value
Vault: Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	Dependent item	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(vault_wal_flushready_count)` ⛔️Custom on fail: Discard value

LLD rule Replication metrics discovery

Name	Description	Type	Key and additional info
Replication metrics discovery	Discovery for replication metrics.	Dependent item	vault.replication.discovery

Item prototypes for Replication metrics discovery

Name	Description	Type	Key and additional info
Vault: Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)` ⛔️Custom on fail: Discard value
Vault: Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	Dependent item	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)` ⛔️Custom on fail: Discard value
Vault: Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	Dependent item	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_merkle_commit_index)` ⛔️Custom on fail: Discard value
Vault: Last WAL{#SINGLETON}	The index of the last WAL.	Dependent item	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_wal)` ⛔️Custom on fail: Discard value
Vault: Last DR WAL{#SINGLETON}	The index of the last DR WAL.	Dependent item	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_dr_wal)` ⛔️Custom on fail: Discard value
Vault: Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	Dependent item	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_wal_last_performance_wal)` ⛔️Custom on fail: Discard value
Vault: Last remote WAL{#SINGLETON}	The index of the last remote WAL.	Dependent item	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)` ⛔️Custom on fail: Discard value

LLD rule Token metrics discovery

Name	Description	Type	Key and additional info
Token metrics discovery	Tokens metrics discovery.	Dependent item	vault.tokens.discovery

Item prototypes for Token metrics discovery

Name Description Type Key and additional info

Vault: Token [{#TOKEN_NAME}] error

Token lookup error text.

Dependent item

vault.token_via_accessor.error["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].error.first()
Discard unchanged with heartbeat: 1h

Vault: Token [{#TOKEN_NAME}] has TTL

The Token has TTL.

Dependent item

vault.token_via_accessor.has_ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()
Boolean to decimal
Discard unchanged with heartbeat: 1h

Vault: Token [{#TOKEN_NAME}] TTL

The TTL period of the token.

Dependent item

vault.token_via_accessor.ttl["{#ACCESSOR}"]

Preprocessing

JSON Path: $.[?(@.accessor == "{#ACCESSOR}")].ttl.first()

Trigger prototypes for Token metrics discovery

Name	Expression	Severity	Dependencies and additional info
Vault: Token [{#TOKEN_NAME}] lookup error occurred	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	Warning	Depends on: Vault: Vault server is sealed
Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	Average
Vault: Token [{#TOKEN_NAME}] will expire soon	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	Warning	Depends on: Vault: Token [{#TOKEN_NAME}] will expire soon

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 5.4

Also available for: 7.4 7.2 7.0 6.4 6.2 6.0

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/vault_http?at=release/5.4

HashiCorp Vault by HTTP

Overview

For Zabbix version: 5.4 and higher
The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

Template Vault by HTTP — collects metrics by HTTP agent from /sys/metrics API endpoint. See https://www.vaultproject.io/api-docs/system/metrics.

This template was tested on:

Vault, version 1.6

Setup

Configure Vault API. See Vault Configuration. Create a Vault service token and set it to the macro {$VAULT.TOKEN}.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name	Description	Default
{$VAULT.API.PORT}	Vault port.	`8200`
{$VAULT.API.SCHEME}	Vault API scheme.	`http`
{$VAULT.HOST}	Vault host name.	`<PUT YOUR VAULT HOST>`
{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}	Maximum number of Vault leadership losses.	`5`
{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}	Maximum number of Vault leadership setup failed.	`5`
{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}	Maximum number of Vault leadership step downs.	`5`
{$VAULT.LLD.FILTER.STORAGE.MATCHES}	Filter of discoverable storage backends.	`.+`
{$VAULT.OPEN.FDS.MAX.WARN}	Maximum percentage of used file descriptors for trigger expression.	`90`
{$VAULT.TOKEN.ACCESSORS}	Vault accessors separated by spaces for monitoring token expiration time.	``
{$VAULT.TOKEN.TTL.MIN.CRIT}	Token TTL critical threshold.	`3d`
{$VAULT.TOKEN.TTL.MIN.WARN}	Token TTL warning threshold.	`7d`
{$VAULT.TOKEN}	Vault auth token.	`<PUT YOUR AUTH TOKEN>`

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
Storage metrics discovery	Storage backend metrics discovery.	DEPENDENT	vault.storage.discovery Filter: AND - {#STORAGE} MATCHES_REGEX `{$VAULT.LLD.FILTER.STORAGE.MATCHES}`
Mountpoint metrics discovery	Mountpoint metrics discovery.	DEPENDENT	vault.mountpoint.discovery
WAL metrics discovery	Discovery for WAL metrics.	DEPENDENT	vault.wal.discovery
Replication metrics discovery	Discovery for replication metrics.	DEPENDENT	vault.replication.discovery
Token metrics discovery	Tokens metrics doscovery.	DEPENDENT	vault.tokens.discovery

Items collected

Group	Name	Description	Type	Key and additional info
Vault	Vault: Initialized	Initialization status.	DEPENDENT	vault.health.initialized Preprocessing: - JSONPATH: `$.initialized` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Sealed	Seal status.	DEPENDENT	vault.health.sealed Preprocessing: - JSONPATH: `$.sealed` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Standby	Standby status.	DEPENDENT	vault.health.standby Preprocessing: - JSONPATH: `$.standby` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Performance standby	Performance standby status.	DEPENDENT	vault.health.performance_standby Preprocessing: - JSONPATH: `$.performance_standby` ⛔️ON_FAIL: `DISCARD_VALUE ->` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Performance replication	Performance replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replication_performance_mode Preprocessing: - JSONPATH: `$.replication_performance_mode` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Disaster Recovery replication	Disaster recovery replication mode https://www.vaultproject.io/docs/enterprise/replication	DEPENDENT	vault.health.replication_dr_mode Preprocessing: - JSONPATH: `$.replication_dr_mode` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Version	Server version.	DEPENDENT	vault.health.version Preprocessing: - JSONPATH: `$.version` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Healthcheck	Vault healthcheck.	DEPENDENT	vault.health.check Preprocessing: - JSONPATH: `$.healthcheck` ⛔️ON_FAIL: `CUSTOM_VALUE -> 1` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: HA enabled	HA enabled status.	DEPENDENT	vault.leader.ha_enabled Preprocessing: - JSONPATH: `$.ha_enabled` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Is leader	Leader status.	DEPENDENT	vault.leader.is_self Preprocessing: - JSONPATH: `$.is_self` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Get metrics error	Get metrics error.	DEPENDENT	vault.get_metrics.error Preprocessing: - JSONPATH: `$.errors[0]` ⛔️ON_FAIL: `CUSTOM_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Process CPU seconds, total	Total user and system CPU time spent in seconds.	DEPENDENT	vault.metrics.process.cpu.seconds.total Preprocessing: - PROMETHEUS_PATTERN: `process_cpu_seconds_total` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Open file descriptors, max	Maximum number of open file descriptors.	DEPENDENT	vault.metrics.process.max.fds Preprocessing: - PROMETHEUS_PATTERN: `process_max_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Open file descriptors, current	Number of open file descriptors.	DEPENDENT	vault.metrics.process.open.fds Preprocessing: - PROMETHEUS_PATTERN: `process_open_fds` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Process resident memory	Resident memory size in bytes.	DEPENDENT	vault.metrics.process.resident_memory.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_resident_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Uptime	Server uptime.	DEPENDENT	vault.metrics.process.uptime Preprocessing: - PROMETHEUS_PATTERN: `process_start_time_seconds` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return Math.floor(Date.now()/1000 - Number(value));`
Vault	Vault: Process virtual memory, current	Virtual memory size in bytes.	DEPENDENT	vault.metrics.process.virtual_memory.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Process virtual memory, max	Maximum amount of virtual memory available in bytes.	DEPENDENT	vault.metrics.process.virtual_memory.max.bytes Preprocessing: - PROMETHEUS_PATTERN: `process_virtual_memory_max_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Audit log requests, rate	Number of all audit log requests across all audit log devices.	DEPENDENT	vault.metrics.audit.log.request.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log request failures, rate	Number of audit log request failures.	DEPENDENT	vault.metrics.audit.log.request.failure.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_request_failure` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log response, rate	Number of audit log responses across all audit log devices.	DEPENDENT	vault.metrics.audit.log.response.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_response_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Audit log response failures, rate	Number of audit log response failures.	DEPENDENT	vault.metrics.audit.log.response.failure.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_audit_log_response_failure` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier DELETE ops, rate	Number of DELETE operations at the barrier.	DEPENDENT	vault.metrics.barrier.delete.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_delete_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier GET ops, rate	Number of GET operations at the barrier.	DEPENDENT	vault.metrics.vault.barrier.get.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_get_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier LIST ops, rate	Number of LIST operations at the barrier.	DEPENDENT	vault.metrics.barrier.list.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_list_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Barrier PUT ops, rate	Number of PUT operations at the barrier.	DEPENDENT	vault.metrics.barrier.put.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_barrier_put_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache hit, rate	Number of times a value was retrieved from the LRU cache.	DEPENDENT	vault.metrics.cache.hit.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_hit` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache miss, rate	Number of times a value was not in the LRU cache. The results in a read from the configured storage.	DEPENDENT	vault.metrics.cache.miss.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_miss` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Cache write, rate	Number of times a value was written to the LRU cache.	DEPENDENT	vault.metrics.cache.write.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_cache_write` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Check token, rate	Number of token checks handled by Vault corecore.	DEPENDENT	vault.metrics.core.check.token.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_core_check_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Fetch ACL and token, rate	Number of ACL and corresponding token entry fetches handled by Vault core.	DEPENDENT	vault.metrics.core.fetch.acl_and_token Preprocessing: - PROMETHEUS_PATTERN: `vault_core_fetch_acl_and_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Requests, rate	Number of requests handled by Vault core.	DEPENDENT	vault.metrics.core.handle.request Preprocessing: - PROMETHEUS_PATTERN: `vault_core_handle_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Leadership setup failed, counter	Cluster leadership setup failures which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadership.setup_failed Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_leadership_setup_failed` - JSONPATH: `$[?(@.name=="vault_core_leadership_setup_failed")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Leadership setup lost, counter	Cluster leadership losses which have occurred in a highly available Vault cluster.	DEPENDENT	vault.metrics.core.leadership_lost Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_leadership_lost_count` - JSONPATH: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Post-unseal ops, counter	Duration of time taken by post-unseal operations handled by Vault core.	DEPENDENT	vault.metrics.core.post_unseal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_post_unseal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Pre-seal ops, counter	Duration of time taken by pre-seal operations.	DEPENDENT	vault.metrics.core.pre_seal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_pre_seal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Requested seal ops, counter	Duration of time taken by requested seal operations.	DEPENDENT	vault.metrics.core.seal_with_request Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_with_request_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Seal ops, counter	Duration of time taken by seal operations.	DEPENDENT	vault.metrics.core.seal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Internal seal ops, counter	Duration of time taken by internal seal operations.	DEPENDENT	vault.metrics.core.seal_internal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_seal_internal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Leadership step downs, counter	Cluster leadership step down.	DEPENDENT	vault.metrics.core.step_down Preprocessing: - PROMETHEUS_TO_JSON: `vault_core_step_down_count` - JSONPATH: `$[?(@.name=="vault_core_step_down_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Unseal ops, counter	Duration of time taken by unseal operations.	DEPENDENT	vault.metrics.core.unseal Preprocessing: - PROMETHEUS_PATTERN: `vault_core_unseal_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times, counter	Time taken to fetch lease times.	DEPENDENT	vault.metrics.expire.fetch.lease.times Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_fetch_lease_times_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Fetch lease times by token, counter	Time taken to fetch lease times by token.	DEPENDENT	vault.metrics.expire.fetch.lease.times.by_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_fetch_lease_times_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Number of expiring leases	Number of all leases which are eligible for eventual expiry.	DEPENDENT	vault.metrics.expire.num_leases Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_num_leases` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke, count	Time taken to revoke a token.	DEPENDENT	vault.metrics.expire.revoke Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke force, count	Time taken to forcibly revoke a token.	DEPENDENT	vault.metrics.expire.revoke.force Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_force_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire revoke prefix, count	Tokens revoke on a prefix.	DEPENDENT	vault.metrics.expire.revoke.prefix Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_prefix_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Revoke secrets by token, count	Time taken to revoke all secrets issued with a given token.	DEPENDENT	vault.metrics.expire.revoke.by_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_revoke_by_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Expire renew, count	Time taken to renew a lease.	DEPENDENT	vault.metrics.expire.renew Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_renew_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Renew token, count	Time taken to renew a token which does not need to invoke a logical backend.	DEPENDENT	vault.metrics.expire.renew_token Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_renew_token_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Register ops, count	Time taken for register operations.	DEPENDENT	vault.metrics.expire.register Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_register_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Register auth ops, count	Time taken for register authentication operations which create lease entries without lease ID.	DEPENDENT	vault.metrics.expire.register.auth Preprocessing: - PROMETHEUS_PATTERN: `vault_expire_register_auth_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Policy GET ops, rate	Number of operations to get a policy.	DEPENDENT	vault.metrics.policy.get_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_get_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy LIST ops, rate	Number of operations to list policies.	DEPENDENT	vault.metrics.policy.list_policies.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_list_policies_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy DELETE ops, rate	Number of operations to delete a policy.	DEPENDENT	vault.metrics.policy.delete_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_delete_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Policy SET ops, rate	Number of operations to set a policy.	DEPENDENT	vault.metrics.policy.set_policy.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_policy_set_policy_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Token create, count	The time taken to create a token.	DEPENDENT	vault.metrics.token.create Preprocessing: - PROMETHEUS_PATTERN: `vault_token_create_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token createAccessor, count	The time taken to create a token accessor.	DEPENDENT	vault.metrics.token.createAccessor Preprocessing: - PROMETHEUS_PATTERN: `vault_token_createAccessor_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token lookup, rate	Number of token look up.	DEPENDENT	vault.metrics.token.lookup.rate Preprocessing: - PROMETHEUS_PATTERN: `vault_token_lookup_count` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Token revoke, count	The time taken to look up a token.	DEPENDENT	vault.metrics.token.revoke Preprocessing: - PROMETHEUS_PATTERN: `vault_token_revoke_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token revoke tree, count	Time taken to revoke a token tree.	DEPENDENT	vault.metrics.token.revoke.tree Preprocessing: - PROMETHEUS_PATTERN: `vault_token_revoke_tree_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token store, count	Time taken to store an updated token entry without writing to the secondary index.	DEPENDENT	vault.metrics.token.store Preprocessing: - PROMETHEUS_PATTERN: `vault_token_store_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime allocated bytes	Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.	DEPENDENT	vault.metrics.runtime.alloc.bytes Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_alloc_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime freed objects	Number of freed objects.	DEPENDENT	vault.metrics.runtime.free.count Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_free_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime heap objects	Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.heap.objects Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_heap_objects` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime malloc count	Cumulative count of allocated heap objects.	DEPENDENT	vault.metrics.runtime.malloc.count Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_malloc_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime num goroutines	Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.	DEPENDENT	vault.metrics.runtime.num_goroutines Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_num_goroutines` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime sys bytes	Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.	DEPENDENT	vault.metrics.runtime.sys.bytes Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_sys_bytes` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Runtime GC pause, total	The total garbage collector pause time since Vault was last started.	DEPENDENT	vault.metrics.total.gc.pause Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_total_gc_pause_ns` ⛔️ON_FAIL: `DISCARD_VALUE ->` - MULTIPLIER: `1.0E-9`
Vault	Vault: Runtime GC runs, total	Total number of garbage collection runs since Vault was last started.	DEPENDENT	vault.metrics.runtime.total.gc.runs Preprocessing: - PROMETHEUS_PATTERN: `vault_runtime_total_gc_runs` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token count, total	Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.	DEPENDENT	vault.metrics.token Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count` - JSONPATH: `$[?(@.name=="vault_token_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by auth, total	Total number of service tokens that were created by a auth method.	DEPENDENT	vault.metrics.token.by_auth Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_auth` - JSONPATH: `$[?(@.name=="vault_token_count_by_auth")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by policy, total	Total number of service tokens that have a policy attached.	DEPENDENT	vault.metrics.token.by_policy Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_policy` - JSONPATH: `$[?(@.name=="vault_token_count_by_policy")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token count by ttl, total	Number of service tokens, grouped by the TTL range they were assigned at creation.	DEPENDENT	vault.metrics.token.by_ttl Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_count_by_ttl` - JSONPATH: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token creation, rate	Number of service or batch tokens created.	DEPENDENT	vault.metrics.token.creation.rate Preprocessing: - PROMETHEUS_TO_JSON: `vault_token_creation` - JSONPATH: `$[?(@.name=="vault_token_creation")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0` - CHANGE_PER_SECOND
Vault	Vault: Secret kv entries	Number of entries in each key-value secret engine.	DEPENDENT	vault.metrics.secret.kv.count Preprocessing: - PROMETHEUS_TO_JSON: `vault_secret_kv_count` - JSONPATH: `$[?(@.name=="vault_secret_kv_count")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0`
Vault	Vault: Token secret lease creation, rate	Counts the number of leases created by secret engines.	DEPENDENT	vault.metrics.secret.lease.creation.rate Preprocessing: - PROMETHEUS_TO_JSON: `vault_secret_lease_creation` - JSONPATH: `$[?(@.name=="vault_secret_lease_creation")].value.sum()` ⛔️ON_FAIL: `CUSTOM_VALUE -> 0` - CHANGE_PER_SECOND
Vault	Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate	Number of a {#OPERATION} operation against the {#STORAGE} storage backend.	DEPENDENT	vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate	Number of operations to perform a rollback operation on the given mount point.	DEPENDENT	vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Route rollback [{#MOUNTPOINT}] ops, rate	Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.	DEPENDENT	vault.metrics.route.rollback.rate[{#MOUNTPOINT}] Preprocessing: - PROMETHEUS_PATTERN: `{#PATTERN_C}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - CHANGE_PER_SECOND
Vault	Vault: Delete WALs, count{#SINGLETON}	Time taken to delete a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.deletewals[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_deletewals_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: GC deleted WAL{#SINGLETON}	Number of Write Ahead Logs (WAL) deleted during each garbage collection run.	DEPENDENT	vault.metrics.wal.gc.deleted[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_gc_deleted` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: WALs on disk, total{#SINGLETON}	Total Number of Write Ahead Logs (WAL) on disk.	DEPENDENT	vault.metrics.wal.gc.total[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_gc_total` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Load WALs, count{#SINGLETON}	Time taken to load a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.loadWAL[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_loadWAL_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Persist WALs, count{#SINGLETON}	Time taken to persist a Write Ahead Log (WAL).	DEPENDENT	vault.metrics.wal.persistwals[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_persistwals_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Flush ready WAL, count{#SINGLETON}	Time taken to flush a ready Write Ahead Log (WAL) to storage.	DEPENDENT	vault.metrics.wal.flushready[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `vault_wal_flushready_count` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL missing guard, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `logshipper_streamWALs_missing_guard` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Stream WAL guard found, count{#SINGLETON}	Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.	DEPENDENT	vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `logshipper_streamWALs_guard_found` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Merkle commit index{#SINGLETON}	The last committed index in the Merkle Tree.	DEPENDENT	vault.metrics.replication.merkle.commit_index[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_merkle_commit_index` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last WAL{#SINGLETON}	The index of the last WAL.	DEPENDENT	vault.metrics.replication.wal.last_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last DR WAL{#SINGLETON}	The index of the last DR WAL.	DEPENDENT	vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_dr_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last performance WAL{#SINGLETON}	The index of the last Performance WAL.	DEPENDENT	vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_wal_last_performance_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Last remote WAL{#SINGLETON}	The index of the last remote WAL.	DEPENDENT	vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}] Preprocessing: - PROMETHEUS_PATTERN: `replication_fsm_last_remote_wal` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Vault	Vault: Token [{#TOKEN_NAME}] error	Token lookup error text.	DEPENDENT	vault.token_via_accessor.error["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()` - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] has TTL	The Token has TTL.	DEPENDENT	vault.token_via_accessor.has_ttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()` - BOOL_TO_DECIMAL - DISCARD_UNCHANGED_HEARTBEAT: `1h`
Vault	Vault: Token [{#TOKEN_NAME}] TTL	The TTL period of the token.	DEPENDENT	vault.token_via_accessor.ttl["{#ACCESSOR}"] Preprocessing: - JSONPATH: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`
Zabbix_raw_items	Vault: Get health	-	HTTP_AGENT	vault.get_health Preprocessing: - CHECK_NOT_SUPPORTED ⛔️ON_FAIL: `CUSTOM_VALUE -> {"healthcheck": 0}`
Zabbix_raw_items	Vault: Get leader	-	HTTP_AGENT	vault.get_leader Preprocessing: - CHECK_NOT_SUPPORTED
Zabbix_raw_items	Vault: Get metrics	-	HTTP_AGENT	vault.get_metrics Preprocessing: - CHECK_NOT_SUPPORTED
Zabbix_raw_items	Vault: Clear metrics	-	DEPENDENT	vault.clear_metrics Preprocessing: - CHECK_JSON_ERROR: `$.errors` ⛔️ON_FAIL: `DISCARD_VALUE ->`
Zabbix_raw_items	Vault: Get tokens	Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".	SCRIPT	vault.get_tokens Expression: `The text is too long. Please see the template.`
Zabbix_raw_items	Vault: Check WAL discovery	-	DEPENDENT	vault.check_wal_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^vault_wal_(?:.+)$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARD_UNCHANGED_HEARTBEAT: `15m`
Zabbix_raw_items	Vault: Check replication discovery	-	DEPENDENT	vault.check_replication_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^replication_(?:.+)$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `return JSON.stringify(value !== "[]" ? [{'{#SINGLETON}': ''}] : []);` - DISCARD_UNCHANGED_HEARTBEAT: `15m`
Zabbix_raw_items	Vault: Check storage discovery	-	DEPENDENT	vault.check_storage_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{name=~"^vault_(?:.+)_(?:get
Zabbix_raw_items	Vault: Check mountpoint discovery	-	DEPENDENT	vault.check_mountpoint_discovery Preprocessing: - PROMETHEUS_TO_JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}` ⛔️ON_FAIL: `DISCARD_VALUE ->` - JAVASCRIPT: `The text is too long. Please see the template.` - DISCARD_UNCHANGED_HEARTBEAT: `15m`

Triggers

Name	Description	Expression	Severity	Dependencies and additional info
Vault: Vault server is sealed	https://www.vaultproject.io/docs/concepts/seal	`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`	AVERAGE
Vault: Version has changed (new version: {ITEM.VALUE})	Vault version has changed. Ack to close.	`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`	INFO	Manual close: YES
Vault: Vault server is not responding	-	`last(/HashiCorp Vault by HTTP/vault.health.check)=0`	HIGH
Vault: Failed to get metrics (error: {ITEM.VALUE})	-	`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Current number of open files is too high (over {$VAULT.OPEN.FDS.MAX.WARN}% for 5m)	-	`min(/HashiCorp Vault by HTTP/vault.metrics.process.open.fds,5m)/last(/HashiCorp Vault by HTTP/vault.metrics.process.max.fds)*100>{$VAULT.OPEN.FDS.MAX.WARN}`	WARNING
Vault: has been restarted (uptime < 10m)	Uptime is less than 10 minutes	`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`	INFO	Manual close: YES
Vault: High frequency of leadership setup failures (over {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} for 1h)	There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership losses (over {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} for 1h)	There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`	AVERAGE
Vault: High frequency of leadership step downs (over {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} for 1h)	There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.	`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] lookup error occurred	-	`length(last(/HashiCorp Vault by HTTP/vault.token_via_accessor.error["{#ACCESSOR}"]))>0`	WARNING	Depends on: - Vault: Vault server is sealed
Vault: Token [{#TOKEN_NAME}] will expire soon (less than {$VAULT.TOKEN.TTL.MIN.CRIT})	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.CRIT}`	AVERAGE
Vault: Token [{#TOKEN_NAME}] will expire soon (less than {$VAULT.TOKEN.TTL.MIN.WARN})	-	`last(/HashiCorp Vault by HTTP/vault.token_via_accessor.has_ttl["{#ACCESSOR}"])=1 and last(/HashiCorp Vault by HTTP/vault.token_via_accessor.ttl["{#ACCESSOR}"])<{$VAULT.TOKEN.TTL.MIN.WARN}`	WARNING	Depends on: - Vault: Token [{#TOKEN_NAME}] will expire soon (less than {$VAULT.TOKEN.TTL.MIN.CRIT})

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide a feedback, discuss the template or ask for help with it at ZABBIX forums.

Zabbix 7.4 - Less work. More depth.

Try Zabbix Cloud with a free trial

Zabbix Academy is launched!

Become a Zabbix Partner

Zabbix Academy is launched!

Join our global team!

Zabbix + HashiCorp Vault

HashiCorp Vault

Available solutions

HashiCorp Vault by HTTP

Overview

Requirements

Tested versions

Configuration

Setup

Macros used

Items

Triggers

LLD rule Storage metrics discovery

Item prototypes for Storage metrics discovery

LLD rule Mountpoint metrics discovery

Item prototypes for Mountpoint metrics discovery

LLD rule WAL metrics discovery

Item prototypes for WAL metrics discovery

LLD rule Replication metrics discovery

Item prototypes for Replication metrics discovery

LLD rule Token metrics discovery

Item prototypes for Token metrics discovery

Trigger prototypes for Token metrics discovery

Feedback

HashiCorp Vault by HTTP

Overview

Requirements

Tested versions

Configuration

Setup

Macros used

Items

Triggers

LLD rule Storage metrics discovery

Item prototypes for Storage metrics discovery

LLD rule Mountpoint metrics discovery

Item prototypes for Mountpoint metrics discovery

LLD rule WAL metrics discovery

Item prototypes for WAL metrics discovery

LLD rule Replication metrics discovery

Item prototypes for Replication metrics discovery

LLD rule Token metrics discovery

Item prototypes for Token metrics discovery

Trigger prototypes for Token metrics discovery

Feedback

HashiCorp Vault by HTTP

Overview

Requirements

Tested versions

Configuration

Setup

Macros used

Items

Triggers

LLD rule Storage metrics discovery

Item prototypes for Storage metrics discovery

LLD rule Mountpoint metrics discovery

Item prototypes for Mountpoint metrics discovery

LLD rule WAL metrics discovery

Item prototypes for WAL metrics discovery

LLD rule Replication metrics discovery

Item prototypes for Replication metrics discovery

LLD rule Token metrics discovery

Item prototypes for Token metrics discovery

Trigger prototypes for Token metrics discovery

Feedback

HashiCorp Vault by HTTP

Overview

Requirements

Tested versions

Configuration

Setup

Macros used

Items