Zabbix Code Guidelines




Zabbix template guidelines


Current document should not be considered as a strict set of rules everybody must follow. Instead, this document only reflects our current approach to template building and any rule or best practice described here may evolve to something else or may be abandoned in the future.

Current document status: Version 1.0


Zabbix approach to monitoring. Resource vs service

Before we can start talking about what a good template is, let’s briefly discuss monitoring in general. As monitoring is such common and ambiguous term and there is no true RFC definition of what it is. Some may say that monitoring is when you are constantly looking for mentions about your company on social networks, while others may say it is when you collect sensor readings about your plant soil nutrition and moisture. As Zabbix is the universal monitoring solution, we provide a toolset for any kind of monitoring project. But in this document, we will focus on our views on monitoring services provided by modern distributed systems.

Everything can be seen as a service or as a resource.

A service is something that your organization provides to the outside world. Like an online retail store, or public email service. If your store is online and customers can successfully purchase something from it – your service is available. Or maybe service is something that your department provides to the rest of the company. Like computation resources that you provide to other departments on demand. Such internal service can be the dependency for your company online store service. As you can see, service issues clearly affect the real world. Service monitoring availability and performance is what you should always do in the first place. As Google suggests in SRE book, service unavailability should be considered a red-hot situation and the responsible person must be immediately paged.

A resource is a component that helps to provide services. It can be a server, a virtual machine, a container, a database, middleware app, microservice custom app, some hardware controller, network or anything else. You can breakdown resources to even smaller bits, like splitting server into CPU, RAM, IO subsystem or splitting network link to layer 3 and 2 connectivity and physical link present on both sides. The modern distributed system might be a complex set of different resources with dynamic nature where resources are added and removed on demand based on the service load, just like in Kubernetes cluster or AWS. Resources require your monitoring attention too, but differently. Because resource unavailability doesn’t necessarily automatically affect the real world, keep paging people on resource failures at a minimum. Create tickets instead that can be solved during working hours.

Service monitoring is considered a project level monitoring – it is not something you can get out of the box or get a template on - it is something you need to create yourself using Zabbix features. That is because all services are different, have different SLOs, have different architecture and so on, so it’s hard to prepare a common blueprint for service monitoring. But consider the following approaches:

  • Try synthetic monitoring: emulate user (person or another application that interact with your service) activity on a regular basis: check for symptoms with the black-box approach:
    • Use Web monitoring in Zabbix and go through common user scenarios, for example, try to log in and purchase a test item from the store.
    • Simpler HTTP check will also work – just make sure your website URL returns HTTP 200 OK
    • If your service provides a REST API, write a script that will emulate some common service activity.
  • Try real user monitoring approach: gather and read transactions from logs, network, or database about your real users.
    • Count the ratio of success/failure requests
    • Collect request rate per second/per minute to your service
    • Calculate min/max/average response times to your service or create a histogram.

And most likely situations such as zero rate (or sudden drop of rate), high errors ratio (HTTP 500 everywhere) are the situations that indicate serious service problems.

But why do you need to monitor resources if service monitoring is set up? There are multiple reasons but the most important one is this: once you know your service is down (symptom) you need to isolate the root cause of the problem.

Resources is something that is common and generic in many projects, different architectures. That’s where templates can thrive. Seriously, do you really need to waste time to create your own monitoring solution to control OS Linux? Or for MySQL database, Cisco router, or for docker host? Maybe you can spend your time more efficiently by preparing service-level monitoring instead.

What does it take to become a good resource template?

Let’s try to define some properties of what a good resource template is, some key principles we follow in Zabbix when building templates:

1. Flexible and reusable

In Zabbix, the template is equal to the monitoring solution for some specific object. It’s a sort of container that should be used to transfer configuration, monitoring solution between Zabbix server instances. A good quality template is something that Zabbix users create, use for their own good, and then share it with the Zabbix community, so the next person can download this template and reuse it, update it with newer ideas and approaches, contributing to the common cause. So, the first thing that comes to mind when you try to answer a question what a good template in Zabbix is how flexible and reusable it is. If other Zabbix users can download it and use it without changing half of it – that’s really a good sign.

Here are a few rules of the thumb on how to achieve it:

  • Use low-level discovery as much as possible to avoid unsupported items or triggers. If you have some metric in your situation, that doesn’t mean it will be available in someone else case – it could be different hardware, software version or configuration.
  • Use user macros in triggers, items. For example, use {$NGINX.URL} for Nginx stub status URL. Or use {$TEMP.MAX.CRIT} in temperature controlling trigger. This will allow users to configure and fine-tune templates and linked hosts, instead of changing it and breaking compatibility with future versions.
  • Avoid adding rare metrics/triggers required for your project/service into the resource template. For project/service level metrics that you feel are very specific just move it to another template and link it to the generic template.
  • Avoid external dependencies where possible. Use internal Zabbix data collection and processing possibilities to collect data first. Use HTTP agent, powerful preprocessing steps such as Javascript, JSONPath, JMX and so on. This would ensure that such template is easy to install, and all its configuration and processing are defined within the template. Resort to external scripts only if there is no alternative available.

2. Knowledge and expertise

We also think that a good template is not just a set of metrics (items), thresholds (triggers), and dashboards bundled together. The most important ingredient to a good template is how much expertise and knowledge about a monitored object is contained within. And by expertise and knowledge we mean, not the number of metrics someone knows how to collect – but knowing what metrics are useful and important, and which are just useless, or what thresholds should be used to be notified only about problems that matter without too much noise.

While very minimalistic template may not provide all the information you need, on the other hand, bloated, oversized templates are bad as well, as users lose focus on the most important metrics, as well as they get overwhelmed with problems noise. So:

  • Avoid adding too many metrics. Keep it simple. Don’t try to do benchmarking, profiling, collect deep debugging level metrics. This will create unnecessary load on Zabbix, and object monitored. Let Zabbix isolate the problem – then do profiling and debugging using specialized tools on the hosts that really require it.
  • Avoid creating too much problem noise with triggers in the template. Make sure that the problems created from the trigger require immediate (page) or postponed (ticket) action. Avoid ‘for your information’ and ‘this looks weird’ level triggers.

3. Modularity and scope

The last thing that is very important is the template scope. We already talked about services and resources, so, generally, the good template is the one that has a scope of the single resource:

  • Vendor-specific hardware server
  • A temperature sensor
  • Operation system templates like Linux OS or Windows OS
  • Applications like Nginx, Apache, Tomcat, RabbitMQ
  • DBs like MySQL, PostgreSQL, Oracle, DB2, Redis, Mongo, and so on
  • Cloud providers such as AWS, Azure, or others
  • Virtualization providers such as VMware clusters, Hyper-V
  • Container orchestration systems such as Kubernetes
  • A network device or network controller
  • Some custom applications

If you keep a template scope within a single resource, it will be much easier to share such templates and they will be useful to people who have the same building blocks in their architecture. Also, avoid merging resources of different layers – do not add metrics for Linux OS and PostgreSQL into the single template.

But what about ‘inner’, metrics scope? What metric types, classes should you be collecting? Surely you can do monitoring for various reasons, including collecting business indicators or looking for security breaches, but when creating a generic resource Zabbix template, try to adopt the following approach:

3.1 Always start with fault monitoring or availability monitoring. The most popular and very important answer people want to get from monitoring – is my system up and running? So, try to address that in your template first.

That is, prefer black-box monitoring approach here, simple or not so simple health checks are essential and the first thing you or any user want to know about. Add items and triggers to your template that can help you to be sure – the thing you are monitoring is accessible and is up and running. Use ICMP ping, check that TCP port is open, check that API returns HTTP 200 OK, and so on.

The second problem that is addressed by fault monitoring is an imminent failure. For example,

  • the hardware is overheated and is about to shutdown
  • you are running very low on disk space and very soon your DB will refuse to write new data to the DB as there is no space.

Add items and triggers that will help you to intervene and prevent such a drastic outcome.

Also, if your monitoring object can detect faults on its own - use it! Many systems can report faults directly using logs or sending SNMP traps and so on. And that's the kind of expertise we talked earlier provided to you from the developers, vendors, authors of the system you want to control. And nobody knows it better than them. So just make sure you can retranslate faults detected by the system itself in your Zabbix template.

3.2 Once your template can check the health of your system - proceed with performance monitoring. This is where you will need to open the box wide open (white-boxing). There are really nice methods out there to help you choose what metrics to collect first: USE, Four golden signals from Google or RED for request-driven services. Just make sure you extend the template with items and triggers to help solve the following use cases:

  • My system is slow. It is up but the response time is unsatisfactory. Performance has degraded.
  • We just had a big outage. We need to investigate and do retrospective analysis to find out what happened to make sure it never happens again. To do such analysis we need helpful metrics collected beforehand.

3.3 Inventory and state control

While Zabbix is not the inventory system, it still can collect lots of information about the resource and most importantly, detect changes, such as the system being restarted outside of maintenance window, the version was updated, or it is outdated, and so on. So, make it part of your template checklist.

3.4 Security

If you know how to properly detect security issues with the resource, i.e.:

  • Resource version used is the subject to CVE. Consider updating
  • Misconfiguration causes the system to be publicly available without proper authentication when it should be.

Then consider adding such items and triggers to your template as well.

4. Follow guidelines

Finally, is the style of the template. How to name your items? Templates? Triggers? If we would all follow the same style when creating Zabbix templates – then it wouldn’t really matter who made this template – you, Zabbix, or another community member from the other side of the globe – as template contents and layout will be very predictable and expected.


Following style guidelines and template core principles mean that we can reuse each other templates as building bricks for our monitoring projects, saving time and adding someone else knowledge on the monitored object.

That concludes the introduction to Zabbix template guidelines, a comprehensive set of rules how we build templates in Zabbix.

I recommend reading the guide if:

  • You want to share your template with the rest of the world
  • You want to avoid common mistakes when creating a template
  • As a hardware or software vendor, you want to provide a Zabbix template for your solution

Style guide

1.1 General

1.1.1 Avoid extensive template tuning

Try to keep everything default and simple in the template as long as possible. For example, item attributes as update interval, history, trends. Change them only if there is a good reason for this. Don’t waste time deciding whether you should make an update interval 1 minute or 2 minutes, or maybe 2.5 minutes? Use the Pareto principle to get 80% template efficiency with 20% of your time effort. Don’t over-engineer it, unless there is a reason for it.

1.1.2 Template language

All template descriptions, names, and so on, must be created in the English language first. If you need a template in another language – consider maintaining two copies – English and localized version.

1.1.3 Everything enabled

All items, triggers, LLD rules, and other configuration entities should be enabled by default to make the template useful out of the box.

1.1.4 Avoid global macros

If user macros are used, define them in the template itself instead of using global macros - that way users get either the default values or an example of what the macro names are. If global macros are used, they are not exported along with the template.

1.1.5 Avoid global regexes

Avoid using global regexes in templates if possible, as they are not exported with the template. If global regex is used, document in the README what global regex with what values must be used with the template. (Note that since 4.0 you can use NOT to filter out negative results in LLD filters, see ZBXNEXT-2788)

1.1.6 Avoid trigger dependencies for triggers from different templates

Avoid trigger dependencies for triggers from different templates. Use global correlation and event tags instead.

1.1.7 Keep templates modular. Profile templates

Generally, to keep the template reusable and modular, the single template should be capable to monitor a single resource or inseparable set of resources only.

If you need to monitor multiple resources on the host (and you probably do) – consider creating a so-called ‘Profile’ or ‘Meta’ empty template and then link multiple resource templates to it.

“Linux” all linked to profile template named “LAMP”.
Then, “LAMP” is linked to hosts “lamp1” and “lamp2”.
“Linux” all linked directly to hosts “lamp1” and “lamp2”

It is also a good place to redefine user macros on the profile template level if needed.

1.2 Templates

1.2.1 Template name

All parts are separated by spaces, but underscores can also be used.

All names (group, template, item, trigger, graph, tag, dashboard, discovery) use normal case inside the specific part – for example, “Zabbix server”.

To distinguish templates, the most popular data collection method can be stated as an extra suffix at the end of the name, for example: “by SNMPv1”, “by SNMPv2”, “by SNMPv3”, “by Zabbix agent”, “by Zabbix agent active”, “by IPMI”, “by JMX”, “by ODBC” and so on.

Since Zabbix 5.2, standard template names no longer include the word “Template” and the <Category short name> prefix. A name should start with the <Template name> itself (the specific part) and may optionally be followed by the data collection method.
Prior to 5.2: Template App Nginx by HTTP, Template DB MySQL
Since 5.2: Nginx by HTTP, MySQL

Nginx by HTTP
Brocade switch by SNMPv2
Brocade switch SNMPv2
Brocade switch_by SNMPv2
Agent 2 template NGINX
Template DB MySQL
SNMP Brocade switch
Brocade switch Template
Template Brocade Switch
1.2.2 Template visible name

Currently, we suggest leaving the template visible name empty.

1.2.3 Template description

Use this field to provide a short overview of the template, including:

  • Short description
  • Template homepage URL (at or or else)
  • Template author
  • If documentation is quite short – documentation can be provided inline
  • Current template version
  • The simple changelog can be provided as well
1.2.4 Choosing a template group

All templates must be added into a template subgroup called Templates/<Category Full Name>.

“Cisco by SNMP” added into “Templates/Network Devices” “Cisco by SNMP” added into “Datacenter/Network” host group

1.3 Items

1.3.1 Naming an item

Choose a simple, descriptive name for each item.

Prefix item names (metric) with object name (metric location):

<metric location>: <metric name>, for example:

  • Interface eth0: Bits in
  • Interface eth0: Bits out

You may use “#” if the metric location is just a number or index:

  • #0: CPU utilization
  • #1: CPU utilization

Consider adding suffixes like “per second”, “per hour” etc to describe the metric better.

No user macros or $1 macros must be used in item names, they are deprecated and will be removed in Zabbix 5.0.

Consider prefixing your item with “Get” if this a master item to highlight this item is the collector item, not the final metric.

1.3.2 Keys

Keys should use the hierarchical dotted format.


Required to split metrics of one template from another. In the simplest case, this may be a short product name.

e.g: nginx, pgsql, pgbouncer, docker


Component or sub-resource of the monitored object. It could be hierarchical as well.

e.g: upstream, pool, db, db.table, db.client


For example: max_reached.

If possible, prefer to name metrics just as they are named in the monitored object itself with an exception if metric format there is completely different or metric name there is totally confusing and not human-friendly.

Every key must start with a letter and must use only Latin letters in the lower case in the base part.

If you need a space, you could simply replace it with underscore “_”., e.g: response_time.

Remember that the max key length is 255 chars (including users params).

e.g: request_time. request_count

Consider appending .get for collectors, items that are responsible for retrieving data to be used in dependent items. (master items)

e.g: pgsql.db.get_connection […], nginx.get_stub, nginx.get_logs

Consider using .rate for per second metrics.

e.g: nginx.connections.accepted.rate

Consider using .total for accumulators.


In params, first comes mandatory params the optional should follow.

1.3.3 Item description field

Use this field to describe:

  • Extended description of the item, for example, taken from vendor description of the metric
  • Why it is important
  • Provide a reference to the documentation if possible
  • Any additional information about how this item collects data or how it can be tuned, configured
1.3.4 Units

Don't forget to provide units wherever possible.

Add your units to the blacklist to stop automatic conversion where conversion is silly.

For example:

Use “!requests/s” to prevent “Krequests/s” to appear.

Use preprocessing to transform GB, MB, KB to B (Bytes).

Use preprocessing to transform ms, minutes, hours to seconds.

1.3.5 Value mapping

Always use value mappings where applicable, for example, when collecting discrete states.

1.3.6 Type of information

Take type restrictions when choosing which one to use:

Type of data as stored in the database after performing conversions, if any Numeric (unsigned) - 64bit unsigned integer Numeric (float) - floating-point number Negative values can be stored. Allowed range: -999999999999.9999 to 999999999999.9999. Starting with Zabbix 2.2, receiving values in scientific notation is also supported. E.g. 1e+7, 1e-4. Character – short text data Log – long text data with optional log related properties (timestamp, source, severity, logeventid) Text – long text data Limits of text data are described in the table below. Read more here

If your item is a rate (i.e., “Change per second” preprocessing is applied) – use Numeric(float).

Additionally, don’t forget to use Numeric(float) if you need to store negative integers.

1.3.7 Using time suffixes in update intervals, calculated item formulas

Always use time (1m, 5m, 1d…) suffixes in update intervals, history storage period, trends storage period, calculated item formulas to improve readability. Remember, that you can use them in user macros too.

By default, use:

Update interval: 1m History: 7d Trends: 365d

Also, consider using preprocessing steps 'Discard unchanged (with heartbeat)' when collecting items that change rarely like statuses or configuration data (e.g. serial numbers or hostname):

If the item is a health check:

1m with the heartbeat of 1h

If the item is an inventory item:

15m with the heartbeat of 24h

If the item has a tag “data: raw” (master items or items only needed for other calculated items, see below) - set history to 0 and trends to 0, as you don't need to keep such intermediate values.

Please also note: Never set update interval more than 1d, as you will not see such data in the ‘latest data” since Zabbix frontend considers values received more than 24h ago as not latest.

1.3.9 Tags

Use tags to logically group items using the recommended tagging model.

Each regular item must have a mandatory set of tags:

target — specifies the monitoring object. For example, linux, nginx, mongodb, docker

target: mysql target: database
target: vmware target: vm

resource — specifies the monitored resource of the monitoring object. For example, cpu, memory, filesystem, network, database, battery, temperature, connection

resource: cpu resource: cpu_system
resource: connection resource: total_connections

transport — defines the data collection method. For example, http, agent, snmp, script, ipmi, dependent

transport: http transport: request_to_server

type — specifies the metric type. For example, gauge, counter, rate, state

  • gauge — current measurements, such as bytes of memory currently used or the number of items in a queue.
  • counter — measure discrete events. Common examples are the number of HTTP requests received, CPU seconds spent, or bytes sent.
  • rate — calculate value change per second. For example, the number of HTTP requests per second.
  • state — state of the monitored object, e.g. Up/Down, Enabled/Disabled

data: raw (optional) — assigned for the items whose values are needed for dependent or computable items.

Each LLD item requires adding a tag with the name of a discovered entity. In most cases, the name of the discovered entity should be defined in the name tag.

name: {#DBNAME} database: {#DBNAME}
name: {#IFNAME} interface: {#IFNAME}

In cases where the discovered entities require more than one property (e.g. name and instance), additional tags may be defined. For example, name: a, instance: b1

name: {#NAME}, instance: {#INSTANCE} name: “{#INSTANCE}_{#NAME}”
1.3.10 Calculated items

Use newlines and spaces to make long formulas human readable.

1.3.11 SNMP

SNMP OID field should not use any MIB objects, so templates would be working without MIBs imported. At the same time, provide metric name from MIB as an item key parameter and in the item description.

Leading '.' in OID should not be used.

GoodBad FOUNDRY-SN-AGENT-MIB::snAgGblCpuUtil1MinAvg.0 or .

Leave item field ‘Port’ empty for SNMP items. If left empty, then the port will be used from the SNMP host interface.

1.4 Discovery rules (LLD)

1.4.1 Naming

Choose a simple, descriptive name for each discovery rule. Make sure it always ends with the “discovery” word.

Network interface discovery
CPU core discovery
Discovery of CPU cores

Items, triggers, graphs names generated from LLD should be prefixed with the discovery entity name they belong to. The only exception is the singleton discovery pattern.

1.4.2 Update interval

Use 1h. For advanced usage, see the best practices section.

1.4.3 Keep lost resources period

Keep it to default: 30d.

1.4.4 Filters

Use user macros in filters

Consider using user macros in filters, two for each useful LLD macro.


{$NET.IF.IFNAME.NOT_MATCHES} = (^Software Loopback Interface|^NULL[0-9.]*$| ^[Ll]o[0-9.]*$| ^[Ss]ystem$|^Nu[0-9.]*$)

{#IFNAME} MATCHES "@Network interfaces for discovery"

That way, filters can be redefined on a linked template or host level without changing the template itself.


When discovering SNMP OIDs, make sure to use regexes that match both MIB translated values and raw values. This would make discovery filters even if proper MIBs are not loaded into Zabbix server or proxy.

For example, filter for discovering disks when you need to discover only disks of resource type = hrStorageFixedDisk (

MATCHES .*(\.4|hrStorageFixedDisk)$ MATCHES .*(hrStorageFixedDisk)$

1.5 Triggers and problems

1.5.1 Naming

Trigger names must be prefixed with the LLD object they belong to.

Trigger names should not use {HOST.NAME} macro to keep names shorter. Consider getting this data from the host column.

Avoid using {ITEM.LASTVALUE} in trigger name

Don’t use {ITEM.LASTVALUE1-9} macros right in trigger names. As of 4.0 they these macros are expanded to values when problem name is generated and stays.

Use it in an operational data field instead. (available in Zabbix 4.4)

Explain the threshold in name

Consider explaining why trigger fired (threshold) in parenthesis ().

Temperature is too high (over 35 C for 5m)
CPU load is too high (over 1.5)
MySQL: Refused connections (max_connections limit reached)
Temperature is too high ( now: 40)
CPU load is too high
MySQL: Refused connections
1.5.2 Trigger description

Use this field to describe:

  • Describe the problem in more detail. But do not just copy the text from the trigger name.
  • Why it is important to check this
  • Describe the probable root cause of the problem if possible and which actions should be taken
  • Provide a reference to the documentation if any
1.5.3 Expressions

Trigger expressions should be reasonably flap-resistant - that is, not relying on the last value only but checking last 5 or 10 minutes instead. On the other hand, do not make the expressions overly complex - for example, do not use trigger hysteresis unless it really adds significant value.

Prefer to use user macros in trigger expressions to allow thresholds tuning.

{template:temperature.last()}>{$TEMP.MAX.WARN}} {template:temperature.last()}>30

Use newlines and spaces to make long trigger expressions more human-readable.

1.5.4 Using time and data suffixes in triggers

Always use time (1m, 5m, 1d…) and size suffixes (1K, 1B, 1G) in trigger expressions and problem names, trigger description, operational data to improve readability. Remember, that you can use them in user macros, too.

{<{$MEM_FREE.WARN} where {$MEM_FREE.WARN} = 100M
{<{$MEM_FREE.WARN} where {$MEM_FREE.WARN} = 104857600
1.5.5 Severity

Triggers created in the templates are mapped to the standard Zabbix severity scale. Consider choosing the severity assigned to the trigger with the following in mind:

SeverityDescriptionExamplesExpected reaction type and time (not always true!), given as example only
Not classified Not used under normal circumstances
Info The event happened that is not an alarm at all. This is the info that might be helpful in the future for retrospective analysis or for auditing. Examples: s/n changed, user logged in, etc None
Warning A minor alarm that could lead to some more serious problem if left without attention. Examples: Disk usage is low but there is still some room React during working hours, no notification is expected.
Average Performance alarms: Average alarm that indicates serious performance problems or key service degradation.

Fault alarms: partial resource failure or warnings that if left without attention might lead to complete device fault.
Examples: CPU utilization is high, Low memory, High device temperature, Disk health failure in the disk array, Website is slow. React during working hours, create an issue ticket if the problem stays for hours.
High Performance alarms: Key service is not available. Fault alarms: The device is not functioning or not available. No ICMP PING, Website is down. React off working hours if affects services with the page.

React with a ticket during working hours otherwise.
Disaster Reserved for alarms indicating blackouts, disasters, global business service faults.

There should be no triggers with disaster level severity in resource templates.
Riga DC is down, Level core network is down, >50% of users cannot purchase anything from our website. Always react by paging the responsible person.

1.6 Graphs and dashboards

1.6.1 Graph names

Graph names must be prefixed with the low-level discovery object they belong to.

Graph names can also be prefixed with a resource.

1.7 User macros

User macros and low-level macros accept only uppercase characters, that is [A-Z0-9._].

Consider using template specific prefix (namespace) to avoid potential conflicts with other templates.


Use macro context in objects from LLD. This way you can change and tune macros not only on the host level but on the LLD entity level.


Use only widely accepted word shortenings in macro names, such as:


If there is no good short-form - prefer to set macro name long but clearly understandable.

1.7.1 Trigger macros

For macros used in trigger expressions (thresholds) use form:


Use MAX|MIN when you need to highlight whether it is the high or low threshold.


1.8 Files

Share your work as an XML file. The filename should start with the word “template” followed by the category short name and then the exact template name. Use lowercase with spaces replaced by _

Template App Nginx.xml

Store each template file in its own, separate directory. Create a file or similar in this directory to describe what this template does and how to install it. Place user parameter files or any other files required to run this template into this directory as well.

1.8.1 Pick a template category

You can create your own template categories. But first, consider using one of the recommended categories:

Category full nameCategory short nameDescriptionTemplate name → File name
Modules Module For all templates not intended for direct host linkage but often used as a dependency for other templates Generic SNMPv2 → template_module_generic_snmpv2.xml
HOST-RESOURCES-MIB SNMPv2 → template_module_host-resources-mib-snmpv2.xml
Interfaces SNMPv2 → template_module_interfaces_snmpv2.xml
Interfaces simple SNMPv2 → template_module_interfaces_simple_snmpv2.xml
ICMP ping → template_module_icmp_ping.xml
Network devices Net For all network devices(or software) which main role is networking including switches, routers, wireless, firewalls, etc Generic device SNMPv2 → template_net_generic_device_snmpv2.xml
Juniper SNMPv2 → template_net_juniper_snmpv2.xml
Mikrotik SNMPv2 → template_net_mikrotik_snmpv2.xml
Dell Force S-Series SNMPv1 → template_net_dell_force_s-series_snmpv1.xml
Brocade FC SNMPv1 → template_net_brocade_fc_snmpv1.xml
Storage devices Storage For FC and other storage devices IBM Storwize by SNMPv1 → template_storage_ibm_storwize_by _snmpv1.xml
EMC VNX → template_storage_emc_vnx.xml
Server hardware Server For server hardware (iLO, IMM, blades and so on) IBM IMM2 by SNMPv2 → template_server_ibm_imm2_by_snmpv2.xml
IBM IMM2 by IPMI → template_server_ibm_imm2_by_ipmi.xml
HP iLO by SNMPv2 → template_server_hp_ilo_by_snmpv2.xml
Operating systems OS For server operating systems (Windows, Linux, OSX, ESXi by SNMP, Solaris and so on) Linux → template_os_linux.xml
Linux by Zabbix agent active → template_os_linux_by_zabbix_agent_active.xml
Linux by SNMPv2 → template_os_linux_by_snmpv2.xml
Linux VMware → template_os_linux_vmware.xml
ESXi SNMPv2 → template_os_esxi_snmpv2.xml
Solaris → template_os_solaris.xml
Windows → template_os_windows.xml
Windows XP by SNMPv2 → template_os_windows_xp_by_snmpv2.xml
Databases DB For all SQL, NoSQL and key-value storages MySQL → template_db_mysql/xml
Redis → template_db_redis.xml
Oracle by ODBC → template_db_oracle_by_odbc.xml
Power Power For UPSes and other power category devices Generic UPS by SNMPv2 → template_power_generic_ups_by_snmpv2.xml
APC by SNMPv2 → template_power_apc_by_snmpv2.xml
Eaton SNMPv2 → template_power_eaton_snmpv2.xml
Telephony Tel For hardware and software telephony systems (Asterisk, Panasonic, Avaya, etc) including IP phones Asterisk by SNMPv3 → template_tel_asterisk_by_snmpv3.xml
Avaya → template_tel_avaya.xml
Virtualization VM For VMs, Hyper-V, VMware, Xen, KVM… VMWare → template_vm_vmware.xml
Hyper-V → template_vm_hyper-v.xml
Xen → template_vm_xen.xml
Printers Printer For printers and MFPs Printer generic by SNMPv2 → template_printer_generic_by_snmpv2.xml
HP LaserJet → template_printer_hp_laserjet.xml
Applications App For software that doesn't fit in any category above Generic Java JMX → template_app_generic_java_jmx.xml
RabbitMQ → template_app_rabbitmq.xml
Apache Tomcat JMX → template_app_apache_tomcat_jmx.xml
Apache ActiveMQ → template_app_apache_activemq.xml
Docker → template_app_docker
Apache2 → template_app_apache2.xml
Nginx by HTTP → template_app_nginx_by_http.xml
Hardware HW For other hardware that doesn't fit in any category above Netbotz by SNMPv2 → template_hw_netbotz_by_snmpv2.xml
Siemens PLC by Modbus → template_hw_siemens_plc_by_modbus
Skycontrol by SNMPv2 → template_hw_skycontrol_by_snmpv2.xml
Skycontrol SNMPv1 → template_hw_skycontrol_snmpv1.xml
Netping → template_hw_netping.xml
1.8.2 Readme file structure

It is very important to provide a clear explanation of what your template does, how it can be installed, configured, and tuned. Consider providing such documentation in the README file. Readme file should contain the following sections:


Describe what this template is about, what versions of a monitored object it was tested on.


Provide clear step-by-step instructions on how to install the template.

Zabbix configuration

Provide info here on how the template can be tuned using macros and so on.

Template links

List all template links if any.

Discovery rules

List discovery rules with filters applied.

Items collected

List all items being collected.


List all triggers.


Describe how to provide feedback.


Optional. Provide some screenshots from the template in action.

Known issues

Describe all known limitations here.


Optional. Provide any links to any templates that inspired you to create this one, or reference to the official documentation about the monitored object.

Best practices

2.1 Discovering items and tackling unsupported items

Use low-level discovery as much as possible. This helps to avoid unsupported items as well as to improve templates flexibility.

Discovering temperature sensors using LLD
Discovering network interface using LLD
Discovering CPU cores using LLD
CPU core #1 utilization, Sensor 1 temperature value, network interface Fa0/0 directly by statically creating items without using LLD
2.1.1 Discovery frequency

Low-level discovery is considered a heavy operation in Zabbix, so its frequency should be low. Consider always starting at 1 per hour.

If discovery uses another frequent item as a source (Item type = dependent item) - apply “Discard unchanged with heartbeat” preprocessing for such discovery. You can also use such preprocessing for other discoveries too.

In such case, you can also use discovery preprocessing to filter out toggling parts of the low-level discovery data, for example, for data coming from master item:

  “volume_name”: “my disk1”,
  “volume_size”:  1000000000000,
  “volume_used”:  800000000000,
  “volume_updated_at”: “2019-07-01 00:00:00”
  “volume_name”: “my disk2”,
  “volume_size”:  1000000000000,
  “volume_used”:  800000000000,
  “volume_updated_at”: “2019-07-01 00:00:00”

For such output consider transforming this array using JS or JSONPath preprocessing to:

  “volume_name”: “my disk1”
  “volume_name”: “my disk2”

Before applying throttling discard rule.

2.1.2 Discovery with Zabbix trapper

When pushing items via Zabbix trapper protocol – consider pushing low-level discovery data as well since discovery items support it.

2.1.3 Use preprocessing to build low-level discovery on the fly

With JavaScript preprocessing and other powerful features, you can create low-level discovery data on the fly. Prefer this method over external discovery scripts:

  • To keep discovery rules clearly observable by all future template users
  • To keep discovery as a part of the monitoring solution – easily transferable as part of the template
  • To avoid external dependencies such as external discovery scripts

Example 1

Get Nginx Plus zones stats using Zabbix HTTP agent from URL such as this:

  "": {
    "processing": 0,
    "requests": 175276,
    "responses": {
      "1xx": 0,
      "2xx": 162948,
      "3xx": 10117,
      "4xx": 2125,
      "5xx": 8,
      "total": 175198
    "discarded": 78,
    "received": 50484208,
    "sent": 7356417338
 "": {
    "processing": 7,
    "requests": 448613,
    "responses": {
      "1xx": 0,
      "2xx": 305562,
      "3xx": 87065,
      "4xx": 23136,
      "5xx": 5127,
      "total": 420890
    "discarded": 27716,
    "received": 137307886,
    "sent": 3989556941

Feed this output to discovery rule via dependent item and apply Javascript preprocessing as this:

//parsing NGINX plus output:
output = Object.keys(JSON.parse(value)).map(function(zone){
    return {"{#NGINX_ZONE}": zone}
return JSON.stringify({"data": output})

Making original JSON object a fully LLD compatible JSON Array that can be used for NGINX zones discovery.

Example 2

Get disks stats using Zabbix agent vfs.file.contents[/proc/diskstats] item:

   7       0 loop0 2 0 10 0 0 0 0 0 0 0 0
   7       1 loop1 0 0 0 0 0 0 0 0 0 0 0
   7       2 loop2 0 0 0 0 0 0 0 0 0 0 0
   7       3 loop3 0 0 0 0 0 0 0 0 0 0 0
   7       4 loop4 0 0 0 0 0 0 0 0 0 0 0
   7       5 loop5 0 0 0 0 0 0 0 0 0 0 0
   7       6 loop6 0 0 0 0 0 0 0 0 0 0 0
   7       7 loop7 0 0 0 0 0 0 0 0 0 0 0
   8       0 sda 192218 21315 11221888 13020540 28630719 8482221 801446972 388811708 0 265066852 401774948
   8       1 sda1 252 59 11294 5424 6 0 12 464 0 4160 5888
   8       2 sda2 4 0 8 72 0 0 0 0 0 72 72
   8       5 sda5 191918 21256 11208378 13014352 22872982 8482221 801446960 215739516 0 99497600 228699704
 252       0 dm-0 186763 0 10985130 22979168 31930494 0 799946248 396490524 0 265080476 419505356
 252       1 dm-1 26897 0 220608 688352 187589 0 1500712 23501956 0 212608 24190464

Feed this output to a regular item and then apply preprocessing as this:


var parsed = value.split("\n").reduce(function(acc, x, i) {
  acc["values"][x.split(/ +/)[3]] = x.split(/ +/).slice(1)
  acc["lld"].push({"{#DEVNAME}":x.split(/ +/)[3]});
  return acc;
}, {"values":{}, "lld": []});

return JSON.stringify(parsed);

Create new discovery rule with the item above as the master item. Apply additional preprocessing to this discovery rule:


2.1.4 Singleton discovery

While the low-level discovery was designed to automate the creation of items, triggers, and graphs for multiple similar entities such as network interfaces or disks, it can also be used as a simple filter for exclusive entities that either don’t exist or exist in the single instance.

This approach allows keeping a template clean, without users facing unsupported items when a template is applied to hosts with different configurations or versions of the monitored object.

To use singleton pattern, you need to do the following:

  • Create discovery rule. Use regular items or dependent items to get some value that is not in LLD format. For a brief example, lets it a regular item that returns the text ‘found’ or ‘missing’.
  • Use preprocessing in low-level discovery rule:
    • Check that received value matches your conditions and that items should be created
    • Using Javascript preprocessing, add an empty LLD macro named {#SINGLETON} inside LLD array of length 1

These two steps can be combined in a single line of JavaScript that would generate an LLD array.

return JSON.stringify(value === 'found' ? [{'{#SINGLETON}': ''}] : []);

Use this macro {#SINGLETON} inside square brackets of all item prototypes keys.

Append this macro to any graph prototype name.

Empty macro is required, so Zabbix can differentiate item or graph from the prototype. When macro is expanded after discovery - only clean item name or graph name can be seen absolutely identical to the one that you would statically define.

See MPM event discovery in Zabbix 4.4 “Template App Apache by HTTP” template as an example. We wil also describe it in more detail in our blog.

MPM event singleton discovery in Template App Apache by HTTP (Zabbix 4.4) Templates that monitor Apache HTTP server without such Singleton approach, thus leaving MPM event metrics as not supported when MPM event module is disabled
There is a constraint that LLD macro must be inside square brackets in item key.

2.2 Getting items

2.2.1 Minimize external libraries dependencies when writing external scripts/modules if possible

If you need to resort to external scripts – think about making them portable and easy to install as well.

2.2.2 Preprocessing

Prefer to use Zabbix preprocessing in favor of complex data parsing with some scripts on the agent side:

  • To keep Zabbix agent presence noticeable as less as possible
  • To keep preprocessing rules clearly observable by all future template users
  • To keep preprocessing rules as a part of the monitoring solution – easily transferable as part of the template
  • To avoid maintaining two sets of preprocessing rules on Windows and Linux platforms
2.2.3 Master item + dependent items/preprocessing

Prefer to use Zabbix master item + dependent items in favor of multiple separate calls:

  • To keep Zabbix presence noticeable as less as possible - fewer calls to the monitored objects

Reuse master item contents to create Low-level discovery rules. Then reuse master item values again to be used in future items from prototypes.

Master item history storage period

Master items values may be of a very large size (ZBXNEXT-223), while these values are only needed for preprocessing in dependent items. So, shrink its history storage period to a minimum, non-zero value which is 1h.

But what if we set history = 0? This setting will give you some additional issues:

  • master item’s nodata() trigger will give you false positives
  • less info available for troubleshooting data collection

So we suggest 1h as a safer choice at this point.

2.2.4 Security and authentication

While passing passwords as user macros may sound like a convenient idea – avoid as much as you can.

If you need to authenticate in order to gather metrics – prefer to create user named zbx_monitor with read-only access.

2.2.5 Getting data with user parameters/external check

Prefer using user parameters/external check or modules with dependent items/preprocessing over Zabbix trapper if you can, since when using Zabbix trapper you have less control over data collection.

2.2.6 Getting data with Zabbix trapper

Prefer using Zabbix trapper over user parameters/external check if one of the following statements are true:

  • You need to send metrics from your own custom applications
  • Data collection is irregular (backup job, alarm signal, etc)
  • You need to send data with shifted timestamp
  • Data collection script can take more than 30 seconds to complete
2.2.7 Getting data with HTTP agent and

Removing HTTP headers in

As of Zabbix 4.4, web.paget.get returns HTTP body and headers together as the item value. So, to get valid JSON/XML data with Zabbix agent key use Regular expression preprocessing to remove HTTP headers:

Regular expression,
parameter = \n\s?\n(.*)
output: \1

This will return JSON/XML you can now easily parse with JSONPath/XMLPath.

2.3 Healthchecks and discrete states

Always use value mappings for discrete states passed as integers.

Consider using “Boolean to decimal” preprocessing if item check result can only have two states such as YES/NO, TRUE/FALSE to preserve DB space and then apply simple value mapping.

Consider using “Discard unchanged with heartbeat” preprocessing for discrete states. This will improve state change reaction dramatically without putting additional load on Zabbix DB. Start with something like 10s/5m or 1m/30m. Note though that trigger functions such as count() or diff() may work differently.

2.3.1 Healthchecks and discrete states triggers

For health check triggers consider using simple trigger expression:


If your health check metric that returns only integer values and not text statuses, you may also use:


for simplicity.

If your health check can return multiple different values, try to map them to the following triggers of different severity (simplified scale):

LevelSuggested Zabbix severityTrigger nameTrigger dependenciesSample expressions
Not OK Information Service X is not OK depends on warning and critical level triggers {TEMPLATE_NAME:METRIC.count(#1,{$SERVICE.STATUS.OK},ne)}=1
Warning Warning Service X is in warning state depends on critical level trigger {TEMPLATE_NAME:METRIC.count(#1,{$SERVICE.STATUS.WARN},eq)}=1
Critical High or Average Service X is in critical state {TEMPLATE_NAME:METRIC.count(#1,{$SERVICE.STATUS.CRIT},eq)}=1

Use 'Not OK' level if there are too many bad statuses or not all of them known.


Note 'ne' in Not OK expression.

If there are multiple metric values all indicating critical level, put them together in the single expression:

{TEMPLATE_NAME:METRIC.count(#1,{$SERVICE.STATUS.CRIT:"not_responding"},eq)}=1 or {TEMPLATE_NAME:METRIC.count(#1,{$SERVICE.STATUS.CRIT:"timeout"},eq)}=1

Note that you may use macros context to label different statuses.

For noisy items, consider adding recovery expression:


2.4 Collecting inventory and text description states

Consider using “Discard unchanged with heartbeat” preprocessing for inventory and other textual data that rarely changes. This will improve inventory change reaction dramatically without putting additional load on Zabbix DB. Start with something like 15m/1d. Note though that trigger functions such as count() or diff() may work differently.

Always use this preprocessing step if rarely changing inventory field is collected from a general master item that is frequently polled.

2.5 Use trigger snippets

Check the following trigger snippets library and consider reusing configuration to avoid reinventing the wheel.

Case: Something has just been restarted

Trigger: <resource> has just been restarted (uptime < 10m)

Applicable for For uptime counters for device, host, or software/service running
Name <resource> has been restarted (uptime < 10m)
Description <resource> uptime is less than 10 minutes
Expression {TEMPLATE_NAME:METRIC.last()}<10m
Recovery expression -
Recovery mode -
Manual close Yes
Severity Warning for the host. Info for all others.
Depends on -

Case: Any master item + preprocessing in dependent items

Trigger: Master item is not responding

<resource>: Failed to get items (no data for 30m)

Applicable for Any type of items used for bulk data collection
Expression {TEMPLATE_NAME:METRIC.nodata(30m)}=1
Recovery expression -
Recovery mode -
Manual close Yes
Severity Warning
Depends on If present: <Proc> is not running

Case: HTTP item + regex preprocessing in dependent items

Trigger: HTTP item is not responding

Applicable for HTTP items that provide output for future regex preprocessing.
Use ‘Headers and Body’ mode in the item.
Expression {TEMPLATE_NAME:METRIC.str(\“HTTP/1.1 200\”)}=0 or\n {TEMPLATE_NAME:METRIC.nodata(30m)}=1
Recovery expression -
Manual close Yes
Severity Warning
Depends on If present: <Proc> is not running

Case: <VALUE> is too high (over X)/ is too low (under X) for slow to change values

For slow changing values (i.e. temperature, use max() for high, and min() for lows to get immediate response with delayed (confirmed) recovery.

Trigger: <VALUE> is too high (over X)

Applicable for High temperature (slow to change)
Expression {TEMPLATE_NAME:METRIC.max(5m) > X

Trigger: <VALUE> is too low (under X)

Applicable for Low temperature (slow to change)
Expression {TEMPLATE_NAME:METRIC.min(5m)} < X

Case: <VALUE> is too high (over X for 5m)/ is too low (under X for 5m) for quick-to-change and jumpy values

For jumpy values, use min (for high) and max(for low) to make triggers more tolerable to spikes/noise.

Trigger: <VALUE> is too high (over X for 5m)

Applicable for CPU utilization (jumpy), signal strength(jumpy), network utilization
Expression {TEMPLATE_NAME:METRIC.min(5m)} > X

Trigger: <VALUE> is too low (under X for 5m)

Applicable for CPU utilization (jumpy), signal strength(jumpy), network utilization
Expression {TEMPLATE_NAME:METRIC.max(5m)} < X

Case: Serial number has changed on the device

Trigger: Serial numbers controls

Applicable for Serial numbers items
Name <resource> has been replaced (new serial number received)
Description <resource> serial number has changed. Ack to close
Expression {TEMPLATE_NAME:METRIC.diff()}=1 and {TEMPLATE_NAME:METRIC.strlen()}>0
Recovery expression -
Recovery mode None
Manual close Yes
Severity Info
Depends on -

Case: Software version has changed on the device

Trigger: Version controls

Applicable for Software version items
Name <resource> version has changed (new version: {ITEM.VALUE})
Description <resource> version has changed. Ack to close
Expression {TEMPLATE_NAME:METRIC.diff()}=1 and {TEMPLATE_NAME:METRIC.strlen()}>0
Recovery expression -
Recovery mode None
Manual close Yes
Severity Info
Depends on -

Case: Control how much disk space is left

Trigger: Filesystem space is critically low with timeleft with macro


Applicable for Filesystems
Name Disk space is critically low (used > {$VFS.FS.PUSED.MAX.CRIT:\"__RESOURCE__\"})
Description Space used: {ITEM.VALUE3} of {ITEM.VALUE2} ({ITEM.VALUE1}), time left till full: < 24h.

Two conditions should match: First, space utilization should be above {$VFS.FS.PUSED.MAX.CRIT:\"__RESOURCE__\"}.

Second condition should be one of the following:
- The disk free space is less than 5G.
- The disk will be full in less than 24 hours.
Expression {TEMPLATE_NAME:vfs.fs.pused.last()}>{$VFS.FS.PUSED.MAX.CRIT:\"__RESOURCE__\"} and (({}-{TEMPLATE_NAME:vfs.fs.used.last()})<5G or {TEMPLATE_NAME:vfs.fs.pused.timeleft(1h,,100)}<1d)
Recovery expression -
Recovery mode None
Manual close Yes
Severity Average
Depends on -

Trigger: Filesystem space is low with timeleft with macro


Applicable for Filesystems
Name Disk space is low (used > {$VFS.FS.PUSED.MAX.WARN:\"__RESOURCE__\"})
Description Space used: {ITEM.VALUE3} of {ITEM.VALUE2} ({ITEM.VALUE1}), time left till full: < 24h.

Two conditions should match: First, space utilization should be above {$VFS.FS.PUSED.MAX.WARN:\"__RESOURCE__\"}.

Second condition should be one of the following:
- The disk free space is less than 10G.
- The disk will be full in less than 24 hours.
Expression {TEMPLATE_NAME:vfs.fs.pused.last()}>{$VFS.FS.PUSED.MAX.WARN:\"__RESOURCE__\"} and (({}-{TEMPLATE_NAME:vfs.fs.used.last()})<10G or {TEMPLATE_NAME:vfs.fs.pused.timeleft(1h,,100)}<1d)
Recovery expression -
Recovery mode None
Manual close Yes
Severity Warning
Depends on Disk space is critically low.

2.6 Visualization, graphs, and dashboards

Consider adding custom graphs for items that could be correlated.

Good: Graph containing all items of different CPU modes(user, system…)

Consider adding dashboards (screens) to provide monitored object summary or a quick overview.

Good: Template App Zabbix Server is a good example

2.7 Usage of event tags

This section will be filled in the next version of the document.