Zabbix reacts to events by executing set of operations. An action can be defined for any event or set of events generated by Zabbix.
Action attributes:
Parameter | Description |
---|---|
Name | Unique action name. |
Event source | Source of event. Currently three sources are supported: Triggers - events generated by trigger status changes Discovery - events generated by network discovery module Auto registration - events generated by new active agents |
Enable escalations | Enable escalations. If enabled, the action will be escalated according to operation steps defined for operations. |
Period (seconds) | Time period for increase of escalation step. |
Default subject | Default notification subject. The subject may contain macros. |
Default message | Default notification message. The message may contain macros. |
Recovery message | If enabled, Zabbix will send a recovery message after the original problem is resolved. The messages will be sent only to those who received any message regarding this problem before. |
Recovery subject | Subject of the recovery message. It may contain macros. |
Recovery message | Recovery message. It may contain macros. |
Status | Action status: Enabled - action is active Disabled - action is disabled |
An action is executed only in case if an event matches defined set of conditions.
The following conditions can be defined for trigger based events:
Condition type | Supported operators | Description |
---|---|---|
Application | = like not like | = - event came from a trigger, which refers to an item that is linked to the specified application like - event came from a trigger, which refers to an item that is linked to an application, containing the string not like - event came from trigger, which refers to an item that is linked to an application not containing the string |
Host group | = <> | Compare against host group having a trigger which generated event. = - event came from this host group <> - event did not come from this host group |
Host template | = <> | Compare against Host Template the trigger belongs to. = - event came from a trigger inherited from this Host Template <> - event did not come from a trigger inherited from this Host Template |
Host | = <> | Compare against Host having a trigger which generated event. = - event came from this Host <> - event did not come from this Host |
Trigger | = <> | Compare against Trigger which generated event. = - event generated by this Trigger <> - event generated by other Trigger |
Trigger description (name) | like not like | Compare against Trigger Name which generated event. like - String can be found in Trigger Name. Case sensitive. not like - String cannot be found in Trigger Name. Case sensitive. Note: Entered value will be compared to trigger description (name) with all macros expanded. |
Trigger severity | = <> >= <= | Compare with Trigger Severity. = - equal to trigger severity <> - not equal to trigger severity >= - more or equal to trigger severity <= - less or equal to trigger severity |
Trigger value | = | Compare with Trigger Value. = - equal to trigger value (OK or PROBLEM) |
Time period in | in | Event is within time period. in - event time matches the time period. See Time period specification page for description of the format. |
Maintenance status | = <> | Check if host is in maintenance. = - Host is in maintenance mode. <> - Host is not in maintenance mode. |
Trigger value:
Trigger changes status from OK to PROBLEM (trigger value is PROBLEM) Trigger changes status from PROBLEM to OK (trigger value is OK)
Status change OK→UNKNOWN→PROBLEM is treated as OK→PROBLEM, and PROBLEM→UNKNOWN→OK as PROBLEM→OK.
The following conditions can be defined for Discovery based events:
Condition type | Supported operators | Description |
---|---|---|
Host IP | = <> | Check if IP address of a discovered Host is or is not in the range of IP addresses. = - Host IP is in the range <> - Host IP is out of the range |
Service type | = <> | Check if a discovered service. = - matches discovered service <> - event came from a different service |
Service port | = <> | Check if TCP port number of a discovered service is or is not in the range of ports. = - service port is in the range <> - service port is out of the range |
Discovery status | = | Up - matches Host Up and Service Up events Down - matches Host Down and Service Down events |
Uptime/Downtime | >= <= | Downtime for Host Down and Service Down events. Uptime for Host Up and Service Up events. >= - uptime/downtime is more or equal <= - uptime/downtime is less or equal. Parameter is given in seconds. |
Received value | = <> >= <= like not like | Compare with value received from an agent (Zabbix, SNMP). String comparison. = - equal to the value <> - not equal to the value >= - more or equal to the value <= - less or equal to the value like - has a substring not like - does not have a substring. Parameter is given as a string. |
For example this set of conditions (calculation type: AND/OR):
is evaluated as
(Host group = Oracle servers or Host group = MySQL servers) and (Trigger name like 'Database is down' or Trigger name like 'Database is unavailable')
Operation or a set of operations is executed when event matches conditions.
Zabbix supports the following operations:
Additional operations available for discovery events:
When adding a host, its name is decided by standard gethostbyname function. If the host can be resolved, resolved name is used. If not, IP address is used. Besides, if IPv6 address must be used for a host name, then all “:” (colons) are replaced by “_” (underscores), since “:” (colons) are not allowed in host names.
Operation attributes:
Parameter | Description |
---|---|
Step | If escalation is enabled for this action, escalation settings: From - execute for each step starting from this one To - till this (0, for all steps starting from From) Period - increase step number after this period, 0 - use default period. |
Operation type | Type of action: Send message - send message to user Execute command - execute remote command |
Event Source | |
Send message to | Send message to: Single user - a single user User group - to all members of a group |
Default message | If selected, default message will be used. |
Subject | Subject of the message. The subject may contain macros. |
Message | The message itself. The message may contain macros. |
Remote command | List of remote commands. |
The macros can be used for more efficient reporting.
Subject:
{TRIGGER.NAME}: {TRIGGER.STATUS}
Message subject will be replaced by something like:
Processor load is too high on server zabbix.zabbix.com: PROBLEM
Message:
Processor load is: {zabbix.zabbix.com:system.cpu.load[,avg1].last(0)}
The message will be replaced by something like:
Processor load is: 1.45
Message:
Latest value: {{HOSTNAME}:{TRIGGER.KEY}.last(0)} MAX for 15 minutes: {{HOSTNAME}:{TRIGGER.KEY}.max(900)} MIN for 15 minutes: {{HOSTNAME}:{TRIGGER.KEY}.min(900)}
The message will be replaced by something like:
Latest value: 1.45 MAX for 15 minutes: 2.33 MIN for 15 minutes: 1.01
Zabbix supports number of macros which may be used in various situations. Effective use of macros allows to save time and make Zabbix configuration more transparent.
The table contains complete list of macros supported by Zabbix. X means “supported”.
Item descriptions | DESCRIPTION | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Trigger names | ▼▼ | |||||||||
Trigger expressions | ▼▼ | |||||||||
Map labels1 | ▼▼ | |||||||||
Item key's parameters | ▼▼ | |||||||||
GUI Scripts | ▼▼ | |||||||||
Auto registration notifications | ▼▼ | |||||||||
Discovery notifications | ▼▼ | |||||||||
Notifications and commands | ▼▼ | |||||||||
MACRO | ▼▼ | |||||||||
▼▼ | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
{DATE} | X | X | X | Current date in yyyy.mm.dd. format. | ||||||
{DISCOVERY.DEVICE.IPADDRESS} | X | IP address of the discovered device. Available always, does not depend on host being added. | ||||||||
{DISCOVERY.DEVICE.STATUS} | X | Status of the discovered device: can be either UP or DOWN. | ||||||||
{DISCOVERY.DEVICE.UPTIME} | X | Time since the last change of discovery status for a particular device. For example: 1h 29m. For devices with status DOWN, this is the period of their downtime. |
||||||||
{DISCOVERY.RULE.NAME} | X | Name of the discovery rule that discovered the presence or absence of the device or service. | ||||||||
{DISCOVERY.SERVICE.NAME} | X | Name of the service that was discovered. For example: HTTP. |
||||||||
{DISCOVERY.SERVICE.PORT} | X | Port of the service that was discovered. For example: 80. |
||||||||
{DISCOVERY.SERVICE.STATUS} | X | Status of the discovered service: can be either UP or DOWN. | ||||||||
{DISCOVERY.SERVICE.UPTIME} | X | Time since the last change of discovery status for a particular service. For example: 1h 29m. For services with status DOWN, this is the period of their downtime. |
||||||||
{ESC.HISTORY} | X | Escalation history. Log of previously sent messages. Shows previously sent notifications, on which escalation step they were sent and their status (sent, in progress or failed). | ||||||||
{EVENT.ACK.HISTORY} | X | |||||||||
{EVENT.ACK.STATUS} | X | |||||||||
{EVENT.AGE} | X | X | X | Age of the event. Useful in escalated messages. | ||||||
{EVENT.DATE} | X | X | X | Date of the event. | ||||||
{EVENT.ID} | X | X | X | Numeric event ID which triggered this action. | ||||||
{EVENT.TIME} | X | X | X | Time of the event. | ||||||
{HOSTNAME<1-9>} | X | X | X | X | X | Host name of the Nth item of the trigger which caused a notification. Supported in auto registration notifications since 1.8.4. | ||||
{HOST.CONN<1-9>} | X | X | X | X | IP and host DNS name depending on host settings. | |||||
{HOST.DNS<1-9>} | X | X | X | X | Host DNS name. | |||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ||
{IPADDRESS<1-9>} | X | X | X | X | IP address of the Nth item of the trigger which caused a notification. | |||||
{ITEM.ID<1-9>} | X | Numeric ID of the Nth item of the trigger which caused a notification. Supported since 1.8.12. | ||||||||
{ITEM.LASTVALUE<1-9>} | X | X | The latest value of the Nth item of the trigger expression which caused a notification. Supported from Zabbix 1.4.3. It is alias to {{HOSTNAME}:{TRIGGER.KEY}.last(0)} | |||||||
{ITEM.LOG.AGE<1-9>} | X | |||||||||
{ITEM.LOG.DATE<1-9>} | X | |||||||||
{ITEM.LOG.EVENTID<1-9>} | X | |||||||||
{ITEM.LOG.NSEVERITY<1-9>} | X | |||||||||
{ITEM.LOG.SEVERITY<1-9>} | X | |||||||||
{ITEM.LOG.SOURCE<1-9>} | X | |||||||||
{ITEM.LOG.TIME<1-9>} | X | |||||||||
{ITEM.NAME<1-9>} | X | Name of the Nth item of the trigger which caused a notification. | ||||||||
{ITEM.VALUE<1-9>} | X | X | The latest value of Nth item of the trigger expression if used for displaying triggers. Historical (when event happened) value of Nth item of the trigger expression if used for displaying events and notifications. Supported from Zabbix 1.4.3. |
|||||||
{NODE.ID<1-9>} | X | X | X | |||||||
{NODE.NAME<1-9>} | X | X | X | |||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ||
{PROFILE.CONTACT<1-9>} | X | Contact from host profile. | ||||||||
{PROFILE.DEVICETYPE<1-9>} | X | Device type from of host profile. | ||||||||
{PROFILE.HARDWARE<1-9>} | X | Hardware from host profile. | ||||||||
{PROFILE.LOCATION<1-9>} | X | Location from host profile. | ||||||||
{PROFILE.MACADDRESS<1-9>} | X | Mac Address from host profile. | ||||||||
{PROFILE.NAME<1-9>} | X | Name from host profile. | ||||||||
{PROFILE.NOTES<1-9>} | X | Notes from host profile. | ||||||||
{PROFILE.OS<1-9>} | X | OS from host profile. | ||||||||
{PROFILE.SERIALNO<1-9>} | X | Serial No from host profile. | ||||||||
{PROFILE.SOFTWARE<1-9>} | X | Software from host profile. | ||||||||
{PROFILE.TAG<1-9>} | X | Tag from host profile. | ||||||||
{PROXY.NAME<1-9>} | X | X | X | Proxy name of the Nth item of the trigger which caused a notification. Supported since 1.8.4. | ||||||
{TIME} | X | X | X | Current time in hh:mm:ss. | ||||||
{TRIGGER.COMMENT} | X | Trigger comment. | ||||||||
{TRIGGER.EVENTS.UNACK} | X | X | Number of unacknowledged events for a map element in maps, or for the trigger which generated current event in notifications. Supported in map element labels since 1.8.3. | |||||||
{TRIGGER.EVENTS.PROBLEM.UNACK} | X | X | Number of unacknowledged PROBLEM events for all triggers disregarding their state. Supported since 1.8.3. | |||||||
{TRIGGER.PROBLEM.EVENTS.PROBLEM.UNACK} | X | Number of unacknowledged PROBLEM events for triggers in PROBLEM state. Supported since 1.8.3. | ||||||||
{TRIGGER.EVENTS.ACK} | X | X | Number of acknowledged events for a map element in maps, or for the trigger which generated current event in notifications. Supported since 1.8.3. | |||||||
{TRIGGER.EVENTS.PROBLEM.ACK} | X | X | Number of acknowledged PROBLEM events for all triggers disregarding their state. Supported since 1.8.3. | |||||||
{TRIGGER.PROBLEM.EVENTS.PROBLEM.ACK} | X | Number of acknowledged PROBLEM events for triggers in PROBLEM state. Supported since 1.8.3. | ||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ||
{TRIGGER.EXPRESSION} | X | Trigger expression. Supported since 1.8.12. | ||||||||
{TRIGGER.ID} | X | Numeric trigger ID which triggered this action. | ||||||||
{TRIGGER.KEY<1-9>} | X | Key of the Nth item of the trigger which caused a notification. | ||||||||
{TRIGGER.NAME} | X | Name (description) of the trigger. | ||||||||
{TRIGGER.NSEVERITY} | X | Numerical trigger severity. Possible values: 0 - Not classified, 1 - Information, 2 - Warning, 3 - Average, 4 - High, 5 - Disaster. Supported starting from Zabbix 1.6.2. | ||||||||
{TRIGGER.SEVERITY} | X | Trigger severity. Possible values: Not classified, Information, Warning, Average, High, Disaster, Unknown | ||||||||
{TRIGGER.STATUS} | X | Trigger state. Can be either PROBLEM or OK. {STATUS} is deprecated. | ||||||||
{TRIGGER.URL} | X | Trigger URL. | ||||||||
{TRIGGER.VALUE} | X | X | Current trigger value: 0 - trigger is in OK state, 1 – trigger is in PROBLEM state, 2 – trigger UNKNOWN. This macro can also be used in trigger expressions. | |||||||
{TRIGGERS.UNACK} | X | Number of unacknowledged triggers for a map element, disregarding trigger state. Trigger is considered to be unacknowledged if at least one of its PROBLEM events is unacknowledged. | ||||||||
{TRIGGERS.PROBLEM.UNACK} | X | Number of unacknowledged PROBLEM triggers for a map element. Trigger is considered to be unacknowledged if at least one of its PROBLEM events is unacknowledged. Supported since 1.8.3. | ||||||||
{TRIGGERS.ACK} | X | Number of acknowledged triggers for a map element, disregarding trigger state. Trigger is considered to be acknowledged if all of it's PROBLEM events are acknowledged. Supported since 1.8.3. | ||||||||
{TRIGGERS.PROBLEM.ACK} | X | Number of acknowledged PROBLEM triggers for a map element. Trigger is considered to be acknowledged if all of it's PROBLEM events are acknowledged. Supported since 1.8.3. | ||||||||
{host:key.func(param)} | X | X2 | X | Simple macros as used in trigger expressions. | ||||||
{$MACRO} | X | X | X | X | User macros. Supported in trigger names and item descriptions since 1.8.4. | |||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Macro {TRIGGER.ID} is supported in trigger URL since Zabbix 1.8.8.
For a greater flexibility, Zabbix supports user macros, which can be defined on global, template and host level. These macros have a special syntax: {$MACRO}.
The macros can be used in:
The following characters are allowed in the macro names: A-Z , 0-9 , _ , .
Zabbix substitutes macros according to the following precedence:
In other words, if a macro does not exist for a host, Zabbix will try to find it in host templates of increasing depth. If still not found, a global macro will be used, if exists.
If Zabbix is unable to find a macro, the macro will not be substituted.
To define user macros, go to the corresponding locations in the frontend:
Most common use cases of global and host macros:
Use of host macro in item “Status of SSH daemon” key:
ssh,{$SSH_PORT}
Use of host macro in trigger “CPU load is too high”:
{ca_001:system.cpu.load[,avg1].last(0)}>{$MAX_CPULOAD}
Such a trigger would be created on the template, not edited in individual hosts.
Use of two macros in trigger “CPU load is too high”:
{ca_001:system.cpu.load[,avg1].min({$CPULOAD_PERIOD})}>{$MAX_CPULOAD}
Note that a macro can be used as a parameter of trigger function, in this example function min().
Application is a set of host items. For example, application 'MySQL Server' may contain all items which are related to the MySQL server: availability of MySQL, disk space, processor load, transactions per second, number of slow queries, etc.
An item may be linked with one or more applications.
Applications are used in Zabbix front-end to group items.
Custom (user defined) graphs allow the creation of complex graphs.
These graphs, once configured, can be easily accessed via Monitoring→Graphs.
Configuration of custom graphs can be accessed by navigating to Configuration→Templates or Configuration→Hosts and clicking on Graphs link for corresponding template or host.
1. if the first item was from a template, only from that template;
2. if the first item was from any host, from any host (but not from templates anymore)
A medium is a delivery channel for Zabbix alerts. None, one or more media types can be assigned to user.
Email notification.
Notifications using Jabber messaging.
When sending notifications, Zabbix tries to look up a Jabber SRV record first, and if that fails, it uses an address record for that domain. Among Jabber SRV records, the one with the highest priority and maximum weight is chosen. If it fails, other records are not tried.
Looking up Jabber SRV records is supported since Zabbix 1.8.6. Prior to that Zabbix only tried an address record.
Custom media scripts are executed from the path defined in the Zabbix server configuration file variable AlertScriptsPath. The script has three command line variables passed to it:
Environment variables are not preserved or created for the script, so they should be handled explicitly.
Zabbix supports sending of SMS messages using Serial GSM Modem connected to Zabbix Server's serial port.
Make sure that:
Zabbix has been tested with the following GSM modems:
Use of templates is an excellent way of making maintenance of Zabbix much easier.
A template can be linked to a number of hosts. Items, triggers and graphs of the template will be automatically added to the linked hosts. Change definition of a template item (trigger, graph) and the change will be automatically applied to the hosts.
Host template attributes:
Parameter | Description |
---|---|
Name | Unique template (host) name. The name must be unique within ZABBIX Node. |
Groups | List of host groups the template belongs to. |
New group | Assign new host group to the template. |
Link with template | Used to create hierarchical templates. |
Host group may have zero, one or more hosts.
Host group attributes:
Parameter | Description |
---|---|
Group name | Unique host group name. The name must be unique within a Zabbix node. |
Hosts | List of hosts of this group. |
Zabbix does not support host dependencies. Host dependencies can be defined using more flexible option, i.e. trigger dependencies.
How it works?
A trigger may have list of one or more triggers it depends on. It means that the trigger will still change its status regardless of state of the triggers in the list, yet the trigger won't generate notifications and actions in case if one of the trigger in the list has state PROBLEM.
Host dependency
Suppose you have two hosts: a router and a server. The server is behind the router. So, we want to receive only one notification if the route is down:
“The router is down”
instead of:
“The router is down” and “The host is down”
In order to achieve this, we create a trigger dependency:
"The host is down" depends on "The router is down"
In case if both the server and the router is down, Zabbix will not execute actions for trigger “The host is down”.
An item is a single performance or availability check (metric).
A flexible parameter is a parameter which accepts an argument. For example, vfs.fs.size[*] is a flexible parameter. '*' is any string that will be passed as an argument to the parameter. Correct definition examples:
Item key format, including key parameters, must follow syntax rules. The following illustrations depict supported syntax. Allowed elements and characters at each point can be determined by following the arrows - if some block can be reached through the line, it is allowed, if not - it is not allowed.
Item key
To construct a valid item key, one starts with specifying the key name, then there's a choice to either have parameters or not - as depicted by the two lines that could be followed.
Key name
The key name itself has a limited range of allowed characters, which just follow each other. Allowed characters are:
0-9a-zA-Z_-.
Which means:
Key parameters
An item key can have multiple parameters that are comma separated.
Individual key parameter
Each key parameter can be either a quoted string, an unquoted string or an array.
The parameter can also be left empty, thus using the default value. In that case, the appropriate number of commas must be added if any further parameters are specified. For example, item key icmpping[,,200,,500] would specify that the interval between individual pings is 200 milliseconds, timeout - 500 milliseconds, and all other parameters are left at their defaults.
Parameter - quoted string
If the key parameter is a quoted string, any Unicode character is allowed, and included double quotes must be backslash escaped.
Parameter - unquoted string
If the key parameter is an unquoted string, any Unicode character is allowed except comma and right square bracket (]).
Parameter - array
If the key parameter is an array, it is again enclosed in square brackets, where individual parameters come following multiple parameters specifying rules and syntax.
The parameter “encoding” is used to specify encoding for processing corresponding item checks, so that data acquired will not be corrupted. For a list of supported encodings (code page identifiers), please consult respective documentation, such as documentation for libiconv (GNU Project) or Microsoft Windows SDK documentation for “Code Page Identifiers”. If an empty “encoding” parameter is passed, then ANSI with system specific extension (Windows) or UTF-8 (default locale for newer Unix/Linux distributions, see your system's settings) is used by default.
An item can become unsupported if its value can not be retrieved for some reason. Such items are still rechecked at a fixed interval, configurable in Administration section.
NetBSD | ||||||||||||
OpenBSD | ▼▼ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mac OS X | ▼▼ | |||||||||||
Tru64 | ▼▼ | |||||||||||
AIX | ▼▼ | |||||||||||
HP-UX | ▼▼ | |||||||||||
Solaris | ▼▼ | |||||||||||
FreeBSD | ▼▼ | |||||||||||
Linux 2.6 | ▼▼ | |||||||||||
Linux 2.4 | ▼▼ | |||||||||||
Windows | ▼▼ | |||||||||||
Parameter / system | ▼▼ | |||||||||||
▼▼ | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
agent.hostname | X | X | X | X | X | X | X | X | X | X | X | |
agent.ping | X | X | X | X | X | X | X | X | X | X | X | |
agent.version | X | X | X | X | X | X | X | X | X | X | X | |
kernel.maxfiles | - | X | X | X | - | - | - | ? | X | X | X | |
kernel.maxproc | - | - | X | X | X | - | - | ? | X | X | X | |
log[file,<regexp>,<encoding>,<maxlines>] | X | X | X | X | X | X | X | X | X | X | X | |
logrt[file_format,<regexp>,<encoding>,<maxlines>] | X | X | X | X | X | X | X | X | X | X | X | |
eventlog[name,<regexp>,<severity>,<source>,<eventid>, <maxlines>] | X | - | - | - | - | - | - | - | - | - | - | |
net.if.collisions[if] | - | X | X | X | X | - | X | - | - | X | r | |
net.if.in[if,<mode>] | X | X | X | X | X | - | X | - | - | X | r | |
mode ▲ | bytes (default) | X | X | X | X | X1 | - | X | - | - | X | r |
packets | X | X | X | X | X | - | X | - | - | X | r | |
errors | X | X | X | X | X1 | - | X | - | - | X | r | |
dropped | X | X | X | X | - | - | - | - | - | X | r | |
net.if.list | X | - | - | - | - | - | - | - | - | - | - | |
net.if.out[if,<mode>] | X | X | X | X | X | - | X | - | - | X | r | |
mode ▲ | bytes (default) | X | X | X | X | X1 | - | X | - | - | X | r |
packets | X | X | X | X | X | - | X | - | - | X | r | |
errors | X | X | X | X | X1 | - | X | - | - | X | r | |
dropped | X | X | X | - | - | - | - | - | - | - | - | |
net.if.total[if,<mode>] | X | X | X | X | X | - | X | - | - | X | r | |
mode ▲ | bytes (default) | X | X | X | X | X1 | - | X | - | - | X | r |
packets | X | X | X | X | X | - | X | - | - | X | r | |
errors | X | X | X | X | X1 | - | X | - | - | X | r | |
dropped | X | X | X | - | - | - | - | - | - | - | - | |
net.tcp.dns[<ip>,zone] | - | X | X | X | X | X | X | X | X | X | X | |
net.tcp.dns.query[<ip>,zone,<type>] | - | X | X | X | X | X | X | X | X | X | X | |
net.tcp.listen[port] | X | X | X | X | X | - | - | - | - | - | - | |
net.tcp.port[<ip>,port] | X | X | X | X | X | X | X | X | X | X | X | |
net.tcp.service[service,<ip>,<port>] | X | X | X | X | X | X | X | X | - | X | X | |
net.tcp.service.perf[service,<ip>,<port>] | X | X | X | X | X | X | X | X | - | X | X | |
net.udp.listen[port] | - | X | X | - | - | - | - | - | - | - | - | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
proc.mem[<name>,<user>,<mode>,<cmdline>] | - | X | X | X | X | - | X | X | ? | X | X | |
mode ▲ | sum (default) | - | X | X | X | X | - | X | X | ? | X | X |
avg | - | X | X | X | X | - | X | X | ? | X | X | |
max | - | X | X | X | X | - | X | X | ? | X | X | |
min | - | X | X | X | X | - | X | X | ? | X | X | |
proc.num[<name>,<user>,<state>,<cmdline>] | X | X | X | X | X | - | X | X | ? | X | X | |
state ▲ | all (default) | - | X | X | X | X | - | X | X | ? | X | X |
sleep | - | X | X | X | X | - | X | X | ? | X | X | |
zomb | - | X | X | X | X | - | X | X | ? | X | X | |
run | - | X | X | X | X | - | X | X | ? | X | X | |
sensor[device,sensor,<mode>] | - | X | - | - | - | - | - | - | - | X | - | |
services[<type>,<state>,<exclude>] | X | - | - | - | - | - | - | - | - | - | - | |
system.boottime | - | X | X | X | X | - | - | - | - | X | X | |
system.cpu.intr | - | X | X | X | X | - | X | - | - | X | X | |
system.cpu.load[<cpu>,<mode>] | X | X | X | X | X | X | - | X | ? | X | X | |
mode ▲ | avg1 (default) | X | X | X | X | X | X | - | X | ? | X | X |
avg5 | X | X | X | X | X | X | - | X | ? | X | X | |
avg15 | X | X | X | X | X | X | - | X | ? | X | X | |
system.cpu.num[<type>] | X | X | X | X | X | X | X | - | - | X | X | |
type ▲ | online (default) | X | X | X | X | X | X | X | - | - | X | X |
max | - | X | X | X | X | - | - | - | - | - | - | |
system.cpu.switches | - | X | X | X | X | - | X | - | - | X | X | |
system.cpu.util[<cpu>,<type>,<mode>] | X | X | X | X | X | X | X | X | ? | X | X | |
type ▲ | user (default) | - | X | X | X | X | X | X | X | ? | X | X |
nice | - | X | X | X | - | X | - | X | ? | X | X | |
idle | - | X | X | X | X | X | X | X | ? | X | X | |
system | X | X | X | X | - | X | X | X | ? | X | X | |
kernel | - | - | - | - | X | - | - | - | - | - | - | |
iowait | - | - | X | - | - | - | X | - | - | - | - | |
wait | - | - | - | - | X | - | - | - | - | - | - | |
interrupt | - | - | X | X | - | - | - | - | - | X | - | |
softirq | - | - | X | - | - | - | - | - | - | - | - | |
steal | - | - | X | - | - | - | - | - | - | - | - | |
mode ▲ | avg1 (default) | X | X | X | X | - | X | X | X | ? | X | - |
avg5 | X | X | X | X | - | X | X | - | ? | X | - | |
avg15 | X | X | X | X | - | X | X | - | ? | X | - | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
system.hostname[<type>] | X | X | X | X | X | X | X | X | X | X | X | |
system.localtime | X | X | X | X | X | X | X | X | X | X | X | |
type ▲ | utc (default) | X | X | X | X | X | X | X | X | X | X | X |
local | X | X | X | X | X | X | X | X | X | X | X | |
system.run[command,<mode>] | X | X | X | X | X | X | X | X | X | X | X | |
mode ▲ | wait (default) | X | X | X | X | X | X | X | X | X | X | X |
nowait | X | X | X | X | X | X | X | X | X | X | X | |
system.stat[resource,<type>] | - | - | - | - | - | - | X | - | - | - | - | |
system.swap.in[<device>,<type>] (specifying a device is only supported under Linux) | - | X | X | - | X | - | - | - | - | X | - | |
type ▲ (pages will only work if device was not specified) | count (default under all except Linux) | - | X | X | - | X | - | - | - | - | X | - |
sectors | - | X | X | - | - | - | - | - | - | - | - | |
pages (default under Linux) | - | X | X | - | X | - | - | - | - | X | - | |
system.swap.out[<device>,<type>] (specifying a device is only supported under Linux) | - | X | X | - | X | - | - | - | - | X | - | |
type ▲ (pages will only work if device was not specified) | count (default under all except Linux) | - | X | X | - | X | - | - | - | - | X | - |
sectors | - | X | X | - | - | - | - | - | - | - | - | |
pages (default under Linux) | - | X | X | - | X | - | - | - | - | X | - | |
system.swap.size[<device>,<type>] | X | X | X | X | X | - | - | X | ? | X | - | |
type ▲ | free (default) | X | X | X | X | X | - | - | X | ? | X | - |
total | X | X | X | X | X | - | - | X | ? | X | - | |
used | - | X | X | X | - | - | - | - | - | X | - | |
pfree | - | X | X | X | X | - | - | - | ? | X | - | |
pused | - | X | X | X | X | - | - | - | ? | X | - | |
system.uname | X | X | X | X | X | X | X | X | - | X | X | |
system.uptime | X | X | X | X | X | - | X | ? | ? | X | X | |
system.users.num | - | X | X | X | X | X | X | X | - | X | X | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
vfs.dev.read[<device>,<type>,<mode>] | - | X | X | X | X | - | - | - | - | X | - | |
type ▲ (defaults are different under various OSes) | sectors | - | X | X | - | - | - | - | - | - | - | - |
operations | - | X | X | X | X | - | - | - | - | X | - | |
bytes | - | - | - | X | X | - | - | - | - | X | - | |
sps | - | X | X | - | - | - | - | - | - | - | - | |
ops | - | X | X | X | - | - | - | - | - | - | - | |
bps | - | - | - | X | - | - | - | - | - | - | - | |
mode ▲ (compatible only with type in: sps, ops, bps) | avg1 (default) | - | X | X | X | - | - | - | - | - | i | - |
avg5 | - | X | X | X | - | - | - | - | - | i | - | |
avg15 | - | X | X | X | - | - | - | - | - | i | - | |
vfs.dev.write[<device>,<type>,<mode>] | - | X | X | X | X | - | - | - | - | X | - | |
type ▲ (defaults are different under various OSes) | sectors | - | X | X | - | - | - | - | - | - | - | - |
operations | - | X | X | X | X | - | - | - | - | X | - | |
bytes | - | - | - | X | X | - | - | - | - | X | - | |
sps | - | X | X | - | - | - | - | - | - | - | - | |
ops | - | X | X | X | - | - | - | - | - | - | - | |
bps | - | - | - | X | - | - | - | - | - | - | - | |
mode ▲ (compatible only with type in: sps, ops, bps) | avg1 (default) | - | X | X | X | - | - | - | - | - | i | - |
avg5 | - | X | X | X | - | - | - | - | - | i | - | |
avg15 | - | X | X | X | - | - | - | - | - | i | - | |
vfs.file.cksum[file] | X | X | X | X | X | X | X | X | - | X | X | |
vfs.file.exists[file] | X | X | X | X | X | X | X | X | X | X | X | |
vfs.file.md5sum[file] | X | X | X | X | X | X | X | X | - | X | X | |
vfs.file.regexp[file,regexp,<encoding>] | X | X | X | X | X | X | X | X | - | X | X | |
vfs.file.regmatch[file,regexp,<encoding>] | X | X | X | X | X | X | X | X | - | X | X | |
vfs.file.size[file] | X | X | X | X | X | X | X | X | - | X | X | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
vfs.file.time[file,<mode>] | X | X | X | X | X | X | X | X | - | X | X | |
mode ▲ | modify (default) | X | X | X | X | X | X | X | X | - | X | X |
access | X | X | X | X | X | X | X | X | - | X | X | |
change | X | X | X | X | X | X | X | X | - | X | X | |
vfs.fs.inode[fs,<mode>] | - | X | X | X | X | X | X | X | ? | X | X | |
mode ▲ | total (default) | - | X | X | X | X | X | X | X | ? | X | X |
free | - | X | X | X | X | X | X | X | ? | X | X | |
used | - | X | X | X | X | X | X | X | ? | X | X | |
pfree | - | X | X | X | X | X | X | X | ? | X | X | |
pused | - | X | X | X | X | X | X | X | ? | X | X | |
vfs.fs.size[fs,<mode>] | X | X | X | X | X | X | X | X | ? | X | X | |
mode ▲ | total (default) | X | X | X | X | X | X | X | X | ? | X | X |
free | X | X | X | X | X | X | X | X | ? | X | X | |
used | X | X | X | X | X | X | X | X | ? | X | X | |
pfree | X | X | X | X | X | X | X | X | ? | X | X | |
pused | X | X | X | X | X | X | X | X | ? | X | X | |
vm.memory.size[<mode>] | X | X | X | X | X | X | X | X | ? | X | X | |
mode ▲ | total (default) | X | X | X | X | X | X | X | X | ? | X | X |
free | X | X | X | X | X | X | X | X | ? | X | X | |
used | - | - | - | X | - | - | - | - | - | X | X | |
shared | - | X | X | X | - | - | - | - | ? | X | X | |
buffers | - | X | X | - | - | - | - | - | ? | X | X | |
cached | X | X | X | X | - | - | X | - | ? | X | X | |
pfree | X | X | X | X | - | - | - | - | - | X | X | |
pused | - | - | - | X | - | - | - | - | - | X | X | |
available | - | X | X | - | - | - | - | - | - | - | - | |
web.page.get[host,<path>,<port>] | X | X | X | X | X | X | X | X | X | X | X | |
web.page.perf[host,<path>,<port>] | X | X | X | X | X | X | X | X | X | X | X | |
web.page.regexp[host,<path>,<port>,<regexp>,<length>] | X | X | X | X | X | X | X | X | X | X | X | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
[1] These values for these items are not supported for loopback interfaces on Solaris systems prior to Solaris 10 6/06 as byte, error and utilisation statistics are not stored and/or reported by the kernel. However, if you're monitoring a Solaris system via net-snmp, values may be returned as net-snmp carries legacy code from the cmu-snmp dated as old as 1997 that, upon failing to read byte values from the interface statistics returns the packet counter (which does exist on loopback interfaces) multiplied by an arbitrary value of 308. This makes the assumption that the average length of a packet is 308 octets, which is a very rough estimation as the MTU limit on Solaris systems for loopback interfaces is 8892 bytes.
These values should not be assumed to be correct or even closely accurate. They are guestimates. The Zabbix agent does not do any guess work, but net-snmp will return a value for these fields.
List of supported parameters
Key | ||||
---|---|---|---|---|
▲ | Description | Return value | Parameters | Comments |
agent.hostname | ||||
Returns agent host name. | String value | - | Returns the actual value of the agent hostname from a configuration file. This item is supported starting from version 1.8.13. |
|
agent.ping | ||||
Check the agent availability. | Returns '1' if agent is available, nothing if unavailable. | - | Use function nodata() to check for host unavailability. | |
agent.version | ||||
Version of Zabbix Agent. | String | - | Example of returned value: 1.8.2 | |
kernel.maxfiles | ||||
Maximum number of opened files supported by OS. | Number of files. Integer. | |||
kernel.maxproc | ||||
Maximum number of processes supported by OS. | Number of processes. Integer. | |||
log[file,<regexp>,<encoding>,<maxlines>] | ||||
Monitoring of log file. | Log. | file – full file name regexp – regular expression for pattern encoding - Code Page identifier maxlines - Maximum number of new lines per second the agent will send to Zabbix Server or Proxy. This parameter overrides the 'MaxLinesPerSecond' option in zabbix_agentd.conf | Must be configured as an Active Check. Example: log[/home/zabbix/logs/logfile,,,100] See detailed description. |
|
logrt[file_pattern,<regexp>,<encoding>,<maxlines>] | ||||
Monitoring of log file with log rotation support. | Log. | file_pattern – absolute path to file and regexp describing the file name pattern regexp – regular expression describing the required content pattern encoding - Code Page identifier maxlines - Maximum number of new lines per second the agent will send to Zabbix Server or Proxy. This parameter overrides the 'MaxLinesPerSecond' option in zabbix_agentd.conf | Must be configured as an Active Check. Examples: logrt["/home/zabbix/logs/^logfile[0-9]{1,3}$",,,100] - will match a file like "logfile1" (will not match ".logfile1") logrt["/home/user/logfile_.*_[0-9]{1,3}","pattern_to_match","UTF-8",100] - will collect data from files such “logfile_abc_1” or “logfile__001”. Log rotation is based on last modification times of files. See detailed description. |
|
eventlog[name,<regexp>,<severity>,<source>,<eventid>,<maxlines>] | ||||
Monitoring of event logs. | Log. | name – event log name regexp – regular expression severity – regular expression The parameter accepts the following values: “Information”, “Warning”, “Error”, “Failure Audit”, “Success Audit” source - Source identifier eventid - regular expression maxlines - Maximum number of new lines per second the agent will send to Zabbix Server or Proxy. This parameter overrides the 'MaxLinesPerSecond' option in zabbix_agentd.conf | Must be configured as an Active Check. Examples: eventlog[Application] eventlog[Security,,"Failure Audit",,529|680] eventlog[System,,"Warning|Error"] eventlog[System,,,,^1$] eventlog[System,,,,@TWOSHORT] - here custom regular expression TWOSHORT is defined as type Result is TRUE and expression itself is ^1$|^70$. |
|
net.if.collisions[if] | ||||
Out-of-window collision. | Number of collisions. Integer. | if - interface | ||
net.if.in[if,<mode>] | ||||
Network interface incoming statistic. | Integer. | if - interface mode – bytes number of bytes (default) packets number of packets errors number of errors dropped number of dropped packets | Multi-byte interface names on Windows supported since Zabbix agent version 1.8.6. Examples: net.if.in[eth0,errors] net.if.in[eth0] You may use this key with Delta (speed per second) in order to get bytes per second statistics. |
|
net.if.list | ||||
List of network interfaces: Type Status IPv4 Description | String | Supported since Zabbix agent version 1.8.1. Multi-byte interface names supported since Zabbix agent version 1.8.6. Disabled interfaces are not listed. Note that enabling/disabling some components may change their ordering in the Windows interface name. |
||
net.if.out[if,<mode>] | ||||
Network interface outgoing statistic. | Integer. | if - interface mode – bytes number of bytes (default) packets number of packets errors number of errors dropped number of dropped packets | Multi-byte interface names on Windows supported since Zabbix agent version 1.8.6. Examples: net.if.out[eth0,errors] net.if.out[eth0] You may use this key with Delta (speed per second) in order to get bytes per second statistics. |
|
net.if.total[if,<mode>] | ||||
Sum of network interface incoming and outgoing statistics. | Integer. | if - interface mode – bytes number of bytes (default) packets number of packets errors number of errors dropped number of dropped packets | Examples: net.if.total[eth0,errors] net.if.total[eth0] You may use this key with Delta (speed per second) in order to get bytes per second statistics. Note that dropped packets are supported only if both net.if.in and net.if.out work for dropped packets on your platform. |
|
net.tcp.dns[<ip>,zone] | ||||
Checks if DNS service is up. | 0 - DNS is down 1 - DNS is up | ip - IP address of DNS server (ignored) zone - zone to test the DNS | Example: net.tcp.dns[127.0.0.1,zabbix.com] Internationalized domain names are not supported, please use IDNA encoded names instead. |
|
net.tcp.dns.query[<ip>,zone,<type>] | ||||
Performs a query for the supplied DNS record type. | On success returns a character string with the required type of information. | ip - IP address of DNS server (ignored) zone - zone to test the DNS type - Record type to be queried (default is SOA) | Example: net.tcp.dns.query[127.0.0.1,zabbix.com,MX] type can be one of: A, NS, CNAME, MB, MG, MR, PTR, MD, MF, MX, SOA, NULL, WKS, HINFO, MINFO, TXT, SRV SRV record type is supported on Unix since Zabbix agent version 1.8.6. Internationalized domain names are not supported, please use IDNA encoded names instead. |
|
net.tcp.listen[port] | ||||
Checks if this TCP port is in LISTEN state. | 0 - it is not 1 - it is in LISTEN state | port - TCP port number | Example: net.tcp.listen[80] On Linux supported since Zabbix agent version 1.8.4 |
|
net.tcp.port[<ip>,port] | ||||
Check, if it is possible to make TCP connection to port number port. | 0 - cannot connect 1 - can connect | ip - IP address(default is 127.0.0.1) port - port number | Example: net.tcp.port[,80] can be used to test availability of web server running on port 80. Old naming: check_port[*] For simple TCP performance testing use net.tcp.service.perf[tcp,<ip>,<port>] Note that these checks may result in additional messages in system daemon logfiles (SMTP and SSH sessions being logged usually). |
|
net.tcp.service[service,<ip>,<port>] | ||||
Check if service is running and accepting TCP connections. | 0 - service is down 1 - service is running | service - one of ssh, ntp, ldap, smtp, ftp, http, pop, nntp, imap, tcp ip - IP address (default is 127.0.0.1) port - port number (by default standard service port number is used) | Example: net.tcp.service[ftp,,45] can be used to test availability of FTP server on TCP port 45. Old naming: check_service[*] Note that before Zabbix version 1.8.3 service.ntp should be used instead of ntp. Note that these checks may result in additional messages in system daemon logfiles (SMTP and SSH sessions being logged usually). Checking of encrypted protocols (like IMAP on port 993 or POP on port 995) is currently not supported. As a workaround, please use net.tcp.port for checks like these. Checking of LDAP by Windows agent is currently not supported. |
|
net.tcp.service.perf[service,<ip>,<port>] | ||||
Check performance of service | 0 - service is down sec - number of seconds spent while connecting to the service | service - one of ssh, ntp, ldap, smtp, ftp, http, pop, nntp, imap, tcp ip - IP address (default is 127.0.0.1) port - port number (by default standard service port number is used) | Example: net.tcp.service.perf[ssh] can be used to test speed of initial response from SSH server. Old naming: check_service_perf[*] Note that before Zabbix version 1.8.3 service.ntp should be used instead of ntp. Checking of encrypted protocols (like IMAP on port 993 or POP on port 995) is currently not supported. As a workaround, please use net.tcp.service.perf[tcp,<ip>,<port>] for checks like these. Checking of LDAP by Windows agent is currently not supported. |
|
net.udp.listen[port] | ||||
Checks if this UDP port is in LISTEN state. | 0 - it is not 1 - it is in LISTEN state | port - UDP port number | Example: net.udp.listen[68] On Linux supported since Zabbix agent version 1.8.4 |
|
proc.mem[<name>,<user>,<mode>,<cmdline>] | ||||
Memory used by process name running under user user | Memory used by process. | name - process name user - user name (default is all users) mode - one of avg, max, min, sum (default) cmdline - filter by command line | Example: proc.mem[,root] - memory used by all processes running under user “root”. proc.mem[zabbix_server,zabbix] - memory used by all processes zabbix_server running under user zabbix proc.mem[,oracle,max,oracleZABBIX] - memory used by most memory hungry process running under oracle having oracleZABBIX in its command line |
|
proc.num[<name>,<user>,<state>,<cmdline>] | ||||
Number of processes name having state running under user user | Number of processes. | name - process name user - user name (default is all users) state - one of all (default), run, sleep, zomb cmdline - filter by command line | Example: proc.num[,mysql] - number of processes running under user mysql proc.num[apache2,www-data] - number of apache2 running under user www-data proc.num[,oracle,sleep,oracleZABBIX] - number of processes in sleep state running under oracle having oracleZABBIX in its command line On Windows, only name and user arguments are supported. |
|
sensor[device,sensor,<mode>] | ||||
Hardware sensor reading. | device - device name (if <mode> is used, it is a regular expression) sensor - sensor name (if <mode> is used, it is a regular expression) mode - one of avg, max, min (if omitted, device and sensor are treated verbatim). | On Linux 2.4, reads /proc/sys/dev/sensors. Example: sensor[w83781d-i2c-0-2d,temp1] Prior to Zabbix 1.8.4, format sensor[temp1] was used. On OpenBSD, reads hw.sensors MIB. Example: sensor[cpu0,temp0] - temperature of one CPU sensor["cpu[0-2]$",temp,avg] - average temperature of the first three CPU's Supported on OpenBSD since Zabbix 1.8.4. |
||
system.boottime | ||||
Timestamp of system boot. | Integer. | Time in seconds. | ||
system.cpu.intr | ||||
Device interrupts. | Integer. | |||
system.cpu.load[<cpu>,<mode>] | ||||
CPU load. | Processor load. Float. | cpu - CPU number (default is all CPUs, only default “all” is supported) mode - one of avg1 (default),avg5 (average within 5 minutes), avg15 | Example: system.cpu.load[] Old naming: system.cpu.loadX |
|
system.cpu.num[<type>] | ||||
Number of CPUs. | Number of available processors. | type - one of online (default), max | Example: system.cpu.num |
|
system.cpu.switches | ||||
Context switches. | Switches count. | Old naming: system[switches] | ||
system.cpu.util[<cpu>,<type>,<mode>] | ||||
CPU(s) utilisation. | Processor utilisation in percents | cpu - CPU number (default is all CPUs) type - one of idle, nice, user (default), system, kernel, iowait, interrupt, softirq, steal mode - one of avg1 (default),avg5 (average within 5 minutes), avg15 | Old naming: system.cpu.idleX, system.cpu.niceX, system.cpu.systemX, system.cpu.userX Example: system.cpu.util[0,user,avg5] |
|
system.hostname[<type>] | ||||
Returns host name. | String value | type (only on Windows, ignored on other systems) - netbios (default) or host | On Windows the value is acquired from either GetComputerName() (for netbios) or gethostname() (for host) function and from “hostname” command on other systems. Example of returned value www.zabbix.com Parameter for this item is supported starting from version 1.8.6. |
|
system.localtime | ||||
System time. | Integer or string value. | utc - (default) the time since the Epoch (00:00:00 UTC, January 1, 1970), measured in seconds. local - the time in the 'yyyy-mm-dd,hh:mm:ss.nn,+hh:mm' format | ||
system.run[command,<mode>] | ||||
Run specified command on the host. | Text result of the command | command - command for execution mode - one of wait (default, wait end of execution), nowait (do not wait) | Example: system.run[ls -l /] - detailed file list of root directory. Note: To enable this functionality, agent configuration file must have EnableRemoteCommands=1 option. |
|
system.stat[resource,<type>] | ||||
Virtual memory statistics | Numeric value | ent - number of processor units this partition is entitled to receive (float) kthr,<type> - information about kernel thread states: r - average number of runnable kernel threads (float) b - average number of kernel threads placed in the Virtual Memory Manager wait queue (float) memory,<type> - information about the usage of virtual and real memory: avm - active virtual pages (integer) fre - size of the free list (integer) page,<type> - information about page faults and paging activity: fi - file page-ins per second (float) fo - file page-outs per second (float) pi - pages paged in from paging space (float) po - pages paged out to paging space (float) fr - pages freed (page replacement) (float) sr - pages scanned by page-replacement algorithm (float) faults,<type> - trap and interrupt rate: in - device interrupts (float) sy - system calls (float) cs - kernel thread context switches (float) cpu,<type> - breakdown of percentage usage of processor time: us - user time (float) sy - system time (float) id - idle time (float) wa - idle time during which the system had outstanding disk/NFS I/O request(s) (float) pc - number of physical processors consumed (float) ec - the percentage of entitled capacity consumed (float) lbusy - indicates the percentage of logical processor(s) utilization that occurred while executing at the user and system level (float) app - indicates the available physical processors in the shared pool (float) disk,<type> - disk statistics: bps - indicates the amount of data transferred (read or written) to the drive in bytes per second (integer) tps - indicates the number of transfers per second that were issued to the physical disk/tape (float) This item is supported starting from version 1.8.1. |
||
system.swap.in[<device>,<type>] | ||||
Swap in (from device, into memory) statistics | Numeric value | device - swap device (default is all), type - one of count (number of swapins), sectors (sectors swapped in), pages (pages swapped in). See supported by platform for details on defaults. | Example: system.swap.in[,pages] Old naming: swap[in] |
|
system.swap.out[<device>,<type>] | ||||
Swap out (from memory, onto device) statistics | Numeric value | device - swap device (default is all), type - one of count (number of swapouts), sectors (sectors swapped out), pages (pages swapped out). See supported by platform for details on defaults. | Example: system.swap.out[,pages] Old naming: swap[out] |
|
system.swap.size[<device>,<type>] | ||||
Swap space. | Number of bytes or percentage1 | device - swap device (default is all), type - one of free (default, free swap space), total (total swap space), pfree (free swap space, percentage), pused (used swap space, percentage) | Example: system.swap.size[,pfree] - percentage of free swap space Old naming: system.swap.free, system.swap.total |
|
system.uname | ||||
Returns detailed host information. | String value | Example of returned value: FreeBSD localhost 4.4-RELEASE FreeBSD 4.4-RELEASE #0: Tue Sep 18 11:57:08 PDT 2001 [email protected]: /usr/src/sys/compile/GENERIC i386 |
||
system.uptime | ||||
System's uptime in seconds. | Number of seconds | Use Units s or uptime to get readable values. | ||
system.users.num | ||||
Number of users connected. | Number of users | Command who is used on agent side. | ||
vfs.dev.read[<device>,<type>,<mode>] | ||||
Disk read statistics. | Integer for type in: sectors, operations, bytes Float for type in: sps, ops, bps | device - disk device (default is “all”2) type - one of sectors, operations, bytes, sps, ops, bps (must specify exactly which parameter to use, since defaults are different under various OSes). sps, ops, bps means: sectors, operations, bytes per second respectively mode - one of avg1 (default),avg5 (average within 5 minutes), avg15. Compatible only with type in: sps, ops, bps | Default values of 'type' parameter for different OSes: FreeBSD - bps Linux - sps OpenBSD - operations Solaris - bytes Example: vfs.dev.read[,operations] Old naming: io[*] The type parameters ops, bps and sps on supported platforms are limited to 8 devices (7 individual devices and one “all”). Supports LVM since Zabbix 1.8.6. Until Zabbix 1.8.6, only relative device names may be used (for example, sda), since 1.8.6 optional /dev/ prefix may be used (for example, /dev/sda) |
|
vfs.dev.write[<device>,<type>,<mode>] | ||||
Disk write statistics. | Integer for type in: sectors, operations, bytes Float for type in: sps, ops, bps | device - disk device (default is “all”2) type - one of sectors, operations, bytes, sps, ops, bps (must specify exactly which parameter to use, since defaults are different under various OSes). sps, ops, bps means: sectors, operations, bytes per second respectively mode - one of avg1 (default),avg5 (average within 5 minutes), avg15. Compatible only with type in: sps, ops, bps | Default values of 'type' parameter for different OSes: FreeBSD - bps Linux - sps OpenBSD - operations Solaris - bytes Example: vfs.dev.write[,operations] Old naming: io[*] The type parameters ops, bps and sps on supported platforms are limited to 8 devices (7 individual devices and one “all”). Supports LVM since Zabbix 1.8.6. Until Zabbix 1.8.6, only relative device names may be used (for example, sda), since 1.8.6 optional /dev/ prefix may be used (for example, /dev/sda) |
|
vfs.file.cksum[file] | ||||
Calculate file checksum | File checksum, calculated by algorithm used by UNIX cksum. | file - full path to file | Example of returned value: 1938292000 Example: vfs.file.cksum[/etc/passwd] Old naming: cksum |
|
vfs.file.exists[file] | ||||
Check if file exists | 1 - regular file or a link (symbolic or hard) to regular file exists. 0 - otherwise | file - full path to file | Example: vfs.file.exists[/tmp/application.pid] The return value depends on what S_ISREG POSIX macro returns. |
|
vfs.file.md5sum[file] | ||||
File's MD5 checksum | MD5 hash of the file. | file - full path to file | Example of returned value: b5052decb577e0fffd622d6ddc017e82 Example: vfs.file.md5sum[/etc/zabbix/zabbix_agentd.conf] The file size limit (64 MB) for this item was removed in version 1.8.6. |
|
vfs.file.regexp[file,regexp,<encoding>] | ||||
Find string in a file | Matched string or EOF if expression not found | file - full path to file regexp - GNU regular expression encoding - Code Page identifier | Only the first matching line is returned. Example: vfs.file.regexp[/etc/passwd,zabbix] |
|
vfs.file.regmatch[file,regexp,<encoding>] | ||||
Find string in a file | 0 - expression not found 1 - found | file - full path to file regexp - GNU regular expression encoding - Code Page identifier | Example: vfs.file.regmatch[/var/log/app.log,error] | |
vfs.file.size[file] | ||||
File size | Size in bytes. | file - full path to file | File must have read permissions for user zabbix Example: vfs.file.size[/var/log/syslog] |
|
vfs.file.time[file,<mode>] | ||||
File time information. | Unix timestamp. | file - full path to the file mode - one of modify (default, modification time), access - last access time, change - last change time | Example: vfs.file.time[/etc/passwd,modify] | |
vfs.fs.inode[fs,<mode>] | ||||
Number of inodes | Numeric value | fs - filesystem mode - one of total (default), free, used, pfree (free, percentage), pused (used, percentage) | Example: vfs.fs.inode[/,pfree] Old naming: vfs.fs.inode.free[*], vfs.fs.inode.pfree[*], vfs.fs.inode.total[*] | |
vfs.fs.size[fs,<mode>] | ||||
Disk space | Disk space in bytes | fs - filesystem mode - one of total (default), free, used, pfree (free, percentage), pused (used, percentage) | In case of a mounted volume, disk space for local file system is returned. Example: vfs.fs.size[/tmp,free] Old naming: vfs.fs.free[*], vfs.fs.total[*], vfs.fs.used[*], vfs.fs.pfree[*], vfs.fs.pused[*] | |
vm.memory.size[<mode>] | ||||
Memory size | Memory size in bytes | mode - one of total (default), shared, free, buffers, cached, pfree, available | Old naming: vm.memory.buffers, vm.memory.cached, vm.memory.free, vm.memory.shared, vm.memory.total | |
web.page.get[host,<path>,<port>] | ||||
Get content of web page | Web page source as text | host - hostname path - path to HTML document (default is /) port - port number (default is 80) | Returns EOF on fail. Example: web.page.get[www.zabbix.com,index.php,80] |
|
web.page.perf[host,<path>,<port>] | ||||
Get timing of loading full web page | Time in seconds | host - hostname path - path to HTML document (default is /) port - port number (default is 80) | Returns 0 on fail. Example: web.page.perf[www.zabbix.com,index.php,80] |
|
web.page.regexp[host,<path>,<port>,<regexp>,<length>] | ||||
Get first occurrence of regexp in web page | Matched string | host - hostname path - path to HTML document (default is /) port - port number (default is 80) regexp - GNU regular expression length - maximum number of characters to return | Returns EOF in case of no match or any other failures (such as timeout, failed connection, etc). Example: web.page.regexp[www.zabbix.com,index.php,80,OK,2] |
See this section for the difference of an item being performed as a passive or an active check.
This section contains descriptions of parameters supported by Zabbix Windows agent only.
Key | |||
---|---|---|---|
▲ | Description | Return value | Comments |
perf_counter[counter,<interval>] | |||
Value of any performance counter, where “counter” is the counter path, and “interval” is the time period for storing the average value. | Average value of the “counter” during last “interval” seconds. The “interval” must be between 1 and 900 seconds (included) and the default value is 1. | Performance Monitor can be used to obtain list of available counters. Until version 1.6 this parameter will return correct value only for counters that require just one sample (like \System\Threads). It will not work as expected for counters that require more that one sample - like CPU utilisation. Since 1.6 interval is used, so the check returns an average value for last “interval” seconds every time. | |
service_state[service] | |||
State of service. Parameter is service name. | 0 – running 1 – paused 2 - start pending 3 - pause pending 4 - continue pending 5 - stop pending 6 – stopped 7 - unknown 255 – no such service | Parameter must be real service name as seen in service properties under “Name:” or name of EXE file. | |
services[<type>,<state>,<exclude>] | |||
List of services, separated by a newline or 0, if list would be empty. | type - one of all (default), automatic, manual, disabled state - one of all (default), stopped, started, start_pending, stop_pending, running, continue_pending, pause_pending, paused exclude - list of services to exclude it from the result. Excluded services should be written in double quotes, separated by comma, without spaces. This parameter is supported starting from version 1.8.1. | Examples: services[,started] - list of started services services[automatic, stopped] - list of stopped services, that should be run services[automatic, stopped, "service1,service2,service3"] - list of stopped services, that should be run, excluding services with names service1,service2 and service3 |
|
proc_info[process,<attribute>,<type>] | |||
Different information about specific process(es). | process - process name attribute - requested process attribute. type - representation type (meaningful when more than one process with the same name exists) | The following attributes are currently supported: vmsize - Size of process virtual memory in Kbytes wkset - Size of process working set (amount of physical memory used by process) in Kbytes pf - Number of page faults ktime - Process kernel time in milliseconds utime - Process user time in milliseconds io_read_b - Number of bytes read by process during I/O operations io_read_op - Number of read operation performed by process io_write_b - Number of bytes written by process during I/O operations io_write_op - Number of write operation performed by process io_other_b - Number of bytes transferred by process during operations other than read and write operations io_other_op - Number of I/O operations performed by process, other than read and write operations gdiobj - Number of GDI objects used by process userobj - Number of USER objects used by process Valid types are: min - minimal value among all processes named <process> max - maximal value among all processes named <process> avg - average value for all processes named <process> sum - sum of values for all processes named <process> Examples: 1. In order to get the amount of physical memory taken by all Internet Explorer processes, use the following parameter: proc_info[iexplore.exe,wkset,sum] 2. In order to get the average number of page faults for Internet Explorer processes, use the following parameter: proc_info[iexplore.exe,pf,avg] Note: All io_xxx,gdiobj and userobj attributes available only on Windows 2000 and later versions of Windows, not on Windows NT 4.0. |
Zabbix must be configured with SNMP support in order to be able to retrieve data provided by SNMP agents.
The following steps have to be performed in order to add monitoring of SNMP parameters:
Create a host for the SNMP device.
Enter an IP address. Set the host Status to NOT MONITORED. You can use one of the SNMP templates (Template_SNMPv1_Device, Template_SNMPv2_Device), which will automatically add the set of items. However, the template may not be compatible with the host.
Find out the SNMP string of the item you want to monitor.
After creating the host, use 'snmpwalk' (part of ucd-snmp/net-snmp software which you should have installed as part of the Zabbix installation) or equivalent tool:
shell> snmpwalk <host or host IP> public
This will give you a list of SNMP strings and their last value. If it doesn't then it is possible that the SNMP 'community' is different from the standard public in which case you will need to find out what it is. You would then go through the list until you find the string you want to monitor, e.g. you wanted to monitor the bytes coming in to your switch on port 3 you would use:
interfaces.ifTable.ifEntry.ifOctetsIn.3 = Counter 32: 614794138
You should now use the snmpget command to find the OID for interfaces.ifTable.ifEntry.ifInOctets.3:
shell> snmpget -On 10.62.1.22 interfaces.ifTable.ifEntry.ifOctetsIn.3
where the last number in the string is the port number you are looking to monitor. This should give you something like the following:
.1.3.6.1.2.1.2.2.1.10.3 = Counter32: 614794138
again the last number in the OID is the port number.
3COM seem to use port numbers in the hundreds, e.g. port 1 = port 101, port 3 = port 103, but Cisco use regular numbers, e.g. port 3 = 3.
Create an item for monitoring.
So, now go back to Zabbix and click on Items, selecting the SNMP host you created earlier. Depending on whether you used a template or not when creating your host, you will have either a list of SNMP items associated with your host or just a new item box. We will work on the assumption that you are going to create the item yourself using the information you have just gathered using snmpwalk and snmpget, so enter a plain English description in the 'Description' field of the new item box. Make sure the 'Host' field has your switch/router in it and change the 'Type' field to “SNMPv* agent”. Enter the community (usually public) and enter the numeric OID that you retrieved earlier in to the 'SNMP OID' field, i.e. .1.3.6.1.2.1.2.2.1.10.3
Enter the 'SNMP port' as 161 and the 'Key' as something meaningful, e.g. SNMP-InOctets-Bps. Choose a Multiplier if you want one and enter an 'update interval' and 'keep history' if you want it to be different from the default. Set the 'Status' to Monitored, the 'Type of information' to Numeric (float) and the 'Store value' to DELTA (important otherwise you will get cumulative values from the SNMP device instead of the latest change).
Now save the item and go back to the hosts area of Zabbix. From here check that the SNMP device Status shows 'Monitored' and check in Latest data for your SNMP data!
General example
Parameter | Description |
---|---|
Community | public |
OID | 1.2.3.45.6.7.8.0 (or .1.2.3.45.6.7.8.0) |
Key | <Unique string to be used as reference to triggers> For example, “my_param”. |
Note that OID can be given in either numeric or string form. However, in some cases, string OID must be converted to numeric representation. Utility snmpget may be used for this purpose:
shell> snmpget -On localhost public enterprises.ucdavis.memory.memTotalSwap.0
Monitoring of SNMP parameters is possible if either --with-net-snmp or --with-ucd-snmp flag was specified while configuring Zabbix sources.
Monitoring of Uptime
Parameter | Description |
---|---|
Community | public |
Oid | MIB::sysUpTime.0 |
Key | router.uptime |
Value type | Float |
Units | uptime |
Multiplier | 0.01 |
Simple checks are normally used for agent-less monitoring or for remote checks of services. Note that Zabbix agent is not needed for simple checks. Zabbix server is responsible for processing of simple checks (making external connections, etc).
All simple checks, except tcp and tcp_perf, accept one optional parameter:
Examples of using simple checks:
ftp,155 http http_perf,8080
List of supported simple checks:
Key | ||
---|---|---|
▲ | Description | Return value |
ftp,<port> | ||
Checks if FTP server is running and accepting connections | 0 - FTP server is down 1 - FTP server is running |
|
ftp_perf,<port> | ||
Checks if FTP server is running and accepting connections | 0 - FTP server is down Otherwise, number of seconds spent connecting to FTP server. |
|
http,<port> | ||
Checks if HTTP server is running and accepting connections | 0 - HTTP server is down 1 - HTTP server is running |
|
http_perf,<port> | ||
Checks if HTTP (web) server is running and accepting connections | 0 - HTTP (web) server is down Otherwise, number of seconds spent connecting to HTTP server. |
|
icmpping[<target>,<packets>,<interval>,<size>,<timeout>] | ||
Checks if server is accessible by ICMP ping target - host IP or DNS name packets - number of packets interval - time between successive packets in milliseconds size - packet size in bytes timeout - timeout in milliseconds | 0 - ICMP ping fails 1 - ICMP ping successful Example: icmpping[,4] - if at least one packet of the four is returned, the item will return 1. |
|
icmppingloss[<target>,<packets>,<interval>,<size>,<timeout>] | ||
Return percentage of lost packets target - host IP or DNS name packets - number of packets interval - time between successive packets in milliseconds size - packet size in bytes timeout - timeout in milliseconds | Loss of packets in percents | |
icmppingsec[<target>,<packets>,<interval>,<size>,<timeout>,<mode>] | ||
Return ICMP ping response time target - host IP or DNS name packets - number of packets interval - time between successive packets in milliseconds size - packet size in bytes timeout - timeout in milliseconds mode - one of min, max, avg (default) | Number of seconds If host is not available (timeout reached), the item will return 0. |
|
imap,<port> | ||
Checks if IMAP server is running and accepting connections | 0 - IMAP server is down 1 - IMAP server is running |
|
imap_perf,<port> | ||
Checks if IMAP server is running and accepting connections | 0 - IMAP server is down Otherwise, number of seconds spent connecting to IMAP server. |
|
ldap,<port> | ||
Checks if LDAP server is running and accepting connections | 0 - LDAP server is down 1 - LDAP server is running |
|
ldap_perf,<port> | ||
Checks if LDAP server is running and accepting connections | 0 - LDAP server is down Otherwise, number of seconds spent connecting to LDAP server. |
|
nntp,<port> | ||
Checks if NNTP server is running and accepting connections | 0 - NNTP server is down 1 - NNTP server is running |
|
nntp_perf,<port> | ||
Checks if NNTP server is running and accepting connections | 0 - NNTP server is down Otherwise, number of seconds spent connecting to NNTP server. |
|
ntp,<port> | ||
Checks if NTP server is running and accepting connections | 0 - NTP server is down 1 - NTP server is running |
|
ntp_perf,<port> | ||
Checks if NTP server is running and accepting connections | 0 - NTP server is down Otherwise, number of seconds spent connecting to NTP server. |
|
pop,<port> | ||
Checks if POP server is running and accepting connections | 0 - POP server is down 1 - POP server is running |
|
pop_perf,<port> | ||
Checks if POP server is running and accepting connections | 0 - POP server is down Otherwise, number of seconds spent connecting to POP server. |
|
smtp,<port> | ||
Checks if SMTP server is running and accepting connections | 0 - SMTP server is down 1 - SMTP server is running |
|
smtp_perf,<port> | ||
Checks if SMTP server is running and accepting connections | 0 - SMTP server is down Otherwise, number of seconds spent connecting to SMTP server. |
|
ssh,<port> | ||
Checks if SSH server is running and accepting connections | 0 - SSH server is down 1 - SSH server is running |
|
ssh_perf,<port> | ||
Checks if SSH server is running and accepting connections | 0 - SSH server is down Otherwise, number of seconds spent connecting to SSH server. |
|
tcp,port | ||
Checks if TCP service is running and accepting connections | 0 - TCP service is down 1 - TCP service is running |
|
tcp_perf,port | ||
Checks if TCP service is running and accepting connections | 0 - the service on the port is down Otherwise, number of seconds spent connecting to the TCP service. |
Zabbix will not process a simple check longer than Timeout seconds defined in Zabbix server configuration file.
Zabbix uses external utility fping for processing of ICMP pings. The utility is not part of Zabbix distribution and has to be additionally installed. If the utility is missing, has wrong permissions or its location does not match FpingLocation defined in configuration file, ICMP pings (icmpping, icmppingsec and icmppingloss) will not be processed.
fping must be executable by user Zabbix daemons run as and setuid root. Run these commands as user root in order to setup correct permissions:
shell> chown root:zabbix /usr/sbin/fping shell> chmod 4710 /usr/sbin/fping
After performing the two commands above check ownership of the fping executable. In some cases the ownership can be reset by executing the chmod command.
The default values for ICMP checks parameters:
Parameter | Value | Description | fping flag | Min | Max |
---|---|---|---|---|---|
packets | 3 | pings to the target | -C | 1 | 10000 |
interval | 1000 | milliseconds, “fping” default | -p | 20 | |
size | 56 or 68 | bytes, “fping” default; 56 bytes on x86, 68 bytes on x86_64 | -b | 24 | 65507 |
timeout | 500 | milliseconds, “fping” default | -t | 50 |
Zabbix writes addresses to be checked to a temporary file, which is then passed to fping. If items have different parameters, only ones with identical parameters are written to a single file.
Internal checks allow monitoring of the internals of Zabbix. Internal checks are calculated by Zabbix server.
Key | ||||
---|---|---|---|---|
▲ | Description | Comments | ||
zabbix[boottime] | ||||
Startup time of Zabbix server process in seconds. | In seconds since the epoch. | |||
zabbix[history] | ||||
Number of values stored in table HISTORY | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! | |||
zabbix[history_log] | ||||
Number of values stored in table HISTORY_LOG | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! This item is supported starting from version 1.8.3. |
|||
zabbix[history_str] | ||||
Number of values stored in table HISTORY_STR | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! | |||
zabbix[history_text] | ||||
Number of values stored in table HISTORY_TEXT | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! This item is supported starting from version 1.8.3. |
|||
zabbix[history_uint] | ||||
Number of values stored in table HISTORY_UINT | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! This item is supported starting from version 1.8.3. |
|||
zabbix[items] | ||||
Number of items in Zabbix database | ||||
zabbix[items_unsupported] | ||||
Number of unsupported items in Zabbix database | ||||
zabbix[log] | ||||
Stores warning and error messages generated by Zabbix server. | Character. Add item with this key to have Zabbix internal messages stored. | |||
zabbix[process,<type>,<mode>,<state>] | ||||
Time a particular Zabbix process or a group of processes (identified by <type> and <mode>) spent in <state> in percentage. It is calculated for last minute only. If <mode> is Zabbix process number that is not running (for example, with 5 pollers running <mode> is specified to be 6), such an item will turn into unsupported state. Minimum and maximum refers to the usage percentage for a single process. So if in a group of 3 pollers usage percentages per process were 2, 18 and 66, min would return 2 and max would return 66. Processes report what they are doing in shared memory and the self-monitoring process summarizes that data each second. State changes (busy/idle) are registered upon change - thus a process that becomes busy registers as such and doesn't change or update the state until it becomes idle. This ensures that even fully hung processes will be correctly registered as 100% busy. Currently, “busy” means “not sleeping”, but in the future additional states might be introduced - waiting for locks, performing database queries, etc. On Linux and most other systems, resolution is 1/100 of a second. | The following process types are currently supported: alerter - process for sending notifications configuration syncer - process for managing in-memory cache of configuration data db watchdog - sender of a warning message in case DB is not available discoverer - process for discovery of devices escalator - process for escalation of actions history syncer - history DB writer http poller - web monitoring poller housekeeper - process for removal of old historical data icmp pinger - poller for icmpping checks ipmi poller - poller for IPMI checks node watcher - process for sending historical data and configuration changes between nodes self-monitoring - process for collecting internal server statistics poller - normal poller for passive checks proxy poller - poller for passive proxies timer - process for evaluation of time-related trigger functions and maintenances trapper - trapper for active checks, traps, inter-node and -proxy communication unreachable poller - poller for unreachable devices Note: You can also see these process types in a server log file. Valid modes are: avg - average value for all processes of a given type (default) count - returns number of forks for a given process type, <state> should not be specified max - maximum value min - minimum value <process number> - process number (between 1 and the number of pre-forked instances). For example, if 4 trappers are running, the value is between 1 and 4. Valid states are: busy - process is in busy state, for example, processing request (default). idle - process is in idle state doing nothing. Examples: zabbix[process,poller,avg,busy] - average time of poller processes spent doing something during the last minute zabbix[process,"icmp pinger",max,busy] - maximum time spent doing something by any ICMP pinger process during the last minute zabbix[process,trapper,count] - amount of currently running trapper processes This item is supported starting from version 1.8.5. |
|||
zabbix[proxy,<name>,<param>] | ||||
Access to Proxy related information. | <name> - Proxy name List of supported parameters (<param>): lastaccess – timestamp of last heart beat message received from Proxy For example, zabbix[proxy,"Germany",lastaccess] Trigger function fuzzytime() can be used to check availability of proxies. |
|||
zabbix[queue,<from>,<to>] | ||||
Number of server monitored items in the Queue which are delayed by <from> to <to> seconds, inclusive. | <from> - default: 6 seconds <to> - default: infinity Suffixes s,m,h,d,w are supported for these parameters. Parameters from and to are supported starting from version 1.8.3. |
|||
zabbix[requiredperformance] | ||||
Required performance of the Zabbix server, in new values per second expected. | Approximately correlates with “Required server performance, new values per second” in Reports → Status of Zabbix. Supported since Zabbix 1.6.2. | |||
zabbix[trends] | ||||
Number of values stored in table TRENDS | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! | |||
zabbix[trends_uint] | ||||
Number of values stored in table TRENDS_UINT | Do not use if MySQL InnoDB, Oracle or PostgreSQL is used! This item is supported starting from version 1.8.3. |
|||
zabbix[triggers] | ||||
Number of triggers in Zabbix database | ||||
zabbix[uptime] | ||||
Uptime of Zabbix server process in seconds. | ||||
zabbix[wcache,<cache>,<mode>] | ||||
Cache | Mode | |||
values | all | Number of values processed by Zabbix server, except not supported. | Counter. | |
float | Counter. | |||
uint | Counter. | |||
str | Counter. | |||
log | Counter. | |||
text | Counter. | |||
not supported | Number of processed not supported items. | Counter. This item is supported starting from version 1.8.6. |
||
history | pfree | Free space in the history buffer in percentage. | Low number indicates performance problems on the database side. | |
total | ||||
used | ||||
free | ||||
trend | pfree | |||
total | ||||
used | ||||
free | ||||
text | pfree | |||
total | ||||
used | ||||
free | ||||
zabbix[rcache,<cache>,<mode>] | ||||
Cache | Mode | |||
buffer | pfree | |||
total | ||||
used | ||||
free |
Aggregate checks do not require any agent running on a host being monitored. Zabbix server collects aggregate information by doing direct database queries.
Syntax of an aggregate item's key
groupfunc["Host group","Item key",itemfunc,parameter]
Multiple host groups may be used since Zabbix 1.8.2 by inserting a comma-delimited array.
Supported group functions:
GROUP FUNCTION | DESCRIPTION |
---|---|
grpavg | Average value |
grpmax | Maximum value |
grpmin | Minimum value |
grpsum | Sum of values |
Supported item functions:
ITEM FUNCTION | DESCRIPTION |
---|---|
avg | Average value |
count | Number of values |
last | Last value |
max | Maximum value |
min | Minimum value |
sum | Sum of values |
Examples of keys for aggregate items:
Total disk space of host group 'MySQL Servers'.
grpsum["MySQL Servers","vfs.fs.size[/,total]",last,0]
Average processor load of host group 'MySQL Servers'.
grpavg["MySQL Servers","system.cpu.load[,avg1]",last,0]
Average (5min) number of queries per second for host group 'MySQL Servers'
grpavg["MySQL Servers",mysql.qps,avg,300]
Average CPU load on all hosts in multiple host groups.
grpavg[["Servers A","Servers B","Servers C"],system.cpu.load,last,0]
External check is a check executed by Zabbix Server by running a shell script or a binary.
External checks do not require any agent running on a host being monitored.
Syntax of item's key:
script[parameters] * script – name of the script. * parameters – list of command line parameters. Parameters will be used in command line without any changes.
If you don't want to pass your parameters to the script you may use:
script[] or script <- this simplified syntax is supported starting from Zabbix 1.8.1
Zabbix server will find and execute the command in the directory defined in configuration parameter ExternalScripts in zabbix_server.conf. The command will be executed as the user Zabbix server runs as, so any access permissions or environment variables should be handled in a wrapper script, if necessary, and permissions on the command should allow that user to execute it. Only commands in the specified directory are available.
First command line parameter is host IP address or DNS name, other parameters are substituted by parameters.
Zabbix uses the first line (trimmed from trailing whitespace) in the standard output of the script as the value. The following lines, standard error and the exit code are discarded.
Execute script check_oracle.sh with parameters “-h 192.168.1.4”. Host DNS name 'www1.company.com'.
check_oracle.sh[-h 192.168.1.4]
Zabbix will execute:
check_oracle.sh www1.company.com -h 192.168.1.4.
Zabbix must be configured with SSH2 support.
SSH checks are used for agent-less monitoring. Note that Zabbix agent is not needed for SSH checks.
Actual commands to be executed must be placed in the Executed script field in the item configuration. Multiple commands can be executed one after another by placing them on a new line.
Key | Description | Comments |
---|---|---|
ssh.run[<unique short description>,<ip>,<port>,<encoding>] | Run a command by using SSH remote session |
Telnet checks are used for agent-less monitoring. Zabbix agent is not needed for Telnet checks.
Actual commands to be executed must be placed in the Executed script field in the item configuration. Multiple commands can be executed one after another by placing them on a new line.
Till version 1.8.1, supported characters that the prompt can end with:
Zabbix version 1.8.2 adds support for additional character:
Key | Description | Comments |
---|---|---|
telnet.run[<unique short description>,<ip>,<port>,<encoding>] | Run a command on a remote device using telnet connection |
With calculated items you can create calculations on the basis of other items. Thus, calculated items are a way of creating virtual data sources. Item values will be periodically calculated based on an arithmetical expression.
Resulting data will be stored in the Zabbix database as for any other item - this means storing both history and trends values for fast graph generation. Calculated items may be used in trigger expressions, referenced by macros or other entities same as any other item type.
To use calculated items, choose the item type Calculated. The key is a unique item identifier (per host). You can create any key name using supported symbols. Calculation definition should be entered in the Formula field (named Expression in 1.8.1 and 1.8.2). There is virtually no connection between the formula and key. The key parameters are not used in formula in any way - variables may be passed to the formula with user macros.
The correct syntax of a simple formula is:
func(<key>|<hostname:key>,<parameter1>,<parameter2>,...)
Where:
ARGUMENT | DEFINITION |
---|---|
func | One of the functions supported in trigger expressions: last, min, max, avg, count, etc |
key | The key of another item whose data you want to use. It may be defined as key or hostname:key. Note: Putting the whole key in double quotes (“…”) is strongly recommended to avoid incorrect parsing because of spaces or commas within the key. If there are also quoted parameters within the key, those double quotes must be escaped by using the backslash (\). See Examples 5 and 6 below. |
parameter(s) | Any additional parameters that may be required. See Example 5 below. |
A more complex formula may use a combination of functions, operators and brackets. You could use all functions and operators supported in trigger expressions. Note that syntax is slightly different, however logic and operator precedence are exactly the same.
Supported characters for a hostname:
a..zA..Z0..9 ._-
Supported characters for a key:
a..zA..Z0..9.,_
Supported characters for a function:
a..zA..Z0..9_
Unlike trigger expressions, Zabbix processes calculated items according to item update interval, not upon receiving a new value.
A calculated item may become unsupported in several cases:
Calculate percentage of free disk space on '/'.
Use of function last:
100*last("vfs.fs.size[/,free]")/last("vfs.fs.size[/,total]")
Zabbix will take the latest values for free and total disk spaces and calculate percentage according to the given formula.
Calculate 10 minute average number of values processed by Zabbix.
Use of function avg:
avg("Zabbix Server:zabbix[wcache,values]",600)
Note that extensive use of calculated items with long time periods may affect performance of the Zabbix Server.
Calculate total bandwidth on eth0.
Sum of two functions:
last("net.if.in[eth0,bytes]")+last("net.if.out[eth0,bytes]")
Calculate percentage of incoming traffic.
More complex expression:
100*last("net.if.in[eth0,bytes]")/(last("net.if.in[eth0,bytes]")+last("net.if.out[eth0,bytes]"))
Calculate count of records in a log file for last 10 minutes.
Take note of how double quotes are escaped within the quoted key and first function parameter is required:
count("logrt[\"/tmp/test.log\",\"some words pattern\"]",600)
Using aggregated items correctly within a calculated item.
Take note of how double quotes are escaped within the quoted key:
last("grpsum[\"video\",\"net.if.out[eth0,bytes]\",\"last\",\"0\"]") / last("grpsum[\"video\",\"nginx_stat.sh[active]\",\"last\",\"0\"]")
Functionality of Zabbix agents can be enhanced by defining user parameters (UserParameter configuration parameter) in agent's configuration file. Once user parameters are defined, they can be accessed in the same way as any other agent items by using the key, specified in the parameter definition.
User parameters are commands executed by Zabbix agent. /bin/sh is used as a command line interpreter under UNIX operating systems.
See a step-by-step tutorial on making use of user parameters.
In order to define a new parameter for monitoring, one line has to be added to configuration file of Zabbix agent and the agent must be restarted.
User parameter has the following syntax:
UserParameter=key,command
Parameter | Description |
---|---|
Key | Unique item key. |
Command | Command to be executed to evaluate value of the Key. |
Simple command
UserParameter=ping,echo 1
The agent will always return '1' for item with key 'ping'.
More complex example
UserParameter=mysql.ping,mysqladmin -uroot ping | grep -c alive
The agent will return '1', if MySQL server is alive, '0' - otherwise.
Flexible user parameters can be used for more control and flexibility.
For flexible user parameters,
UserParameter=key[*],command
Parameter | Description |
---|---|
Key | Unique item key. The [*] defines that this key accepts parameters. |
Command | Command to be executed to evaluate value of the Key. Zabbix parses content of [] and substitutes $1,…,$9 in the command. $0 will be substituted by the original command (prior to expansion of $0,…,$9) to be run. |
Something very simple
UserParameter=ping[*],echo $1
We may define unlimited number of items for monitoring all having format ping[something].
Let's add more sense!
UserParameter=mysql.ping[*],mysqladmin -u$1 -p$2 ping | grep -c alive
This parameter can be used for monitoring availability of MySQL database. We can pass user name and password:
mysql.ping[zabbix,our_password]
How many lines matching a regular expression in a file?
UserParameter=wc[*],grep -c "$2" $1
This parameter can be used to calculate number of lines in a file.
wc[/etc/passwd,root] wc[/etc/services,zabbix]
Trigger is defined as a logical expression and represents system state.
A trigger may have the following values:
VALUE | DESCRIPTION |
---|---|
PROBLEM | Normally means that something happened. For example, processor load is too high. Called TRUE in older Zabbix versions. |
OK | This is a normal trigger state. Called FALSE in older Zabbix versions. |
UNKNOWN | In this case, Zabbix cannot evaluate trigger expression. This may happen because of several reasons: server is unreachable trigger expression cannot be evaluated trigger expression has been recently changed |
Triggers are evaluated based on history data only; trend data are never considered.
The expressions used in triggers are very flexible. You can use them to create complex logical tests regarding monitored statistics.
The following operators are supported for triggers (descending priority of execution):
PRIORITY | OPERATOR | DEFINITION |
---|---|---|
1 | / | Division |
2 | * | Multiplication |
3 | - | Arithmetical minus |
4 | + | Arithmetical plus |
5 | < | Less than. The operator is defined as: A<B ⇔ (A<=B-0.000001) |
6 | > | More than. The operator is defined as: A>B ⇔ (A>=B+0.000001) |
7 | # | Not equal. The operator is defined as: A#B ⇔ (A<=B-0.000001) | (A>=B+0.000001) |
8 | = | Is equal. The operator is defined as: A=B ⇔ (A>B-0.000001) & (A<B+0.000001) |
9 | & | Logical AND |
10 | | | Logical OR |
Trigger functions allow to reference collected values, current time and other factors.
Trigger status (expression) is recalculated every time Zabbix server receives new value, if this value is part of this expression. If time based functions are used in the expression, it is recalculated every 30 seconds by a zabbix timer process. If both time-based and non-time-based functions are used in an expression, it is recalculated when a new value is received and every 30 seconds.
Time based functions are:
The following functions are supported:
▼ | FUNCTION | Parameter(s) | Supported value types |
---|---|---|---|
Definition | |||
abschange | ignored | float, int, str, text, log | |
Returns absolute difference between last and previous values. For strings: 0 - values are equal 1 - values differ |
|||
avg | sec or #num | float, int | |
Average value for period of time. Parameter defines length of the period in seconds. The function accepts a second, optional parameter time_shift. It is useful when there is a need to compare the current average value with the average value time_shift seconds back. For instance, avg(3600,86400) will return the average value for an hour one day ago. Parameter time_shift is supported from Zabbix 1.8.2. |
|||
change | ignored | float, int, str, text, log | |
Returns difference between last and previous values. For strings: 0 - values are equal 1 - values differ |
|||
count | sec or #num | float, int, str, text, log | |
Number of historical values for period of time in seconds or number of last #num values matching condition. The function accepts second optional parameter pattern, third parameter operator, and fourth parameter time_shift. For example, count(600,12) will return exact number of values equal to '12' stored in the history. Integer items: exact match Float items: match within 0.000001 String, text and log items: operators like (default), eq, ne are supported Supported operators: eq - equal ne - not equal gt - greater ge - greater or equal lt - less le - less or equal like (textual search only) - matches if contains pattern. For example, count(600,12,"gt") will return exact number of values which are more than '12' stored in the history for the last 600 seconds. Another example: count(#10,12,"gt",86400) will return exact number of values which are larger than '12' stored in the history among last 10 values 24 hours ago. If there is a need to count arbitrary values, for instance, for the last 600 seconds 24 hours ago, count(600,,,86400) should be used. Parameter #num is supported from Zabbix 1.6.1. Parameter time_shift and string operators are supported from Zabbix 1.8.2. See function avg for an example of using time_shift. |
|||
date | ignored | any | |
Returns current date in YYYYMMDD format. For example: 20031025 |
|||
dayofmonth | ignored | any | |
Returns day of month in range of 1 to 31. This function is supported since Zabbix 1.8.5. |
|||
dayofweek | ignored | any | |
Returns day of week in range of 1 to 7. Mon - 1, Sun - 7. | |||
delta | sec or #num | float, int | |
Same as max()-min(). Since Zabbix 1.8.2, the function supports a second, optional parameter time_shift. See function avg for an example of its use. |
|||
diff | ignored | float, int, str, text, log | |
Returns: 1 - last and previous values differ 0 - otherwise |
|||
fuzzytime | sec | float, int | |
Returns 1 if timestamp (item value) does not differ from Zabbix server time for more than N seconds, 0 - otherwise. Usually used with system.localtime to check that local time is in sync with local time of Zabbix server. |
|||
iregexp | 1st - string, 2nd - sec or #num | str, log, text | |
This function is non case-sensitive analogue of regexp. | |||
last | sec or #num | float, int, str, text, log | |
Last (most recent) value. Parameter: sec - ignored #num - Nth value For example, last(0) is always equal to last(#1) last(#3) - third most recent value The function also supports a second optional time_shift parameter. For example, last(0,86400) will return the most recent value one day ago. Zabbix does not guarantee exact order of values if more than two values exist within one second in history. Parameter #num is supported starting from Zabbix 1.6.2. Parameter time_shift is supported starting from Zabbix 1.8.2. See function avg for an example of its use. |
|||
logeventid | string | log | |
Check if Event ID of the last log entry matches a regular expression. Parameter defines the regular expression, POSIX extended style. Returns: 0 - does not match 1 - matches This function is supported since Zabbix 1.8.5. |
|||
logseverity | ignored | log | |
Returns log severity of the last log entry. Parameter is ignored. 0 - default severity N - severity (integer, useful for Windows event logs). Zabbix takes log severity from field Information of Windows event log. |
|||
logsource | string | log | |
Check if log source of the last log entry matches parameter. 0 - does not match 1 - matches Normally used for Windows event logs. For example, logsource("VMware Server"). |
|||
max | sec or #num | float, int | |
Maximal value for period of time. Parameter defines length of the period in seconds. Since Zabbix 1.8.2, the function supports a second, optional parameter time_shift. See function avg for an example of its use. |
|||
min | sec or #num | float, int | |
Minimal value for period of time. Parameter defines length of the period in seconds. Since Zabbix 1.8.2, the function supports a second, optional parameter time_shift. See function avg for an example of its use. |
|||
nodata | sec | any | |
Returns: 1 - if no data received during period of time in seconds. The period should not be less than 30 seconds. 0 - otherwise |
|||
now | ignored | any | |
Returns number of seconds since the Epoch (00:00:00 UTC, January 1, 1970). | |||
prev | ignored | float, int, str, text, log | |
Returns previous value. Parameter is ignored. Same as last(#2) |
|||
regexp | 1st - string, 2nd - sec or #num | str, log, text | |
Check if last value matches regular expression. Parameter defines regular expression, POSIX extended style. Second optional parameter is number of seconds or number of lines to analyse. In this case more than one value will be processed. This function is case-sensitive. Returns: 1 - found 0 - otherwise |
|||
str | 1st - string, 2nd - sec or #num | str, log, text | |
Find string in last (most recent) value. Parameter defines string to find. Case sensitive! Second optional parameter is number of seconds or number of lines to analyse. In this case more than one value will be processed. Returns: 1 - found 0 - otherwise |
|||
strlen | sec or #num | str, log, text | |
Length of the last (most recent) value in characters (not bytes). Parameters are the same as for function last. For example, strlen(0) is equal to strlen(#1) strlen(#3) - length of the third most recent value strlen(0,86400) - length of the most recent value one day ago. This function is supported since Zabbix 1.8.4. |
|||
sum | sec or #num | float, int | |
Sum of values for period of time. Parameter defines length of the period in seconds. Since Zabbix 1.8.2, the function supports a second, optional parameter time_shift. See function avg for an example of its use. |
|||
time | ignored | any | |
Returns current time in HHMMSS format. Example: 123055 |
Most of numeric functions accept number of seconds as an argument. You may also use prefix # to specify that argument has a different meaning:
FUNCTION CALL | MEANING |
---|---|
sum(600) | Sum of all values within 600 seconds |
sum(#5) | Sum of the last 5 values |
Function last uses a different meaning for values, prefixed with the hash mark - it makes it choose n-th previous value, so given values (from most recent to least recent) 3, 7, 2, 6, 5, last(#2) would return 7 and last(#5) would return 5.
Trigger expressions support using various multipliers as suffixes.
A simple useful expression might look like:
{<server>:<key>.<function>(<parameter>)}<operator><constant>
A parameter must be given even for those functions which ignore it. Example: last(0)
Processor load is too high on www.zabbix.com
{www.zabbix.com:system.cpu.load[all,avg1].last(0)}>5
'www.zabbix.com:system.cpu.load[all,avg1]' gives a short name of the monitored parameter. It specifies that the server is 'www.zabbix.com' and the key being monitored is 'system.cpu.load[all,avg1]'. By using the function 'last()', we are referring to the most recent value. Finally, '>5' means that the trigger is in the PROBLEM state whenever the most recent processor load measurement from www.zabbix.com is greater than 5.
www.zabbix.com is overloaded
{www.zabbix.com:system.cpu.load[all,avg1].last(0)}>5|{www.zabbix.com:system.cpu.load[all,avg1].min(600)}>2
The expression is true when either the current processor load is more than 5 or the processor load was more than 2 during last 10 minutes.
/etc/passwd has been changed
Use of function diff:
{www.zabbix.com:vfs.file.cksum[/etc/passwd].diff(0)}=1
The expression is true when the previous value of checksum of /etc/passwd differs from the most recent one.
Similar expressions could be useful to monitor changes in important files, such as /etc/passwd, /etc/inetd.conf, /kernel, etc.
Someone is downloading a large file from the Internet
Use of function min:
{www.zabbix.com:net.if.in[eth0,bytes].min(300)}>100K
The expression is true when number of received bytes on eth0 is more than 100 KB within last 5 minutes.
Both nodes of clustered SMTP server are down
Note use of two different hosts in one expression:
{smtp1.zabbix.com:net.tcp.service[smtp].last(0)}=0&{smtp2.zabbix.com:net.tcp.service[smtp].last(0)}=0
The expression is true when both SMTP servers are down on both smtp1.zabbix.com and smtp2.zabbix.com.
Zabbix agent needs to be upgraded
Use of function str():
{zabbix.zabbix.com:agent.version.str("beta8")}=1
The expression is true if Zabbix agent has version beta8 (presumably 1.0beta8).
Server is unreachable
{zabbix.zabbix.com:icmpping.count(1800,0)}>5
The expression is true if host “zabbix.zabbix.com” is unreachable more than 5 times in the last 30 minutes.
No heartbeats within last 3 minutes
Use of function nodata():
{zabbix.zabbix.com:tick.nodata(180)}=1
'tick' must have type 'Zabbix trapper'. In order to make this trigger work, item 'tick' must be defined. The host should periodically send data for this parameter using zabbix_sender. If no data is received within 180 seconds, the trigger value becomes PROBLEM.
CPU activity at night time
Use of function time():
{zabbix:system.cpu.load[all,avg1].min(300)}>2&{zabbix:system.cpu.load[all,avg1].time(0)}>000000&{zabbix:system.cpu.load[all,avg1].time(0)}<060000
The trigger may change its status to true, only at night (00:00-06:00) time.
Check if client local time is in sync with Zabbix server time
Use of function fuzzytime():
{MySQL_DB:system.localtime.fuzzytime(10)}=0
The trigger will change to the problem state in case when local time on server MySQL_DB and Zabbix server differs by more than 10 seconds.
Trigger dependencies can be used to define relationship between triggers.
Trigger dependencies is a very convenient way of limiting number of messages to be sent in case if an event belongs to several resources.
For example, a host Host is behind router Router2 and the Router2 is behind Router1.
Zabbix - Router1 - Router2 - Host
If the Router1 is down, then obviously the Host and the Router2 are also unreachable. One does not want to receive three notifications about the Host, the Router1 and the Router2. This is when Trigger dependencies may be handy.
In this case, we define these dependencies:
trigger 'Host is down' depends on trigger 'Router2 is down' trigger 'Router2 is down' depends on trigger 'Router1 is down'
Before changing status of trigger 'Host is down', Zabbix will check if there are corresponding trigger dependencies defined. If so, and one of the triggers is in PROBLEM state, then trigger status will not be changed and thus actions will not be executed and notifications will not be sent.
Zabbix performs this check recursively. If Router1 or Router2 is unreachable, the Host trigger won't be updated.
Trigger severity defines how important is a trigger. Zabbix supports following trigger severities:
SEVERITY | DEFINITION | COLOR |
---|---|---|
Not classified | Unknown severity. | Gray. |
Information | For information purposes. | Light green. |
Warning | Be warned. | Light yellow. |
Average | Average problem. | Dark red. |
High | Something important has happened. | Red. |
Disaster | Disaster. Financial losses, etc. | Bright red. |
The severities are used to:
Sometimes a trigger must have different conditions for different states. For example, we would like to define a trigger which would become PROBLEM when server room temperature is higher than 20C while it should stay in the state until temperature will not become lower than 15C.
In order to do this, we define the following trigger:
Temperature in server room is too high
({TRIGGER.VALUE}=0&{server:temp.last(0)}>20)| ({TRIGGER.VALUE}=1&{server:temp.last(0)}>15)
Note use of macro {TRIGGER.VALUE}. The macro returns current trigger value.
Free disk space is too low
Problem: it is less than 10GB for last 5 minutes
Recovery: it is more than 40GB for last 10 minutes
({TRIGGER.VALUE}=0&{server:vfs.fs.size[/,free].max(5m)}<10G) | ({TRIGGER.VALUE}=1&{server:vfs.fs.size[/,free].min(10m)}<40G)
Note use of macro {TRIGGER.VALUE}. The macro returns current trigger value.
Zabbix screens allow grouping of various information for quick access and display on one screen. An easy-to-use screen builder makes creating screens easy and intuitive.
A screen is a table which may contain the following elements in each cell:
The number of elements per screen is unlimited.
You can configure screens in Configuration → Screens and view them in Monitoring → Screens as well as include your favourite screens in the favourites section of Monitoring → Dashboard.
A slide show is a series of screens, which will be automatically rotated according to configured update intervals.
You can configure slide shows in Configuration → Slides.
PARAMETER | Description |
---|---|
Name | Name of slide show. |
Update interval (in sec) | This parameter defines the default interval between screen rotation, in seconds. |
Slides | List of individual slides (screens) |
Screen | Screen name |
Delay | How long the screen will be displayed, in seconds. If set to 0, Update Interval of the slide show will be used. |
Slide show “Zabbix administrators”
The slide show consists of two screens which will be displayed in the following order:
Zabbix Server ⇒ Pause 60 seconds ⇒ Zabbix Server2 ⇒ Pause 30 seconds ⇒ Zabbix Server ⇒ Pause 60 seconds ⇒ Zabbix Server2 ⇒ …
IT Services are intended for those who want to get a high-level (business) view of monitored infrastructure. In many cases, we are not interested in low-level details, like lack of disk space, high processor load, etc. What we are interested in is availability of service provided by our IT department. We can also be interested in identifying weak places of IT infrastructure, SLA of various IT services, structure of existing IT infrastructure, and many other information of higher level.
Zabbix IT Services provide answers to all mentioned questions.
IT Services is hierarchy representation of monitored data.
A very simple IT service structure may look like:
IT Service | |-Workstations | | | |-Workstation1 | | | |-Workstation2 | |-Servers
Each node of the structure has attribute status. The status is calculated and propagated to upper levels according to selected algorithm. At the lowest level of IT Services are triggers. The status of individual nodes is affected by the status of their triggers.
To configure IT Services, go to Configuration → IT Services.
On this screen you can build a hierarchy of your monitored infrastructure. The highest-level parent service is 'root'. You can build your hierarchy downward by adding lower-level parent services and then individual nodes to them.
Click on a service to add services to it or edit the service. A form is displayed where you can edit service attributes.
IT Service attributes:
Parameter | Description |
---|---|
Name | Service name. |
Parent service | Parent service the service belongs to. |
Depends on | List of child services the service depends on. |
Status calculation algorithm | Method of calculating service status: Do not calculate - do not calculate service status Problem, if at least one child has a problem - considered to be a problem if already one child service has a problem Problem, if all children have problems - considered to be a problem only if all child services are having problems |
Calculate SLA | Enable SLA calculation and display. |
Acceptable SLA (in %) | SLA percentage that is acceptable for this service. Used for reporting. |
Service times | By default, all services are expected to operate 24x7x365. If exceptions needed, add new service times. |
New service time | Service times: One-time downtime - a single downtime. Service state within this period does not affect SLA. Uptime - service uptime Downtime - service state within this period does not affect SLA. Add the respective hours. Note: Service times affect only the service they are configured for. Thus, a parent service will not take into account the service time configured on a child service (unless a corresponding service time is configured on the parent service as well). |
Link to trigger | Linkage to trigger: None - no linkage trigger name - linked to the trigger, thus depends on the trigger status Services of the lowest level must be linked to triggers. (Otherwise their state will not be represented accurately.) |
Sort order | Sort order for display, lowest comes first. |
To monitor IT Services, go to Monitoring → IT Services.
A list of the existing IT services is displayed along with data of their status and SLA. From the dropdown in the upper right corner you can select a desired period for display.
Displayed data:
Parameter | Description |
---|---|
Service | Service name. |
Status | Status of service: OK - no problems (trigger colour and severity) - indicates a problem and its severity |
Reason | Indicates the reason of problem (if any). |
SLA (period) | Displays SLA bar. Green/red ratio indicates the proportion of availability/problems. |
SLA | Displays acceptable SLA/current SLA value. If current value is below the acceptable level, it is displayed in red. |
Graph | Contains link to a graph of availability data. |
You can also click on the green/red SLA bar to access the IT Services Availability Report.
Here you can assess IT service availability data over a longer period of time on daily/weekly/monthly/yearly basis.
All Zabbix users access the Zabbix application through the Web-based front end. Each Zabbix user is assigned a unique login name and a password. All user passwords are encrypted and stored on the Zabbix database. Users can not use their user id and password to log directly into the UNIX server unless they have also been set up accordingly to UNIX. Communication between the Web Server and the user's browser can be protected using SSL.
Access permissions on screen within the menu may be set for each user. By default, no permissions are granted on a screen when user is registered to the Zabbix.
Note that a user is automatically disconnected after 30 minutes of inactivity.
Zabbix has a flexible user permission schema which can be efficiently used to manage user permission within one Zabbix installation or in a distributed environment.
Permissions are granted to user groups on a host group level.
Zabbix supports several types of users. The type controls what administrative functions a user has permission to.
User types are used to define access to administrative functions and to specify default permissions.
User type | Description |
---|---|
Zabbix User | The user has access to Monitoring menu. The user has no access to any resources by default. Permissions to host groups must be explicitly assigned. |
Zabbix Admin | The user has access to Monitoring and Configuration. The user has no access to any host groups by default. Permissions to host groups must be explicitly given. |
Zabbix Super Admin | The user has access to everything: Monitoring, Configuration and Administration. The user has Read-Write access to all host groups. Permissions cannot be revoked by denying access to specific host groups. |
Zabbix Queue displays items that are waiting for a refresh. The Queue is just a logical representation of data from the database. There is no IPC queue or any other queue mechanism in Zabbix.
Statistics shown by the Queue is a good indicator of performance of Zabbix server.
The Queue on a standalone application or when displayed for a master node shows items waiting for a refresh.
In this case, we see that we have three items of type Zabbix agent waiting to be refreshed 0-5 seconds, and one item of type Zabbix agent (active) waiting more than five minutes (perhaps the agent is down?). Note that information displayed for a child node is not up-to-date. The master node receives historical data with a certain delay (normally, up-to 10 seconds for inter-node data transfer), so the information is delayed.
On the screenshot we see that there are 93 items waiting more than 5 minutes for refresh on node “Child”, however we should not trust the information as it depends on:
The scripts are used to automatically start/stop Zabbix processes during system's start-up/shutdown.
The scripts are located under directory misc/init.d.
The script is used to receive SNMP traps. The script must be used in combination with snmptrapd, which is part of package net-snmp.
Configuration guide:
traphandle default /bin/bash /home/zabbix/bin/snmptrap.sh
Complex regular expressions can be created and tested in the Zabbix frontend by going to Administration → General → Regular expressions.
After a regular expression has been created, it can be used everywhere regular expressions are supported by referring to it's name, prefixed with @, for example, @mycustomregexp.
All regular expressions in Zabbix, whether created with the advanced editor, or entered manually, support POSIX extended regular expressions.
While many things in the frontend can be configured using the frontend itself, some customisations are currently only possible by editing a definitions file. Located in the frontend directory, this file is include/defines.inc.php. Parameters in this file that could be of interest to users:
For how long to show triggers in OK state after their state changed from PROBLEM, in seconds.
Default: 1800
For how long a trigger should blink after its state changed, in seconds.
Default: 1800
Default graph period, in seconds. One hour by default.
Minimum graph period, in seconds. One hour by default.
Maximum graph period, in seconds. Two years by default since 1.6.7, one year before that.
Default location of Y axis in simple graphs and default value for drop down box when adding items to custom graphs. Possible values: 0 - left, 1 - right.
Default: 0
Threshold value for roundoff constants. Values less than it will be rounded to ZBX_UNITS_ROUNDOFF_LOWER_LIMIT number of digits after comma, greater to ZBX_UNITS_ROUNDOFF_UPPER_LIMIT.
Default: 0.01
Number of digits after comma, when value is greater than roundoff threshold
Default: 2
Number of digits after comma, when value is less than roundoff threshold
Default: 6
Number of days, which will reflect on frontend choice when deciding which history or trends table to process for selected period on data graphing. When this define is:
This define could be useful for partitioned history data storage.
Default: -1
Enables support for Zapcat Zabbix Java JMX bridge item keys syntax
Default: false
It is possible to simplify Zabbix trigger expressions or item keys by using suffixes.
The following table summarises available standard multipliers in Zabbix frontend and server:
Till_1.8.2 | Additional in 1.8.2 | |
---|---|---|
Server | K (Kilo) M (Mega) G (Giga) | T (Tera) |
Frontend | K (Kilo) M (Mega) G (Giga) T (Tera) | P (Peta) E (Exa) Z (Zetta) Y (Yotta) |
Since Zabbix version 1.8.2 the following time-related multipliers are available:
These are supported in trigger expression constants and function parameters, as well as in the internal item zabbix[queue,<from>,<to>] parameters.
These multipliers allow to write expressions that are easier to understand and maintain, for example the following expressions:
{host:zabbix[proxy,zabbix_proxy,lastaccess]}>120 {host:system.uptime[].last(0)}<86400 {host:system.cpu.load.avg(600)}<10
could be changed to:
{host:zabbix[proxy,zabbix_proxy,lastaccess]}>2m {host:system.uptime.last(0)}<1d {host:system.cpu.load.avg(10m)}<10
Time period has the following format:
d-d,hh:mm-hh:mm
You can specify more than one time period using a semicolon (;) separator:
d-d,hh:mm-hh:mm;d-d,hh:mm-hh:mm...
Format | Description |
---|---|
d | Day of week: 1 - Monday, 2 - Tuesday ,… , 7 - Sunday |
hh | Hours: 00-24 |
mm | Minutes: 00-59 |
Empty time specification equals to 01-07,00:00-24:00
Working hours. Monday - Friday from 9:00 till 18:00:
1-5,09:00-18:00
Working hours plus weekend. Monday - Friday from 9:00 till 18:00 and Saturday, Sunday from 10:00 till 16:00:
1-5,09:00-18:00;6-7,10:00-16:00