Sidebar

Zabbix Summit 2022
Register for Zabbix Summit 2022

2 Global event correlation

Overview

Global event correlation allows to reach out over all metrics monitored by Zabbix and create correlations.

It is possible to correlate events created by completely different triggers and apply the same operations to them all. By creating intelligent correlation rules it is actually possible to save yourself from thousands of repetitive notifications and focus on root causes of a problem!

Global event correlation is a powerful mechanism, which allows you to untie yourself from one-trigger based problem and resolution logic. So far, a single problem event was created by one trigger and we were dependent on that same trigger for the problem resolution. We could not resolve a problem created by one trigger with another trigger. But with event correlation based on event tagging, we can.

For example, a log trigger may report application problems, while a polling trigger may report the application to be up and running. Taking advantage of event tags you can tag the log trigger as Status: Down while tag the polling trigger as Status: Up. Then, in a global correlation rule you can relate these triggers and assign an appropriate operation to this correlation such as closing the old events.

In another use, global correlation can identify similar triggers and apply the same operation to them. What if we could get only one problem report per network port problem? No need to report them all. That is also possible with global event correlation.

Global event correlation is configured in correlation rules. A correlation rule defines how the new problem events are paired with existing problem events and what to do in case of a match (close the new event, close matched old events by generating corresponding OK events). If a problem is closed by global correlation, it is reported in the Info column of MonitoringProblems.

Configuring global correlation rules is available to Zabbix Super Admin level users only.

Event correlation must be configured very carefully, as it can negatively affect event processing performance or, if misconfigured, close more events than was intended (in the worst case even all problem events could be closed).

To configure global correlation safely, observe the following important tips:

  • Reduce the correlation scope. Always set a unique tag for the new event that is paired with old events and use the New event tag correlation condition;
  • Add a condition based on the old event when using the Close old event operation (or else all existing problems could be closed);
  • Avoid using common tag names that may end up being used by different correlation configurations;
  • Keep the number of correlation rules limited to the ones you really need.

See also: known issues.

Configuration

To configure event correlation rules globally:

  • Go to ConfigurationEvent correlation
  • Click on Create correlation to the right (or on the correlation name to edit an existing rule)
  • Enter parameters of the correlation rule in the form

correlation_rule.png

All mandatory input fields are marked with a red asterisk.

Parameter Description
Name Unique correlation rule name.
Type of calculation The following options of calculating conditions are available:
And - all conditions must be met
Or - enough if one condition is met
And/Or - AND with different condition types and OR with the same condition type
Custom expression - a user-defined calculation formula for evaluating action conditions. It must include all conditions (represented as uppercase letters A, B, C, ...) and may include spaces, tabs, brackets ( ), and (case sensitive), or (case sensitive), not (case sensitive).
Conditions List of conditions, as selected from the New condition field.
New condition Select conditions upon which events are correlated and click on Add. The following conditions are available:
Old event tag - match new event to the old event(s) that has the corresponding old event tag
New event tag - match new event that has the corresponding event tag to old event(s)
New event host group - match new event that belongs to the corresponding host group to old event(s)
Event tag pair - match new event to the old event(s) if the values of the specified tags in both events match. Tag names need not match.
This option is useful for matching runtime values, which may not be known at the time of configuration (see also Example 1)
Old event tag value - match new event to the old event(s) that:
= - has the corresponding old event tag value
<> - does not have the corresponding old event tag value
like - has the corresponding string in the old event tag value
not like - does not have this string in the corresponding old event tag value
New event tag value - match new event to old event(s) if the new event:
= - has the corresponding new event tag value
<> - does not have the corresponding new event tag value
like - has the corresponding string in the new event tag value
not like - does not have this string in the corresponding new event tag value
Description Correlation rule description.
Enabled If you mark this checkbox, the correlation rule will be enabled.
  • Select the operation of the correlation rule in the form

Parameter Description
Operations List of operations, selected from the New operation field.
New operation Select operation to perform when event is correlated and click on Add. The following operations are available:
Close old events - close old events when a new event happens. Always add a condition based on the old event when using the Close old events operation or all existing problems could be closed.
Close new event - close the new event when it happens

Because misconfiguration is possible, when similar event tags may be created for unrelated problems, please review the cases outlined below!

  • Actual tags and tag values only become visible when a trigger fires. If the regular expression used is invalid, it is silently replaced with an *UNKNOWN* string. If the initial problem event with an *UNKNOWN* tag value is missed, there may appear subsequent OK events with the same *UNKNOWN* tag value that may close problem events which they shouldn't have closed.
  • If a user uses the {ITEM.VALUE} macro without macro functions as the tag value, the 255-character limitation applies. When log messages are long and the first 255 characters are non-specific, this may also result in similar event tags for unrelated problems.

Examples

Example 1

Stop repetitive problem events from the same network port.

This global correlation rule will correlate problems if Host and Port tag values exist on the trigger and they are the same in the original event and the new one.

This operation will close new problem events on the same network port, keeping only the original problem open.