Zabbix Documentation 3.4

3.04.04.4 (current)| In development:5.0 (devel)| Unsupported:1.82.02.22.43.23.44.2Guidelines

User Tools

Site Tools


manual:config:event_correlation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
manual:config:event_correlation [2017/03/31 08:07]
martins-v description for correlation rule operations
manual:config:event_correlation [2017/08/25 06:58] (current)
martins-v removing auto numbering
Line 1: Line 1:
-===== - #5 Event correlation =====+===== 5 Event correlation =====
  
 === Overview === === Overview ===
  
-While generally OK events close all problem events in Zabbix, there are cases when more detailed approach ​is needed. For example, when monitoring log files you may want to discover certain problems in a log file and close them individually rather than all together.+Event correlation allows to correlate ​problem events ​to their resolution ​in a manner that is very precise ​and flexible.
  
-This is the case with triggers that have //Multiple Problem ​Event Generation//​ enabled. Such triggers are normally used for log monitoring, trap processing, etc.+Event correlation can be defined: ​
  
-It is possible in Zabbix to relate problem events based on the [[:​manual/​config/​triggers/event_tags|event tags]]. Tags are used to extract values and create identification for problem events. Taking advantage of that, problems can also be closed individually based on matching tag. +  * [[:​manual/​config/​event_correlation/trigger|on trigger level]] - one trigger may be used to relate ​separate ​problems to their solution 
- +  * [[:​manual/​config/​event_correlation/global|globally]] problems ​can be correlated ​to their solution ​from a different trigger/​polling ​method using global correlation rules
-In other words, the same trigger can create separate events identified by the event tag. Therefore problem events can be identified one-by-one and closed separately based on the identification by the event tag. +
- +
-Correlation can be defined in:  +
- +
-  * trigger configuration ​- one trigger may be used to relate problems to their solution +
-  * globally - it is possible to relate problems to their solution from a different trigger/​polling method using global correlation rules +
- +
-=== How it works === +
- +
-In log monitoring you may encounter lines similar to these: +
- +
-  Line1: Application 1 stopped +
-  Line2: Application 2 stopped +
-  Line3: Application 1 was restarted +
-  Line4: Application 2 was restarted +
- +
-The idea of event correlation is to be able to match the problem event from Line1 to the resolution from Line3 and the problem event from Line2 to the resolution from Line4, and close these problems one by one: +
- +
-  Line1: Application 1 stopped +
-  Line3: Application 1 was restarted #problem from Line 1 closed +
-   +
-  Line2: Application 2 stopped +
-  Line4: Application 2 was restarted #problem from Line 2 closed +
- +
-To do this you need to tag these related events as, for example, "​Application 1" and "​Application 2". That can be done by applying a regular expression to the log line to extract the tag value. Then, when events are created, they are tagged "​Application 1" and "​Application 2" respectively and problem can be matched to the resolution. +
- +
-=== Configuration === +
- +
-To configure event correlation on trigger level: +
- +
-  * go to the trigger ​[[:​manual/​config/​triggers/trigger|configuration form]] +
- +
-{{:​manual:​config:​matching_tags_trigger.png|}} +
- +
-  * select '​Problem event generation mode' as //​Multiple//​ +
-  * select that 'OK event closes'​ //All problems ​if tag values match// +
-  * enter the name of the tag for event matching +
-  * configure the [[:​manual/​config/​triggers/​event_tags|tags]] to extract tag values from log lines +
- +
-If configured successfully you will be able to see problem events tagged by application and matched ​to their resolution in //​Monitoring//​ -> //​Problems//​. +
- +
-{{:​manual:​config:​matched_problems.png?​600|}} +
- +
-<note warning>​Because misconfiguration is possible, when similar event tags may be created for **unrelated** problems, please review the cases outlined below!</​note>​ +
- +
-  * With two applications writing error and recovery messages to the same log file a user may decide to use two //​Application//​ tags in the same trigger with different tag values by using separate regular expressions in the tag values to extract the names of, say, application A and application B from the {ITEM.VALUE} macro (e.g. when the message formats differ). However, this may not work as planned if there is no match to the regular expressions. Non-matching regexps will yield empty tag values and a single empty tag value in both problem and OK events is enough to correlate them. So a recovery message from application A may accidentally close an error message from application B. +
- +
-  * Actual tags and tag values only become visible when a trigger fires. If the regular expression used is invalid, it is silently replaced with an *UNKNOWN* string. If the initial problem event with an *UNKNOWN* tag value is missed, there may appear subsequent OK events with the same *UNKNOWN* tag value that may close problem events which they shouldn'​t have closed. +
- +
-  * If a user uses the {ITEM.VALUE} macro without macro functions as the tag value, the 255-character limitation applies. When log messages are long and the first 255 characters are non-specific,​ this may also result in similar event tags for unrelated problems. +
- +
-=== Configuring global correlation === +
- +
-In slightly ​different ​scenario, you may have different triggers for problem and resolution. For example, a log trigger ​may report application problems, while a polling trigger may report the application to be up and running.  +
- +
-Taking advantage of event tags you can tag the log trigger as //Status: Down// while tag the polling ​trigger as //Status: Up//. Then, in a global ​correlation rule you can relate these triggers and assign operations to this correlation such as close old events or close new events. +
- +
-To configure event correlation rules globally: +
- +
-  * go to //​Configuration//​ -> //Event correlation//​ +
-  * Click on //Create correlation//​ to the right (or on the correlation name to edit an existing rule) +
-  * Enter parameters of the correlation rule in the form +
- +
-{{:​manual:​config:​correlation_rule.png}} +
- +
-^Parameter^Description^ +
-|//​Name// ​               |Unique correlation rule name.  | +
-|//Type of calculation//​ |The following options of calculating conditions are available:​\\ **And** - all conditions must be met\\ **Or** - enough if one condition is met\\ **And/Or** - AND with different condition types and OR with the same condition type\\ **Custom expression** - a user-defined calculation formula for evaluating action conditions. It must include all conditions (represented as uppercase letters A, B, C, ...) and may include spaces, tabs, brackets ( ), **and** (case sensitive), **or** (case sensitive). ​ | +
-|//​Conditions// ​         |List of conditions, selected fromthe //New condition// field. ​ | +
-|//New condition// ​      ​|Select conditions upon which events are correlated and click on //Add//. The following conditions are available:​\\ **Old event tag** - match new events to the specified old event tag\\ **New event tag** - match old events to the specified new event tag\\ **New event host group** - filter matched events to the specified new event host group(s)\\ **Event tag pair** - match events if the **values** of the specified tag pair match\\ **Old event tag value** - match new events to the specified old event tag value\\ **New event tag value** - match old events to the specified new event tag value  | +
-|//​Description// ​        ​|Correlation rule description. ​ | +
-|//​Enabled// ​            |If you mark this checkbox, the correlation rule will be enabled. ​ | +
- +
-  * Select the operation of the correlation rule in the form +
- +
-{{:​manual:​config:​correlation_rule2.png|}} +
- +
-^Parameter^Description^ +
-|//​Operations// ​         |List of operations, selected from the //New operation// field. ​ | +
-|//New operation// ​      ​|Select operation to perform when event is correlated and click on //Add//. The following operations are available:​\\ **Close old events** - close old events when a new event happens\\ **Close new event** - close the new event when it happens ​  | +
- +
-<note warning>​Because misconfiguration is possible, when similar event tags may be created for **unrelated** problems, please review the cases outlined below!</​note>​ +
- +
-  * Actual tags and tag values only become visible when a trigger fires. If the regular expression used is invalid, it is silently replaced with an *UNKNOWN* string. If the initial problem event with an *UNKNOWN* tag value is missed, there may appear subsequent OK events with the same *UNKNOWN* tag value that may close problem events which they shouldn'​t have closed. +
- +
-  * If a user uses the {ITEM.VALUE} macro without macro functions as the tag value, the 255-character limitation applies. When log messages are long and the first 255 characters are non-specific,​ this may also result in similar event tags for unrelated problems.+