Hi All,
I am using Zabbix 5.4 with no proxy. I noticed a problem recently with the ICMP items that whenever a device goes down, the ICMP item is not fetching data according to the "Update interval" IF the device is also attached with a SNMP template. However, when the device is up, everything is working like by the book. If the device just have the ICMP templates attached to it, whether or not the device is "Up" or "down", it collects data according to the Update interval (every 1m) . Our trigger expression is based on count (max of last 15 values) and because of this issue, the "PROBLEM" event is created so late when it was supposed to be triggered after 15mins (#15 values) of Down.
From the latest data, I couldn't really find a pattern of how frequently its logging the historic data after its "Down", but I see its mostly every 1-2 hours. We had a INC lately where the PROBLEM event was delayed almost a day because the trigger expression only got 15 "Down" values by that time. How can we de-couple the dependency of this ICMP checks over SNMP? For any devices just have the ICMP monitoring, works perfectly fine.
Please feel free to ask any questions related to this.
NOTE : I don't see anything specific when I enabled debug=5 for "icmp pinger" process and were never had alerts about icmp pinger process busy. Also there is no preprocessing rules for this item. We are using out of the box templates and modules with the very little customizations on duration.
I am using Zabbix 5.4 with no proxy. I noticed a problem recently with the ICMP items that whenever a device goes down, the ICMP item is not fetching data according to the "Update interval" IF the device is also attached with a SNMP template. However, when the device is up, everything is working like by the book. If the device just have the ICMP templates attached to it, whether or not the device is "Up" or "down", it collects data according to the Update interval (every 1m) . Our trigger expression is based on count (max of last 15 values) and because of this issue, the "PROBLEM" event is created so late when it was supposed to be triggered after 15mins (#15 values) of Down.
From the latest data, I couldn't really find a pattern of how frequently its logging the historic data after its "Down", but I see its mostly every 1-2 hours. We had a INC lately where the PROBLEM event was delayed almost a day because the trigger expression only got 15 "Down" values by that time. How can we de-couple the dependency of this ICMP checks over SNMP? For any devices just have the ICMP monitoring, works perfectly fine.
Please feel free to ask any questions related to this.
NOTE : I don't see anything specific when I enabled debug=5 for "icmp pinger" process and were never had alerts about icmp pinger process busy. Also there is no preprocessing rules for this item. We are using out of the box templates and modules with the very little customizations on duration.
Comment