Hello, I have this problem:
We are collecting bunch o metrics from applications by external Zabbix agent scripts. To avoid unnecessary invocations of these scripts for each individual metric, those scripts collect all application metrics and store them as Zabbix trapper items.
Now I need to monitor if all metrics are periodically collected and reported. To do so, I have two options: 1) Monitor status of the metric which invokes the Zabbix agent script, 2) monitor timestamp of the last metric sample (i.e. using nodata() trigger).
When a Zabbix agent script crashes (for whatever reason), no its status is returned and the Zabbix agent invocation metric just becomes unsupported instead of firing a trigger connected to it so method 1) is not reliable.
But with method 2) comes another problem: If the monitored environment is not running 24/7 this happens when the environment is started in the morning:
Trigger nodata() is evaluated before today's metrics are collected and finds out that the last time the metric was collected was yesterday and the hell breaks loose (SMS flooding, freaked out CEO 'doomsday has come' calling etc :-) ).
My idea was to enable the mymetric.nodata() trigger only by "system.uptime.nodata(<system.uptime metric collection interval>)=0 and system.uptime.last() > <application metrics collection interval>" condition but alas!
you cannot just refer to another template and if you make this application template dependent on a common template then you cannot load this common template by default and then load this application template because this would cause metric key conflict.
(We are deploying Zabbix in CICD pipeline with automatic template assignment based on the host role).
Was any of you solving this and how did you do it?
We are collecting bunch o metrics from applications by external Zabbix agent scripts. To avoid unnecessary invocations of these scripts for each individual metric, those scripts collect all application metrics and store them as Zabbix trapper items.
Now I need to monitor if all metrics are periodically collected and reported. To do so, I have two options: 1) Monitor status of the metric which invokes the Zabbix agent script, 2) monitor timestamp of the last metric sample (i.e. using nodata() trigger).
When a Zabbix agent script crashes (for whatever reason), no its status is returned and the Zabbix agent invocation metric just becomes unsupported instead of firing a trigger connected to it so method 1) is not reliable.
But with method 2) comes another problem: If the monitored environment is not running 24/7 this happens when the environment is started in the morning:
Trigger nodata() is evaluated before today's metrics are collected and finds out that the last time the metric was collected was yesterday and the hell breaks loose (SMS flooding, freaked out CEO 'doomsday has come' calling etc :-) ).
My idea was to enable the mymetric.nodata() trigger only by "system.uptime.nodata(<system.uptime metric collection interval>)=0 and system.uptime.last() > <application metrics collection interval>" condition but alas!
you cannot just refer to another template and if you make this application template dependent on a common template then you cannot load this common template by default and then load this application template because this would cause metric key conflict.
(We are deploying Zabbix in CICD pipeline with automatic template assignment based on the host role).
Was any of you solving this and how did you do it?