Note: Server version 2.2.7
I am currently getting a VERY frustrating set of behaviors from Zabbix. A trigger will be perfectly working one second, and then the next, flip to "Unknown" status with a corresponding Error pop-up of "Cannot evaluate function". Unfortunately I cannot tell which function it cannot evaluate because the rollover pop-up is too small to display the whole error message; further, the server logs give no hint.
Consider the following trigger (some values replaced for security reasons):
This is meant to fire the trigger if there is data for the last hour in the given logfile. This seems to work well enough. Now we add...
This is expected to mean (combined with the first expression): "if the postgresql-today log contains a logged error message sent within the last 3600 seconds and such a message contains the regexp-match FATAL, the trigger evaluates to true".
Now, whenever I create this trigger, it will seemingly work -- everything is green. After some time, however, it will "flip" to the error status. To get this trigger back to the "enabled" state, I must simply edit it to...
I think we have the confluence of three bugs. First, an expression failure that is not seen at trigger-editing time but only later. Second, an insufficient level of reporting (to the user, in some way) on such a failure. Third: the underlying problem: the broken feature of the "evaluation period" in the regexp function.
The same kind of situation happens when the function is countinstead of regexp.
If someone will please take the time to confirm/disconfirm this as a bug, and I will follow up accordingly (with the bug-reporting system).
I am currently getting a VERY frustrating set of behaviors from Zabbix. A trigger will be perfectly working one second, and then the next, flip to "Unknown" status with a corresponding Error pop-up of "Cannot evaluate function". Unfortunately I cannot tell which function it cannot evaluate because the rollover pop-up is too small to display the whole error message; further, the server logs give no hint.
Consider the following trigger (some values replaced for security reasons):
Code:
{hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].nodata(3600)}=0
Code:
<<first part>>& {hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].regexp("FATAL",3600)}=1
Now, whenever I create this trigger, it will seemingly work -- everything is green. After some time, however, it will "flip" to the error status. To get this trigger back to the "enabled" state, I must simply edit it to...
Code:
<<first part>> & {hostname.our.org:log[/var/log/postgres/postgresql-today.log,"(PANIC|WARNING|ERROR|FATAL)"].regexp("FATAL")}=1
The same kind of situation happens when the function is countinstead of regexp.
If someone will please take the time to confirm/disconfirm this as a bug, and I will follow up accordingly (with the bug-reporting system).
Comment