Hi, after item value return to a good value, trigger still shows problem. It's second time I have similar problem, before it was different host and trigger. Previously I've unlink and clean template with trigger and it resolve case, but it is not a solution. Is there a way to debug this problem, I've checked log but I haven't find any clue.
Ad Widget
Collapse
Trigger still show problem despite item give good value.
Collapse
X
-
I have also started having this problem recently, in the past couple of weeks.
My Zabbix server version is 2.4.4, running on CentOS 6, with a MySQL 5.1.73 database.
Out of 421 enabled hosts (all being polled, no active checks), only 2 have displayed this behaviour.
The first time it happened, I manually updated the "value" and "lastchange" entries in the "triggers" table for the affected trigger IDs, after waiting 12 hours for it to clear by itself. This fixed it, but a few days later it happened again for one of the triggers.
From the main Zabbix dashboard (dashboard.php), if I click on the affected hostname to get the pop-up menu and then click on "Triggers" (tr_status.php), then I see the trigger listed with a status of PROBLEM.
On that trigger status page, if I click on the trigger's name to get the pop-up menu and click on "Events" (events.php), no events are shown.
On the trigger status page, if I choose the "History" entry from the pop-up menu instead of "Events", the correct data is shown in the graph, showing values that indicate the trigger should not be active (and the line on the graph showing the trigger threshold also confirms this).
From the dashboard page, if I click on the affected hostname to get the pop-up menu and then click on "Latest data" (latest.php), then the correct latest data is shown, including data for the affected item which shows it shouldn't be activating the trigger any more.
The trigger in question is from the "Template OS Linux" template, and is an autodiscovered filesystem "free disk space is less than 20%" rule.
Housekeeping is enabled, set to the default 365 days. The setting "Refresh unsupported items" is 600.
Is your trigger also on an autodiscovered item? I'm wondering if the item disappearing and being recreated later due to autodiscovery is causing this. -
@ivarch My current Zabbix version is 2.4.5, os Ubuntu 14 and Oracle DB.
When 2.4.4 was up I have one problem with rule "free disk space is less than 20%", 20min later after some cleaning free space was 34%. Problem goes to OK day later.
Second time "Zabbix Agent is down on", problem goes off after unlink and clean of the template.
Third time was a custom trigger for checking if service has 2 processes. Trigger went to OK, after restart of the host executed by other team, so I don't know if only restart the zabbix agent would help.
Housekeeping is disable.Comment
-
I tried waiting 24 hours, but the same condition triggered again (the disk space drops briefly during each night), so I don't know if it would have cleared by itself.
I also tried restarting the Zabbix agent on the server that the trigger was reported about, but this made no difference.
If you are having the same problem I have, then I guess we can rule out housekeeping as the cause, since you have it disabled and I have it enabled.Comment
-
I've had another instance of this happening, this time on an auto-discovered IPMI based trigger rule.
The trigger expression is:
{HOSTNAME:ipmi_proliant.pl["Disk 6 Drive 6",sensor,'{$ILO}',{$ILO_USER},{$ILO_PASS},discret e].diff(0)}>0 and {HOSTNAME:ipmi_proliant.pl["Disk 6 Drive 6",sensor,'{$ILO}',{$ILO_USER},{$ILO_PASS},discret e].strlen(0)}>2
That host has 15 drives and they all changed state, triggering alerts on all of them, but all the other drives' triggers have correctly gone back to "OK".
This one has been incorrectly showing a warning for 24 hours and 40 minutes now.Comment
-
I've not had any more of these since I managed to drastically reduce the number of "Deadlock found when trying to get lock; try restarting transaction" errors in zabbix_server.log.
Running "SHOW ENGINE INNODB STATUS" on the database server helped to find the cause of most of these errors - in my case it was a separate program that was locking up the database from time to time.
So maybe this problem is caused when a database query fails due to a deadlock at an unfortunate time.Comment
Comment