Hello, I hope someone can explain this issue and odd status I see.
Zabbix 2.4.3 running on CentOS 7
This is a big server with large internal array, hard drives are SMART capable, and so smartd prints errors in the /var/log/messages log that I would like to catch and report on.
Item is set to Zabbix agent (active), type Log, checking every 2 seconds (maybe that is too frequent?) and key is:
logrt[/var/log/messages*,"SMART Fail",,,skip,]
I am using logrt as I expect CentOS to rotate this syslog messages file on some periodic basis.
The item comes up as 'unsupported' and the Info column red X reports a permission error on a file that doesn't exist (/var/log/messages-20150614). I think that file may have existed a few days ago, but I moved all messages.* files out of /var/log before I had this active check configured. I have also restarted the zabbix agent several times.
If I go to Monitoring>Latest Data for this server I see the same error reported in the info column, however "History" is active, and I am able to see the most recent "SMART Fail" syslog messages. One odd issue is that the History reports duplicate lines ... essentially 2 lines reported in history for each occurrence of the line matched ... why would that be?
Finally I have a Trigger to report this has High priority:
{Zabbix server:logrt[/var/log/messages*,"SMART Fail",,,skip,].count(3600)}>0
The intent is to keep the High status active as long as there are any "SMART Fail" messages found within the last hour. The SMART failure message is expected to pop up at least a few times per hour until it is resolved. Again in this screen the status is "unknown" and the info reports the same error on the non-existnat file.
So far it seems like the Trigger and alert is working. However, how do I clear the item's "unsupported" status and the erroneous error message referring to non-existing file? Also, how might I prevent the history from populating duplicate log lines?
Maybe I missed something in the documentation? Thanks
Zabbix 2.4.3 running on CentOS 7
This is a big server with large internal array, hard drives are SMART capable, and so smartd prints errors in the /var/log/messages log that I would like to catch and report on.
Item is set to Zabbix agent (active), type Log, checking every 2 seconds (maybe that is too frequent?) and key is:
logrt[/var/log/messages*,"SMART Fail",,,skip,]
I am using logrt as I expect CentOS to rotate this syslog messages file on some periodic basis.
The item comes up as 'unsupported' and the Info column red X reports a permission error on a file that doesn't exist (/var/log/messages-20150614). I think that file may have existed a few days ago, but I moved all messages.* files out of /var/log before I had this active check configured. I have also restarted the zabbix agent several times.
If I go to Monitoring>Latest Data for this server I see the same error reported in the info column, however "History" is active, and I am able to see the most recent "SMART Fail" syslog messages. One odd issue is that the History reports duplicate lines ... essentially 2 lines reported in history for each occurrence of the line matched ... why would that be?
Finally I have a Trigger to report this has High priority:
{Zabbix server:logrt[/var/log/messages*,"SMART Fail",,,skip,].count(3600)}>0
The intent is to keep the High status active as long as there are any "SMART Fail" messages found within the last hour. The SMART failure message is expected to pop up at least a few times per hour until it is resolved. Again in this screen the status is "unknown" and the info reports the same error on the non-existnat file.
So far it seems like the Trigger and alert is working. However, how do I clear the item's "unsupported" status and the erroneous error message referring to non-existing file? Also, how might I prevent the history from populating duplicate log lines?
Maybe I missed something in the documentation? Thanks