Ad Widget

Collapse

Zabbix Log File Monitoring False Positives

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rcollier
    Member
    • Sep 2013
    • 53

    #1

    Zabbix Log File Monitoring False Positives

    Greetings all,

    Since yesterday, I have received two false positive alerts for a log file monitoring item that I have configured for about a week or so. When I've received the alert, I immediate go to the log file and scan for the error string to verify that there is indeed a problem. When I scan the log file there are no entries containing the string I am monitoring for. The latest data for this host also confirms that there is no string because the latest data just shows the time stamp of when the item found the string, with no actual data (just a blank line) in the latest value field.

    My Environment
    - Zabbix Server Version 2.4.5
    - MySQL Version 14.14

    Item in Question
    - log[/apps/scope/profile-root/lpls_prodinstance/logs/domserver/scpp-dom.log,"Async IO operation failed (3), reason: RC: 73 Connection reset by peer","ISO-8859-1",200]

    Trigger Expression
    - {HOSTNAME:log[/apps/scope/profile-root/lpls_prodinstance/logs/domserver/scpp-dom.log,"Async IO operation failed (3), reason: RC: 73 Connection reset by peer","ISO-8859-1",200].nodata(35)}=0

    I've used this log file monitoring method for quite some time and have never come across any false positives until I upgraded our Zabbix environment from 2.2 to 2.4.5. Does anyone have an idea as to why I might be receiving false positive alerts? Did log file monitoring change with the introduction of 2.4.5?

    EDIT:

    I found these two entries in the Zabbix agent log:
    - 475276:20150708:080156.377 cannot open '/apps/scope/profile-root/lpls_prodinstance/logs/domserver/scpp-dom.log': [2] No such file or directory

    - 475276:20150708:080156.387 active check "log[/apps/scope/profile-root/lpls_prodinstance/logs/domserver/scpp-dom.log,"Async IO operation failed (3), reason: RC: 73 Connection reset by peer","ISO-8859-1",200]" is not supported

    I think I'm starting to get closer to understanding why this item became unsupported. Our application admin enabled additional logs which quickly fill up the log file that I am monitoring. Once the log reaches 10MB in size a new log is created matching the same name of the log file I am monitoring.
    Last edited by rcollier; 08-07-2015, 18:20. Reason: Additional additional findings
Working...