Ad Widget

Collapse

Looking on good log file monitoring method

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • calebwong
    Junior Member
    • Dec 2011
    • 10

    #1

    Looking on good log file monitoring method

    Dear All,
    As Title, I looking on a log file monitoring method. FYI, I have read forums / wiki / online documentation. I have founded different ways can be done what I am looking for.
    1. Zabbix agent (active) (such as: http://87.110.183.172/forum/showthread.php?t=20211)
    2. UserParameters
    3. Zabbix trapper items (push)
    4. 3rd party extensions: such as Zbxlog
    5. snmp4J -> with Zabbix SNMP trap

    After I grow through these information's, I still do not know how can i setup the log file monitoring.

    I have tested the 1st one, but I discover the response of Zabbix does not sensitive (even i set the "Update interval to 1 sec") .

    My testing platform: Zabbix Server 1.8.10(Zabbix Appliance / opensuse VM), Testing Client 1.8.5(win xp running on VM ESXi 5)

    My Case:
    [-] monitoring a log files update by "mail server / programs"(which keep update forever, the programs will keep update the status of itself which not only the error)
    [-] log file included different pattern message i want to detect. such as: "user 1 is removed" / "quota of user 2 is full." (is it configure a "items, trigger" with "Expression")
    [-] the monitoring system is Windows base / linux base (allow install Zabbix agent)

    May be I mis understood the meaning of forum / documentation. Please do not hesitate to let me know if I got a wrong meaning / point.

    Thanks all~!
  • zarbazan
    Junior Member
    • Mar 2012
    • 9

    #2
    It took me some time to understand how Zabbix treats logs, so I'll share my methods before I forgot them . Here are examples which work (well, for me at least).

    === Method 1, "nodata" trigger ===

    Goal: I want to be alerted whenever my application server throws a memory exception into AppServer.log file. After that, the trigger should go back to OK state.

    Item: log["/opt/local/mydomain/servers/myserver_int_1/logs/myserver_int_1.log","memory","ISO-8859-1",200]

    Trigger: {myhost:log["/opt/local/mydomain/servers/myserver_int_1/logs/myserver_int_1.log","memory","ISO-8859-1",200].nodata(35)}=0

    How it works: Whenever "memory" keyword is found in the log, the Trigger is set to Problem, so its Action sends a message. After that, if the keyword is not found for more than 35 sec, the Trigger is back to OK.

    === Method 2, "regexp" trigger ===

    Goal: My application generates all kinds of exceptions during its normal run, but I want to be alerted only when it throws an OutOfMemory exception into AppServer.log file. After that, the trigger should go back to OK state.

    Item: log["/opt/local/mydomain/servers/myserver_int_1/logs/myserver_int_1.log","Exception","UTF-8",100]

    Trigger: {TRIGGER.VALUE}=0 & {myserver:log[/opt/local/mydomain/servers/myserver_int_1/logs/myserver_int_1.log","Exception","UTF-8",100].regexp(OutOfMemory)}=1

    How it works: Whenever "Exception" keyword is found in the log, the string gets analyzed by the trigger's regexp. If it contains "OutOfMemory", Action is activated. {TRIGGER.VALUE}=0 means that the trigger must be in OK state in order to proceed with regexp. After that, the trigger will be in PROBLEM state until a new no-OutOfMemory Exception is thrown.

    === Which Method to Employ ===

    Method 1 does one step parsing and is good for monitoring non-regular well defined log events. It may fit monitoring a log files update by "mail server / programs" (Method 2 will work for them too, of course).

    Method 2 does two steps "fine-grained" parsing and is better for records generated on a regular basis. It may work for "user 1 is removed" / "quota of user 2 is full." In this case, the Item should watch for "user" keyword, and triggers can be defined with regexp for "removed" and "quota". However, if "quota of user 2 is full" and "quota of user 3 is full" are generated one by one in a raw, you'll be notified only for "user 2". For multiple alerts, "Multiple + Normal" should be selected in the trigger.

    === Why Resetting Log Trigger State is That Tricky ===

    Because while monitoring logs, Zabbix agent sends a signal only when it detects a keyword defined in the Item. Otherwise, it doesn't say anything, so we have to use workarounds to reset its state to OK. Method 1 uses timeout, Method 2 - next log event with the "primary" keyword ("secondary" would be the one in the Trigger regexp).
    Last edited by zarbazan; 30-03-2012, 14:04.

    Comment

    • calebwong
      Junior Member
      • Dec 2011
      • 10

      #3
      Thanks for your reply, I will try it, and update the relate information later on.

      Comment

      • caraconan
        Junior Member
        • Oct 2012
        • 8

        #4
        Just in case that you want to test a solution based and SNMP without zabbix agent:



        Regards

        Comment

        • shimas
          Junior Member
          • May 2012
          • 8

          #5
          Originally posted by calebwong
          Thanks for your reply, I will try it, and update the relate information later on.
          Have you figured out with those triggers?

          Cause I have same situation where log file includes different pattern message I want to detect.

          Comment

          • otheus
            Member
            • Mar 2009
            • 53

            #6
            @Zaraban: question:

            === Method 2, "regexp" trigger ===

            ...

            log["/opt/local/mydomain/servers/myserver_int_1/logs/myserver_int_1.log","Exception","UTF-8",100]

            Trigger: {TRIGGER.VALUE}=0 & ...
            How does the trigger value get reset and back into the OK state?

            Comment

            Working...