Ad Widget

Collapse

how to get OK status

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • wlord
    Junior Member
    • May 2015
    • 10

    #1

    how to get OK status

    I am using zabbix to monitor a specific event for my backup application, it is working correctly.

    this trigger, emails me and remains in the dashboard.

    {hostname:eventlog[Backup,,Error,,190].logeventid(190)}=1


    Usually my backup errors due to transient network issues and after a retry is successful. This may sound silly to more advanced admin's but i'm new to this. How do I configure zabbix to send that trigger to "OK" on the following successful event?

    thank you in advance for the direction I have been searching for a day.
  • wlord
    Junior Member
    • May 2015
    • 10

    #2
    update

    wow no replies but my own...lonely. lol

    So I followed some documentation here.


    more specifically the last example seemed like it would solve what I wanted

    (({TRIGGER.VALUE}=1 and {WIN-Test:eventlog[Backup,,Error,,190].logeventid(190)}=1) or
    {TRIGGER.VALUE}=0) and {WIN-Test:eventlog[Backup,,Information,,190].logeventid(190)}=1

    It half works, I get the problem trigger for the Error 190 event ID. But it does not recover to 'ok' on the information event ID 190.





    {edit}
    I do get closer to my desired results with this:

    {WIN-Test:eventlog[Backup,,Error,,190].logeventid(190)}=1 and {WIN-Test:eventlog[Backup,,Error,,190].nodata(180)}=1

    Issue is my 'ok' status is a little false, its not because my backup became successful but just the data in log hasn't changed in 180.
    Last edited by wlord; 12-06-2015, 14:48.

    Comment

    • wlord
      Junior Member
      • May 2015
      • 10

      #3
      Is it even Possible.

      So at this point I would just like to know if this is even possible. As I've found i'm no the only person questioning if this is feasible. These went unanswered as well.





      Comment

      • BDiE8VNy
        Senior Member
        • Apr 2010
        • 680

        #4
        Not sure whether I have understood you correctly.

        Anyhow, how about this?
        Code:
        {TRIGGER.VALUE}=0 AND {WIN-Test:eventlog[Backup,,Information|Error,,190].logseverity()}=4 OR
        {TRIGGER.VALUE}=1 AND {WIN-Test:eventlog[Backup,,Information|Error,,190].logseverity()}<>1
        Trigger is in PROBLEM sate if:
        - its value was OK and log message with "Error" severity occurred
        - or if its value is in PROBLEM state and no log message with "Information" occurred.

        So basically trigger fires when "Error" occurred and keeps the state until "Information" occurs.

        I should mention that I neither have tested the suggested trigger expression nor am i experienced with Windows related stuff.
        So, it might be nonsense
        Last edited by BDiE8VNy; 18-06-2015, 19:20. Reason: replaced '#' operator by '<>'

        Comment

        • wlord
          Junior Member
          • May 2015
          • 10

          #5
          Close but no

          Thank you very much! I hadn't thought of log severity I was using ID. I had to tweak a little I was getting expression errors. Apparently doesn't like the # at the end.

          Code:
          {TRIGGER.VALUE}=0 and {WIN-Test:eventlog[Backup,,Information|Error,,190].logseverity()}=4 or
          {TRIGGER.VALUE}=1 and {WIN-Test:eventlog[Backup,,Information|Error,,190].logseverity()}=1
          Unfortunately its the same results I had earlier. The trigger fires into the 'problem' state when the 190 error happens in the event log. But when the retry event completes successfully with information 190 event. Zabbix doesn't recover.

          similar try with same results

          Code:
          (({TRIGGER.VALUE}=1 and {WIN-Test:eventlog[Backup,,Error,,190].logeventid(190)}=1) or 
          {TRIGGER.VALUE}=0) and {WIN-Test:eventlog[Backup,,Information,,190].logeventid(190)}=1
          My Key item
          Code:
          agent active
          log
          eventlog[Backup,,Information|Error,,190]
          I'll keep at it.

          Comment

          • BDiE8VNy
            Senior Member
            • Apr 2010
            • 680

            #6
            The syntax has slightly changed with Zabbix release 2.4 and i used a 2.2 operator instead the 2.4 one for 'not equal'

            I edited the origin post and corrected the operator accordantly.

            Comment

            • wlord
              Junior Member
              • May 2015
              • 10

              #7
              That did it!!!!!!

              This is exactly what I wanted to have happen I just couldn't get that last piece working. I will link this thread for others that i've seen have an issue with this and hopefully it'll help future searches.

              I see but don't fully understand where my logic is flawed (although it definitely is) Recovery happens on information event 190, so why the 'not equal'? As you can see all of my efforts I ended with =1 as opposed to <>1 My thoughts being recovery happens when you 'match' event information 190.

              Thank you for all your help.


              Not 100% there yet. I need to clear on both Information 190 AND Warning 190 (warnings are known issue) I'll post back if I get that one easily.
              Last edited by wlord; 18-06-2015, 21:04.

              Comment

              • wlord
                Junior Member
                • May 2015
                • 10

                #8
                Looks like I've got it. Here is the requirement should this help someone else.

                --Based on windows events ID and Severity
                -Backup application running then errors out(say network loss) (Error event ID 190) = Trigger a problem.
                -Backup application retries over and over = remains in dashboard no additional email notifications.
                -Backup application succeeds at some point later on (Information ID 190) = Recover from Trigger / email the 'OK'
                -Backup application succeeds but with warnings that can be safely ignored (Warning ID 190) = Recover from Trigger / email the 'OK'

                Here is the final product.

                Code:
                ((({TRIGGER.VALUE}=0 and {WIN-Test:eventlog[Backup,,Information|Warning|Error|,,190].logseverity()}=4) or 
                {TRIGGER.VALUE}=1) and {WIN-Test:eventlog[Backup,,Information|Warning|Error|,,190].logseverity()}<>1) and {WIN-Test:eventlog[Backup,,Information|Warning|Error|,,190].logseverity()}<>2

                Comment

                • BDiE8VNy
                  Senior Member
                  • Apr 2010
                  • 680

                  #9
                  Originally posted by wlord
                  [...]
                  I see but don't fully understand where my logic is flawed (although it definitely is) Recovery happens on information event 190, so why the 'not equal'?
                  [...]
                  Trigger hysteresis is indeed quite confusing in the beginning. See ZBXNEXT-2118

                  In fact one has to keep in mind that the the expression describes when a trigger should turn or stay in PROBLEM state:

                  Trigger is in PROBLEM sate if:
                  - its value was OK and log message with "Error" severity occurred
                  - or if its value is in PROBLEM state and no log message with "Information" occurred.


                  The first condition takes care of when the trigger may turn to PROBLEM state and the second condition takes care when to keep the trigger in PROBLEM state.

                  Regarding the latter it ensures the trigger to keep in PROBLEM state while no "Information" message occured. Otherwise the expression is FALSE and the trigger turns back to OK. That's the explaination why using 'not equal'.

                  Comment

                  • wlord
                    Junior Member
                    • May 2015
                    • 10

                    #10
                    Thank you for the clarification. This will probably help me with more triggers in the future.

                    Comment

                    • jgshier
                      Junior Member
                      • Mar 2016
                      • 13

                      #11
                      I know this is a little old, but on the same subject, what if you have two different event ids. One is a "Warning" with EventID of 3001 and the other is an
                      "Information" with EventID of 3002. We want to only clear the 3001 trigger if we get a 3002. Is that possible?
                      I tried,
                      (({TRIGGER.VALUE}=0 and ({TestPC:eventlog[Application,,"Information|Warning|Error",ZabbixTes t,,,skip].logseverity()}=2 and {TestPC:eventlog[Application,,"Information|Warning|Error",ZabbixTes t,,,skip].logeventid(3001)}=1)) or
                      ({TRIGGER.VALUE}=1 and ({TestPC:eventlog[Application,,"Information|Warning|Error",ZabbixTes t,,,skip].logseverity()}<>1 and {TestPC:eventlog[Application,,"Information|Warning|Error",ZabbixTes t,,,skip].logeventid(3002)}<>1)))
                      Which will work with my two actions for a down and up, but any "other" eventid will clear the dashboard which we don't want. How can I make the second part a false for all others? Any I doing this wrong and I really need to go and create an Item with the event ids in them?

                      Thanks for your help.

                      Comment


                      • Serkoning
                        Serkoning commented
                        Editing a comment
                        This seems to works for me with two different event-id's

                        Problem: {TestPC:eventlog[Application,,Warning,"ZabbixTest",3001,,].nodata(300)}=0
                        Recovery: {TestPC:eventlog[Application,,Information,"ZabbixTest",3002,,].logeventid(3002)}=1 and {TestPC:eventlog[Application,,Warning,"ZabbixTest",3001,,].nodata(300)}=1
                    Working...