Ad Widget

Collapse

Zabbix 1.1.3 action {TRIGGER.NAME}: {STATUS} wrong after status change

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • netod
    Member
    • Nov 2006
    • 36

    #1

    Zabbix 1.1.3 action {TRIGGER.NAME}: {STATUS} wrong after status change

    Hi guys, I'm using version 1.1.3 for both server and agentd. I have an action with the following template:

    Trigger severity = "High"
    Trigger severity = "Disaster"
    Trigger value = "ON"


    Severity: {TRIGGER.SEVERITY}
    Time: {TIME} - {DATE}
    Host: {HOSTNAME}
    IP: {IPADDRESS}

    {TRIGGER.NAME}: {STATUS}
    Trigger key: {TRIGGER.KEY}
    Value: {{HOSTNAME}:{TRIGGER.KEY}.last(0)}

    when the trigger changes to ON, evertything is fine and the email reports:

    Severity: High
    Time: 12:07:20 - 2006.11.14
    Host: XXXXXXXXXXXXXXXXXXXX
    IP: xxx.xxx.xxx.xxx

    Server XXXXXXXXXXXXXXXXXX is unreachable: ON
    Trigger key: status
    Value: 2

    when the trigger changes back to OFF,

    I have another alert template with the following:

    Trigger severity = "High"
    Trigger severity = "Disaster"
    Trigger value = "OFF"

    Severity: {TRIGGER.SEVERITY}
    Time: {TIME} - {DATE}
    Host: {HOSTNAME}
    IP: {IPADDRESS}

    {TRIGGER.NAME}: {STATUS}
    Trigger key: {TRIGGER.KEY}
    Value: {{HOSTNAME}:{TRIGGER.KEY}.last(0)}



    When the status changes to OFF I get an email with the following:


    Severity: High
    Time: 12:08:52 - 2006.11.14
    Host: XXXXXXXXXXXXXXXXXXXXXX
    IP: xxx.xxx.xxx.xxx

    Server XXXXXXXXXXXXXXXXXXX is unreachable: ON <------------------- Should be OFF?
    Trigger key: status
    Value: 0 <------------ notice the value which indicates that the trigger is OFF...


    Has anyone else seen this kindda problem? Seems like an issue with states, but I can't work out why its happening. Any help would be appreciated.
    Last edited by netod; 14-11-2006, 03:10.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    I cannot confirm this, everything works fine here in our test environment.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • disgruntleddutch
      Member
      • Oct 2006
      • 34

      #3
      From what I can tell, 0 is what you are supposed to be getting back for a status value if its seen as available, whereas 2 means its not available. I get 0 back for status on all my monitored hosts.

      Comment

      • netod
        Member
        • Nov 2006
        • 36

        #4
        Originally posted by disgruntleddutch
        From what I can tell, 0 is what you are supposed to be getting back for a status value if its seen as available, whereas 2 means its not available. I get 0 back for status on all my monitored hosts.
        yeah, the action email seems to contain the correct value of the trigger (if 0, host is OK, if 2 unreachable) But the actual status is not correct, status should be set to OFF...I get ON in both emails. Can anyone else confirm this kind of behavior?

        Comment

        • netod
          Member
          • Nov 2006
          • 36

          #5
          Originally posted by Alexei
          I cannot confirm this, everything works fine here in our test environment.

          I tested this again with a different host running version 1.1.3 agentd. The ON trigger specified above sends out 3 repeats. The OFF trigger acts only once.


          Content of ON email

          Severity: High
          Time: 14:59:36 - 2006.11.15
          Host: XXXXXXXXXXXXXXXXXXX
          IP: 202.xxx.xxx.xxx

          Server XXXXXXXXXXXXXXXXXXXX is unreachable: ON <----------- CORRECT value
          Trigger key: status
          Value: 2 <------------------ CORRECT VALUE


          Content of OFF email:

          Severity: High
          Time: 15:31:41 - 2006.11.15
          Host: XXXXXXXXXXXXXXXXXXX
          IP: 202.xxx.xxx.xxx

          Server XXXXXXXXXXXXXXXXXXX is unreachable: ON <------------ INCORRECT Value
          Trigger key: status
          Value: 0 <-------------------- CORRECT value


          These tests are executed by manually stopping and strating the zabbix agentd on the given host. Can you think of any reason why this would be happening?

          Comment

          • Alexei
            Founder, CEO
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Sep 2004
            • 5654

            #6
            Please could you post your trigger expression as well?
            Alexei Vladishev
            Creator of Zabbix, Product manager
            New York | Tokyo | Riga
            My Twitter

            Comment

            • netod
              Member
              • Nov 2006
              • 36

              #7
              Originally posted by Alexei
              Please could you post your trigger expression as well?

              Sure.. The trigger in question is for a host unreachable scenario

              ({Unix_t:status.last(0)}=2)&({Unix_t:status.min(60 0)}=2)&({Unix_t:system.uptime.nodata(600)}=1)
              ( Above is redundant)

              Comment

              • Alexei
                Founder, CEO
                Zabbix Certified Trainer
                Zabbix Certified SpecialistZabbix Certified Professional
                • Sep 2004
                • 5654

                #8
                Originally posted by netod
                Sure.. The trigger in question is for a host unreachable scenario

                ({Unix_t:status.last(0)}=2)&({Unix_t:status.min(60 0)}=2)&({Unix_t:system.uptime.nodata(600)}=1)
                ( Above is redundant)
                Your trigger depends on nodata() as well, it means that it may become OFF regardless of value of the status. I wouldn't the status with function nodata(), use a processor load item, TCP ping, whatever. It makes no sense using the status with 'nodata'.
                Alexei Vladishev
                Creator of Zabbix, Product manager
                New York | Tokyo | Riga
                My Twitter

                Comment

                • netod
                  Member
                  • Nov 2006
                  • 36

                  #9
                  Originally posted by Alexei
                  Your trigger depends on nodata() as well, it means that it may become OFF regardless of value of the status. I wouldn't the status with function nodata(), use a processor load item, TCP ping, whatever. It makes no sense using the status with 'nodata'.
                  Well I thought the way the trigger worked in words is this:

                  if ( last status = 2 ) AND (minimum value of status in last 600 seconds = 2) AND ( uptime value has not received data in the last 600 seconds = 1 ) then call the action.

                  This trigger would become true when the host status is 2 AND host status is 2 in last 600 seconds AND there was no uptime data received from host in last 600 seconds.

                  Am I wrong on this?

                  The nodata function is on the item uptime, this should make sense since uptime is polled and not an internal value. No?
                  Last edited by netod; 16-11-2006, 23:40.

                  Comment

                  • Alexei
                    Founder, CEO
                    Zabbix Certified Trainer
                    Zabbix Certified SpecialistZabbix Certified Professional
                    • Sep 2004
                    • 5654

                    #10
                    My apologies! I didn't notice the 'uptime', I thought 'nodata' was defined for the 'status'. You're absolutely right!
                    Alexei Vladishev
                    Creator of Zabbix, Product manager
                    New York | Tokyo | Riga
                    My Twitter

                    Comment

                    • netod
                      Member
                      • Nov 2006
                      • 36

                      #11
                      Originally posted by Alexei
                      My apologies! I didn't notice the 'uptime', I thought 'nodata' was defined for the 'status'. You're absolutely right!
                      So that means that the trigger should not be the main cause of this weirdness? Not sure how else I can try and debug it short of going through the code. What I might do is recompile 1.1.4 and see if that helps with this issue.

                      Comment

                      • Alexei
                        Founder, CEO
                        Zabbix Certified Trainer
                        Zabbix Certified SpecialistZabbix Certified Professional
                        • Sep 2004
                        • 5654

                        #12
                        Yes, something wrong is here. More testing will be performed prior to release of 1.1.5. I will keep you informed in this thread.
                        Alexei Vladishev
                        Creator of Zabbix, Product manager
                        New York | Tokyo | Riga
                        My Twitter

                        Comment

                        • netod
                          Member
                          • Nov 2006
                          • 36

                          #13
                          Originally posted by Alexei
                          Yes, something wrong is here. More testing will be performed prior to release of 1.1.5. I will keep you informed in this thread.
                          I have just upgraded to version 1.1.4 (server only) and I'm having the same problem. When the trigger becomes ON:

                          Severity: High
                          Time: 10:01:27 - 2006.11.17
                          Host: dev.XXXXXXXXXX.com.au
                          IP: 203.xxx.xxx.xxx

                          Server dev.XXXXXXXXXX.com.au unreachable: ON <--------CORRECT
                          Trigger key: status
                          Value: 2 <---------- CORRECT


                          Severity: High
                          Time: 10:02:45 - 2006.11.17
                          Host: dev.XXXXXXXXXX.com.au
                          IP: 203.xxx.xxx.xxx

                          Server dev.XXXXXXXXXX.com.au is unreachable: ON <-------- should be OFF
                          Trigger key: status
                          Value: 0 <--------- CORRECT

                          Comment

                          • bbrendon
                            Senior Member
                            • Sep 2005
                            • 870

                            #14
                            Is this problem discussed here related to repeats? I'm on 1.1.2 and if I use repeats setup when using an action with Trigger Value = ON, I'll get repeats even if the the Trigger Value = OFF before the repeat would run.
                            Unofficial Zabbix Expert
                            Blog, Corporate Site

                            Comment

                            • peter_field
                              Member
                              • Jun 2006
                              • 71

                              #15
                              I third this

                              I am intermittenly experiencing this also. I haven't looked into it too far, but I cant see why it works properly sometimes and not others. Perhaps something to do with the trigger going into UNKNOWN? Just throwing it out there?

                              Services Not Running: ON <-- Correct
                              2006.11.27 23:50:26
                              Trigger key: uServicesNotRunning
                              Value: AE Tax Publisher <-- Correct
                              Host: ffsrv

                              Services Not Running: ON <-- Should be OFF.
                              2006.11.28 11:50:26
                              Trigger key: uServicesNotRunning
                              Value: None <-- Correct
                              Host: ffsrv

                              <edit to post more detail>
                              Sorry, here are the details of my config.

                              The action is generic and works fine for most triggers. The trigger that has the most trouble (it does work properly sometimes) is listed below, one thing I have noticed with this trigger is it's status is quite often UNKNOWN (which is another matter for another thread).

                              Item Configuration:
                              Desc: Products Outdated
                              Type: Zabbix Agent
                              Key: uProdChk
                              Type: INT64
                              Units: (none)
                              Mult: Do not use
                              Interval/Hist/Trends: 86400/30/365
                              Status: Monitored
                              Store: As is
                              Show: As is

                              Trigger configuration:
                              Name: Products Outdated
                              Expr: {Windows_t:uProdChk.last(0)}>0
                              Deps: no dependences
                              Severity: Average
                              No comments/URL/not disabled

                              Action configutation:
                              Type: send message
                              Source: trigger
                              Conditions: Trigger severity >= "Warning"
                              Send to: single user
                              User: admin
                              Subject: FF:{STATUS}:{HOSTNAME}:{TRIGGER.NAME}
                              Message:
                              {TRIGGER.NAME}: {STATUS}
                              {DATE} {TIME}
                              Trigger key: {TRIGGER.KEY}
                              Value: {{HOSTNAME}:{TRIGGER.KEY}.last(0)}
                              Host: {HOSTNAME}

                              ON = Problem is occurring.
                              OFF = Problem has been rectified.
                              Repeat: No repeats
                              Status: Enabled

                              I hope this helps.

                              Peter
                              Last edited by peter_field; 29-11-2006, 01:33.

                              Comment

                              Working...