Ad Widget

Collapse

Can't get this trigger to do what I want it to do

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • frater
    Senior Member
    • Oct 2010
    • 340

    #1

    Can't get this trigger to do what I want it to do

    I am monitoring a Windows server that sometimes is unable to resolve certain domains. On this server I am monitoring DNS-performance with some 60 second perf_counter.

    Today this customer complained again and it was solved by restarting the Windows 2008R2 DNS-server. Because it happens quite often this customer is not amused...

    Checking my graphs I can clearly see a spike when it happened.
    So far, so good....

    I want to make a trigger to show me when it happens and thought of using 'count'

    Code:
    {ITEM.LASTVALUE1} Occurences of more than 1 Failed DNS Queries/minute within the last 2 hours on {HOSTNAME}
    Code:
    {Template_Windows_English:perf_counter[\DNS\Recursive Query Failure].count(7200,1,”gt”)}#0
    Anyone knows why it doesn't trigger....????


    Note: I'm using Zabbix 1.9.10
    Attached Files
    Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP
  • frater
    Senior Member
    • Oct 2010
    • 340

    #2
    Besides the problem with this trigger....
    I still don't know how I can get the value (which has to be unequal to 0) in the trigger description....

    Neither {ITEM.VALUE1} nor {ITEM.LASTVALUE1} gives me that value....

    This means it will only work when I use .last(0) in my trigger....
    Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP

    Comment

    • EnigmA-X
      Senior Member
      Zabbix Certified Specialist
      • Oct 2010
      • 116

      #3
      At this time, I cannot really explain why you didn't see a trigger, except in case your host is not connected to the template which defines this particular trigger.

      Did you already check your log-files for errors (zabbix server)?

      It might be that the trigger expression 'not equals' doesn't work with the count, but I'm not sure about that. Check if your version of Zabbix supports this (simply follow my suggestion below as a simple test).

      Following that thought, why are you using 'not equals 0' instead of 'more than 0' as your trigger expression? You couldn't worry that it happens -1 time, right?

      And finally, why do you want to know how many times it happened in the last hour? To me it looks like you should be interested *if* it happens, it should not matter if it happens once or a 1000 times...

      Comment

      • frater
        Senior Member
        • Oct 2010
        • 340

        #4
        Originally posted by EnigmA-X
        At this time, I cannot really explain why you didn't see a trigger, except in case your host is not connected to the template which defines this particular trigger.
        Thanks for looking at it and trying to help.
        This graph is from that specific host and therefore proof for this connection.
        In the mean time I've defined a trigger somewhat simpler which does work and is based on the same data.


        Originally posted by EnigmA-X

        Did you already check your log-files for errors (zabbix server)?

        It might be that the trigger expression 'not equals' doesn't work with the count, but I'm not sure about that. Check if your version of Zabbix supports this (simply follow my suggestion below as a simple test).
        I'm not sure anymore... I did look at the log and even enabled debugging, but don't know if it was for this. Normally the trigger would turn invalid if it's incorrect.


        Originally posted by EnigmA-X
        Following that thought, why are you using 'not equals 0' instead of 'more than 0' as your trigger expression? You couldn't worry that it happens -1 time, right?
        I expected this trigger to work immediately after I examined the wiki that explains the "count macro" so I used ">5". I replaced it with '#0' when I noticed it wasn't working.

        I'm now using "max(3600)>5" and that trigger is working....


        Originally posted by EnigmA-X

        And finally, why do you want to know how many times it happened in the last hour? To me it looks like you should be interested *if* it happens, it should not matter if it happens once or a 1000 times...
        I does matter really. I examined other hosts and several of them have these fails from time to time. AFAIK those clients don't have these problems this specific client has. I really want a threshold of 3 or 5....

        It's also because I might be using Zabbix in a different way. We're only using the dashboard (no active notification by email) and we're not looking at it all the time. By configuring the trigger the way I intend I will get a notification immediately after it happens and if it's only a spike it will show it for at least an hour. If it keeps occuring and I notice it 20 minutes too late I can immediately see it's rather serious as it will then show me that it happed 15 or 18 times and I will see the start time of the trigger....
        In 1 sentence I will have all the info to get started.
        Last edited by frater; 10-03-2012, 04:31.
        Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP

        Comment

        • frater
          Senior Member
          • Oct 2010
          • 340

          #5
          It turns out the double quote around gt (greater than) was the culprit.
          I only saw this after I enabled debug for zabbix_server.
          It's my opinion this should also been seen with Loglevel 3


          The examples given here:
          are wrong: http://www.zabbix.com/documentation/...onfig/triggers

          For example,
          count(600,12,”gt”) will return exact number of values which are more than '12' stored in the history for the last 600 seconds.
          Another example:
          count(#10,12,”gt”,86400) will return exact number of values which are larger than '12' stored in the history among last 10 values 24 hours ago.
          PS

          I don't know if this is of any influence, but I have this flag set in the agent.
          It's not needed for my Windows agents, but I do need it for my Linux agents...
          Code:
          UnsafeUserParameters=1
          Last edited by frater; 12-03-2012, 22:00.
          Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP

          Comment

          • richlv
            Senior Member
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Oct 2005
            • 3112

            #6
            what did you see in the debug log, btw ?
            Zabbix 3.0 Network Monitoring book

            Comment

            • frater
              Senior Member
              • Oct 2010
              • 340

              #7
              Code:
               25963:20120312:185807.827 evaluate_expressions():expression [{21590}#0] cannot be evaluated: Evaluation failed for function: {sss-server:perf_counter[\DNS\Recursive Query Failure].count(7200,0,â
                                                                                                                                                                                                                  25963:20120312:185807.827 evaluate_expressions():expression [{21594}#0] cannot be evaluated: Evaluation failed for function: {sss-server:perf_counter[\DNS\Total Query Received].count(7200,100,â
                                                                                                                                                                                                             25961:20120312:185833.373 In evaluate_function() function:'mr-wolfserver:perf_counter[\DNS\Recursive Query Failure].count(7200,0,â
                                                                                                                                         25961:20120312:185833.374 In evaluate_function() function:'mr-wolfserver:perf_counter[\DNS\Total Query Received].count(7200,100,â
                                                                    25961:20120312:185833.417 evaluate_expressions():expression [{21592}#0] cannot be evaluated: Evaluation failed for function: {mr-wolfserver:perf_counter[\DNS\Recursive Query Failure].count(7200,0,â
                                                                   25961:20120312:185833.417 evaluate_expressions():expression [{21596}#0] cannot be evaluated: Evaluation failed for function: {mr-wolfserver:perf_counter[\DNS\Total Query Received].count(7200,100,â
                                                                 25961:20120312:185848.857 In evaluate_function()
              Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP

              Comment

              • richlv
                Senior Member
                Zabbix Certified Trainer
                Zabbix Certified SpecialistZabbix Certified Professional
                • Oct 2005
                • 3112

                #8
                interesting. could you please report a new ZBX, so that it is not forgotten in case there's a serious problem
                Zabbix 3.0 Network Monitoring book

                Comment

                • frater
                  Senior Member
                  • Oct 2010
                  • 340

                  #9
                  @richlv

                  Could you also comment on my wish for some macro so I can get the outcome of the expression that's in the trigger?

                  For a nice output in my dashboard I really want this...

                  I just wrote a little script that returns the age of the latest file in a directory. For flexibility and precision I decided to return the age in seconds. Most of the time one would only need days or hours....

                  My trigger now says:
                  {Zabbix server:vfs.file.latestfiledate[/var/www/vhosts/abelgoldschmidt.nl/private, 2, Abel.7z].last(0)}>604800

                  But I would rather use:
                  {Zabbix server:vfs.file.latestfiledate[/var/www/vhosts/abelgoldschmidt.nl/private, 2, Abel.7z].last(0)/86400}>7

                  In the Description I then want to use:
                  FTP backup of Abel Goldschmidt on {HOSTNAME} is older than a week ({TRIGGER.RESULT1} days)

                  This would really make trigger descriptions cleaner and easier to use.
                  Zabbix agents on Linux, FreeBSD, Windows, AVM-Fritz!box, DD-WRT and QNAP

                  Comment

                  • richlv
                    Senior Member
                    Zabbix Certified Trainer
                    Zabbix Certified SpecialistZabbix Certified Professional
                    • Oct 2005
                    • 3112

                    #10
                    Originally posted by frater
                    Could you also comment on my wish for some macro so I can get the outcome of the expression that's in the trigger?
                    sorry, that's overloading the topic a bit... i'd suggest starting a new thread instead
                    Zabbix 3.0 Network Monitoring book

                    Comment

                    Working...