Ad Widget

Collapse

Something wrong with triggers

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • viadmin
    Junior Member
    • Jan 2015
    • 7

    #1

    Something wrong with triggers

    Hi, guys!

    I have a trigger "Server is down". Conditions: no data received from server during 90 seconds (processor usage and RAM usage). Some time ago I started to receive a letters from my Zabbix system with problem - Server is down. But when I look on graphs, there are no pauses or missed data - all is OK. What it could be?
    P.S. I receive data from servers every 15 seconds. I'll be very grateful for answer.
  • gleepwurp
    Senior Member
    • Mar 2014
    • 119

    #2
    Hi,

    can you send us the exact trigger definition?

    Sometimes the way we expect functions to work doesn't actually do what we think they should.

    G.

    Comment

    • tchjts1
      Senior Member
      • May 2008
      • 1605

      #3
      Personally, I think using 90 seconds for the threshold is very aggressive.
      I have my "Agent unreachable" alerts set to 5 minutes of no data.

      Comment

      • viadmin
        Junior Member
        • Jan 2015
        • 7

        #4
        90 seconds for me could be a little bit longer then I need ))) Couse if server reboots (and 90 seconds for rebooting could be enough) there is no warranty that all services will start normally and there will be no down-time. It's really critical for our business. And you can say that I can set up monitoring of services state, I know, they already set. But after rebooting I have to check that replication between databases works well and so on.

        And you asked me about trigger definitions. Here they are:
        Code:
        {mrss-02:perf_counter[\Processor(_Total)\% Processor Time,15].nodata(90)}=1 and {mrss-02:perf_counter[\Memory\Available Bytes,15].nodata(90)}=1
        By the way, after a little tunning of MySQL I become to receive less letters :-)
        Could be a problem with database or with speed of reading/writing from/to database?
        Last edited by tchjts1; 12-01-2015, 03:47. Reason: Inserted code tags

        Comment

        • gleepwurp
          Senior Member
          • Mar 2014
          • 119

          #5
          Hi,

          This is what I use for detecting server reboots, a trigger on the server's uptime:

          Server was rebooted:
          Code:
          {Template Windows:system.uptime.change(0)}<0
          As for server down, just having a trigger on the Zabbix agent not reporting any pings is usually enough, combined with ICMP Pings:

          Agent is Down:
          Code:
          {Template Windows:agent.ping.nodata(5m)}=1
          Server is Down:
          Code:
          {Template ICMP Ping:icmpping.max(#3)}=0
          G.

          Comment

          • ingus.vilnis
            Senior Member
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Mar 2014
            • 908

            #6
            Hi,

            Just to add some opinion to the original question I would say that the reason for false alerts could be performance issues on your Zabbix server and database. Try to check the performance graphs and do some tuning. Plenty of stuff on this topic available in forum here.

            From what I have seen I can tell that data might be stored in Zabbix server with delays. You told that you use nodata triggers set to 90 seconds. If the data is stored in Zabbix server later that that, the nodata triggers will fire. But then the data is received after e.g. 110 seconds and therefore you notice no gaps in the actual graphs.

            Alternative way to check for availability is to use Zabbix internal item with key "zabbix[host,agent,available]" on each server you monitor. This item will give you exact 1 id up or 0 if down, whereas agent.ping item will give you 1 id up and nothing if down, thus possibly reducing false alarms.

            But even though the performance problem remains. I would try to check that first.

            Best Regards,
            Ingus

            Comment

            Working...