Ad Widget

Collapse

Zabbix Queue too BIG

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • martins.felipe
    Member
    • Jul 2006
    • 34

    #1

    Zabbix Queue too BIG

    Dear all,

    I'm experiencing a strange thing, i think, at zabbix. My zabbix server monitors about 80 network misc O.S. nodes like Windows 2000, Windows 2003, Linux and Solaris. All monitoring is made using snmp to get the server's data.
    I have about 100 triggers per server but not all switched on for all servers, some have 50, and some have 100 , and so on.
    I think that my zabbix queue is getting to big because everyday the "5 minutos queue" gets up to more then 3000 items. Is that normal ? If it's not, what can I do to fix it, or improve its performance ?
    Below is a sample of my zabbix queue screen:

    QUEUE OF ITEMS TO BE UPDATED
    Delay Count
    5 seconds 0
    10 seconds 0
    30 seconds 76
    1 minute 840
    5 minutes 3210
    More than 5 minutes 238


    []'s
    Thanks in advance.
  • ad@kbc-clearing.com
    Member
    • Sep 2005
    • 77

    #2
    Take a look at http://www.zabbix.com/forum/search.php?searchid=232754

    Comment

    • martins.felipe
      Member
      • Jul 2006
      • 34

      #3
      Queue is working fine but it is too big

      The zabbix Queue is working fine, I just find it strange that the queue has so many items to monitor, like beyond 3000. There's no host that avoid another one to be monitored. Everything is working fine, but due to the number of items I have monitored it's just strange that appears 3000 in the queue.

      [ ]'s


      Comment

      • alj
        Senior Member
        • Aug 2006
        • 188

        #4
        Originally posted by martins.felipe
        The zabbix Queue is working fine, I just find it strange that the queue has so many items to monitor, like beyond 3000. There's no host that avoid another one to be monitored. Everything is working fine, but due to the number of items I have monitored it's just strange that appears 3000 in the queue.

        [ ]'s
        I have that on one machine which has slower disk IO than other machines. THis basically means that database engine does not keep up with the data you receive. You have to increase check intevals, disable unneeded monitors, make data retension period smaller and so on.

        I was also wondering if postgresql will be better here for zabbix because postgresql has significantly better write performance than innodb, and you really need write performence more than anything for zabbix.

        Comment

        • martins.felipe
          Member
          • Jul 2006
          • 34

          #5
          I agree !!!

          I agree with you, maybe postgresql has less overhead.
          I've already switched off every item i don't use, there may be a few, like 2 or 3, that I must have forgotten but they are few. But I like your idea, I didn't thing of change the check intervals, i'm gonna do so to be sure everything goes OK with the queue.

          Thanks
          [ ] 's

          Comment

          • fmtaylor2
            Member
            • May 2006
            • 66

            #6
            Smooth as glass setup

            with 18 pollers, using mysql on RHEL4 x86_64, 4G ram and 2 dual core Intel(R) Xeon(TM) CPU 3.20GHz

            Values stored 312524215
            Trends stored 12988598
            Number of hosts (monitored/not monitored/templates/deleted) 136(117/7/12/0)
            Number of items (monitored/disabled/not supported)[trapper] 13089(7429/3085/2575)[0]
            Number of triggers (enabled/disabled)[true/unknown/false] 4273(3678/595)[9/90/3579]
            Number of alarms 80704
            Number of alerts 506

            5 seconds 634
            10 seconds 333
            30 seconds 752
            1 minute 5
            5 minutes 0
            More than 5 minutes 0

            top - 15:59:46 up 3 days, 7:40, 2 users, load average: 3.70, 3.75, 4.80
            Tasks: 147 total, 1 running, 146 sleeping, 0 stopped, 0 zombie
            Cpu(s): 51.5% us, 6.9% sy, 0.0% ni, 34.4% id, 6.3% wa, 0.1% hi, 0.8% si
            Mem: 4041584k total, 4026132k used, 15452k free, 24796k buffers
            Swap: 4072436k total, 176k used, 4072260k free, 3478428k cached

            04:01:09 PM CPU %user %nice %system %iowait %irq %soft %idle intr/s
            04:01:09 PM all 19.44 0.01 2.48 4.49 0.04 0.27 73.28 2139.26
            04:01:09 PM 0 17.40 0.01 3.18 2.22 0.00 1.01 76.18 616.34
            04:01:09 PM 1 18.09 0.01 1.87 2.12 0.01 0.04 77.86 142.91
            04:01:09 PM 2 19.20 0.01 2.36 5.53 0.07 0.24 72.60 268.26
            04:01:09 PM 3 20.12 0.01 2.78 8.63 0.07 0.29 68.10 295.16
            04:01:09 PM 4 21.00 0.01 2.94 4.39 0.04 0.23 71.39 229.17
            04:01:09 PM 5 19.40 0.01 2.06 2.74 0.02 0.03 75.73 147.82
            04:01:09 PM 6 19.38 0.01 1.98 2.75 0.02 0.10 75.76 175.64
            04:01:09 PM 7 20.93 0.01 2.65 7.52 0.06 0.23 68.60 263.97

            avg-cpu: %user %nice %sys %iowait %idle
            19.44 0.01 2.79 4.49 73.27

            sda is a hardware raid 10 for the database. The OS is on sdb.

            Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
            sda 898.50 1781.31 4071.04 511091782 1168060416
            sda1 4299.32 1781.30 4070.92 511090214 1168028160

            If I increase the number of pollers it gets slower, if I get to many items that don't respond, the queue backs up, especially simple check pings that don't answer.

            It's a balancing act, but its working for me.

            Comment

            Working...