Ad Widget

Collapse

Housekeeper and DB performance

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • techaca
    Junior Member
    • Nov 2017
    • 2

    #1

    Housekeeper and DB performance

    Hello!

    We are using Zabbix 3.2 to monitor our network which currently has 320 Hosts and 550 NVP.
    This has been up and running fine for months. I am trying to clean up some issues with a few hosts, and had to delete the hosts and re-create them. After doing that for a few hosts they started reporting correctly to the Zabbix server, but now we are seeing performance issues.

    The issue arises when Housekeeper initializes. It starts 30 minutes after the server is rebooted and will run for 7 hours before stopping for an hour. After this hour housekeeper starts up again for 7 hours repeating the cycle.

    It appears that after running for this time period that nothing was deleted.

    Just looking for some adivce on troubleshooting or tuning this DB issue.

    Thanks in advance

    1781:20171108:151722.403 housekeeper [deleted 0 hist/trends, 3125000 items, 0 events, 168 problems, 0 sessions, 0 alarms, 0 audit items in 26197.656955 sec, idle for 1 hour(s)]
    1781:20171108:151722.403 __zbx_zbx_setproctitle() title:'housekeeper [deleted 0 hist/trends, 3125000 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 26197.656955 sec, idle for 1 hour(s)]'

    total used free shared buffers cached
    Mem: 64352 63175 1176 3237 386 61243
    -/+ buffers/cache: 1545 62806
    Swap: 131067 0 131067
    LogFile=/var/log/zabbixs/zabbix-server.log
    DebugLevel=3
    StartPollers=75
    StartPollersUnreachable=20
    StartTrappers=25
    StartPingers=10
    MaxHousekeeperDelete=5000
    CacheSize=128M
    StartDBSyncers=4
    HistoryCacheSize=128M
    HistoryIndexCacheSize=16M
    TrendCacheSize=16M
    ValueCacheSize=64M
    Timeout=4
    AlertScriptsPath=/usr/share/zabbix/alertscripts
    LogSlowQueries=3000
    top - 15:29:49 up 1 day, 1:48, 1 user, load average: 1.34, 1.95, 3.87
    Tasks: 386 total, 1 running, 385 sleeping, 0 stopped, 0 zombie
    %Cpu(s): 0.6 us, 0.4 sy, 0.0 ni, 97.7 id, 1.0 wa, 0.0 hi, 0.3 si, 0.0 st
    KiB Mem: 65896648 total, 64702512 used, 1194136 free, 395908 buffers
    KiB Swap: 13421363+total, 0 used, 13421363+free. 62723152 cached Mem

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    1201 mysql 20 0 6931328 392696 15944 S 6.623 0.596 297:09.47 mysqld
    1694 zabbixs 20 0 451780 35568 31264 S 0.662 0.054 1:21.61 zabbix-server
    1628 zabbixs 20 0 452064 35868 31264 S 0.331 0.054 1:20.80 zabbix-server
    1634 zabbixs 20 0 451936 35936 31364 S 0.331 0.055 1:18.89 zabbix-server
    1635 zabbixs 20 0 451720 35988 31092 S 0.331 0.055 1:20.12 zabbix-server
    1637 zabbixs 20 0 452092 35936 31216 S 0.331 0.055 1:21.03 zabbix-server
    1641 zabbixs 20 0 451932 35688 31116 S 0.331 0.054 1:20.04 zabbix-server
    1647 zabbixs 20 0 452428 36760 31200 S 0.331 0.056 1:21.51 zabbix-server
    1648 zabbixs 20 0 452040 35948 31284 S 0.331 0.055 1:20.27 zabbix-server
    1650 zabbixs 20 0 451756 35612 31228 S 0.331 0.054 1:21.09 zabbix-server
    1654 zabbixs 20 0 451696 36116 31248 S 0.331 0.055 1:20.60 zabbix-server
    1660 zabbixs 20 0 452040 35952 31284 S 0.331 0.055 1:21.50 zabbix-server
    1661 zabbixs 20 0 452076 36620 31368 S 0.331 0.056 1:20.20 zabbix-server
    1663 zabbixs 20 0 451796 36336 31476 S 0.331 0.055 1:20.50 zabbix-server
    1664 zabbixs 20 0 452064 35968 31364 S 0.331 0.055 1:20.25 zabbix-server
    1678 zabbixs 20 0 451784 36124 31276 S 0.331 0.055 1:20.63 zabbix-server
    1679 zabbixs 20 0 452328 36160 31332 S 0.331 0.055 1:20.38 zabbix-server
    1683 zabbixs 20 0 452048 36500 31368 S 0.331 0.055 1:20.51 zabbix-server
    1685 zabbixs 20 0 451916 35900 31384 S 0.331 0.054 1:21.37 zabbix-server
    1686 zabbixs 20 0 452044 35912 31300 S 0.331 0.054 1:20.26 zabbix-server
    1688 zabbixs 20 0 451692 35644 31324 S 0.331 0.054 1:21.68 zabbix-server
    1689 zabbixs 20 0 452072 35936 31236 S 0.331 0.055 1:20.42 zabbix-server
    1692 zabbixs 20 0 452200 35968 31152 S 0.331 0.055 1:20.20 zabbix-server
    1698 zabbixs 20 0 451756 36416 31484 S 0.331 0.055 1:20.82 zabbix-server
    1701 zabbixs 20 0 452408 36932 31364 S 0.331 0.056 1:20.21 zabbix-server
    1713 zabbixs 20 0 451144 30992 28580 S 0.331 0.047 0:24.04 zabbix-server
    1793 zabbixs 20 0 470540 62276 37792 S 0.331 0.095 1:02.12 zabbix-server
    1795 zabbixs 20 0 474876 66676 37904 S 0.331 0.101 1:01.60 zabbix-server



  • kaspars.mednis
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2017
    • 349

    #2
    Hi !

    This problem has nothing to do with NVPs and host count,
    the problem is DB size / Disk IO performance
    Delete SQL queries are taking too much resources and are becoming very slow and resource hungry with very big tables.

    you can cut your history/trends storage period, move to faster hardware with more iops (RAID10, SSD disks), or use partitioning

    Join the friendly and open Zabbix community on our forums and social media platforms.


    pros: You will be able to drop history data for a whole period in a few seconds
    cons: You will lose fine tuning in history keeping periods for different items (By example - if you choose to keep history data for 14 days for history_uint table, ALL items with data type numeric unsigned will be kept for 14 days,ignoring the period specified in item. That is because entire table partition is dropped for a whole day, but it has a HUGE performance gain over standard DELETE queries.

    Regards,
    Kaspars

    Comment

    • techaca
      Junior Member
      • Nov 2017
      • 2

      #3
      Thank you for your Reply!

      That does make sense, I thought the issue was related to the MySQL DB just wasn't sure the best way to resolve it.
      We are looking at going the Partitioning route, are there any noticable disadvantages to this other then the fine tuning you mentioned?

      Comment

      • kaspars.mednis
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2017
        • 349

        #4
        I can't remember any other disadvantages, had used partitioning with zabbix for years without problems.

        If new partitions are created in time, everything is working fine.
        The only time i had problems with it, was broken scheduler (because of human intervention), no new partition was created, and zabbix was unable to write history data (zabbix will complain about it in zabbix_server.log file). Just created by hand new partition and everything started working at the same moment

        The mysql DB design will become a little more complex, but it's worth it

        Regards,
        Kaspars

        Comment

        Working...