Ad Widget

Collapse

zabbix history syncer processes more than 75 busy

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • joszif
    Junior Member
    • Jan 2018
    • 5

    #16
    Hi Kloczek,

    sorry but I'm newbie to storage performance issue.
    I installed fio and made random read and write performance tests. If I don't make a mistake the read/write ratio is 1/2.


    random read:

    bi@zabbix:/tmp$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randread
    test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
    fio-2.2.10
    Starting 1 process
    Jobs: 1 (f=1): [r(1)] [100.0% done] [6533KB/0KB/0KB /s] [1633/0/0 iops] [eta 00m:00s]
    test: (groupid=0, jobs=1): err= 0: pid=38667: Fri Feb 9 13:03:34 2018
    read : io=4096.0MB, bw=8550.2KB/s, iops=2137, runt=490554msec
    cpu : usr=0.76%, sys=2.61%, ctx=270801, majf=0, minf=73
    IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
    submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
    issued : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
    latency : target=0, window=0, percentile=100.00%, depth=64

    Run status group 0 (all jobs):
    READ: io=4096.0MB, aggrb=8550KB/s, minb=8550KB/s, maxb=8550KB/s, mint=490554msec, maxt=490554msec

    Disk stats (read/write):
    sda: ios=1058742/124229, merge=0/23899, ticks=32882792/272992, in_queue=33156740, util=100.00%



    random write:


    bi@zabbix:/tmp$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randwrite
    test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
    fio-2.2.10
    Starting 1 process
    Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/8516KB/0KB /s] [0/2129/0 iops] [eta 00m:00s]
    test: (groupid=0, jobs=1): err= 0: pid=48603: Fri Feb 9 14:03:33 2018
    write: io=4096.0MB, bw=4117.3KB/s, iops=1029, runt=1018716msec
    cpu : usr=0.36%, sys=1.49%, ctx=317704, majf=0, minf=9
    IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
    submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
    issued : total=r=0/w=1048576/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
    latency : target=0, window=0, percentile=100.00%, depth=64

    Run status group 0 (all jobs):
    WRITE: io=4096.0MB, aggrb=4117KB/s, minb=4117KB/s, maxb=4117KB/s, mint=1018716msec, maxt=1018716msec

    Disk stats (read/write):
    sda: ios=9362/1122857, merge=7/20915, ticks=1006788/54816900, in_queue=55825364, util=100.00%



    read/write:


    bi@zabbix:/tmp$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
    test: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
    fio-2.2.10
    Starting 1 process
    Jobs: 1 (f=1): [m(1)] [100.0% done] [5966KB/1874KB/0KB /s] [1491/468/0 iops] [eta 00m:00s]
    test: (groupid=0, jobs=1): err= 0: pid=56500: Fri Feb 9 14:37:32 2018
    read : io=3071.7MB, bw=4857.5KB/s, iops=1214, runt=647541msec
    write: io=1024.4MB, bw=1619.9KB/s, iops=404, runt=647541msec
    cpu : usr=0.72%, sys=2.29%, ctx=444454, majf=0, minf=9
    IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
    submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
    issued : total=r=786347/w=262229/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
    latency : target=0, window=0, percentile=100.00%, depth=64

    Run status group 0 (all jobs):
    READ: io=3071.7MB, aggrb=4857KB/s, minb=4857KB/s, maxb=4857KB/s, mint=647541msec, maxt=647541msec
    WRITE: io=1024.4MB, aggrb=1619KB/s, minb=1619KB/s, maxb=1619KB/s, mint=647541msec, maxt=647541msec

    Disk stats (read/write):
    sda: ios=787076/417154, merge=0/45641, ticks=34827636/6285268, in_queue=41114504, util=100.00%




    joszif

    Comment

    • bbrendon
      Senior Member
      • Sep 2005
      • 870

      #17
      3.4.4 was killing me for some template issues and then 3.4.6 killed my disk.

      I'm hoping my problems are gone with 347rc1


      Here is a packaged version I made of 3.4.7rc1 using zabbix brand package rules.
      Unofficial Zabbix Expert
      Blog, Corporate Site

      Comment

      • Breorn
        Junior Member
        • Dec 2017
        • 2

        #18
        History syncer and Housekeeper problem - v3.4.6

        Hello. Have the same problem too. Does anybody know, when v3.4.7 for RHEL 7 will be released? These issues really killing me. I really like Zabbix, but now I cannot use it as a primary oversight system, because it shows nonsenses (ZBX is overloaded and cannot process data to MariaDB). v3.4.4 was OK. Thanks a lot. There is v3.4.7 section in Zabbix manual, but this ver. is not present in official repo.

        Comment

        • tcilmo
          Senior Member
          • Nov 2016
          • 122

          #19
          Any estimates when 3.4.7 will be available?

          Comment

          • tcilmo
            Senior Member
            • Nov 2016
            • 122

            #20
            Re:

            Bbrendon,

            Will you please go into detail about what happened to your disk\environment when you were on 3.4.6?

            -Tony

            Comment

            • tcilmo
              Senior Member
              • Nov 2016
              • 122

              #21
              Originally posted by Breorn
              Hello. Have the same problem too. Does anybody know, when v3.4.7 for RHEL 7 will be released? These issues really killing me. I really like Zabbix, but now I cannot use it as a primary oversight system, because it shows nonsenses (ZBX is overloaded and cannot process data to MariaDB). v3.4.4 was OK. Thanks a lot. There is v3.4.7 section in Zabbix manual, but this ver. is not present in official repo.
              We were just told that Zabbix is going to prepare 3.4.7 packages early next week!

              Comment

              • tcilmo
                Senior Member
                • Nov 2016
                • 122

                #22
                Originally posted by kaspars.mednis
                3.4.5 and 3.4.6 are affected by


                that can be the cause of your issue

                Regards,
                Kaspars
                We were just told by Zabbix that the the bug (ZBX-13343) exists in all versions prior to 3.4.7.

                Comment

                • bbrendon
                  Senior Member
                  • Sep 2005
                  • 870

                  #23
                  Originally posted by tcilmo
                  Bbrendon,

                  Will you please go into detail about what happened to your disk\environment when you were on 3.4.6?

                  -Tony
                  It's detailed in the beginning of this thread by others.
                  Unofficial Zabbix Expert
                  Blog, Corporate Site

                  Comment

                  • kloczek
                    Senior Member
                    • Jun 2006
                    • 1771

                    #24
                    Originally posted by joszif
                    Hi Kloczek,

                    sorry but I'm newbie to storage performance issue.
                    I installed fio and made random read and write performance tests. If I don't make a mistake the read/write ratio is 1/2.


                    random read:

                    bi@zabbix:/tmp$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randread
                    test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
                    fio-2.2.10
                    I've been asking you not to generate IO workload with such ratio but to have look at what is the ratio between read an write IOs generated by DB engine used by your zabbix server.
                    Do you have the monitoring of the IOs on this host?
                    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                    https://kloczek.wordpress.com/
                    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                    My zabbix templates https://github.com/kloczek/zabbix-templates

                    Comment

                    • kacareu
                      Junior Member
                      • Feb 2018
                      • 1

                      #25
                      Originally posted by joszif
                      Hi Kloczek,

                      sorry but I'm newbie to storage performance issue.
                      I installed fio and made random read and write performance tests. If I don't make a mistake the read/write ratio is 1/2.



                      joszif
                      Hi joszif,

                      What Kloczek implies is that you need to check write and read operation counts that are generated by Zabbix Server.

                      So you can use sysstat package for this purpose. For example;

                      Code:
                      iostat -x 2
                      ..output gives you r/s and w/s values for each disk in your system. Check the values and see the R/W ratio. You may post it here too, so we can check together.

                      Comment

                      • db100
                        Member
                        • Feb 2023
                        • 61

                        #26
                        i have got a similar issue, however in my case the zabbix server is working quite smootlhy during regular operation, but as soon as it is restarted for some reason all processes go crazy ... here the metrics. is there a way to avoid this wild behavior ?

                        Click image for larger version

Name:	image.png
Views:	386
Size:	184.2 KB
ID:	465973

                        Comment


                        • db100
                          db100 commented
                          Editing a comment
                          browsing through the forum i could find out that the history write cache seems to be the problem, you can see that going suddenly way up to 100 % ... i believe this might be due to some disk backpressure and/or cpu load ... but again i cant guess why this sudden change in behavior.
                          Would increasing the histroy_cache_size and the number of db_ / history_syncers solve this problem ?
                      Working...