Ad Widget

Collapse

Problem with too much average load while running zabbix_server 1.4.4

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • pascalp
    Junior Member
    • Jan 2008
    • 5

    #1

    Problem with too much average load while running zabbix_server 1.4.4

    Hello,

    I recently installed Zabbix 1.4.4 Server and Clients on my machines. My server is virtual Debian machine with guaranteed 1500Mhz Xeon und nearly 800MB RAM. I'm monitoring 15 Linux-servers (most of them (+-10) having SuSE 9.3, 10.0, 10.1, 10.2 and a few with Debian 3.1 or 4). They all use zabbix-agentd. The SuSE servers are using active checks because I don't have any access to the router. On each server I try to monitor 15 items.

    My problem:

    From time to time the average load on my monitoring server raises up to 6, even 7. When I restart the server, the load seems to rest lower than 1 for a certain time but often after a few minutes it starts again raising until 5-6 or 7 sometimes. I think, but I'm not sure, that this raise of load is due to an alert on one of my servers. I mean, after the first alert, the load starts incrementing.

    First I thought that I was trying to monitor too much items on too much servers and so I reduced it to 15 servers with maximum 15 items and 15 triggers respectively. Next I set a minimum of 60 seconds for each item, even more for the biggest part. Nothing changed.

    I'm not sure now if this is a bug or a bad configuration, but if anyone has an idea for one of these cases, I would be grateful

    N.B. My configuration files are unchanged (expect server address of course)

    Thanks,
    pascal
  • Petya
    Member
    • Dec 2007
    • 37

    #2
    same

    I have the same issue.

    I've checked `top` during high LA and some zabbix_server
    processes ate memory while other did not.

    I've attached `strace` to the one process which ate memory
    and here's the listing:

    accept(4, 0xbfffd130, [16]) = -1 EBADF (Bad file descriptor)
    read(4, 0xbfffd19c, 5) = -1 EBADF (Bad file descriptor)
    accept(4, 0xbfffd130, [16]) = -1 EBADF (Bad file descriptor)
    read(4, 0xbfffd19c, 5) = -1 EBADF (Bad file descriptor)
    accept(4, 0xbfffd130, [16]) = -1 EBADF (Bad file descriptor)
    read(4, 0xbfffd19c, 5) = -1 EBADF (Bad file descriptor)
    accept(4, 0xbfffd130, [16]) = -1 EBADF (Bad file descriptor)
    read(4, 0xbfffd19c, 5) = -1 EBADF (Bad file descriptor)

    and so on.

    I suspect cycle in child_trapper_main in zabbix_server/trapper/trapper.c

    Comment

    • Alexei
      Founder, CEO
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Sep 2004
      • 5654

      #3
      The problem is fixed in pre 1.4.5. Hopefully ZABBIX 1.4.5 will be released within 2-3 weeks.
      Alexei Vladishev
      Creator of Zabbix, Product manager
      New York | Tokyo | Riga
      My Twitter

      Comment

      • bbrendon
        Senior Member
        • Sep 2005
        • 870

        #4
        Any chance I can back-port the fix patch to v1.4.4 ? Its quite an annoying problem. I'm having to restart zabbix every few days.

        Sounds very similar to this:
        Last edited by bbrendon; 14-01-2008, 19:52.
        Unofficial Zabbix Expert
        Blog, Corporate Site

        Comment

        • bbrendon
          Senior Member
          • Sep 2005
          • 870

          #5
          Bump. I don't see this mentioned in SVN changelogs. Help!
          Unofficial Zabbix Expert
          Blog, Corporate Site

          Comment

          • Alexei
            Founder, CEO
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Sep 2004
            • 5654

            #6
            Here is the patch:

            Code:
            Index: src/zabbix_server/trapper/active.c
            ===================================================================
            --- src/zabbix_server/trapper/active.c    (revision 5199)
            +++ src/zabbix_server/trapper/active.c    (revision 5209)
            @@ -100,7 +100,6 @@
                     if( zbx_tcp_send_raw(sock,s) != SUCCEED )
                     {
                         zabbix_log( LOG_LEVEL_WARNING, "Error while sending list of active checks");
            -            zbx_tcp_close(sock);
                         return  FAIL;
                     }
                 }
            @@ -114,7 +113,6 @@
                 if( zbx_tcp_send_raw(sock,s) != SUCCEED )
                 {
                     zabbix_log( LOG_LEVEL_WARNING, "Error while sending list of active checks");
            -        zbx_tcp_close(sock);
                     return  FAIL;
                 }
            Alexei Vladishev
            Creator of Zabbix, Product manager
            New York | Tokyo | Riga
            My Twitter

            Comment

            • bbrendon
              Senior Member
              • Sep 2005
              • 870

              #7
              Thanks. I used 1.4 from SVN yesterday, and today something went wrong because all my "server down" triggers went off at 4:25 AM.

              I'll use 1.4.4 with the above patch instead. Hopefully i'll have better luck.
              Unofficial Zabbix Expert
              Blog, Corporate Site

              Comment

              • Alexei
                Founder, CEO
                Zabbix Certified Trainer
                Zabbix Certified SpecialistZabbix Certified Professional
                • Sep 2004
                • 5654

                #8
                Originally posted by infinity005
                I'll use 1.4.4 with the above patch instead. Hopefully i'll have better luck.
                Please keep me updated.
                Alexei Vladishev
                Creator of Zabbix, Product manager
                New York | Tokyo | Riga
                My Twitter

                Comment

                • NOB
                  Senior Member
                  Zabbix Certified Specialist
                  • Mar 2007
                  • 469

                  #9
                  Hi Pascal

                  Originally posted by pascalp
                  Hello,

                  I recently installed Zabbix 1.4.4 Server and Clients on my machines. My server is virtual Debian machine with guaranteed 1500Mhz Xeon und nearly 800MB RAM.

                  [more info removed]

                  My problem:

                  From time to time the average load on my monitoring server raises up to 6, even 7. When I restart the server, the load seems to rest lower than 1 for a certain time but often after a few minutes it starts again raising until 5-6 or 7 sometimes. I think, but I'm not sure, that this raise of load is due to an alert on one of my servers. I mean, after the first alert, the load starts incrementing.

                  [more details removed]
                  Which virtual environment are you using ?

                  Are you aware, that the time does not move forward smoothly on VMWare ESX 2.x ?
                  It seems to be slightly better with 3.x, but the problem for the scheduler
                  will still be the clock.

                  The time is the same for several seconds and then, suddenly, the
                  time "jumps" a whole minute into the future.
                  So, all tasks scheduled for this minute have to be done immediately.

                  This can be another source of heavy loads from time to time, I guess.

                  Because we want to use ZABBIX for distributed monitoring, we will
                  not install a ZABBIX-Server in such an environment, not even for
                  testing purposes. AFAIK the synchronisation between the ZABBIX-Servers
                  depends on accurate time (clocks) on all involved systems.

                  Regards

                  Norbert.

                  Comment

                  • bbrendon
                    Senior Member
                    • Sep 2005
                    • 870

                    #10
                    Ahhh yes. Good ol' VMware and time problems. A nightmare.

                    Maybe use Virtualbox or Xen. I didn't realize ESX was plagued with the same VMware Server problems.

                    Out of VMware server, Virtualbox, and Xen, I've been most impressed with Virtualbox. The only thing wrong with Vbox it is geared toward a desktop solution because of the features. It can be made a server solution, but takes more work from an admin to script it all nicely.
                    Unofficial Zabbix Expert
                    Blog, Corporate Site

                    Comment

                    • pascalp
                      Junior Member
                      • Jan 2008
                      • 5

                      #11
                      Hello Norbert,

                      thanks for the tip. Until now I didn't know anything about the virtualization of our servers :P but my collegue said, that our provider is using "Virtuozzo" and not VMVare. Because I don't know very much about virtualization, I'll do know a little bit of homework and spam my provider to see if this is/will be a problem (after trying the patch mentionned above).

                      So, thanks for your help.
                      Regards,
                      Pascal

                      Comment

                      • bbrendon
                        Senior Member
                        • Sep 2005
                        • 870

                        #12
                        Originally posted by Alexei
                        Please keep me updated.
                        Those two lines seem to be a winner! I've been humming along nicely. Thanks for helping out!
                        Unofficial Zabbix Expert
                        Blog, Corporate Site

                        Comment

                        • pascalp
                          Junior Member
                          • Jan 2008
                          • 5

                          #13
                          I confirm! Zabbix_server is now running for a few hours and the load generally never became higher than +-1.5.

                          Thanks for the help,
                          Pascal

                          Comment

                          • rolandsym
                            Member
                            • Jul 2007
                            • 76

                            #14
                            zabbix patch?

                            Alexi,
                            Has this patch been incorporated into the development build yet?

                            Rolandsym

                            Comment

                            • Alexei
                              Founder, CEO
                              Zabbix Certified Trainer
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Sep 2004
                              • 5654

                              #15
                              Sure, it was!
                              Alexei Vladishev
                              Creator of Zabbix, Product manager
                              New York | Tokyo | Riga
                              My Twitter

                              Comment

                              Working...