Ad Widget

Collapse

Zabbix-proxy and data arrival delay.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • TriCK
    Junior Member
    • Feb 2010
    • 7

    #1

    Zabbix-proxy and data arrival delay.

    Our organization use Zabbix as main monitoring solution. Zabbix server monitors about 100 hosts directly and some of the boxes(~15-20) are monitored by zabbix-proxy.

    We are experiencing very strange behavior of zabbix-proxy. When we start proxy then main server begin to get actual data from hosts behind the proxy. Everything goes well for sometime, but in a few hours later "last check" time for proxied hosts begin to lag behind the actual time. At proxy start the lag time is 1-2 minuts (it's good), but in 24h lag time grows up to 20-25 minutes and so on. Restart of the proxy solves a problem for a while...

    I've didn't find solution of exact problem at this forum so i decided to start my own thread. Changelog for newer zabbix 1.8 doesn't include such bugfix.

    Zabbix-server version is 1.6.6
    /etc/zabbix/zabbix_server.conf
    ############ GENERAL PARAMETERS #################

    #NodeID=0
    StartPollers=32
    StartIPMIPollers=10
    StartPollersUnreachable=3
    StartTrappers=32
    StartPingers=20
    #StartDiscoverers=1
    #StartHTTPPollers=1
    #ListenPort=10051
    #ListenIP=127.0.0.1
    #HousekeepingFrequency=1
    SenderFrequency=30
    #DisableHousekeeping=1
    DebugLevel=3
    Timeout=10
    #TrapperTimeout=5
    #UnreachablePeriod=45
    #UnavailableDelay=15
    #UnavailableDelay=60
    PidFile=/var/run/zabbix-server/zabbix_server.pid
    LogFile=/var/log/zabbix-server/zabbix_server.log
    #LogFileSize=1
    AlertScriptsPath=/etc/zabbix/alert.d/
    #FpingLocation=/usr/sbin/fping
    #PingerFrequency=60

    DBHost=localhost
    DBName=zabbix
    DBUser=zabbix
    DBPassword=passwd
    #DBSocket=/tmp/mysql.sock
    Zabbix-proxy version 1.6.5

    ############ GENERAL PARAMETERS #################

    Server=/correct and checkes server name/
    ServerPort=10051
    Hostname=/correct proxy hostname/

    StartPollers=5
    #StartIPMIPollers=0
    StartPollersUnreachable=5
    #StartTrappers=5
    StartPingers=5
    #StartDiscoverers=1
    #StartHTTPPollers=1
    #ListenPort=10051
    #SourceIP=
    #ListenIP=127.0.0.1
    #HeartbeatFrequency=60
    ConfigFrequency=180
    HousekeepingFrequency=1
    #SenderFrequency=30
    #ProxyLocalBuffer=2
    ProxyOfflineBuffer=2
    DebugLevel=3
    Timeout=5
    #TrapperTimeout=5
    #UnreachablePeriod=45
    #UnavailableDelay=15
    #UnavailableDelay=60
    PidFile=/var/run/zabbix-proxy/zabbix_proxy.pid
    LogFile=/var/log/zabbix-proxy/zabbix_proxy.log
    #LogFileSize=1
    AlertScriptsPath=/home/zabbix/bin/
    #ExternalScripts=/etc/zabbix/externalscripts
    FpingLocation=/usr/sbin/fping
    #Fping6Location=/usr/sbin/fping6
    #TmpDir=/tmp
    #PingerFrequency=60
    DBHost=localhost
    DBPassword are ignored.
    DBName=zabbix_proxy
    DBUser=zabbix_proxy
    DBPassword=passwd
    #DBSocket=/tmp/mysql.sock
    Zabbix agents versions are 1.4.6(centos distr) and 1.6.8 (win)
    Centos conf
    Server=::ffff:192.168.125.178,192.168.125.178
    ServerPort=10051
    Hostname=/correct hostname/
    ListenPort=10050
    StartAgents=5
    DebugLevel=3
    PidFile=/var/run/zabbix/zabbix_agentd.pid
    LogFile=/var/log/zabbix/zabbix_agentd.log
    Timeout=3
    Notice: DNS doesn't work at proxied hosts(can resolve current hostname from /etc/hosts), but works on host with zabbix-proxy.

    Proxy log after start
    29657:20100209:112535 Starting zabbix_proxy. ZABBIX 1.6.5 (revision 7442).
    29657:20100209:112535 **** Enabled features ****
    29657:20100209:112535 SNMP monitoring: YES
    29657:20100209:112535 WEB monitoring: YES
    29657:20100209:112535 ODBC: YES
    29657:20100209:112535 IPv6 support: YES
    29657:20100209:112535 **************************
    29659:20100209:112535 server #1 started [Configuration syncer]
    29660:20100209:112535 server #2 started [Datasender]
    29663:20100209:112535 server #3 started [Poller. SNMP:YES]
    29670:20100209:112535 server #9 started [Trapper]
    29671:20100209:112535 server #10 started [Trapper]
    29672:20100209:112535 server #11 started [Trapper]
    29677:20100209:112535 server #15 started [ICMP pinger]
    29674:20100209:112535 server #13 started [ICMP pinger]
    29673:20100209:112535 server #12 started [Trapper]
    29678:20100209:112535 server #16 started [ICMP pinger]
    29679:20100209:112535 server #17 started [ICMP pinger]
    29666:20100209:112535 server #5 started [Poller. SNMP:YES]
    29687:20100209:112535 server #18 started [Housekeeper]
    29687:20100209:112535 Executing housekeeper
    29667:20100209:112535 server #6 started [Poller. SNMP:YES]
    29665:20100209:112535 server #4 started [Poller. SNMP:YES]
    29669:20100209:112535 server #8 started [Trapper]
    29690:20100209:112535 server #21 started [Poller for unreachable hosts. SNMP:YES]
    29675:20100209:112535 server #14 started [ICMP pinger]
    29691:20100209:112535 server #22 started [Poller for unreachable hosts. SNMP:YES]
    29692:20100209:112535 server #23 started [Poller for unreachable hosts. SNMP:YES]
    29657:20100209:112535 server #0 started [Heartbeat sender]
    29689:20100209:112535 server #20 started [Poller for unreachable hosts. SNMP:YES]
    29700:20100209:112535 server #24 started [HTTP Poller]
    29668:20100209:112535 server #7 started [Poller. SNMP:YES]
    29701:20100209:112535 server #25 started [Discoverer. SNMP:YES]
    29688:20100209:112535 server #19 started [Poller for unreachable hosts. SNMP:YES]
    29687:20100209:112536 Deleted 30191 records from history [0.731471 seconds]
    29668:20100209:112537 Item [b137.organization.com:vfs.dev.write[sda,,avg1]] error: Not supported by ZABBIX agent
    29668:20100209:112537 Parameter [vfs.dev.write[sda,,avg1]] is not supported by agent on host [b137.organization.com] Old status [0]
    29668:20100209:112537 Item [b137.organization.com:vfs.file.time[/var/run/puppet/puppetd.stamp]] error: Not supported by ZABBIX agent
    29668:20100209:112537 Parameter [vfs.file.time[/var/run/puppet/puppetd.stamp]] is not supported by agent on host [b137.organization.com] Old status [0]
    29666:20100209:112538 Item [b134.organization.comerf_counter[\Physical Disk(_Total)\Avg. Disk Read Queue Length]] error: Not supported by ZABBIX agent
    29666:20100209:112538 Parameter [perf_counter[\Physical Disk(_Total)\Avg. Disk Read Queue Length]] is not supported by agent on host [b134.organization.com] Old status [0]
    29665:20100209:112548 Item [b127.organization.comroc.num[sshd]] error: Get value from agent failed: ZBX_TCP_READ() failed [Interrupted system call]
    29665:20100209:112548 Host [b127.norganization.com]: first network error, wait for 15 seconds
    29665:20100209:112548 Parameter [proc.num[sshd]] will be checked after 240 seconds on host [b127.organization.com]
    29666:20100209:112548 Item [b127.organization.comroc.num[httpd]] error: Get value from agent failed: ZBX_TCP_READ() failed [Interrupted system call]
    29666:20100209:112548 Host [b127.organization.com]: first network error, wait for 15 seconds
    29666:20100209:112548 Parameter [proc.num[httpd]] will be checked after 240 seconds on host [b127.organization.com]
    I know that there are a lot of incorrect keys and triggers, but i dont'd think that this is a source of a problem.

    Zabbix agents on centos puts that string to log:
    20456:20100209:111421 Getting list of active checks failed. Will retry after 60 seconds
    But parameters are monitored. I don't know what to do to stop this message.

    Help me please how to solve my problems.
  • stemasie
    Junior Member
    • Sep 2009
    • 18

    #2
    hey,

    take a look at this.
    maybe it's the same problem with sqlite. try to recreate the DB file manually

    Comment

    • TriCK
      Junior Member
      • Feb 2010
      • 7

      #3
      Originally posted by stemasie
      hey,

      take a look at this.
      maybe it's the same problem with sqlite. try to recreate the DB file manually

      http://www.zabbix.com/forum/showthread.php?t=12184
      I have mysql. That topic doesn't have any right solutions. Just only guesses and recomendations that didn't solve the problem. But thanks for comment.

      Comment

      • stemasie
        Junior Member
        • Sep 2009
        • 18

        #4
        it solved the problem for me perfectly

        Comment

        • TriCK
          Junior Member
          • Feb 2010
          • 7

          #5
          Originally posted by stemasie
          it solved the problem for me perfectly
          Are you using sqlite or mysql?

          I do not use autocreation. I use mysql. I've created mysql database manually and made >cat initial_db.sql | mysql zabbix_proxy by hands and granted all on zabbix_proxy.* to zabbix_proxy user.

          Comment

          • stemasie
            Junior Member
            • Sep 2009
            • 18

            #6
            i'm using mysql too (since 5 minutes...).
            when you create a new db (example:zabbix2) and conduct the following steps again:

            shell> mysql -u<username> -p<password>
            mysql> create database zabbix2 character set utf8;
            mysql> quit;
            shell> cd create/schema
            shell> cat mysql.sql | mysql -u<username> -p<password> zabbix2

            ---edit---
            don't forget to set the new dbname in zabbix_proxy.conf

            Comment

            • stemasie
              Junior Member
              • Sep 2009
              • 18

              #7
              I've the same issue with mysql now. ..

              Comment

              • TriCK
                Junior Member
                • Feb 2010
                • 7

                #8
                Originally posted by stemasie
                I've the same issue with mysql now. ..
                So your solution with db recreation does not help?

                I can't find any difference between your recreation and initial setup which i made at the beginning...and solution did not help as it supposed to be.

                Guys, do you have any ideas?

                Comment

                • stemasie
                  Junior Member
                  • Sep 2009
                  • 18

                  #9
                  it works after a recreation...but only for a while, 1-2 times a week i've the problem again.
                  i've started the troubleshooting again, i assume that the problem depends on any cache options or a connection maximum.

                  @developers: what are you thinking? it's not the first thread about that problem. do you have any ideas?

                  Comment

                  • fjrial
                    Senior Member
                    • Feb 2010
                    • 140

                    #10
                    not getting values

                    Hi:

                    I think that my problem is similar to yours: I'm in the process of moving our noc software to zabbix.

                    I've monitored some elements. One of them is an SNMP router juniper m20.
                    I've monitorized all interfaces of the router and created graphs for all of them.

                    Zabbix starts monotoring everything fine, but after some time, it stops getting values from some interfaces and so the graphs are interrupted at some points. If I clear the history of the item graphed then it starts monitoring fine, but after some undefined time, it stops monotoring the interface.

                    Any help will be appreciated.
                    thanks

                    Comment

                    • stemasie
                      Junior Member
                      • Sep 2009
                      • 18

                      #11
                      That sounds simular...
                      Have you tried to set the "max_connection" parameter in the mysql configuration file (my.cnf) higher?

                      Comment

                      • TriCK
                        Junior Member
                        • Feb 2010
                        • 7

                        #12
                        Originally posted by stemasie
                        That sounds simular...
                        Have you tried to set the "max_connection" parameter in the mysql configuration file (my.cnf) higher?
                        Guys, let's talk about zabbix-proxy.

                        I didn't find any solution yet. I've fixed error messages about active checks on zabbix agents but it wasn't a reason that cause lags.

                        Comment

                        • stemasie
                          Junior Member
                          • Sep 2009
                          • 18

                          #13
                          Which database engine are you using with zabbix-proxy?
                          --edit--
                          i have this problem now once a day, same time.
                          Last edited by stemasie; 18-02-2010, 10:02.

                          Comment

                          • TriCK
                            Junior Member
                            • Feb 2010
                            • 7

                            #14
                            Originally posted by stemasie
                            Which database engine are you using with zabbix-proxy?
                            --edit--
                            i have this problem now once a day, same time.
                            As i said i use mysql from Debian Lenny.

                            Comment

                            • TriCK
                              Junior Member
                              • Feb 2010
                              • 7

                              #15
                              Zabbix guru, take a glance at this problem please.

                              Comment

                              Working...