Ad Widget

Collapse

Zabbix Server not stable

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • yjong
    Junior Member
    • Jan 2013
    • 4

    #1

    Zabbix Server not stable

    Hello,

    I installed Zabbix 3.0.14 on a Centos 7.4 system, but it is not performing stable. I am not totally new with Zabbix, but I didn't use it for a few years. On the host page I continuously see SNMP timeout on ip and port 161. Graphs come with gaps. First I though I might missed something with SELinux, but even when I totally disabled SELinux the behavior is till there.

    Below some lines of the Zabbix_Server.log. Hope that some of you can help me out, where to look or what to investigate further. I did build the same on an Ubuntu 16.04 LTS system and it is working fine, also this is not our linux standard.

    Code:
    1763:20180109:100611.330 SNMP agent item "fgHardInvHostname" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1766:20180109:100613.326 SNMP agent item "fgHaStatsHostname" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1764:20180109:100615.327 SNMP agent item "fgVdEntName" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1749:20180109:100617.328 SNMP agent item "fgHwSensorEntName_FAN" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1755:20180109:100618.326 resuming SNMP agent checks on host "sw-external": connection restored
      1754:20180109:100619.329 SNMP agent item "fgHwSensorEntName_PSStatus" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1760:20180109:100621.327 SNMP agent item "fgVdEntName" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1750:20180109:100623.332 SNMP agent item "fgHwSensorEntName_FAN" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1757:20180109:100630.328 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/27]" on host "sw-external" failed: first network error, wait for 15 seconds
      1773:20180109:100638.381 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1770:20180109:100638.381 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1772:20180109:100638.381 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1771:20180109:100638.410 cannot send list of active checks to "127.0.0.1": host [5wzab01.cloud.local] not found
      1759:20180109:100644.336 temporarily disabling SNMP agent checks on host "nl-nw-fw02": host unavailable
      1757:20180109:100650.339 temporarily disabling SNMP agent checks on host "nl-nw-fw01": host unavailable
      1769:20180109:100653.341 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/28]" on host "sw-external" failed: another network error, wait for 15 seconds
      1752:20180109:100657.341 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/25]" on host "sw-external" failed: another network error, wait for 15 seconds
      1767:20180109:100701.342 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/7]" on host "sw-external" failed: another network error, wait for 15 seconds
      1745:20180109:100716.343 resuming SNMP agent checks on host "sw-external": connection restored
      1756:20180109:100728.342 SNMP agent item "ifAdminStatus[StackPort1]" on host "sw-external" failed: first network error, wait for 15 seconds
      1763:20180109:100751.351 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/8]" on host "sw-external" failed: another network error, wait for 15 seconds
      1747:20180109:100755.352 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/23]" on host "sw-external" failed: another network error, wait for 15 seconds
      1762:20180109:100759.353 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/14]" on host "sw-external" failed: another network error, wait for 15 seconds
      1764:20180109:100814.354 resuming SNMP agent checks on host "sw-external": connection restored
      1755:20180109:100826.357 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/11]" on host "sw-external" failed: first network error, wait for 15 seconds
      1772:20180109:100838.395 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1773:20180109:100838.395 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1771:20180109:100838.395 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1770:20180109:100838.422 cannot send list of active checks to "127.0.0.1": host [5wzab01.cloud.local] not found
      1752:20180109:100849.365 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/23]" on host "sw-external" failed: another network error, wait for 15 seconds
      1755:20180109:100853.367 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/5]" on host "sw-external" failed: another network error, wait for 15 seconds
      1747:20180109:100857.366 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/3]" on host "sw-external" failed: another network error, wait for 15 seconds
      1767:20180109:100916.369 resuming SNMP agent checks on host "sw-external": connection restored
      1762:20180109:100920.372 SNMP agent item "ifAdminStatus[Vlan1]" on host "sw-external" failed: first network error, wait for 15 seconds
      1763:20180109:100943.378 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/28]" on host "sw-external" failed: another network error, wait for 15 seconds
      1745:20180109:100947.382 SNMP agent item "ifAdminStatus[StackSub-St1-2]" on host "sw-external" failed: another network error, wait for 15 seconds
      1759:20180109:100951.380 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/1]" on host "sw-external" failed: another network error, wait for 15 seconds
      1761:20180109:101014.384 temporarily disabling SNMP agent checks on host "sw-external": host unavailable
      1749:20180109:101014.384 enabling SNMP agent checks on host "sw-external": host became available
      1766:20180109:101018.385 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/5]" on host "sw-external" failed: first network error, wait for 15 seconds
      1768:20180109:101024.387 enabling SNMP agent checks on host "nl-nw-fw02": host became available
      1747:20180109:101036.392 SNMP agent item "fgIdsVdEntName" on host "nl-nw-fw02" failed: first network error, wait for 15 seconds
      1773:20180109:101038.408 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1771:20180109:101038.408 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1772:20180109:101038.408 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1770:20180109:101038.434 cannot send list of active checks to "127.0.0.1": host [5wzab01.cloud.local] not found
      1765:20180109:101041.394 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/17]" on host "sw-external" failed: another network error, wait for 15 seconds
      1753:20180109:101045.395 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/25]" on host "sw-external" failed: another network error, wait for 15 seconds
      1756:20180109:101049.395 SNMP agent item "ifAdminStatus[StackPort1]" on host "sw-external" failed: another network error, wait for 15 seconds
      1763:20180109:101050.393 enabling SNMP agent checks on host "nl-nw-fw01": host became available
      1767:20180109:101059.397 SNMP agent item "fgHardInvHostname" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1768:20180109:101103.400 SNMP agent item "fgAvVdEntName" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1745:20180109:101106.398 SNMP agent item "fgHaStatsHostname" on host "nl-nw-fw01" failed: first network error, wait for 15 seconds
      1763:20180109:101107.400 SNMP agent item "fgHaStatsHostname" on host "nl-nw-fw02" failed: another network error, wait for 15 seconds
      1764:20180109:101112.402 temporarily disabling SNMP agent checks on host "sw-external": host unavailable
      1748:20180109:101130.407 temporarily disabling SNMP agent checks on host "nl-nw-fw02": host unavailable
      1755:20180109:101133.405 SNMP agent item "ifName" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1761:20180109:101137.408 SNMP agent item "fgHwSensorEntName_FAN" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1758:20180109:101141.411 SNMP agent item "fgVpnTunEntPhase1Name" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1753:20180109:101145.413 SNMP agent item "fgHardInvHostname" on host "nl-nw-fw01" failed: another network error, wait for 15 seconds
      1767:20180109:101212.410 temporarily disabling SNMP agent checks on host "nl-nw-fw01": host unavailable
      1746:20180109:101220.410 enabling SNMP agent checks on host "sw-external": host became available
      1752:20180109:101232.418 SNMP agent item "ifAdminStatus[StackSub-St1-1]" on host "sw-external" failed: first network error, wait for 15 seconds
      1774:20180109:101238.421 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1770:20180109:101238.421 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1771:20180109:101238.421 cannot send list of active checks to "172.16.30.32": host [5wzab01.cloud.local] not found
      1773:20180109:101238.446 cannot send list of active checks to "127.0.0.1": host [5wzab01.cloud.local] not found
      1752:20180109:101255.426 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/16]" on host "sw-external" failed: another network error, wait for 15 seconds
      1769:20180109:101259.428 SNMP agent item "ifAdminStatus[GigabitEthernet2/0/13]" on host "sw-external" failed: another network error, wait for 15 seconds
      1768:20180109:101303.428 SNMP agent item "ifAdminStatus[GigabitEthernet1/0/10]" on host "sw-external" failed: another network error, wait for 15 seconds
    Last edited by yjong; 09-01-2018, 17:41.
  • AndrewSummer
    Junior Member
    • May 2017
    • 20

    #2
    Did SNMP ever work on this device ?

    The network Errors are not really uncommon. It can mean almost anything...

    Comment

    • yjong
      Junior Member
      • Jan 2013
      • 4

      #3
      SNMP is working well, if I am doing a snmpwalk -v 2c -c <community string> <ip address> is reporting back all the values of the devices, without any issues.

      I installed two environments of Zabbix, one on Centos 7.4 and one older one on Ubuntu 16.04 LTS. I did exactly the same things as far as this is possible on different Operating Systems. On Ubuntu everything is working ok and on Centos 7.4 is failing for firewall, switches, even linux hosts that I try to monitor via SNMP.

      As Centos is the linux standard for our company, I need to get it running on that operating system. So still figuring and investigating. My assumption is that the differences in packages might be an issue, because on Centos different packages are used.

      Comment

      • AndrewSummer
        Junior Member
        • May 2017
        • 20

        #4
        Yes, probably the version of snmpd and the libs for snmp are outdated on centos.

        I had severe issues with old libipmi on Ubuntu 16.04, switched to 17.10 and had no issues at all with the new libs.

        So I would suggest you try to get new versions of snmpd and the required libs on your centos host.

        Good luck!

        Comment

        • Mechanix
          Member
          • Jan 2017
          • 92

          #5
          I have noticed this behaviour when couple of (host) items are not supported. Check you items for that particular host

          Comment

          • yjong
            Junior Member
            • Jan 2013
            • 4

            #6
            Update the SNMP utils packages from 5.7.2 to 5.7.3, but this doesn't make any differences.

            As Mecahnix say, there are some items that are not supported. Strange that this is handle fine on an Ubuntu system and it showing issues on a Centos system. Anyone ideas on what to investigation further and maybe how to solve?

            Mechanix: I have noticed this behaviour when couple of (host) items are not supported. Check you items for that particular host

            Comment

            • Mechanix
              Member
              • Jan 2017
              • 92

              #7
              maybe some of the MIBs are missing on the system. On ubuntu you can install those with
              Code:
              sudo apt-get install snmp-mibs-downloader
              on centos I think there is no equivalent package so you have to make sure you´ve got the right MIBS in /usr/share/snmp/mibs

              Comment

              • AndrewSummer
                Junior Member
                • May 2017
                • 20

                #8
                Don't forget to configure your /etc/snmp/snmp.conf to accept new mibs
                and of course restart snmpd and zabbix-server

                Comment

                • kaspars.mednis
                  Senior Member
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Oct 2017
                  • 349

                  #9
                  Hi,

                  what is your timeout settings in zabbix_server.conf ?

                  this timeout does affect snmp polling as well, maybe this is too small to wait for data gathering ?

                  Regards,
                  Kaspars

                  Comment

                  Working...