Ad Widget

Collapse

Incomplete SNMP Graphs

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ptera
    Senior Member
    • Oct 2014
    • 109

    #1

    Incomplete SNMP Graphs

    Zabbix 3.0.3. © 2001–2016, Zabbix SIA



    Number of hosts (enabled/disabled/templates) 288 250 / 0 / 38
    Number of items (enabled/disabled/not supported) 12703 4592 / 7883 / 228
    Number of triggers (enabled/disabled [problem/ok]) 343 343 / 0 [1 / 342]
    Number of users (online) 3 2
    Required server performance, new values per second 68.81

    Can anyone help?

    I do not have this problem with my version 2.4.5

    Number of hosts (enabled/disabled/templates) 983 917 / 11 / 55
    Number of items (enabled/disabled/not supported) 10743 8771 / 1668 / 304
    Number of triggers (enabled/disabled [problem/ok]) 3531 3524 / 7 [2 / 3522]
    Number of users (online) 8 3
    Required server performance, new values per second 146.68 -

    My server log on the 3.0 system is full of these.
    1623:20160817:094540.850 SNMP agent item "ifOperStatus[wifi0]" on host ".47 Ptera138" failed: first network error, wait for 15 seconds
    1652:20160817:094555.402 resuming SNMP agent checks on host ".47 Ptera138": connection restored
    1633:20160817:094605.044 SNMP agent item "ifOutOctets[eth0]" on host ".44 Ptera143" failed: first network error, wait for 15 seconds
    1654:20160817:094620.418 resuming SNMP agent checks on host ".44 Ptera143": connection restored
    1625:20160817:095446.238 SNMP agent item "ifOutOctets[lo]" on host ".44 Ptera130" failed: first network error, wait for 15 seconds
    1664:20160817:095501.653 resuming SNMP agent checks on host ".44 Ptera130": connection restored
    1630:20160817:100444.798 SNMP agent item "ifAdminStatus[lo]" on host ".44 Ptera130" failed: first network error, wait for 15 seconds
    1646:20160817:100459.059 resuming SNMP agent checks on host ".44 Ptera130": connection restored
    1630:20160817:101312.801 SNMP agent item "ifAdminStatus[tunl0]" on host ".46 Ptera125" failed: first network error, wait for 15 seconds
    1651:20160817:101327.987 resuming SNMP agent checks on host ".46 Ptera125": connection restored
    1628:20160817:101516.478 SNMP agent item "ifInOctets[br0]" on host ".47 Ptera138" failed: first network error, wait for 15 seconds
    1638:20160817:101531.052 resuming SNMP agent checks on host ".47 Ptera138": connection restored
    1628:20160817:102412.167 SNMP agent item "ifOutErrors[wifi0]" on host ".47 Ptera131" failed: first network error, wait for 15 seconds
    1663:20160817:102427.173 resuming SNMP agent checks on host ".47 Ptera131": connection restored
    1626:20160817:102512.848 SNMP agent item "ifInErrors[eth0]" on host ".47 Ptera138" failed: first network error, wait for 15 seconds
    1655:20160817:102527.212 resuming SNMP agent checks on host ".47 Ptera138": connection restored
    1635:20160817:102634.709 SNMP agent item "ifOperStatus[wifi0]" on host ".44 Ptera143" failed: first network error, wait for 15 seconds
    1654:20160817:102649.230 resuming SNMP agent checks on host ".44 Ptera143": connection restored

    I followed all the instructions that I could find on the forum and googling.
    Or is this something we need to pay to get fixed?

    Thanks
  • Ruddimaster
    Member
    • Dec 2016
    • 49

    #2
    I have exactly the same issue...

    I think the hardware is ok (HP DL380 Gen9)...
    top - 19:00:15 up 4:37, 1 user, load average: 0,09, 0,07, 0,07
    Tasks: 457 gesamt, 1 laufend, 455 schlafend, 0 gestoppt, 1 Zombie
    %CPU(s): 0,5 be, 0,1 sy, 0,0 ni, 99,3 un, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
    KiB Spch : 16298484 gesamt, 6355028 frei, 3209660 belegt, 6733796 Puff/Cache


    ------------------------------------
    1294:20170105:194111.062 SNMP agent item "ifDescr" on host "Server2" failed: first network error, wait for 30 seconds
    1294:20170105:194133.082 resuming SNMP agent checks on host "Switch1": connection restored
    1296:20170105:194141.270 resuming SNMP agent checks on host "Switch2": connection restored
    1298:20170105:194203.261 SNMP agent item "hrProcessorLoad" on host "Switch2": first network error, wait for 30 seconds
    1360:20170105:194203.756 IPMI agent item "02-CPU1" on host "Server2" failed: first network error, wait for 30 seconds
    1351:20170105:194247.114 item "30456734-1234-1234-3230-353042285300:Fan1" became supported
    1294:20170105:194303.144 temporarily disabling SNMP agent checks on host "Switch2": host unavailable
    1294:20170105:194303.150 resuming IPMI agent checks on host "Server1": connection restored
    1294:20170105:194304.152 IPMI agent item "03-CPU2" on host "Server1" failed: first network error, wait for 30 seconds
    1353:20170105:194308.136 item "30456734-1234-1234-3230-353042285300":Fans" became supported
    ------------




    set in zabbix_server.conf
    Timeout=30
    StartPollers=100

    ------------------------
    In Administration -> "Queue" I see sometimes 3 items queued..

    Number of hosts (enabled/disabled/templates) 143 81 / 2 / 60
    Number of items (enabled/disabled/not supported) 6811 6277 / 23 / 511
    Number of triggers (enabled/disabled [problem/ok]) 2581 2551 / 30 [3 / 2548]
    Number of users (online) 9 2
    Required server performance, new values per second 100.41
    -----------------------------
    The same issue I have with IPMI...

    Zabbix 3.0.7.

    Zabbix Queue
    2017-01-05 19:37:52 1483641472 3
    2017-01-05 19:27:52 1483640872 98
    2017-01-05 19:17:52 1483640272 184
    2017-01-05 19:07:52 1483639672 187
    2017-01-05 18:57:52 1483639072 96
    2017-01-05 18:47:52 1483638472 184
    2017-01-05 18:37:52 1483637872 88
    2017-01-05 18:27:52 1483637272 97
    2017-01-05 18:17:52 1483636672 4
    2017-01-05 18:07:52 1483636072 0
    2017-01-05 17:57:52 1483635472 1
    2017-01-05 17:47:52 1483634872 1
    2017-01-05 17:37:52 1483634272 14
    2017-01-05 17:27:52 1483633672 2
    2017-01-05 17:17:52 1483633072 1
    2017-01-05 17:07:52 1483632472 0
    2017-01-05 16:57:52 1483631872 184
    2017-01-05 16:47:52 1483631272 0
    2017-01-05 16:37:52 1483630672 2
    2017-01-05 16:27:52 1483630072 1
    2017-01-05 16:17:52 1483629472 0
    2017-01-05 16:07:52 1483628872 93
    2017-01-05 15:57:52 1483628272 2
    2017-01-05 15:47:52 1483627672 0
    2017-01-05 15:37:52 1483627072 2

    Values processed by Zabbix server per second
    2017-01-05 19:39:57 1483641597 96.7451
    2017-01-05 19:38:57 1483641537 93.5207
    2017-01-05 19:37:57 1483641477 96.3181
    2017-01-05 19:36:57 1483641417 95.2895
    2017-01-05 19:35:57 1483641357 97.5815
    2017-01-05 19:34:57 1483641297 94.3979
    2017-01-05 19:33:57 1483641237 95.1267
    2017-01-05 19:32:57 1483641177 95.0379
    2017-01-05 19:31:57 1483641117 95.9417
    2017-01-05 19:30:57 1483641057 106.195
    2017-01-05 19:29:57 1483640997 92.7352
    2017-01-05 19:28:57 1483640937 96.6114
    2017-01-05 19:27:57 1483640877 95.9258
    2017-01-05 19:26:57 1483640817 95.5866
    2017-01-05 19:25:57 1483640757 95.3177
    2017-01-05 19:24:57 1483640697 94.2188
    2017-01-05 19:23:57 1483640637 95.1791
    2017-01-05 19:22:57 1483640577 95.4446
    2017-01-05 19:21:57 1483640517 95.6287
    2017-01-05 19:20:57 1483640457 94.8533
    2017-01-05 19:19:57 1483640397 96.9811
    2017-01-05 19:18:57 1483640337 96.8871

    Ubuntu 16.04.


    iftop within 2 minutes:
    TX: cum: 6,42MB peak: 835Kb
    RX: 9,32MB 1,80Mb
    TOTAL: 15,7MB 2,56Mb

    MySQL is on the same device.

    Comment

    • Ruddimaster
      Member
      • Dec 2016
      • 49

      #3
      I had an other issue with mysql.

      I have set the max_connections to 2000, but this value does work. It stuck on 214.
      For this I have found these articles:

      and
      https://support.plesk.com/hc/en-us/articles/213393029

      Could it be to have the same problem with zabbix?

      Comment

      • Ruddimaster
        Member
        • Dec 2016
        • 49

        #4
        Now I have a fresh installation on VMWare from scratch
        Ubuntu 16.04.1
        Zabbix 3.0.7


        I followed this guide (german)


        I added 6 HPE ILOs with VM Hypervisors on it.. + 2 Switches and 4 Routers + Zabbix server itself

        -------------------
        Zabbix server is running Yes localhost:10051
        Number of hosts (enabled/disabled/templates) 80 13 / 23 / 44
        Number of items (enabled/disabled/not supported) 2121 1882 / 0 / 239
        Number of triggers (enabled/disabled [problem/ok]) 382 326 / 56 [3 / 323]
        Number of users (online) 2 1
        Required server performance, new values per second 23.54
        --------------------
        (I have disabled the discovered guests)


        Now I have the same issue...!
        2923:20170107:180028.651 SNMP agent item "hrProcessorLoad" on host "IBNMJU11" failed: first network error, wait for 15 seconds
        2924:20170107:180029.990 SNMP agent item "hrStorageDescr" on host "IBNMJU12" failed: first network error, wait for 15 seconds
        2928:20170107:180055.653 SNMP agent item "hrProcessorLoad" on host "IBNMJU11" failed: another network error, wait for 15 seconds
        2926:20170107:180057.007 SNMP agent item "ifDescr" on host "IBNMJU12" failed: another network error, wait for 15 seconds
        2929:20170107:180059.898 SNMP agent item "ifDescr" on host "IBNMJU11" failed: another network error, wait for 15 seconds
        2930:20170107:180102.961 SNMP agent item "hrProcessorLoad" on host "IBNMJU12" failed: another network error, wait for 15 seconds
        2961:20170107:180105.957 item "123456789-1234-1234-1234-343330335235:C1P1IBay1" became supported
        2927:20170107:180107.009 SNMP agent item "hrStorageDescr" on host "IBNMJU12" failed: another network error, wait for 15 seconds
        2959:20170107:180108.961 item "123456789-1234-1234-1234-343330335235:C1P1IBay4" became supported
        2961:20170107:180112.966 item "123456789-1234-1234-1234-343330335235:C1P2IBay8" became supported
        2926:20170107:180126.020 temporarily disabling SNMP agent checks on host "IBNMJU11": host unavailable
        2970:20170107:180127.590 IPMI agent item "Fan3DutyCycle" on host "123456789-1234-1234-1234-343330335235" failed: first network error, wait for 15 seconds
        2929:20170107:180134.917 temporarily disabling SNMP agent checks on host "IBNMJU12": host unavailable
        2927:20170107:180143.203 resuming IPMI agent checks on host "123456789-1234-1234-1234-343330335235": connection restored
        2963:20170107:180146.985 item "123456789-1234-1234-1234-343330335235:SysHealthLED" became supported
        2951:20170107:180217.575 executing housekeeper
        2951:20170107:180217.593 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 0.017156 sec, idle for 1 hour(s)]
        2968:20170107:180232.644 IPMI agent item "Fan2" on host "123456789-1234-1234-1234-35304b485351" failed: first network error, wait for 15 seconds
        2926:20170107:180234.237 enabling SNMP agent checks on host "IBNMJU11": host became available
        2929:20170107:180242.962 enabling SNMP agent checks on host "IBNMJU12": host became available
        2927:20170107:180250.234 SNMP agent item "hrProcessorLoad" on host "IBNMJU11" failed: first network error, wait for 15 seconds
        2926:20170107:180250.255 IPMI agent item "Fan3" on host "123456789-1234-1234-1234-35304b485351" failed: another network error, wait for 15 seconds
        2929:20170107:180254.972 SNMP agent item "hrStorageDescr" on host "IBNMJU12" failed: first network error, wait for 15 seconds
        2927:20170107:180305.257 IPMI agent item "Fan5" on host "123456789-1234-1234-1234-35304b485351" failed: another network error, wait for 15 seconds
        2926:20170107:180317.270 SNMP agent item "ifDescr" on host "IBNMJU11" failed: another network error, wait for 15 seconds
        2927:20170107:180321.272 SNMP agent item "hrStorageDescr" on host "IBNMJU11" failed: another network error, wait for 15 seconds
        2928:20170107:180321.715 SNMP agent item "ifDescr" on host "IBNMJU12" failed: another network error, wait for 15 seconds
        2928:20170107:180324.016 resuming IPMI agent checks on host "123456789-1234-1234-1234-35304b485351": connection restored

        I have the same log entries like the productive environment (described above).

        Is this probably an issue with Ubuntu 16.04.1 LTS?

        Comment

        • bbrendon
          Senior Member
          • Sep 2005
          • 870

          #5
          Originally posted by Ruddimaster
          I had an other issue with mysql.

          I have set the max_connections to 2000, but this value does work. It stuck on 214.
          For this I have found these articles:

          and
          https://support.plesk.com/hc/en-us/articles/213393029

          Could it be to have the same problem with zabbix?
          The # of connections zabbix makes to mysql is a littles less than the number of zabbix_server processes.
          Unofficial Zabbix Expert
          Blog, Corporate Site

          Comment

          • Ruddimaster
            Member
            • Dec 2016
            • 49

            #6
            Yes you are right...

            ...but now it is set to 20.000 and zabbix use 600. But I have still this problem.
            The new installation has also (with less pollers) this issue.

            I am the only one with this combination Ubuntu 16.04.1 and 3.0.7. and this issue?

            Comment

            • Mechanix
              Member
              • Jan 2017
              • 92

              #7
              I have the same issue with 3.2 and its driving me nuts.

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #8
                Originally posted by Mechanix
                I have the same issue with 3.2 and its driving me nuts.
                Such issues where many times discussed.
                It is not zabbix issue but performance issue on SNMP agent side.
                Try to query +10 times the same OID or set of OIDs and you will see that even using snmpget/snmpwalk you will see the same timeouts.
                You need to contact HPE or whatever HW vendor you are using support to resolve this issue.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment

                Working...