Ad Widget

Collapse

debugging problem with gaps in graphs

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • couker
    Junior Member
    • Jun 2010
    • 5

    #1

    debugging problem with gaps in graphs

    Hello,
    I have also problem with gaps in graphs. I searched through forum and found that I am not alone. I tried all proposed solutions like tune mysql etc. but still no success. I do not have many items and hosts. Just 100 hosts, 1500 items and 800 triggers. New values per second: 10.42.

    I tried to debug it with debug level 4.
    What I have found:

    Monitored switch has itemid 22766 in database.

    zabbix server receives correctly value from snmp. Here is part from log file
    Code:
    10894:20100808:152926.545  SNMP [[email protected]:161:161]
     10894:20100808:152926.546 End of snmp_open_session()
     10894:20100808:152926.546 Standard processing
     10894:20100808:152926.546 In snmp_normalize(oid:1.3.6.1.2.1.2.2.1.16.98)
     10894:20100808:152926.546 End of nmp_normalize():1.3.6.1.2.1.2.2.1.16.98
     10894:20100808:152926.546 In get_snmp(oid:1.3.6.1.2.1.2.2.1.16.98)
     10894:20100808:152926.548 Status send [0]
     10894:20100808:152926.548 AV loop OID [1.3.6.1.2.1.2.2.1.16.98] Type  0x41] 'Counter32: 1154862166'
     10894:20100808:152926.548 End of get_snmp():SUCCEED
     10894:20100808:152926.548 In snmp_close_session()
     10894:20100808:152926.548 End of snmp_close_session()
     10894:20100808:152926.548 End of get_value_snmp():SUCCEED
     10894:20100808:152926.548 End of get_value():SUCCEED
     10894:20100808:152926.548 In calculate_item_nextcheck  22766,300,"",1281274166)
     10894:20100808:152926.548 End calculate_item_nextcheck (result:1281274466)
    So I assume that value at 15:29 was correctly received. Next check is after 5 mins so that is also correct.

    After a while there is a several updates which looks like this:
    Code:
    update items set lastclock=1281274168,prevvalue=lastvalue,prevorgvalue='2461665672.000000',lastvalue='18.333333' where itemid=22468;
    update items set lastclock=1281274165,prevvalue=lastvalue,prevorgvalue='693846650.000000',lastvalue='2300722.866667' where itemid=22765;
    update items set lastclock=1281274166,prevorgvalue='1154862166.000000' where itemid=22766;
    Don't know why the update for item 22766 is different but maybe that's correct.

    After these updates there's a big insert into database history table.

    Code:
    insert into history (itemid,clock,value) values (22765,1281274165,2300722.866667) .......
    Problem is, that in this insert the itemid 22766 is missing! Also if i search the history table in mysql console i get this.

    Code:
    select itemid, FROM_UNIXTIME(clock), value from history where itemid=22766 and clock>1281273331 limit 30;
    +--------+----------------------+---------------+
    | itemid | FROM_UNIXTIME(clock) | value             |
    +--------+----------------------+---------------+
    |  22766 | 2010-08-08 15:19:26  |  2326047.3933 | 
    |  22766 | 2010-08-08 15:44:26  | 12320578.1237 | 
    |  22766 | 2010-08-08 16:09:27  |   162304.7633  | 
    |  22766 | 2010-08-08 16:14:26  | 12754024.0936 | 
    |  22766 | 2010-08-08 16:29:26  |  1422888.1467  | 
    .
    .
    .
    The interval for the switch is set to 5mins so you can see, that several items are missing which I assume creates these gaps in graphs.

    Server is CentOS, runs in vmware, 2GB ram. CPU load is between 0.1 - 0.2. Everything is upgraded to last stable version. Database is mysql innodb, zabbix is 1.8.3 (but i have also problems with previous versions).

    Any ideas what could cause this problem and how can i solve it? Thanks for your help.
  • Yello
    Senior Member
    • Apr 2011
    • 309

    #2
    Hi,
    I have seen data loss, and distorted timings, in a virtual environment. We've discussed this internally and have decided to switch zabbix to a physical server.

    I have raised another thread in this forum discussing just this point and you may want to look it up and take a look. In summary, my position is this:

    1. Web frontend - ok for virtual
    2. Zabbix server process - physical only
    3. MySQL - physical only

    So i think you can guess where I suspect at least some of your issues lie. Virtualization is not a panacea...


    Regards,
    David

    Comment

    • Zaniwoop
      Senior Member
      • Jan 2010
      • 232

      #3
      One of the major causes of this I have seen is when there is a problem with an SNMP item.

      For example, if an item has the wrong community string, or the specific OID does not exist for that item on the host, the Zabbix Server decides to stop ALL SNMP for that host for 10 minutes.

      You will see this in the logs* It will tell you that SNMP is being stopped for the host. the problem is, it stops all SNMP items because of one problem.


      *tail -f /tmp/zabbix_server.log | grep -i hostname

      Comment

      • couker
        Junior Member
        • Jun 2010
        • 5

        #4
        Originally posted by Yello
        So i think you can guess where I suspect at least some of your issues lie. Virtualization is not a panacea...

        Hi David,
        thanks for your reply. I actually solved it. Problem was with overflowing snmp 32 bits counters. See this post http://www.zabbix.com/forum/showthre...2997#post72997

        Regards,

        c.

        Comment

        Working...