Ad Widget

Collapse

Zabbix - MySQL Database Fields & Troubleshooting

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nobody
    Junior Member
    • Jul 2013
    • 17

    #1

    Zabbix - MySQL Database Fields & Troubleshooting

    Hello Zabbix Forums!

    I am trying to figure out why Zabbix is not logging data for a few very specific hosts. Primarily SNMP data. I have adjusted pollers accordingly; I spawn 750 StartPollers, which is much more than what I need. My queue is relatively quiet (except with busy pollers being at 40-60% busy.. Still working on that one. Rest of the pollers are less than 5% busy, mot being ~0.5% busy)

    At first I thought it was an issue with these specific devices/appliances, however after querying them with snmp at a faster rate then what zabbix is configured for: I can state that it isn't a fault with the appliance its self as it is reporting the values properly.

    I am trying to track down the reason for the failed logging of SNMP information into the database. I don't suspect it is an mysql issue, because I do not see any reported errors with MYSQL. I do not know yet if it is the culprit or not (need to figure that out after doing further troubleshooting).

    Can someone from the dev team confirm what field "clock" is in the history_uint table in mysql? I can't figure out what format it is saved in, (it's not epoch time, or anything that I can figure out).

    EG:
    Code:
    mysql> select * from history_uint where itemid = '28137';
    +--------+------------+-------+-----------+
    | itemid | clock      | value | ns        |
    +--------+------------+-------+-----------+
    |  28137 | 1483736051 |    38 | 302856145 |
    |  28137 | 1483736101 |    38 | 459001023 |
    |  28137 | 1483736222 |    38 | 332469004 |
    |  28137 | 1483742748 |    23 | 923644207 |
    |  28137 | 1483744143 |    23 | 896642668 |
    ....
    |  28137 | 1485473217 |    37 | 632053880 |
    |  28137 | 1485473697 |    37 | 348355062 |
    |  28137 | 1485474777 |    37 | 593664092 |
    |  28137 | 1485475257 |    38 | 293999401 |
    |  28137 | 1485475497 |    37 | 431714379 |
    |  28137 | 1485475857 |    38 |  72490487 |
    |  28137 | 1485476217 |    38 | 238922567 |
    |  28137 | 1485476457 |    38 | 414810309 |
    |  28137 | 1485477417 |    37 | 272730653 |
    ....
    These items are set to a 120s polling interval, but it seems like data doesn't get inserted for upwards of 10 minutes at a time at random intervals.... very strange. (confirmed with zabbix graphs and with a database dump seen above)

    I look forward to your replies .

    Regards,
    James

    P.S. Specifics about my environment can be seen in this post:

    Zabbix 3.0.7 -- Agents aren't causing issues, only SNMP.
    Last edited by nobody; 27-01-2017, 20:26.
  • batchenr
    Senior Member
    • Sep 2016
    • 440

    #2
    Originally posted by nobody
    Hello Zabbix Forums!

    I am trying to figure out why Zabbix is not logging data for a few very specific hosts. Primarily SNMP data. I have adjusted pollers accordingly; I spawn 750 StartPollers, which is much more than what I need. My queue is relatively quiet (except with busy pollers being at 40-60% busy.. Still working on that one. Rest of the pollers are less than 5% busy, mot being ~0.5% busy)

    At first I thought it was an issue with these specific devices/appliances, however after querying them with snmp at a faster rate then what zabbix is configured for: I can state that it isn't a fault with the appliance its self as it is reporting the values properly.

    I am trying to track down the reason for the failed logging of SNMP information into the database. I don't suspect it is an mysql issue, because I do not see any reported errors with MYSQL. I do not know yet if it is the culprit or not (need to figure that out after doing further troubleshooting).

    Can someone from the dev team confirm what field "clock" is in the history_uint table in mysql? I can't figure out what format it is saved in, (it's not epoch time, or anything that I can figure out).

    EG:
    Code:
    mysql> select * from history_uint where itemid = '28137';
    +--------+------------+-------+-----------+
    | itemid | clock      | value | ns        |
    +--------+------------+-------+-----------+
    |  28137 | 1483736051 |    38 | 302856145 |
    |  28137 | 1483736101 |    38 | 459001023 |
    |  28137 | 1483736222 |    38 | 332469004 |
    |  28137 | 1483742748 |    23 | 923644207 |
    |  28137 | 1483744143 |    23 | 896642668 |
    ....
    |  28137 | 1485473217 |    37 | 632053880 |
    |  28137 | 1485473697 |    37 | 348355062 |
    |  28137 | 1485474777 |    37 | 593664092 |
    |  28137 | 1485475257 |    38 | 293999401 |
    |  28137 | 1485475497 |    37 | 431714379 |
    |  28137 | 1485475857 |    38 |  72490487 |
    |  28137 | 1485476217 |    38 | 238922567 |
    |  28137 | 1485476457 |    38 | 414810309 |
    |  28137 | 1485477417 |    37 | 272730653 |
    ....
    These items are set to a 120s polling interval, but it seems like data doesn't get inserted for upwards of 10 minutes at a time at random intervals.... very strange. (confirmed with zabbix graphs and with a database dump seen above)

    I look forward to your replies .

    Regards,
    James

    P.S. Specifics about my environment can be seen in this post:

    Zabbix 3.0.7 -- Agents aren't causing issues, only SNMP.

    Hi,
    the clock is unix time, try to convert here : http://www.onlineconversion.com/unix_time.htm

    second- can you tell me what item is not working ?
    we can try to test it throw snmpget command ?

    moreover, i have a server with approx 200-300 hosts and i use
    StartPollers=30
    and it works fine

    Comment

    • nobody
      Junior Member
      • Jul 2013
      • 17

      #3
      Hi batchenr!!!


      the clock is unix time, try to convert here : http://www.onlineconversion.com/unix_time.htm
      How I have never come across "UNIX" time before is beyond me, thank you for the link. That was exactly it. Those time stamps support my "data not being logged to the database" theory, now to find out why...

      second- can you tell me what item is not working ?
      Various SNMP pollers aren't working; they will work for some hosts under the same template, and not for others periodically. I verified that the OID's exist on each device, verified community strings, and firmware versions of those devices. Some OID's are always pulled though, which is very strange. This is across a multitude of our devices, AP's, Switches, Routers; a few of them have never skipped any SNMP data (as far as I can tell.. too much data to actually check with 400+ hosts with fast sample rates.. with ~20300 items).

      we can try to test it throw snmpget command ?
      I even ran an snmpget on a fast loop (every 5 seconds) and it was returning all OID's that I requested in one test against one of our devices that seems to have the worst SNMP polling according to all of their graphs in Zabbix. I tired bulk and individually querying the appliance: both worked as expected. Thus it's not likely an appliance problem.

      Sometimes other values aren't logged to the system, like sys.uptime for other devices in other templates. I'm starting to think you might be right, I might need LESS pollers. I recall increasing it from 400/500 to 550 to reduce my busy poller numbers.

      Thanks . When I have some time I'll try to reduce the number and perform more testing.

      I greatly appreciate your response!!!

      Regards,
      James
      Last edited by nobody; 30-01-2017, 23:25.

      Comment

      • Pada
        Senior Member
        • Apr 2012
        • 236

        #4
        In MySQL you can simply use the from_unixtime function to convert it to a human readable format.
        eg.
        Code:
        select itemid, from_unixtime(clock), value from history_uint where itemid = '28137';
        Only other suggestion may be to start a Zabbix Proxy on a different host and then start to monitor these problematic hosts from them.

        If the problem then still persists, then all I can think of is the SNMP timeout or some SNMP setting (eg. community string, port, etc.) that is wrong, and that item should be logged in the Zabbix server log.

        If none of the above helped, then all I can think of is to start using Wireshark or something in that line to debug.

        Comment

        • batchenr
          Senior Member
          • Sep 2016
          • 440

          #5
          Originally posted by nobody
          Hi batchenr!!!



          How I have never come across "UNIX" time before is beyond me, thank you for the link. That was exactly it. Those time stamps support my "data not being logged to the database" theory, now to find out why...



          Various SNMP pollers aren't working; they will work for some hosts under the same template, and not for others periodically. I verified that the OID's exist on each device, verified community strings, and firmware versions of those devices. Some OID's are always pulled though, which is very strange. This is across a multitude of our devices, AP's, Switches, Routers; a few of them have never skipped any SNMP data (as far as I can tell.. too much data to actually check with 400+ hosts with fast sample rates.. with ~20300 items).



          I even ran an snmpget on a fast loop (every 5 seconds) and it was returning all OID's that I requested in one test against one of our devices that seems to have the worst SNMP polling according to all of their graphs in Zabbix. I tired bulk and individually querying the appliance: both worked as expected. Thus it's not likely an appliance problem.

          Sometimes other values aren't logged to the system, like sys.uptime for other devices in other templates. I'm starting to think you might be right, I might need LESS pollers. I recall increasing it from 400/500 to 550 to reduce my busy poller numbers.

          Thanks . When I have some time I'll try to reduce the number and perform more testing.

          I greatly appreciate your response!!!

          Regards,
          James
          i hope it will work for you, another thing i remembered that interrupting snmp
          is when you have a lot of unsupported items connecting to this devices. if you see some disable them.

          anyways,update us

          Comment

          • nobody
            Junior Member
            • Jul 2013
            • 17

            #6
            batchenr!

            I reduced the amount of startpollers, it increased the poller busy percentage by about 15 percent. I'm not too overly concerned.

            My Busy unreachable poller is averaging about 50 percent; and yes I have LOTS of unusable items on some of my hosts due to firmware differences and the like. (Maybe about 300 consistently or so)

            I'll need to find a way to reduce the amount of busy poller unreachable... and hope that this fixes my "gappy data" problem.

            Regards,
            James

            Comment

            • nobody
              Junior Member
              • Jul 2013
              • 17

              #7
              SNMP Traps don't scale well either . Can't actively monitor interface states using most SNMPTRAP implementation .

              Regards,
              James

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #8
                Originally posted by nobody
                SNMP Traps don't scale well either . Can't actively monitor interface states using most SNMPTRAP implementation .
                I've deleted my comment because I've realised (to late) that you are dealing (mostly?) with SNMP monitoring so my comment was not relevant for you.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment

                • batchenr
                  Senior Member
                  • Sep 2016
                  • 440

                  #9
                  Originally posted by nobody
                  SNMP Traps don't scale well either . Can't actively monitor interface states using most SNMPTRAP implementation .

                  Regards,
                  James
                  when you type your snmp v3 command at zabbix server itself - do you get a valid respone ?

                  please post the commend and the results
                  and then post the trigger you have set + the item

                  Comment

                  Working...