Ad Widget

Collapse

System.uptime overflow causing issues?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • CVGlennS
    Junior Member
    • Oct 2014
    • 7

    #1

    System.uptime overflow causing issues?

    Hey all,

    So, this is an issue I have run into in both old versions of Zabbix (1.4.x) as well as a brand new install that we just finished (2.4.1). Essentially, it seems like system.uptime overflows on Windows hosts and causes issues in ALL items / triggers / graphs on that host, not just uptime.

    The setup:
    The stock/default 'System uptime' item that comes in the 'Template OS Windows' template with Zabbix (Item.jpg).
    A Windows host with a long uptime, in our case it is currently 511 days (LatestData.jpg).
    This causes drop outs in data, graphs, and triggers going off incorrectly, usually the "this host is unreachable / unable to contact Zabbix agent for 5 minutes". A good example is gaps in graphs (Graph.jpg)

    Logs:
    Host shows nothing out of the ordinary
    Server shows a few lines like this:
    12636:20141014:155738.603 [Z3005] query failed: [2006] MySQL server has gone away [begin;]

    My theory:
    I assume that this is some sort of overflow issue, as ALL hosts will exhibit the same exact behavior right around the same uptime - a little bit before 500 days (seems to be about 497). It seems that Zabbix uses seconds for uptime, so with a quick calculation: 497 days * 24 hours * 60 minutes * 60 seconds = 42,940,800.
    That number is very close to the upper bounds of a 32 bit number (2^32 = 4,294,967,296), just off by 2 decimal places. The only thing I can think of is that somewhere, Zabbix is using a 32 bit integer (or some other data type) that overflows right at this value.

    The worst part about this is: even if you totally disable / remove the system.uptime item, the issue still happpens. No matter what I try, every single one of our hosts is going to cause Zabbix to freak out right at ~497 days of uptime.

    So, any ideas? Anything I can do to avoid this behavior (other than rebooting the host, of course)? I tried playing around with the "Store value" option, but that just always stores "1m" or whatever the interval is set to.

    I seems like this issue should have come up for often, but I am unable to find any info on it. Thanks!
    Attached Files
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #2
    Can you take a look at the very last paragraph (and the graphs) of this post, and then see how your internal processes are being utilized?

    I would be interested in seeing your graphs with a 1 day time period.

    Comment

    • CVGlennS
      Junior Member
      • Oct 2014
      • 7

      #3
      I apologize - I don't understand what you are asking. "look at the very last paragraph (and the graphs) of this post" - which posts/graphs? The ones from my post above? "see how your internal processes are being utilized" - what internal processes? Are we talking business processes or some Zabbix processes?

      As for the graphs that span one day, they look okay at that timespan. I assume Zabbix is trending the data well enough to cover the gaps, but the missing data still occurs, and incorrect triggers are still set.

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        Geez, I forgot to give you the link. Sorry.
        Here it is: https://www.zabbix.com/forum/showthread.php?t=41219

        Check the last paragraph regarding internal Zabbix processes.

        Comment

        • CVGlennS
          Junior Member
          • Oct 2014
          • 7

          #5
          Ah, no worries.

          I read that post (and looked at those graphs) before I posted the thread. From what I can tell (if I am reading them correctly), they don't seem to indicate any issues - images attached.
          Attached Files

          Comment

          • tchjts1
            Senior Member
            • May 2008
            • 1605

            #6
            Those all look good.

            I think you should open a bug report indicating your suspicion of an overflow on the system.uptime value.
            That link would be this: https://support.zabbix.com/secure/Dashboard.jspa
            Last edited by tchjts1; 15-10-2014, 15:37.

            Comment

            • CVGlennS
              Junior Member
              • Oct 2014
              • 7

              #7
              Thanks, I have done so: https://support.zabbix.com/browse/ZBX-8906

              Appreciate the help.

              Comment

              Working...