Ad Widget

Collapse

Problem with net.if.* items after upgrade to 1.6.5

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Emir Imamagic
    Member
    • Mar 2008
    • 67

    #1

    Problem with net.if.* items after upgrade to 1.6.5

    We just upgraded Zabbix to 1.6.5 because 1.6.4 with PostgreSQL database was horribly unstable. We did a test run on development server and it looked ok.

    On the first glance, server was working smoothly. However, then we realized that there is a problem with net.if.* items on random set of servers. All items are defined as Float and stored as "Delta (speed per second)". Here are some error messages:
    Code:
    Item [node1.local:net.if.in[eth0,bytes]] error: Type of received value [14535276550645] is not suitable for value type [Numeric (float)]
    Item [node2.local:net.if.out[eth0,bytes]] error: Type of received value [9963079288286] is not suitable for value type [Numeric (float)]
    Item [node3.local:net.if.in[lo,bytes]] error: Type of received value [3581532085262] is not suitable for value type [Numeric (float)]
    I don't have a clue where are these enormous figures coming from. Worst thing is that these are different OSes (Debian, CentOS).

    Has someone noticed something similar? Any ideas?

    Thanks in advance,
    emir
  • Emir Imamagic
    Member
    • Mar 2008
    • 67

    #2
    Important detail - we only upgraded server. Clients are still using Agent 1.6.4.

    Cheers,
    emir

    Comment

    • Emir Imamagic
      Member
      • Mar 2008
      • 67

      #3
      Ok, I understand the problem now. Trick is in the fact that we're using PostgreSQL. You added checks in src/libs/zbxsysinfo/sysinfo.c to verify range of result in order to avoid sending invalid numbers to PostgreSQL.

      In case of Deltas this is incorrect because this is not the number which is going to be stored to the DB. DBchk_uint64 and DBchk_double functions should be invoked after the delta is calculated.

      Comment

      • Emir Imamagic
        Member
        • Mar 2008
        • 67

        #4
        I implemented the suggested changes:
        - removed DBchk_* calls from set_result_type (src/libs/zbxsysinfo/sysinfo.c)
        - added checks before storing the data to DB (src/libs/zbxserver/functions.c).

        I don't fully understand how does this affect the DBCache so I didn't add DBchk_* anywhere in DBCache modules.

        Also, I simply copied DBChk_* functions to functions.c because it seemed the most efficient solution (and required the least amount of changes).

        We tested the zabbix_server with these changes and delta items are working fine now.

        Patches to these two files are attached. We would be very greatful if someone would review them and confirm that this is a valid solution. We would like to put this on our production because currently we're loosing network interface data on many servers.

        Thanks in advace,
        emir
        Attached Files

        Comment

        • justincase
          Junior Member
          • Jan 2008
          • 8

          #5
          Using 1.6.6 I am seeing this same thing on a proxy using sqlite3. Here is an example:

          18209:20090915:091249 Item [HOSTNAME.REMOVED:net.if.out[eth0,bytes]] error: Type of received value [1748823363753] is not suitable for value type [Numeric (float)]

          /sbin/ifconfig (or /proc/net/dev) on this host does confirm that it has output 1.5 TB. The host is centos 5.x with a 64bit kernel running 2.6.18.

          I can see why it is a float as the delta calculation it could be a float but the raw number can exceed the limit of the float imposed. Should this be some sort of two step process or does someone have another solution? I am not even sure if this data is stored in its raw form before calculating the delta. I guess that is another area to examine.

          I could will try your patches on the proxy but I assume these limits were put in place due to real limits in sqlite so that may not work.

          Any feedback would be most helpful.

          Thanks!

          /justin

          Comment

          • elvar
            Senior Member
            • Feb 2008
            • 226

            #6
            I'm using 1.8 on the server and 1.8 on my agents as well and one of my recently added hosts is having this problem. The host server is Postgresql, both the client and server are 64bit Ubuntu Server. Do I need to apply these patches in 1.8 even?

            Comment

            • chwlls
              Junior Member
              • Jan 2010
              • 1

              #7
              Originally posted by elvar
              I'm using 1.8 on the server and 1.8 on my agents as well and one of my recently added hosts is having this problem. The host server is Postgresql, both the client and server are 64bit Ubuntu Server. Do I need to apply these patches in 1.8 even?
              The bug still exists in Zabbix 1.8, but regrettably the patch no longer applies. I am not sure how much work it will be to adapt the patch - the 1.8 changelog says that work has been done on the uint64 stuff with postgres.

              I have raised the issue as a bug here:



              So hopefully it will be properly fixed before long.

              Comment

              Working...