Ad Widget

Collapse

Wrong values in latest data Zabbix 3.0.7

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sinai65
    Junior Member
    • Jun 2016
    • 3

    #1

    Wrong values in latest data Zabbix 3.0.7

    Hi all,
    I have a Zabbix server which contains some hosts (8 Switch devices and 40 Linux servers). Monitoring process is going to be good except in some cases I get the wrong values in Latest data.
    I have created a script to get RAID status of my Linux servers and put it in zabbix_agentd.conf with the proper syntax. Until now everything is OK and when I check the status of a hard disk in RAID array with the use of zabbix_get command I can see the value "Online". The problem is that when I check the status of that Hard disk in the latest data, I see "Failed" value for it !!
    The problem is that it is fluctuating, I mean when a check is occurred in Zabbix (1 per 5 minutes) it shows the value "Online" and after 2 or 3 next checks, it gets "Failed" and after some checks it shows "Online" again!!
    I am using LLD for discovering Raid type and hard disks inside the arrays. This problem does not happen for all hosts only a few hosts are facing this issue.
    Any help is appreciated.
  • batchenr
    Senior Member
    • Sep 2016
    • 440

    #2
    Originally posted by sinai65
    Hi all,
    I have a Zabbix server which contains some hosts (8 Switch devices and 40 Linux servers). Monitoring process is going to be good except in some cases I get the wrong values in Latest data.
    I have created a script to get RAID status of my Linux servers and put it in zabbix_agentd.conf with the proper syntax. Until now everything is OK and when I check the status of a hard disk in RAID array with the use of zabbix_get command I can see the value "Online". The problem is that when I check the status of that Hard disk in the latest data, I see "Failed" value for it !!
    The problem is that it is fluctuating, I mean when a check is occurred in Zabbix (1 per 5 minutes) it shows the value "Online" and after 2 or 3 next checks, it gets "Failed" and after some checks it shows "Online" again!!
    I am using LLD for discovering Raid type and hard disks inside the arrays. This problem does not happen for all hosts only a few hosts are facing this issue.
    Any help is appreciated.
    sometimes this error accoure when you have a device with too many unsupported items - if you see some in this specific device - disable them all.

    what is the item interval ?

    Comment

    • sinai65
      Junior Member
      • Jun 2016
      • 3

      #3
      There were some unsupported items already, but I removed them all and nothing changed!!
      The item interval is set to 5 minutes. The wired thing is that when I check the status of a hard disk by zabbix_get command, it always return the right value (i.e. online) but zabbix front end shows the wrong value (i.e. faulty)!! I also changed the values to 1 and 0 instead of online and faulty, because I thought it is an issue with the string values but nothing changed.
      Moreover, I checked the history_unit table for the stored values for this item and I understood the values are wrong in the database too!!
      for example the latest data for one of my servers is as follows:

      Name Last check Last Value
      Hard No 2I:1:1 status 2017-02-22 08:31:14 Healthy (1)
      Hard No 2I:1:2 status 2017-02-22 08:31:15 Faulty (0)
      Hard No 2I:1:3 status 2017-02-22 08:31:17 Healthy (1) +1
      Hard No 2I:1:4 status 2017-02-22 08:31:17 Healthy (1)

      As you see it is shown that Faulty for "Hard No 2I:1:2 status" and, it was Faulty for "Hard No 2I:1:3 status" which is now Healthy!!! In fact, both of them should be Healthy.

      Comment

      • batchenr
        Senior Member
        • Sep 2016
        • 440

        #4
        Originally posted by sinai65
        There were some unsupported items already, but I removed them all and nothing changed!!
        The item interval is set to 5 minutes. The wired thing is that when I check the status of a hard disk by zabbix_get command, it always return the right value (i.e. online) but zabbix front end shows the wrong value (i.e. faulty)!! I also changed the values to 1 and 0 instead of online and faulty, because I thought it is an issue with the string values but nothing changed.
        Moreover, I checked the history_unit table for the stored values for this item and I understood the values are wrong in the database too!!
        for example the latest data for one of my servers is as follows:

        Name Last check Last Value
        Hard No 2I:1:1 status 2017-02-22 08:31:14 Healthy (1)
        Hard No 2I:1:2 status 2017-02-22 08:31:15 Faulty (0)
        Hard No 2I:1:3 status 2017-02-22 08:31:17 Healthy (1) +1
        Hard No 2I:1:4 status 2017-02-22 08:31:17 Healthy (1)

        As you see it is shown that Faulty for "Hard No 2I:1:2 status" and, it was Faulty for "Hard No 2I:1:3 status" which is now Healthy!!! In fact, both of them should be Healthy.
        this values coming from the RAID script you created ?
        if so can su to zabbix user at zabbix server - simply by
        #su zabbix

        and run the script with bash -x /scriptname.sh

        and tell me if you saw some errors ?
        zabbix server showing somthing ? maybe lost connection ?

        Comment

        • sinai65
          Junior Member
          • Jun 2016
          • 3

          #5
          I am sending the output of running my script within the server and from zabbix server:

          [root@irm ~]# zabbix_get -s serverip -k p410.hardcheck[2I:1:1]
          1
          [root@irm ~]# zabbix_get -s serverip -k p410.hardcheck[2I:1:2]
          1
          [root@irm ~]# zabbix_get -s serverip -k p410.hardcheck[2I:1:3]
          1
          [root@irm ~]# zabbix_get -s serverip -k p410.hardcheck[2I:1:4]
          1

          [root@myserver ~]# su zabbix --shell=/bin/bash

          bash-4.1$ /usr/local/sbin/P410_Discover.sh Discover
          {"data":[{"{#ARRAY}":"2I:1:1"},{"{#ARRAY}":"2I:1:2"},{"{#AR RAY}":"2I:1:3"},{"{#ARRAY}":"2I:1:4"}]}

          bash-4.1$ /usr/local/sbin/P410_Discover.sh 2I:1:1
          1
          bash-4.1$ /usr/local/sbin/P410_Discover.sh 2I:1:2
          1
          bash-4.1$ /usr/local/sbin/P410_Discover.sh 2I:1:3
          1
          bash-4.1$ /usr/local/sbin/P410_Discover.sh 2I:1:4
          1

          As you see there is no error and there is nothing related to lost connection in zabbix server log files too!!

          Comment

          Working...