Ad Widget

Collapse

problem with SMART by Zabbix agent 2 template

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mircea.voicu
    Junior Member
    • Jul 2024
    • 2

    #1

    problem with SMART by Zabbix agent 2 template

    Hello

    We have 2 servers running this template - SMART by Zabbix agent 2.
    On one of the servers it works correctly but on the other it doesn't.
    I think the output should be similar but it is not, somehow it fails to make discovery on megaraid disks.
    Can someone tell me how it can be solved?

    I'm sending below some commands I ran for debugging:
    NOTE:
    s14 is the working server
    s16 is the problem server

    Code:
    [root@s14 ~]# sudo -n smartctl --scan -d sat
    /dev/sda -d scsi # /dev/sda, SCSI device
    /dev/sdb -d scsi # /dev/sdb, SCSI device
    /dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
    /dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device
    /dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
    /dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device
    Code:
    ...
    2024/07/29 11:12:23.404053 [Smart] command sudo -n smartctl -a /dev/bus/0 -d megaraid,3 -j  smartctl raw response: {
      "json_format_version": [
    ...
        "flags": {
          "value": 0,
          "remainder_scan_enabled": false
        },
        "power_up_scan_resume_minutes": 0
      }
    }
    ....
    Code:
    [root@s16 ~]# sudo -n smartctl --scan -d sat
    /dev/sda -d scsi # /dev/sda, SCSI device
    /dev/sdb -d scsi # /dev/sdb, SCSI device
    /dev/bus/1 -d megaraid,0 # /dev/bus/1 [megaraid_disk_00], SCSI device
    /dev/bus/1 -d megaraid,1 # /dev/bus/1 [megaraid_disk_01], SCSI device
    /dev/bus/1 -d megaraid,2 # /dev/bus/1 [megaraid_disk_02], SCSI device
    /dev/bus/1 -d megaraid,3 # /dev/bus/1 [megaraid_disk_03], SCSI device
    Code:
    ...
    2024/07/29 11:12:30.605948 [Smart] command sudo -n smartctl -a /dev/bus/1 -d megaraid,0 -j  smartctl raw response:   Pending defect count:{
      "json_format_version": [
    ...
        }
      },
      "pending_defects": {
        "count": 0
      }
    }
    2024/07/29 11:12:30.605972 [Smart] failed to unmarshal megaraid device with name /dev/bus/1 -d megaraid,0, invalid character 'P' looking for beginning of value
    .....
    Code:
    [root@s14 ~]# sudo -n smartctl -a /dev/bus/1 -d megaraid,0 -j | head
    {
      "json_format_version": [
        1,
        0
      ],
      "smartctl": {
         "version": [
           7,
           2
         ],
    Code:
    [root@s16 ~]# sudo -n smartctl -a /dev/bus/1 -d megaraid,0 -j | head
      Pending defect count:{
      "json_format_version": [
        1,
        0
      ],
      "smartctl": {
        "version": [
          7,
          2
        ],​
    Code:
    [root@s14 ~]# smartctl --version
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.14.0-162.22.2.el9_1.x86_64] (local build)
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
    
    smartctl comes with ABSOLUTELY NO WARRANTY. This is free
    software, and you are welcome to redistribute it under
    the terms of the GNU General Public License; either
    version 2, or (at your option) any later version.
    See http://www.gnu.org for further details.
    
    smartmontools release 7.2 dated 2020-12-30 at 16:48:30 UTC
    smartmontools SVN rev 5155 dated 2020-12-30 at 16:49:18
    smartmontools build host: x86_64-redhat-linux-gnu
    smartmontools build with: C++17, GCC 11.3.1 20221121 (Red Hat 11.3.1-4)
    smartmontools configure arguments: '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-selinux' '--with-libcap-ng=yes' '--with-libsystemd' '--with-systemdsystemunitdir=/usr/lib/systemd/system' '--sysconfdir=/etc/smartmontools/' '--with-systemdenvfile=/etc/sysconfig/smartmontools' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CXX=g++' 'CXXFLAGS=-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 ' 'CC=gcc' 'CFLAGS=-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'​
    Code:
    [root@s16 ~]# smartctl --version
    smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.14.0-427.20.1.el9_4.x86_64] (local build)
    Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
    
    smartctl comes with ABSOLUTELY NO WARRANTY. This is free
    software, and you are welcome to redistribute it under
    the terms of the GNU General Public License; either
    version 2, or (at your option) any later version.
    See http://www.gnu.org for further details.
    
    smartmontools release 7.2 dated 2020-12-30 at 16:48:30 UTC
    smartmontools SVN rev 5155 dated 2020-12-30 at 16:49:18
    smartmontools build host: x86_64-redhat-linux-gnu
    smartmontools build with: C++17, GCC 11.4.1 20231218 (Red Hat 11.4.1-3)
    smartmontools configure arguments: '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-selinux' '--with-libcap-ng=yes' '--with-libsystemd' '--with-systemdsystemunitdir=/usr/lib/systemd/system' '--sysconfdir=/etc/smartmontools/' '--with-systemdenvfile=/etc/sysconfig/smartmontools' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CXX=g++' 'CXXFLAGS=-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LDFLAGS=-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 ' 'CC=gcc' 'CFLAGS=-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'​
    Code:
    [root@s14 ~]# env | grep -i smart
    [root@s14 ~]#​
    Code:
    [root@s16 ~]# env | grep -i smart
    [root@s16 ~]#​
    Code:
    [root@s14 ~]# cat /etc/smartmontools/smartd.conf  | grep -v -e '^#' -e '^$'
    DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q​
    Code:
    [root@s16 ~]# grep -v -e '^#' -e '^$' /etc/smartmontools/smartd.conf
    DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q​
  • mircea.voicu
    Junior Member
    • Jul 2024
    • 2

    #2
    Hello,

    I forgot to mention the main issue.
    On s14 smart.disk.discovery returns all disks.
    On s16 smart.disk.discovery returns only NVME disks, SATA disks are absent.
    Both servers have identical configuration.

    Code:
    [root@s14 ~]# zabbix_agent2 -t 'smart.disk.discovery' | sed 's/^smart\.disk\.discovery\s*\[s|//' | sed 's/\]$//' | jq
    [
      {
        "{#NAME}": "nvme0",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQLB7T6HMLA-00007",
        "{#SN}": "S4BGNW0R702720",
        "{#PATH}": "/dev/nvme0",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },  
      {
        "{#NAME}": "nvme1",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQLB7T6HMLA-00007",
        "{#SN}": "S4BGNW0R702722",
        "{#PATH}": "/dev/nvme1",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "nvme2",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQLB7T6HMLA-00007",
        "{#SN}": "S4BGNW0R702725",
        "{#PATH}": "/dev/nvme2",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "nvme3",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQLB7T6HMLA-00007",
        "{#SN}": "S4BGNW0R702716",
        "{#PATH}": "/dev/nvme3",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "bus/0 megaraid,0",
        "{#DISKTYPE}": "hdd",
        "{#MODEL}": "ST16000NM000J-2TW103",
        "{#SN}": "ZR547BZ8",
        "{#PATH}": "/dev/bus/0",
        "{#RAIDTYPE}": "megaraid,0",
        "{#ATTRIBUTES}": "Raw_Read_Error_Rate Spin_Up_Time Start_Stop_Count Reallocated_Sector_Ct Seek_Error_Rate Power_On_Hours Spin_Retry_Count Power_Cycle_Count Reported_Uncorrect Command_Timeout Airflow_Temperature_Cel Power-Off_Retract_Count Load_Cycle_Count Temperature_Celsius Current_Pending_Sector Offline_Uncorrectable UDMA_CRC_Error_Count Multi_Zone_Error_Rate Head_Flying_Hours Total_LBAs_Written Total_LBAs_Read"
      },
      {
        "{#NAME}": "bus/0 megaraid,1",
        "{#DISKTYPE}": "hdd",
        "{#MODEL}": "ST16000NM000J-2TW103",
        "{#SN}": "ZR53X72A",
        "{#PATH}": "/dev/bus/0",
        "{#RAIDTYPE}": "megaraid,1",
        "{#ATTRIBUTES}": "Raw_Read_Error_Rate Spin_Up_Time Start_Stop_Count Reallocated_Sector_Ct Seek_Error_Rate Power_On_Hours Spin_Retry_Count Power_Cycle_Count Reported_Uncorrect Command_Timeout Airflow_Temperature_Cel Power-Off_Retract_Count Load_Cycle_Count Temperature_Celsius Current_Pending_Sector Offline_Uncorrectable UDMA_CRC_Error_Count Multi_Zone_Error_Rate Head_Flying_Hours Total_LBAs_Written Total_LBAs_Read"
      },
      {
        "{#NAME}": "bus/0 megaraid,2",
        "{#DISKTYPE}": "hdd",
        "{#MODEL}": "ST20000NM007D-3DJ103",
        "{#SN}": "ZVT5KC2Y",
        "{#PATH}": "/dev/bus/0",
        "{#RAIDTYPE}": "megaraid,2",
        "{#ATTRIBUTES}": "Raw_Read_Error_Rate Spin_Up_Time Start_Stop_Count Reallocated_Sector_Ct Seek_Error_Rate Power_On_Hours Spin_Retry_Count Power_Cycle_Count Reported_Uncorrect Command_Timeout Airflow_Temperature_Cel Power-Off_Retract_Count Load_Cycle_Count Temperature_Celsius Current_Pending_Sector Offline_Uncorrectable UDMA_CRC_Error_Count Multi_Zone_Error_Rate Head_Flying_Hours Total_LBAs_Written Total_LBAs_Read"
      },
      {
        "{#NAME}": "bus/0 megaraid,3",
        "{#DISKTYPE}": "hdd",
        "{#MODEL}": "ST20000NM007D-3DJ103",
        "{#SN}": "ZVT5ZBYJ",
        "{#PATH}": "/dev/bus/0",
        "{#RAIDTYPE}": "megaraid,3",
        "{#ATTRIBUTES}": "Raw_Read_Error_Rate Spin_Up_Time Start_Stop_Count Reallocated_Sector_Ct Seek_Error_Rate Power_On_Hours Spin_Retry_Count Power_Cycle_Count Reported_Uncorrect Command_Timeout Airflow_Temperature_Cel Power-Off_Retract_Count Load_Cycle_Count Temperature_Celsius Current_Pending_Sector Offline_Uncorrectable UDMA_CRC_Error_Count Multi_Zone_Error_Rate Head_Flying_Hours Total_LBAs_Written Total_LBAs_Read"
      }
    ]
    Code:
    [root@s16 ~]# zabbix_agent2 -t 'smart.disk.discovery' | sed 's/^smart\.disk\.discovery\s*\[s|//' | sed 's/\]$//' | jq
    [
      {
        "{#NAME}": "nvme0",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQL27T6HBLA-00A07",
        "{#SN}": "S6CKNT0WA14973",
        "{#PATH}": "/dev/nvme0",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "nvme1",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQL27T6HBLA-00A07",
        "{#SN}": "S6CKNT0WA14972",
        "{#PATH}": "/dev/nvme1",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "nvme2",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQL27T6HBLA-00A07",
        "{#SN}": "S6CKNT0WA14974",
        "{#PATH}": "/dev/nvme2",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      },
      {
        "{#NAME}": "nvme3",
        "{#DISKTYPE}": "nvme",
        "{#MODEL}": "SAMSUNG MZQL27T6HBLA-00A07",
        "{#SN}": "S6CKNT0WA14975",
        "{#PATH}": "/dev/nvme3",
        "{#RAIDTYPE}": "",
        "{#ATTRIBUTES}": ""
      }
    ]​

    Comment

    • Markku
      Senior Member
      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
      • Sep 2018
      • 1782

      #3
      Did you notice the extra "Pending defect count:" in the beginning lf the smartctl output? That is not JSON so it is a failure in smartctl output. If you cannot figure out how to get smartctl to output just JSON, you could try removing that in a preprocessing rule.

      Markku

      Comment

      Working...