Ad Widget

Collapse

Discussion thread for official Zabbix SMART Disk monitoring

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PavelZ
    Senior Member
    • Dec 2024
    • 162

    #91
    There are a few workarounds that I decided to share.
    Original zabbix templates collection. Contribute to pavlozt/somezabbixtemplates development by creating an account on GitHub.

    This should temporarily close the problems with long surface test ZBX-22770 and ioctl-storm ZBX-25632

    Comment

    • evert
      Junior Member
      • Jun 2022
      • 4

      #92
      Why does Zabbix query several types of controllers for each HDD?

      Code:
      Feb 24 00:42:14 app sudo[2115688]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdc -d cciss,0 -j
      Feb 24 00:42:14 app sudo[2115683]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sde -d scsi -j
      Feb 24 00:42:14 app sudo[2115684]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdb -d scsi -j
      Feb 24 00:42:14 app sudo[2115685]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sda -d cciss,0 -j
      Feb 24 00:42:14 app sudo[2115687]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdd -d areca,1 -j
      Feb 24 00:42:14 app sudo[2115692]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdc -d 3ware,0 -j
      Feb 24 00:42:14 app sudo[2115690]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/nvme0 -j
      Feb 24 00:42:14 app sudo[2115694]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdc -d areca,1 -j
      Feb 24 00:42:14 app sudo[2115696]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdb -d cciss,0 -j
      Feb 24 00:42:14 app sudo[2115699]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sda -d areca,1 -j
      Feb 24 00:42:14 app sudo[2115689]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sde -d 3ware,0 -j
      Feb 24 00:42:14 app sudo[2115702]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sda -d scsi -j
      Feb 24 00:42:14 app sudo[2115705]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdd -d 3ware,0 -j
      Feb 24 00:42:14 app sudo[2115697]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sda -d 3ware,0 -j
      Feb 24 00:42:14 app sudo[2115704]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sde -d areca,1 -j
      Feb 24 00:42:14 app sudo[2115700]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdc -d scsi -j
      Feb 24 00:42:14 app sudo[2115698]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sde -d cciss,0 -j
      Feb 24 00:42:14 app sudo[2115709]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdd -d cciss,0 -j
      Feb 24 00:42:14 app sudo[2115701]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdb -d areca,1 -j
      Feb 24 00:42:14 app sudo[2115703]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdb -d sat -j
      Feb 24 00:42:14 app sudo[2115686]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdb -d 3ware,0 -j
      Feb 24 00:42:14 app sudo[2115708]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdc -d sat -j
      Feb 24 00:42:14 app sudo[2115695]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdd -d scsi -j
      Feb 24 00:42:14 app sudo[2115693]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sda -d sat -j
      Feb 24 00:42:14 app sudo[2115691]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sdd -d sat -j
      Feb 24 00:42:14 app sudo[2115707]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/nvme1 -j
      Feb 24 00:42:14 app sudo[2115706]:   zabbix : PWD=/ ; USER=root ; COMMAND=/usr/sbin/smartctl -a /dev/sde -d sat -j
      Can I configure the correct type for each disk, to prevent all these extra queries? Or, perhaps even easier, disable the whole '-d XXX' parameter being sent?

      Comment

      • PavelZ
        Senior Member
        • Dec 2024
        • 162

        #93
        That's the idea behind this plugin. But it only does this during autodiscovery, doesn't it?

        By the way, if you don't like the large number of these messages in the log, you can turn it off by configuration of sudoers file

        Comment

        • evert
          Junior Member
          • Jun 2022
          • 4

          #94
          Originally posted by PavelZ
          That's the idea behind this plugin. But it only does this during autodiscovery, doesn't it?
          I get them once an hour, as far as I can tell. (At which interval does Autodiscovery run?)

          Originally posted by PavelZ
          By the way, if you don't like the large number of these messages in the log, you can turn it off by configuration of sudoers file
          How do I do this? And does this only suppress the messages, or does it actually keep smartctl from executing the variations?

          Comment

          • PavelZ
            Senior Member
            • Dec 2024
            • 162

            #95
            The company cannot afford to reduce security and publish such instructions, but we can)

            Old template also have a good example in readme :
            PHP Code:
            Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
            zabbix ALL
            = (ALLNOPASSWDSMARTCTL
            Defaults
            !SMARTCTL !logfile, !syslog, !pam_session 
            The commands will continue to run, they will just clutter the logs.

            Comment

            • evert
              Junior Member
              • Jun 2022
              • 4

              #96
              Originally posted by PavelZ
              The company cannot afford to reduce security and publish such instructions, but we can)
              Does the method you describe reduce security? In what way?
              Last edited by evert; 27-02-2025, 11:53.

              Comment

              • INFinite
                Junior Member
                • Feb 2025
                • 10

                #97
                Hello.
                I ran into this problem, when setting up SMART monitoring on windows(I tried it on different versions OS windows):
                Code:
                Cannot fetch data.: got error executing worker pool: failed to execute smartctl: "{\r\n \"json_format_version\": [\r\n 1,\r\n 0\r\n ],\r\n \"smartctl\":
                {\r\n \"version\": [\r\n 7,\r\n 4\r\n ],\r\n \"pre_release\": false,\r\n \"svn_revision\": \"5530\",\r\n \"platform_info\": \"x86_64-w64-mingw32-2012r2\",
                \r\n \"build_info\": \"(sf-7.4-1)\",\r\n \"argv\": [\r\n \"smartctl\",\r\n \"-a\",\r\n \"/dev/sda\",\r\n \"-j\"\r\n ],\r\n \"exit_status\": 4\r\n },\r\n \"local_time\":
                {\r\n \"time_t\": 1740725274,\r\n \"asctime\": \"Fri Feb 28 09:47:54 2025 RTZ\"\r\n },\r\n \"device\": {\r\n \"name\": \"/dev/sda\",\r\n \"info_name\":
                \"/dev/sda\",\r\n \"type\": \"scsi\",\r\n \"protocol\": \"SCSI\"\r\n },\r\n \"scsi_vendor\": \"Intel\",\r\n \"scsi_product\": \"Raid 1 Volume\",\r\n
                \"scsi_model_name\": \"Intel Raid 1 Volume\",\r\n \"scsi_revision\": \"1.0.\",\r\n \"scsi_version\": \"SPC-3\",\r\n \"user_capacity\": {\r\n \"blocks\": 111357952,
                \r\n \"bytes\": 57015271424\r\n },\r\n \"logical_block_size\": 512,\r\n \"scsi_lb_provisioning\": {\r\n \"name\": \"thin provisioned\",\r\n \"value\": 2,\r\n
                \"management_enabled\": {\r\n \"name\": \"LBPME\",\r\n \"value\": -1\r\n },\r\n \"read_zeros\": {\r\n \"name\": \"LBPRZ\",\r\n \"value\": 0\r\n }\r\n },\r\n
                \"rotation_rate\": 0,\r\n \"logical_unit_id\": \"0x61fa116d01000000001517ffff0aeb84\",\r\n \"device_type\": {\r\n \"scsi_terminology\":
                \"Peripheral Device Type [PDT]\",\r\n \"scsi_value\": 0,\r\n \"name\": \"disk\"\r\n },\r\n \"smart_support\": {\r\n \"available\": false\r\n },\r\n
                \"temperature\": {\r\n \"current\": 0,\r\n \"drive_trip\": 0\r\n },\r\n \"seagate_farm_log\": {\r\n \"supported\": false\r\n }\r\n}\r": exit status 4.
                The data is coming, but it does not have the correct format.

                Ubuntu 24
                zabbix server 7.2.4
                zabbix agent2 7.2.3
                smartmontools 7.4
                template is SMART by Zabbix agent 2

                Help me please
                Last edited by INFinite; 28-02-2025, 14:34.

                Comment

                • PavelZ
                  Senior Member
                  • Dec 2024
                  • 162

                  #98
                  Originally posted by evert

                  Does the method you describe reduce security? In what way?
                  Any of your actions affect security. Therefore, I will not take responsibility.

                  I am simply informing you that this is an alternative solution described immediately in the Readme as recommended.

                  Comment

                  • PavelZ
                    Senior Member
                    • Dec 2024
                    • 162

                    #99
                    INFinite ,
                    There are some changes in the new versions of the agent.
                    Would you like to try slightly older ones?

                    I propose a version 7.0.7

                    Comment


                    • INFinite
                      INFinite commented
                      Editing a comment
                      Thanks for the hint in which direction to look.
                  • INFinite
                    Junior Member
                    • Feb 2025
                    • 10

                    #100
                    Originally posted by PavelZ
                    INFinite ,
                    There are some changes in the new versions of the agent.
                    Would you like to try slightly older ones?

                    I propose a version 7.0.7
                    After these changes, it gives the following error:

                    Code:
                    Cannot fetch data.: got error executing worker pool: smartctl returned error: unknown error from smartctl.

                    Comment

                    • INFinite
                      Junior Member
                      • Feb 2025
                      • 10

                      #101
                      Originally posted by INFinite
                      Hello.
                      I ran into this problem, when setting up SMART monitoring on windows(I tried it on different versions OS windows):
                      Code:
                      Cannot fetch data.: got error executing worker pool: failed to execute smartctl: "{\r\n \"json_format_version\": [\r\n 1,\r\n 0\r\n ],\r\n \"smartctl\":
                      {\r\n \"version\": [\r\n 7,\r\n 4\r\n ],\r\n \"pre_release\": false,\r\n \"svn_revision\": \"5530\",\r\n \"platform_info\": \"x86_64-w64-mingw32-2012r2\",
                      \r\n \"build_info\": \"(sf-7.4-1)\",\r\n \"argv\": [\r\n \"smartctl\",\r\n \"-a\",\r\n \"/dev/sda\",\r\n \"-j\"\r\n ],\r\n \"exit_status\": 4\r\n },\r\n \"local_time\":
                      {\r\n \"time_t\": 1740725274,\r\n \"asctime\": \"Fri Feb 28 09:47:54 2025 RTZ\"\r\n },\r\n \"device\": {\r\n \"name\": \"/dev/sda\",\r\n \"info_name\":
                      \"/dev/sda\",\r\n \"type\": \"scsi\",\r\n \"protocol\": \"SCSI\"\r\n },\r\n \"scsi_vendor\": \"Intel\",\r\n \"scsi_product\": \"Raid 1 Volume\",\r\n
                      \"scsi_model_name\": \"Intel Raid 1 Volume\",\r\n \"scsi_revision\": \"1.0.\",\r\n \"scsi_version\": \"SPC-3\",\r\n \"user_capacity\": {\r\n \"blocks\": 111357952,
                      \r\n \"bytes\": 57015271424\r\n },\r\n \"logical_block_size\": 512,\r\n \"scsi_lb_provisioning\": {\r\n \"name\": \"thin provisioned\",\r\n \"value\": 2,\r\n
                      \"management_enabled\": {\r\n \"name\": \"LBPME\",\r\n \"value\": -1\r\n },\r\n \"read_zeros\": {\r\n \"name\": \"LBPRZ\",\r\n \"value\": 0\r\n }\r\n },\r\n
                      \"rotation_rate\": 0,\r\n \"logical_unit_id\": \"0x61fa116d01000000001517ffff0aeb84\",\r\n \"device_type\": {\r\n \"scsi_terminology\":
                      \"Peripheral Device Type [PDT]\",\r\n \"scsi_value\": 0,\r\n \"name\": \"disk\"\r\n },\r\n \"smart_support\": {\r\n \"available\": false\r\n },\r\n
                      \"temperature\": {\r\n \"current\": 0,\r\n \"drive_trip\": 0\r\n },\r\n \"seagate_farm_log\": {\r\n \"supported\": false\r\n }\r\n}\r": exit status 4.
                      The data is coming, but it does not have the correct format.

                      Ubuntu 24
                      zabbix server 7.2.4
                      zabbix agent2 7.2.3
                      smartmontools 7.4
                      template is SMART by Zabbix agent 2

                      Help me please
                      Experimentally, I found out that the "SMART by Zabbix agent 2" template works with agent version 6.4.21 and possibly earlier. It does not work with versions 7.0.X or 7.2.X.

                      Comment

                      • PavelZ
                        Senior Member
                        • Dec 2024
                        • 162

                        #102
                        Wait, you do understand that templates also need to be updated? The template version is also subject of control.
                        I'm not ready to remote debug this problem, but there are obvious considerations.
                        Try to achieve a situation where both the template is version 7.0 and the agent is version 7.0

                        Also, I suggest focusing on smartctl first and making the error go away
                        exit status 4.
                        Last edited by PavelZ; 04-03-2025, 12:00.

                        Comment

                        • INFinite
                          Junior Member
                          • Feb 2025
                          • 10

                          #103
                          Originally posted by PavelZ
                          Wait, you do understand that templates also need to be updated? The template version is also subject of control.
                          I'm not ready to remote debug this problem, but there are obvious considerations.
                          Try to achieve a situation where both the template is version 7.0 and the agent is version 7.0

                          Also, I suggest focusing on smartctl first and making the error go away
                          Template version 7.2-1.
                          Can you tell me where I can find the old versions of the template?

                          When replacing files from the archive with version 6.4, everything immediately worked.

                          exit status 4.
                          It rarely occurs, it just happened to be an example. I associate it with a frequent request for data. The update interval was 10s.
                          Last edited by INFinite; 04-03-2025, 13:22.

                          Comment

                          • PavelZ
                            Senior Member
                            • Dec 2024
                            • 162

                            #104
                            Technically, all historical versions are available. But it's not that easy to figure it out if you're not used to working with Git.
                            The situation is complicated by the presence of several branches. You need to switch branches and download along this path:
                            Real-time monitoring of IT components and services, such as networks, servers, VMs, applications and the cloud. - History for templates/server/smart_agent2/template_module_smart_agent2.yaml - zabbi...



                            The update interval was 10s.
                            But in the standard template this interval is 5 minutes. It is unknown what other errors such frequent polling may cause.

                            Comment

                            • INFinite
                              Junior Member
                              • Feb 2025
                              • 10

                              #105
                              Originally posted by PavelZ
                              Technically, all historical versions are available. But it's not that easy to figure it out if you're not used to working with Git.
                              The situation is complicated by the presence of several branches. You need to switch branches and download along this path:
                              https://github.com/zabbix/zabbix/commits/master/templates/server/smart_agent2/template_module_smart_agent2.yaml


                              I installed the template version 7.0. Agent version 7.0.7, the result is the same:
                              Code:
                              Cannot fetch data.: got error executing worker pool: smartctl returned error: unknown error from smartctl.

                              Comment

                              Working...