Ad Widget

Collapse

Discussion thread for official Zabbix SMART Disk monitoring

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • AlexL
    Zabbix Certified Specialist
    Zabbix Certified Specialist
    • Aug 2019
    • 55

    #1

    Discussion thread for official Zabbix SMART Disk monitoring


    This thread is designed to provide grounds for discussion of the upcoming official Zabbix Template for SMART Disk monitoring.
    The template and details of the template will be available in GIT repository.

    Zabbix is always looking for ways to improve our services and to make our users happier.
    We pride ourselves on doing our best each and every day, but we know that there is always something more to learn.
    We would like to hear back from you to know what have you liked and what would you improve in the template.
  • sirbusby
    Junior Member
    • Jul 2017
    • 7

    #2
    Hello!
    zabbix_server (Zabbix) 5.4.0beta1
    Revision 2d05a56dbb 1 March 2021, compilation time: Mar 1 2021 09:09:06

    zabbix_agent2 (Zabbix) 5.4.0beta1
    Revision 2d05a56dbb 1 March 2021, compilation time: Mar 1 2021 12:04:58

    smartmontools:
    smartctl -V
    smartctl 7.1 2020-04-05 r5049 [x86_64-linux-4.18.0-240.15.1.el8_3.x86_64] (local build)

    OS:
    Centos 8
    Linux centos8 4.18.0-240.15.1.el8_3.x86_64

    From Zabbix Server:
    zabbix_get -s 192.168.20.141 -p 10050 -k smart.disk.discovery
    Code:
    ZBX_NOTSUPPORTED: Failed to scan for devices: Cannot unmarshal JSON: invalid character 'W' looking for beginning of value..
    zabbix_get -s 192.168.20.141 -p 10050 -k smart.attribute.discovery
    Code:
    ZBX_NOTSUPPORTED: Failed to scan for devices: Cannot unmarshal JSON: invalid character 'W' looking for beginning of value..
    zabbix_get -s 192.168.20.141 -p 10050 -k smart.disk.get
    Code:
    ZBX_NOTSUPPORTED: Failed to scan for devices: Cannot unmarshal JSON: invalid character 'W' looking for beginning of value..
    From Zabbix Agent:
    zabbix_agent2 -t smart.disk.discovery
    smart.disk.discovery [s|[{"{#NAME}":"sda sat","{#DISKTYPE}":"SSD","{#MODEL}":"Samsung SSD 860 EVO 500GB","{#SN}":"S4FNNF0N140769M"}]]
    zabbix_agent2 -t smart.attribute.discovery
    Code:
    smart.attribute.discovery [s|[{"{#NAME}":"sda sat","{#DISKTYPE}":"SSD","{#ID}":5,"{#ATTRNAME}": " Reallocated_Sector_Ct","{#THRESH}":10},{"{#NAME}": "sda sat","{#DISKTYPE}":"SSD","{#ID}":9,"{#ATTRNAME}": " Power_On_Hours","{#THRESH}":0},{"{#NAME}":"sda sat","{#DISKTYPE}":"SSD","{#ID}":12,"{#ATTRNAME}" : "Power_Cycle_Count","{#THRESH}":0},{"{#NAME}": "sda sat","{#DISKTYPE}":"SSD","{#ID}":177,"{#ATTRNAME} " :"Wear_Leveling_Count","{#THRESH}":0},{"{#NAME} ":" sda sat","{#DISKTYPE}":"SSD","{#ID}":179,"{#ATTRNAME} " :"Used_Rsvd_Blk_Cnt_Tot","{#THRESH}":10},{"{#NA ME} ":"sda sat","{#DISKTYPE}":"SSD","{#ID}":181,"{#ATTRNAME} " :"Program_Fail_Cnt_Total","{#THRESH}":10},{"{#N AME }":"sda sat","{#DISKTYPE}":"SSD","{#ID}":182,"{#ATTRNAME} " :"Erase_Fail_Count_Total","{#THRESH}":10},{"{#N AME }":"sda sat","{#DISKTYPE}":"SSD","{#ID}":183,"{#ATTRNAME} " :"Runtime_Bad_Block","{#THRESH}":10},{"{#NAME}" :"s da sat","{#DISKTYPE}":"SSD","{#ID}":187,"{#ATTRNAME} " :"Uncorrectable_Error_Cnt","{#THRESH}":0},{"{#N AME }":"sda sat","{#DISKTYPE}":"SSD","{#ID}":190,"{#ATTRNAME} " :"Airflow_Temperature_Cel","{#THRESH}":0},{"{#N AME }":"sda sat","{#DISKTYPE}":"SSD","{#ID}":195,"{#ATTRNAME} " :"ECC_Error_Rate","{#THRESH}":0},{"{#NAME}":"sd a sat","{#DISKTYPE}":"SSD","{#ID}":199,"{#ATTRNAME} " :"CRC_Error_Count","{#THRESH}":0},{"{#NAME}":"s da sat","{#DISKTYPE}":"SSD","{#ID}":235,"{#ATTRNAME} " :"POR_Recovery_Count","{#THRESH}":0},{"{#NAME}" :"s da sat","{#DISKTYPE}":"SSD","{#ID}":241,"{#ATTRNAME} " :"Total_LBAs_Written","{#THRESH}":0}]]
    zabbix_agent2 -t smart.disk.get
    Code:
    smart.disk.get [s|[{"ata_sct_capabilities":{"data_table_supported" :tr ue,"error_recovery_control_supported":true,"featu r e_control_supported":true,"value":61},"ata_smart_a ttributes":{"revision":1,"table":[{"flags":{"auto_keep":true,"error_rate":false," eve nt_count":true,"performance":false,"prefailure":tr ue,"string":"PO--CK ","updated_online":true,"value":51},"id":5,"na me": "Reallocated_Sector_Ct","raw":{"string":"0","v alue ":0},"thresh":10,"value":100,"when_failed":"", "wor st":100},{"flags":{"auto_keep":true,"error_rate" :f alse,"event_count":true,"performance":false,"prefa ilure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":9,"na me": "Power_On_Hours","raw":{"string":"5020","value ":50 20},"thresh":0,"value":99,"when_failed":"","worst" :99},{"flags":{"auto_keep":true,"error_rate":false ,"event_count":true,"performance":false,"prefai lur e":false,"string":"-O--CK ","updated_online":true,"value":50},"id":12,"n ame" :"Power_Cycle_Count","raw":{"string":"20","valu e": 20},"thresh":0,"value":99,"when_failed":"","worst" :99},{"flags":{"auto_keep":false,"error_rate":fals e,"event_count":true,"performance":false,"prefai lu re":true,"string":"PO--C- ","updated_online":true,"value":19},"id":177," name ":"Wear_Leveling_Count","raw":{"string":"9","v alue ":9},"thresh":0,"value":99,"when_failed":"","w orst ":99},{"flags":{"auto_keep":false,"error_rate" :fal se,"event_count":true,"performance":false,"prefai l ure":true,"string":"PO--C- ","updated_online":true,"value":19},"id":179," name ":"Used_Rsvd_Blk_Cnt_Tot","raw":{"string":"0", "val ue":0},"thresh":10,"value":100,"when_failed":"", "w orst":100},{"flags":{"auto_keep":true,"error_rate" :false,"event_count":true,"performance":false,"pre failure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":181," name ":"Program_Fail_Cnt_Total","raw":{"string":"0" ,"va lue":0},"thresh":10,"value":100,"when_failed":"", " worst":100},{"flags":{"auto_keep":true,"error_rate ":false,"event_count":true,"performance":false ,"pr efailure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":182," name ":"Erase_Fail_Count_Total","raw":{"string":"0" ,"va lue":0},"thresh":10,"value":100,"when_failed":"", " worst":100},{"flags":{"auto_keep":false,"error_rat e":false,"event_count":true,"performance":false ,"p refailure":true,"string":"PO--C- ","updated_online":true,"value":19},"id":183," name ":"Runtime_Bad_Block","raw":{"string":"0","val ue": 0},"thresh":10,"value":100,"when_failed":"","wors t ":100},{"flags":{"auto_keep":true,"error_rate" :fal se,"event_count":true,"performance":false,"prefai l ure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":187," name ":"Uncorrectable_Error_Cnt","raw":{"string":"0 ","v alue":0},"thresh":0,"value":100,"when_failed":""," worst":100},{"flags":{"auto_keep":true,"error_rate ":false,"event_count":true,"performance":false ,"pr efailure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":190," name ":"Airflow_Temperature_Cel","raw":{"string":"2 6"," value":26},"thresh":0,"value":74,"when_failed":"", "worst":47},{"flags":{"auto_keep":false,"error _rat e":true,"event_count":true,"performance":false, "pr efailure":false,"string":"-O-RC- ","updated_online":true,"value":26},"id":195," name ":"ECC_Error_Rate","raw":{"string":"0","value" :0}, "thresh":0,"value":200,"when_failed":"","worst ":20 0},{"flags":{"auto_keep":true,"error_rate":true,"e vent_count":true,"performance":true,"prefailure":f alse,"string":"-OSRCK ","updated_online":true,"value":62},"id":199," name ":"CRC_Error_Count","raw":{"string":"0","value ":0} ,"thresh":0,"value":100,"when_failed":"","worst ":1 00},{"flags":{"auto_keep":false,"error_rate":false ,"event_count":true,"performance":false,"prefai lur e":false,"string":"-O--C- ","updated_online":true,"value":18},"id":235," name ":"POR_Recovery_Count","raw":{"string":"13","v alue ":13},"thresh":0,"value":99,"when_failed":""," wors t":99},{"flags":{"auto_keep":true,"error_rate": fal se,"event_count":true,"performance":false,"prefai l ure":false,"string":"-O--CK ","updated_online":true,"value":50},"id":241," name ":"Total_LBAs_Written","raw":{"string":"895188 6830 ","value":8951886830},"thresh":0,"value":99,"w hen_ failed":"","worst":99}]},"ata_smart_data":{"capabilities":{"attribute_a ut osave_enabled":true,"conveyance_self_test_supporte d":false,"error_logging_supported":true,"exec_o ffl ine_immediate_supported":true,"gp_logging_supporte d":true,"offline_is_aborted_upon_new_cmd":false ,"o ffline_surface_scan_supported":false,"selective_se lf_test_supported":true,"self_tests_supported":tru e,"values":[83,3]},"offline_data_collection":{"completion_seconds ": 0,"status":{"string":"was never started","value":0}},"self_test":{"polling_minutes ":{"extended":85,"short":2},"status":{"passed" :tru e,"string":"completed without error","value":0}}},"ata_smart_error_log":{"summar y":{"count":0,"revision":1}},"ata_smart_selecti ve_ self_test_log":{"current_read_scan":{"lba_max":655 35,"lba_min":0,"status":{"string":"was never started","value":0}},"flags":{"remainder_scan_enab led":false,"value":0},"power_up_scan_resume_minut e s":0,"revision":1,"table":[{"lba_max":0,"lba_min":0,"status":{"string":"No t_t esting","value":0}},{"lba_max":0,"lba_min":0,"stat us":{"string":"Not_testing","value":0}},{"lba_ma x" :0,"lba_min":0,"status":{"string":"Not_testing"," v alue":0}},{"lba_max":0,"lba_min":0,"status":{"stri ng":"Not_testing","value":0}},{"lba_max":0,"lba_ mi n":0,"status":{"string":"Not_testing","value":0 }}]},"ata_smart_self_test_log":{"standard":{"count" :0 ,"revision":1}},"ata_version":{"major_value":25 56, "minor_value":94,"string":"ACS-4 T13/BSR INCITS 529 revision 5"},"device":{"info_name":"/dev/sda [SAT]","name":"/dev/sda","protocol":"ATA","type":"sat"},"disk_name":" s da sat","disk_type":"ssd","firmware_version":"RVT04B 6 Q","form_factor":{"ata_value":3,"name":"2.5 inches"},"in_smartctl_database":true,"interface_sp eed":{"current":{"bits_per_unit":100000000,"sata_ v alue":3,"string":"6.0 Gb/s","units_per_second":60},"max":{"bits_per_unit ":1 00000000,"sata_value":14,"string":"6.0 Gb/s","units_per_second":60}},"json_format_version ":[1,0],"local_time":{"asctime":"Thu Mar 4 17:40:46 2021 +05","time_t":1614861646},"logical_block_size":51 2 ,"model_family":"Samsung based SSDs","model_name":"Samsung SSD 860 EVO 500GB","physical_block_size":512,"power_cycle_coun t":20,"power_on_time":{"hours":5020},"rotation_ rat e":0,"sata_version":{"string":"SATA 3.2","value":255},"serial_number":"S4FNNF0N140769 M ","smart_status":{"passed":true},"smartctl":{" argv ":["smartctl","-a","-d","sat","/dev/sda","-j"],"build_info":"(local build)","exit_status":0,"platform_info":"x86_64-linux-4.18.0-240.15.1.el8_3.x86_64","svn_revision":"5049","vers ion":[7,1]},"temperature":{"current":26},"user_capacity":{ "b locks":976773168,"bytes":500107862016},"wwn":{"id" :62546942647,"naa":5,"oui":9528}}]]

    Click image for larger version  Name:	Screenshot_2.png Views:	0 Size:	48.4 KB ID:	420063
    Last edited by sirbusby; 04-03-2021, 14:48.

    Comment

    • sirbusby
      Junior Member
      • Jul 2017
      • 7

      #3
      Click image for larger version

Name:	Screenshot_3.png
Views:	14428
Size:	63.9 KB
ID:	420071

      I added in /etc/sudoers
      zabbix ALL=(ALL) NOPASSWD:/usr/sbin/smartctl
      and it works

      From Zabbix Server
      zabbix_get -s 192.168.20.141 -p 10050 -k smart.disk.discovery
      Code:
      [{"{#NAME}":"sda sat","{#DISKTYPE}":"SSD","{#MODEL}":"Samsung SSD 860 EVO 500GB","{#SN}":"S4FNNF0N140769M"}]
      But in last value displayed VALUE, Instead of RAW_VALUE.

      Click image for larger version

Name:	Screenshot_6.png
Views:	14278
Size:	337.5 KB
ID:	420072

      Comment


      • JimP
        JimP commented
        Editing a comment
        This is still a PROBLEM I used template as of 5 Juli 2021 and in Zabbix it get the wrong value in like described above here is the smartctl output and it takes the first value THIS is WRONG should take raw: value
        {
        "id": 231,
        "name": "SSD_Life_Left",
        "value": 100,
        "worst": 100,
        "thresh": 0,
        "when_failed": "",
        "flags": {
        "value": 19,
        "string": "PO--C- ",
        "prefailure": true,
        "updated_online": true,
        "performance": false,
        "error_rate": false,
        "event_count": true,
        "auto_keep": false
        },
        "raw": {
        "value": 82,
        "string": "82"
        }
        },
    • garymol
      Junior Member
      • Mar 2021
      • 7

      #4
      Very clean solution to collecting S.M.A.R.T. attributes and setting thresholds given the wide variety of manufacturer's disks. However, there are some attributes, that I (and others) may wish to trigger for any changed value rather than wait for the drive to "fail" by exceeding its threshold. ie attribute 5.

      Comment

      • Schamberger
        Junior Member
        • Mar 2021
        • 1

        #5
        Originally posted by sirbusby
        Click image for larger version  Name:	Screenshot_3.png Views:	35 Size:	63.9 KB ID:	420071

        I added in /etc/sudoers
        zabbix ALL=(ALL) NOPASSWD:/usr/sbin/smartctl
        and it works

        From Zabbix Server
        zabbix_get -s 192.168.20.141 -p 10050 -k smart.disk.discovery
        Code:
        [{"{#NAME}":"sda sat","{#DISKTYPE}":"SSD","{#MODEL}":"Samsung SSD 860 EVO 500GB","{#SN}":"S4FNNF0N140769M"}]
        But in last value displayed VALUE, Instead of RAW_VALUE.

        Click image for larger version  Name:	Screenshot_6.png Views:	26 Size:	337.5 KB ID:	420072
        I will try to figure it out for more. Keep sharing such informative post keep suggesting such post.

        www.myaarpmedicare.com
        Last edited by Schamberger; 22-03-2021, 06:10.

        Comment

        • cstackpole
          Senior Member
          Zabbix Certified Specialist
          • Oct 2006
          • 225

          #6
          Greetings,

          First - Thank you for working on improving SMART data metrics!

          I've got loads of legacy servers and my big storage devices are packed with spinning rust. I rely pretty heavily on Zabbix and good SMART data. I've been using nobodysu's smartmontools template for a while (from Zabbix 4.x to 5.0).


          I decided to give the Zabbix one a try on my recent Zabbix 5.2 server install and provide feedback. I've attached 4 screenshots: one of Zabbix template's data, one of Zabbix template's triggers, one of nobodysu's template's data, and one of nobodysu's template's triggers. Unfortunately, my test system only has SSD's so some of the useful items are blank.

          I'm listing out a few differences between them and where I would really like to see the Zabbix template improved.


          Data:
          Firmware version: 99% of the time I don't care about this, but it is helpful when an issue that is prematurely bricking drives is discovered. I filter on this data to help me know how to target/plan for firmware updates.

          Form Factor: I never care about this field

          Model family: This is actually useful in the big storage systems. I don't use it that much, but I do use it. Especially when I need to go grab a replacement disk for a failed drive in a RAID array.

          Rotation rate: I don't ever use it unless I'm trying to determine performance of a drive. At that point I could just drop to the command line. I've got drives between 5400 and 15K and many in between. It's nice to have but I could do without.

          SATA Version: SUPER IMPORTANT for me. For most drives this field has data like: "SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)". It really helps knowing if the drive is an old SATA, or SATA II, or a newer SATA III device. The revision has been useful to me in the past. But the biggest thing is knowing that it's a 6Gb/s device that is currently only at 3Gb/s. Is that because the computer SATA is an older version (in this case yes) or is it because the drive is failing? A change in speed is often an indicator of pre-fail; I've got a drive on my desk right now that passes SMART but under heavy writes it drops to 1.5Gb/s about an hour or so before the computer locks up!

          SMART Status: That's kinda implied already. I don't really use this field.

          User capacity: This has been very useful when a drive fails and it's not giving me any data at all but I need to know what size drive to replace it with before driving into the data center.


          Triggers

          Drive exit status codes are CRAZY useful. The exit codes for smartctl (all documented in the man page) are very comprehensive and very very useful for me. These trigger events alone are fantastic reasons to use nobodysu's template.

          SMART available but disabled : very useful on my older SCSI drives

          SMART unavailable - device lacks capacity : When bringing up older servers, I target these drives for removal. I use this as one of my first server checks.

          Current bandwidth is less than maximum supported by the drive : I find this very very useful.

          The "has changed within the past 5 days" is VERY useful. If the Reallocated_Sector_Ct starts climbing dramatically, it's nice that I know I need to start planning a replacement before the drive fails on me. All of these "has changed" triggers are useful.

          This doesn't show up in the image, but the "Template is assigned, but no data recieved on {HOST.NAME}" is very useful too. Usually indicates an issue with smartctl or permissions.

          Things that this template does better:
          It's a lot faster. I've not tested on my big drive servers where >32 drives are common but I'm impressed with how much faster it is in gathering data.

          Needs a lot less package overhead. I didn't have to deploy the python package and the python scripts plus setting a bunch of SELinux permissions and sudoer permissions like I do with nobodysu's version. The Zabbix version just needed one line in sudoers.

          That's my feedback. Thank you for working on an integrated SMART check! I am SUPER grateful to everyone who has contributed to SMART metric monitoring in Zabbix!

          Comment


          • Demin Sergey
            Demin Sergey commented
            Editing a comment
            Hello!
            Help please!!!

            zabbix_server (Zabbix) 5.4.
            OS:
            Windows 10
            Zabbix Agent 2 - 5.4.
            smartctl 7.1

            From Zabbix Server:
            zabbix_get -s 192.168.0.141 -p 10050 -k smart.disk.discovery

            Failed to scan for devices: Cannot unmarshal JSON: invalid character 'P' in string escape code..
        • jkvint
          Junior Member
          • Mar 2019
          • 2

          #7
          Hello, how can I add devices that are not found by smart, such as megaraid and cciss?

          Comment

          • miles
            Junior Member
            • May 2021
            • 1

            #8
            Hello!
            I'm using smart monitoring in docker container, and I need help.
            Details are as follows:

            problem encountered:
            when executing this inside the zabbix agent2 container, it returns
            Click image for larger version

Name:	QQ拼音截图20210522215930.png
Views:	12142
Size:	13.5 KB
ID:	425196

            zabbix agent:
            Click image for larger version

Name:	QQ拼音截图20210522220256.png
Views:	12092
Size:	9.1 KB
ID:	425198

            docker-compose.yml:
            Click image for larger version

Name:	QQ拼音截图20210522220257.png
Views:	12128
Size:	12.1 KB
ID:	425200

            smartmontools:
            after starting the container, I entered the container, and installed smartmontools with apk add smartmontools. This is the version of it:
            Click image for larger version

Name:	QQ拼音截图20210522220840.png
Views:	12133
Size:	6.5 KB
ID:	425201

            So I don't know what causes this status 127 when the agent execute smartctl, but I can execute smartctl normally.
            I'm new to zabbix, and have no idea how to get more details about the reason it fails, and how to fix it.
            So, any help will be appreciated. Thanks!
            Attached Files

            Comment

            • tuxmartin
              Junior Member
              • Jan 2017
              • 12

              #9
              Originally posted by jkvint
              Hello, how can I add devices that are not found by smart, such as megaraid and cciss?
              Hi, I have the same problem.

              Code:
              # zabbix_agent2 -t smart.attribute.discovery
              smart.attribute.discovery [m|ZBX_NOTSUPPORTED] [Unknown metric smart.attribute.discovery]
              I need to run smartcl with -d (--device) param: smartctl -a /dev/sda -d sat+megaraid,00
              How can I set -d in Zabbix?

              Comment

              • tuxmartin
                Junior Member
                • Jan 2017
                • 12

                #10
                I increased agent timeout to 30 seconds and now it works (megaraid).

                Comment


                • elemay
                  elemay commented
                  Editing a comment
                  out of curiosity, what have you done that it recognizes megaraid attached devices? just increased timeout? or did you add something to the template?
              • dominicpratt
                Junior Member
                • Dec 2018
                • 14

                #11
                Can anyone point me in the right direction how I can disable some attributes?

                I know there are {$SMART.ATTRIBUTE.ID.MATCHES} and {$SMART.DISK.NAME.MATCHES} but I can't figure out what to put in there...

                Comment

                • gergely.szocs
                  Junior Member
                  • Jun 2021
                  • 4

                  #12
                  Originally posted by dominicpratt
                  Can anyone point me in the right direction how I can disable some attributes?

                  I know there are {$SMART.ATTRIBUTE.ID.MATCHES} and {$SMART.DISK.NAME.MATCHES} but I can't figure out what to put in there...
                  Hello All,

                  I have the same problem. I would like to smart monitor only sda and sdb. I set things as per below but it did not solve the issue. It keeps monitor sd[c-z]. Can you please help me? I am stuck here for many days.


                  - I created filters in both Discovery rule as per below.

                  Disk discovery > Filters

                  {#NAME} matches {$SMART.DISK.NAME.MATCHES}

                  Attribute discovery > Filters

                  {#NAME} matches {$SMART.ATTRIBUTE.ID.MATCHES}

                  - After that I modified the existing macros on template level as per below.

                  SMART by Zabbix agent 2 > Macros

                  {$SMART.ATTRIBUTE.ID.MATCHES} (sd[a-b]).sat
                  {$SMART.DISK.NAME.MATCHES} (sd[a-b]).sat


                  Thank you in advance,

                  Regards,
                  Gregory

                  Comment

                  • Mamukerka
                    Junior Member
                    • May 2021
                    • 2

                    #13
                    Hello! Recently I upgraded Zabbix server from 5.0 to 5.2.6 and then to 5.4. I want to use out-of-the-box function of SMART monitoring.

                    After update everything seems fine, but I can't add new templates to Zabbix, where I can see SMART parameters. I try to download it from Git branch, these templates in *.yaml format:



                    When I try to import it in Zabbix (Configuration >> Templates >> Import) I get an error - "Import failed. Cannot read YAML: Invalid YAML file contents". At the bottom of the screen of Zabbix GUI, version is correct - 5.4.0. The host machine has Zabbix agent2 and Smartmontools 7.2 installed, everything works fine.
                    In manual I saw, that template must be in .xml format:



                    "Steps to ensure correct operation of templates that collect metrics with Zabbix agent 2:

                    1. Make sure that the agent 2 is installed on the host, and that the installed version contains required plugin. Follow the steps on this page if you need to update the agent 2.
                    2. Link the template to a target host (if the template is not available in your Zabbix installation, you may need to import the template's .xml file first - see Templates out-of-the-box section for instructions).
                    3. Adjust the values of mandatory macros as needed. Note, that user macros can be used to override configuration parameters.
                    4. Configure the instance being monitored to allow sharing data with Zabbix - see instructions in the Additional steps/comments column."


                    Sorry if this is too noob question, maybe I'm just doing something wrong?

                    Comment

                    • gergely.szocs
                      Junior Member
                      • Jun 2021
                      • 4

                      #14
                      Originally posted by gergely.szocs

                      Hello All,

                      I have the same problem. I would like to smart monitor only sda and sdb. I set things as per below but it did not solve the issue. It keeps monitor sd[c-z]. Can you please help me? I am stuck here for many days.


                      - I created filters in both Discovery rule as per below.

                      Disk discovery > Filters

                      {#NAME} matches {$SMART.DISK.NAME.MATCHES}

                      Attribute discovery > Filters

                      {#NAME} matches {$SMART.ATTRIBUTE.ID.MATCHES}

                      - After that I modified the existing macros on template level as per below.

                      SMART by Zabbix agent 2 > Macros

                      {$SMART.ATTRIBUTE.ID.MATCHES} (sd[a-b]).sat
                      {$SMART.DISK.NAME.MATCHES} (sd[a-b]).sat


                      Thank you in advance,

                      Regards,
                      Gregory
                      Would you please help me with this issue? I am really stuck here and I already put lot of effort to solve it but without success. Thanks in advance.

                      Comment

                      • kabassanov
                        Junior Member
                        • Jun 2021
                        • 2

                        #15
                        Hi,

                        In smartfs.go, I can see the following comment for getRaidDevice function. But, sometimes, disk number does not follow the previous one... Even worse, sometimes it does not start at 0. In particular, it is true when direct pd mapping is not allowed, like with DELL MD-1400 storage enclosures.
                        So this function will return partial information (or nothing).

                        // getRaidDevices sets raid device information returned by smartctl.
                        // Works by incrementing raid disk number till there is an error from smartctl.
                        // Sets device data to runner 'devices' field.
                        // If jsonRunner is true, sets raw json outputs to runner 'jsonDevices' map instead.
                        // It logs an error when there is an issue with getting or parsing results from smartctl.
                        func (r *runner) getRaidDevices(jsonRunner bool) { ...

                        Comment

                        Working...