Ad Widget

Collapse

Zabbix 6.4.13 - Disk Space Alerts Do Not Stay Active, Resolved but Unaddressed

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • michael.armstrong
    Junior Member
    • May 2024
    • 13

    #1

    Zabbix 6.4.13 - Disk Space Alerts Do Not Stay Active, Resolved but Unaddressed

    Hello,
    Recently we've replaced an older Zabbix 4.0 server (which treated me well for years), with a new Zabbix 6.4.13 server.

    There is 1 major issue I can't seem to figure out with all disk space alerts.
    All disk space ("low or "critically low") clear/resolve some time later and never stay active like in the Zabbix 4.0 server.
    I have made no adjustments to disk space alerts - this is the out-of-the-box template "Windows by Zabbix agent"

    This is a major problem for me because the disk space may still be critically low and need intervention, but the Zabbix interface doesn't show it.

    The alerts are
    Problem started at 22:45:08 on 2024.07.24
    Problem name: (C) Disk space is low (used > 80%)
    Host: SERVERNAME
    Severity: Warning
    Operational data: Space used: 81.71 GB of 99.46 GB (82.15 %)
    Original problem ID: 89723

    Problem has been resolved at 23:19:08 on 2024.07.24
    Problem name: (C) Disk space is low (used > 80%)
    Problem duration: 34m 0s
    Host: SERVERNAME
    Severity: Warning
    Original problem ID: 89723

    The space used on the SERVERNAME remained at 82.15% but the problem resolved itself.
    Why would active alerts get resolved and how can I fix this?

    Thanks for your time!
  • Answer selected by michael.armstrong at 29-07-2024, 15:26.
    Markku
    Senior Member
    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
    • Sep 2018
    • 1784

    The latest 6.4 "Windows by Zabbix agent" template at https://git.zabbix.com/projects/ZBX/...Frelease%2F6.4 has more simple expression:

    min(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],5m)>{$VFS.FS.PUSED.MAX.WARN:"{#FSLABEL}({#FSNAME} )"}

    Markku

    Comment

    • 5h3rmz
      Junior Member
      • Jul 2024
      • 1

      #2
      I also am experiencing this issue on Windows Servers and Zabbix 6.4.15. I have tried updating the Windows Agent to match the Server Version and still have the issue. We don't see this issue with our Linux Servers being monitored by the same Zabbix server. Would also like to get this fixed.

      Comment

      • vsergione
        Junior Member
        • Oct 2023
        • 28

        #3
        Can you please export the alert and post it here so we can take a look at it?

        Comment

        • Markku
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Sep 2018
          • 1784

          #4
          Please post the actual trigger expression that this item currently has, otherwise we cannot know the exact configuration (because the templates are modified from version to version, for example your problem name does not seem to match the current 6.4 branch template).

          Markku

          Comment

          • michael.armstrong
            Junior Member
            • May 2024
            • 13

            #5
            Thanks for the responses - I'm working on this will report back as soon as I have more information.
            @ 5h3rmz - if you have ​additional trigger information faster, feel free to post it. I'm sure they're the same since the versions are nearly the same (which makes me wonder if I need to update versions already)
            More information shortly...

            Comment

            • michael.armstrong
              Junior Member
              • May 2024
              • 13

              #6
              OK - below is the expression for both "Disk space is low" and "Disk space is critically low" - both of which have the problem of clearing/resolving shortly after triggering.

              MACROS
              {$VFS.FS.PUSED.MAX.CRIT} = 90
              {$VFS.FS.PUSED.MAX.WARN} = 80

              EXPRESSION "Disk space is low"
              last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.WARN:"{#FSNAME}"} and
              ((last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},total])-last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.WARN:"{#FSNAME}"} or timeleft(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],1h,100)<1d)

              EXPRESSION "Disk space is critically low"
              last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"} and
              ((last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},total])-last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.CRIT:"{#FSNAME}"} or timeleft(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],1h,100)<1d)​

              Click image for larger version

Name:	screenshot.jpg
Views:	1245
Size:	119.0 KB
ID:	488427
              I am still reviewing the request to export the alert. Thanks everyone!

              Comment

              • michael.armstrong
                Junior Member
                • May 2024
                • 13

                #7
                I'm not finding an easy was to export the alert as requested - need some advice on that one.
                I posted the details from the alert email already and found 2 new examples in history from last night.
                It's clear to see the alerts resolved in under 10mins, but the actual disk-space remained under 20% free-space.
                Click image for larger version

Name:	screenshot2.jpg
Views:	1212
Size:	145.3 KB
ID:	488429

                Comment

                • michael.armstrong
                  Junior Member
                  • May 2024
                  • 13

                  #8
                  I've also glanced through bug fixes for versions above 6.4.13 on https://www.zabbix.com/release_notes and didn't see anything for vfs, disk, space or utilization that stood out.

                  Comment

                  • Markku
                    Senior Member
                    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                    • Sep 2018
                    • 1784

                    #9
                    The latest 6.4 "Windows by Zabbix agent" template at https://git.zabbix.com/projects/ZBX/...Frelease%2F6.4 has more simple expression:

                    min(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],5m)>{$VFS.FS.PUSED.MAX.WARN:"{#FSLABEL}({#FSNAME} )"}

                    Markku

                    Comment

                    • michael.armstrong
                      Junior Member
                      • May 2024
                      • 13

                      #10
                      I will see if I can simply disable the original triggers and set 2 new triggers and report back after some testing. Thanks!

                      Comment

                      • michael.armstrong
                        Junior Member
                        • May 2024
                        • 13

                        #11
                        Updated and will now let process a bit.

                        MACROS
                        {$VFS.FS.PUSED.MAX.CRIT} = 90
                        {$VFS.FS.PUSED.MAX.WARN} = 80

                        Names:
                        {#FSLABEL}({#FSNAME}): Disk space is critically low (v2)
                        {#FSLABEL}({#FSNAME}): Disk space is low (v2)

                        Expressions
                        (critically low): min(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],5m)>{$VFS.FS.PUSED.MAX.CRIT:"{#FSLABEL}({#FSNAME} )"}
                        (low): min(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],5m)>{$VFS.FS.PUSED.MAX.WARN:"{#FSLABEL}({#FSNAME} )"}

                        Click image for larger version

Name:	screenshot3.jpg
Views:	1213
Size:	255.6 KB
ID:	488442

                        Comment

                        • michael.armstrong
                          Junior Member
                          • May 2024
                          • 13

                          #12
                          Alerts are coming in and they're accurately showing the active disk space alerts.
                          Also the alert Resolved when I extended the disk above 20%. (hooray!)

                          BUT there is something missing with the Operational data I need to decipher... I just copy/pasted from the other discovery trigger protocols and the alert shows like this:

                          Alert:
                          Problem started at 10:27:07 on 2024.07.26
                          Problem name: Application(F: Disk space is low (v2) (used > 80%)
                          Host: SERVERNAME
                          Severity: Warning
                          Operational data: Space used: *UNKNOWN* of *UNKNOWN* (80.4 %) ** Area I have to review
                          Original problem ID: 93741

                          Resolved looks good (after I extended the disk):
                          Problem has been resolved at 10:32:07 on 2024.07.26
                          Problem name: Application(F: Disk space is low (v2) (used > 80%)
                          Problem duration: 5m 0s
                          Host: SERVERNAME
                          Severity: Warning
                          Original problem ID: 93741

                          So far - an ideal direction!

                          Comment

                          • michael.armstrong
                            Junior Member
                            • May 2024
                            • 13

                            #13
                            (edit - removed hyperlinks)

                            Small added realization -
                            I just tested a Linux VM and the problem is the same with the "Linux by Zabbix agent" template for my installation @ 5h3rmz.
                            Maybe there were some improvements in your version or you're using a different template.

                            The default trigger for this template is

                            last(/Linux by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.WARN:"{#FSNAME}"} and ((last(/Linux by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},total])-last(/Linux by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.WARN:"{#FSNAME}"} or timeleft(/Linux by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],1h,100)<1d)

                            So also lengthy... I realized this because the Zabbix server itself also had a nearly full volume that I didn't catch until now... Sheesh... ​

                            Comment

                            • michael.armstrong
                              Junior Member
                              • May 2024
                              • 13

                              #14
                              The only item remaining is the Operational Data conundrum.
                              I can't seem to find enough information on where the variables are defined or why they wouldn't work on the new trigger prototypes when only the expression is changed. (all other settings cloned)

                              Space used: {ITEM.LASTVALUE3} of {ITEM.LASTVALUE2} ({ITEM.LASTVALUE1})
                              shows as:
                              Operational data: Space used: *UNKNOWN* of *UNKNOWN* (81.63 %)

                              Therefore {ITEM.LASTVALUE1} works, but not {ITEM.LASTVALUE2} or {ITEM.LASTVALUE3}

                              I tried a rediscovery execution, and even removed a host and re-added it as new.
                              Any thoughts while I keep looking?

                              Thanks again - this expression update was exactly what I wanted for functionality similar to the older Zabbix...
                              (I wonder who finds the original expressions useful?)

                              Comment

                              • Markku
                                Senior Member
                                Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                                • Sep 2018
                                • 1784

                                #15
                                https://www.zabbix.com/documentation...ASTVALUE#items says:

                                The latest value of the Nth item in the trigger expression that caused a notification.
                                = since your trigger only has one item, there are no ITEM.LASTVALUE2 macro value available.

                                You may want to consider just importing the updated template from the repository and not changing all details manually.

                                Markku

                                Comment

                                Working...