Ad Widget

Collapse

How to create an alert for a specific filesystem

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • adminjerry
    Junior Member
    • May 2022
    • 18

    #1

    How to create an alert for a specific filesystem

    Normally, I have all of my servers alerting for low space at 90% and critical space at 95%. This is done by setting the macros in the Linux by Zabbix agent. Let's say I have serverX with /data and I want it to alert at 96% and 99%. I have looked at various posts and tutorials but cannot figure out how to do this. Granted, my understanding of LLD and discovery is limited. Can someone please provide the steps I could take to create this type of alert? I've spent all day on this and still do not have a workable solution.
    Thanks.
  • Answer selected by adminjerry at 29-12-2022, 16:48.
    tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    You don't say what version of Zabbix you're using. That may be relevant, as the templates have changed over the years to have additional functionality.

    Assuming the "Linux by Zabbix agent" template you're using is this one: https://git.zabbix.com/projects/ZBX/...lates/os/linux

    You probably want to read the comments about the triggers and then on the macros page for that specific host (serverX) you want to look at adjusting:

    Code:
    {$VFS.FS.PUSED.MAX.CRIT:"/data"} => 99.0
    {$VFS.FS.PUSED.MAX.WARN:"/data"} => 96.0​​
    Because of the way the templates are written, you may also need to adjust the macros for the amount of free data too. Looking at the triggers and the documentation for the templates is recommended.

    Comment

    • tim.mooney
      Senior Member
      • Dec 2012
      • 1427

      #2
      You don't say what version of Zabbix you're using. That may be relevant, as the templates have changed over the years to have additional functionality.

      Assuming the "Linux by Zabbix agent" template you're using is this one: https://git.zabbix.com/projects/ZBX/...lates/os/linux

      You probably want to read the comments about the triggers and then on the macros page for that specific host (serverX) you want to look at adjusting:

      Code:
      {$VFS.FS.PUSED.MAX.CRIT:"/data"} => 99.0
      {$VFS.FS.PUSED.MAX.WARN:"/data"} => 96.0​​
      Because of the way the templates are written, you may also need to adjust the macros for the amount of free data too. Looking at the triggers and the documentation for the templates is recommended.

      Comment

      • adminjerry
        Junior Member
        • May 2022
        • 18

        #3
        Thank you for the response. I am using version 6.2.3.The code is what I needed. Since I had several servers with the same filesystems that needed changed, I created the macros in the Linux by Zabbix Template. I tested and it worked great.

        Comment

        • yurtesen
          Senior Member
          • Aug 2008
          • 130

          #4
          tim.mooney and adminjerry The trigger is faulty and will never work. When you have a large partition and it is 98% full, zabbix will still think there is too much free space and trigger will not warn you. I reported this over a year ago and nothing was done to fix this very serious issue.

          If you check the Zabbix 6.2 template -> https://git.zabbix.com/projects/ZBX/...Frelease%2F6.2
          Description of trigger is very clear:
          Two conditions should match:

          1. The first condition - utilization of space should be above {$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"}.

          2. The second condition should be one of the following:

          - the disk free space is less than {$VFS.FS.FREE.MIN.CRIT:"{#FSNAME}"};

          - the disk will be full in less than 24 hours.
          And the trigger expression:
          last(/Linux by Zabbix agent/vfs.fs.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"} and ((last(/Linux by Zabbix agent/vfs.fs.size[{#FSNAME},total])-last(/Linux by Zabbix agent/vfs.fs.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.CRIT:"{#FSNAME}"} or timeleft(/Linux by Zabbix agent/vfs.fs.size[{#FSNAME},pused],1h,100)<1d)​
          The problem is that the trigger uses TOTAL - USED to find out the FREE space. But this is absurd. Because in Linux there is reserved space(5%) which is NOT included in "USED". For 1000GB drive, 5% is 50GB. So when the disk is "full". TOTAL-USED will be 50GB, which is larger than {$VFS.FS.FREE.MIN.CRIT} (5GB) and {$VFS.FS.FREE.MIN.WARN} (10GB). Therefore trigger will never work as 2nd part of the expression will never be "true".

          Here the problem is described in detail. You can vote for it if you care it to be fixed.

          Comment


          • tim.mooney
            tim.mooney commented
            Editing a comment
            The general issue you're commenting about for Linux is true, that's why I mentioned adjusting the other macro values too.

            Keep in mind that minfree isn't always 5%. You can lower it when creating the filesystem (I routinely set it to 1 or 2 percent for large volumes), you can lower it after creation for some filesystem types, and some filesystem types use different defaults for minfree depending upon filesystem size.

          • yurtesen
            yurtesen commented
            Editing a comment
            tim.mooney it is irrelevant if people set reserved blocks to 1% or 10%. The official Zabbix Linux template must function independent of what the setting is. Especially, it must function properly by default Linux ext4 settings. Saying something like I make custom settings... does not really change the nature of the problem...

            Yes you have mentioned adjusting other macro values too. But only with that information, it is impossible for OP to adjust it correctly. Because you did not tell him that he should check the reserved blocks for the partition, then convert it into GB and add into the macro values. This is also not written in the official Zabbix Linux template documentation.Well, things should not work like this obviously. People should not need to do math when setting these values.

            You seem to assume you set things correctly after changing the percent, but can you ever be sure? Will you always remember in which machine you have 1%, 2% or 5%? This can be incredibly complex if you have many machines and many sysadmins and 3rd parties involved.

            If you have 1000GB ext4 partition and if you have set the reserved block percent 2% then you have 20GB reserved. With default settings of $VFS.FS.FREE.MIN.CRIT and $VFS.FS.FREE.MIN.WARN, you will not get any warnings even at 2%! So to warn at 10GB available and critical error on 5GB available. You will need to set warning values 30GB and 25GB. In the next machine, you have 1500GB partition and you decide to set it to 1%, then you need to set variables as 25GB and 20GB. You will need to keep checking the partitions and doing math to set correct values when this could be avoided if the default Zabbix Linux template was fixed.

            Next sysadmin in your company looks at these hosts and wonders WTF is going on. Because he does not know what you did with the reserved blocks percent. One can easily assume somebody made a mistake setting different values. Then erroneously set the WARN/CRIT values same in 2 machines doing similar job... In addition, the monitored machine can also be managed by a 3rd party. One day 3rd party administrator comes and sets the reserved blocks to 10% for whatever reason without telling you. What will happen then?
            Last edited by yurtesen; 27-01-2023, 15:22.

          • tim.mooney
            tim.mooney commented
            Editing a comment
            You're right that I didn't provide excruciating detail about the other macros, just a callout to watch out for them.

            I'm not disputing any of your issues with this particular template or trigger. My site doesn't use this template, and we specifically avoid some of the things this template does in our own templates and triggers.
        Working...