Ad Widget

Collapse

Disk Space Trigger problems (falsely resolving)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • cszikszoy
    Junior Member
    • Mar 2020
    • 7

    #1

    Disk Space Trigger problems (falsely resolving)

    I'm observing some strange behavior related to the "Disk Space" triggers for both Linux and Windows hosts. Examining the trigger shows the following conditions:
    Click image for larger version

Name:	Screenshot from 2020-06-19 13-04-13.png
Views:	2095
Size:	44.0 KB
ID:	403723
    The description of the trigger says:
    Two conditions should match: First, space utilization should be above {$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"}.
    Second condition should be one of the following:
    - The disk free space is less than 5G.
    - The disk will be full in less than 24 hours.
    The first and second condition is straight forward, but I'm looking for some help with the 3rd condition.

    What is the purpose / function of this? It seems to not be working correctly because what I'm seeing is the following:
    1. Host with high disk usage has this trigger asserted
    2. Trigger resolves when disk usage does not increase further, but is still above threshold
    Consider the following event for a particular server:
    Click image for larger version

Name:	Screenshot from 2020-06-19 13-08-14-scrub.png
Views:	2198
Size:	157.0 KB
ID:	403724
    The event is shown being triggered and cleared multiple times. However, the graph of the data shows this:
    Click image for larger version

Name:	chart-scrub.png
Views:	2113
Size:	36.8 KB
ID:	403725
    Every spike on the graph corresponds to the trigger being asserted, however, the trigger is resolved when the disk usage stops increasing. It seems like this shouldn't happen. The disk usage never dropped below the threshold (90%) so the trigger should never resolve. I suspect it's the 3rd expression that's causing the trigger to no longer evaluate to true, but why?
  • tim.mooney
    Senior Member
    • Dec 2012
    • 1427

    #2
    Originally posted by cszikszoy
    Every spike on the graph corresponds to the trigger being asserted, however, the trigger is resolved when the disk usage stops increasing. It seems like this shouldn't happen. The disk usage never dropped below the threshold (90%) so the trigger should never resolve. I suspect it's the 3rd expression that's causing the trigger to no longer evaluate to true, but why?

    The logic for the triggers is "A and (B or C)", so the first condition must be true and at least one of the 2nd or 3rd conditions must also be true for the trigger to fire.

    Condition A is true for all of the graph you show.

    Condition B is is only true for a short period right as utilization is very close to 100%. Most of the time, it is not true, so it does not contribute to the trigger.

    Condition C is true whenever there's a brief spike in utilization, because the predictive nature of the trigger is looking at growth from a 1 hour time period. Once the growth subsides and you're back to a steady state, Condition C becomes false again, so at that point A is still true, B is false, and C is back to being false, so you have "true and (false or false)", which is false, so the trigger resolves.

    Comment

    • cszikszoy
      Junior Member
      • Mar 2020
      • 7

      #3
      Originally posted by tim.mooney
      Condition C is true whenever there's a brief spike in utilization, because the predictive nature of the trigger is looking at growth from a 1 hour time period. Once the growth subsides and you're back to a steady state, Condition C becomes false again, so at that point A is still true, B is false, and C is back to being false, so you have "true and (false or false)", which is false, so the trigger resolves.
      I see - this is the condition that's clearing the trigger. I think I understand the purpose of it, essentially "reset the trigger if disk usage has stopped increasing".

      Originally posted by splitek
      Thank you - those were informative.

      Comment

      Working...