Ad Widget

Collapse

Understanding Zabbix Problem Notifications

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • pitchfork
    Junior Member
    • Aug 2025
    • 4

    #1

    Understanding Zabbix Problem Notifications

    Greetings

    I have asked this question a few weeks ago but unfortunately it and the answer from I think cyber were both deleted when my account was temporarily blocked.

    So, one more time. I am trying to understand that actual meaning of some of the data reported in problem notifications.

    Here is an example. I see a notification entitled: Problem: Windows: 0 C:: Disk is overloaded (util > 95% for 15m)

    Problem started at 16:07:20 on 2025.09.23
    Problem name: Windows: 0 C:: Disk is overloaded (util > 95% for 15m)
    Host: WS2019-PROD
    Severity: Warning
    Operational data: 100 %
    Original problem ID: 48808

    and then I get the resolution email with the title: Resolved in 7m 0s: Windows: 0 C:: Disk is overloaded (util > 95% for 15m)

    Problem has been resolved at 16:14:20 on 2025.09.23
    Problem name: Windows: 0 C:: Disk is overloaded (util > 95% for 15m)
    Problem duration: 7m 0s
    Host: WS2019-PROD
    Severity: Warning
    Original problem ID: 48808

    Now my confusion. If the trigger that fired was caused by the disk being utilised at > 95% for 15 minutes, how does the "Problem Duration" of 7 minutes come into play? is it 7 minutes in excess of the 15 in the trigger or something entirely different?

    TIA

    Nigel.
  • Answer selected by pitchfork at 29-09-2025, 08:17.
    cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4806

    Problem expression defines time period after which it fires.. that 15m. This is usually done, if there is possibility, that this value is just temporarily over the threshold, avoiding flapping of the trigger. Like... somekind of process writes a temp file, fills disk for a short time and then deletes it, and disk utilization goes back to normal... This may be a ok behaviour, and you dont need to be informed about this.. But if it insome reason fails to do so, disk utilization stays up, then you get a notification.
    That 15m is a sliding window. Your data is flowing in with every check (1-2m intervals), each time trigger is recalculated. in this example, after 7 minutes of being active, you received a value that is below threshold (someone removed a file manually or something else cleaned up disk), your trigger was calculated to false and "problem has been resovled"... yes, your disk was filled over threshold for 22 minutes, but 15 of it was considered to be OK...

    EDIT: OK I looked at the actual trigger... It is about physical disk idle time... so, if that disk is busy for a short time, its OK... if it stays busy over 15m, you get notified... same as described above about disk fillup...
    Last edited by cyber; 25-09-2025, 10:29.

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4806

      #2
      Problem expression defines time period after which it fires.. that 15m. This is usually done, if there is possibility, that this value is just temporarily over the threshold, avoiding flapping of the trigger. Like... somekind of process writes a temp file, fills disk for a short time and then deletes it, and disk utilization goes back to normal... This may be a ok behaviour, and you dont need to be informed about this.. But if it insome reason fails to do so, disk utilization stays up, then you get a notification.
      That 15m is a sliding window. Your data is flowing in with every check (1-2m intervals), each time trigger is recalculated. in this example, after 7 minutes of being active, you received a value that is below threshold (someone removed a file manually or something else cleaned up disk), your trigger was calculated to false and "problem has been resovled"... yes, your disk was filled over threshold for 22 minutes, but 15 of it was considered to be OK...

      EDIT: OK I looked at the actual trigger... It is about physical disk idle time... so, if that disk is busy for a short time, its OK... if it stays busy over 15m, you get notified... same as described above about disk fillup...
      Last edited by cyber; 25-09-2025, 10:29.

      Comment

      • pitchfork
        Junior Member
        • Aug 2025
        • 4

        #3
        Thank you again for the explanation. Copied and saved JIC I get banned again <grin>

        Comment

        Working...