Ad Widget

Collapse

Baseline Monitoring

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nvitaly007
    Junior Member
    • Jan 2013
    • 11

    #1

    Baseline Monitoring

    Hello,

    It must be discussed and probably ZBXNEXT created, But I could not find anything using "Baseline Monitoring" keywords.

    I have uplink interface graphs (bytes, flows, packets). Which are following same pattern depending on day of week. I want zabbix to calculate historical mean and stddev for each point based on let say last month usage. So I can test if current measurement within stddev and alarm if not.

    Function could be:

    1. h_weekly_mean(days, window)

    That function will take average values size of window (eg 10 min), for same time and week of day as current reading and return its mean value.

    2. h_weekly_stddev(days, window)

    That function will take average values size of window (eg 10 min), for same time and week of day as current reading and return its standard deviation.

    That two function could be joined in one. so it will return rusult right away and save on DB operations.

    Questions -

    is anyone done it or currently working on something similar now ?
    will it be useful for anyone ?
    do you have better ideas ?


    Depending on answers I can start working on it myself.

    Vitaly
  • danrog
    Senior Member
    • Sep 2009
    • 164

    #2
    You can do the 1st function using calculations that already exist in Zabbix:

    Example (and this is an older trigger before 2.0 came along)

    {Template OS Linux:system.cpu.load[,avg1].min(120)}>({Template OS Linux:system.cpu.load[,avg1].max(3600,604740)}+3)

    You can replace max with avg/min. What this is saying is take the max value found in an 1 hour (3600 seconds), last week minus 1 minute (604740). Since the window is rolling, it will be checking 'now - 604740 seconds'. To use this (and this is the key part) you need to make sure your item has HISTORICAL data (not trend data) for the time period you are checking...so basically use n+1 where n=number of days back for your historical value on the item.

    I haven't done tests on this trigger to see how 'expensive' it is regarding DB I/O or CPU but if you limit the number of these types of triggers you should be fine.

    Comment

    • Alexander Podkopaev
      Junior Member
      • Mar 2014
      • 3

      #3
      Rolling data is a BAD baseline - it change over time!

      The main reason (at least for my team) for baselining is automated identifying slow steady changes in measured item.
      For example, if you have small increase in memory footprint, say, +10mb per day, you miss it with trigger configured as 'compare to last week minus 1 minute, alert if new value is greater more than 100mb'.
      Yes, you'll probably see it on monthly graph and well be surpised . And i bet you'll be disappointed by such trigger.

      So, my idea of 'good baselining' is to have configured fixed period:
      • 7 days for weekly baseline
      • 31 day for monthly baseline
      • 365 days for annual baseline


      Than API function to mark week\month\year as baseline or to copy baseline data to dedicated tables should be created.
      After that there should be new functions for retrieving baseline data, 'couse it should roll over and over the same data again. Something like w_baseline(key) - get item identified by key from weekly baseline.
      The possible issue here if item frequency will be changed. I think, it's better to 'look back' little bit, and get closest previous measurement.

      As i didn't find anything looking to my baseline approach either, we plan to work on this next month.

      SY, Alexander

      Comment

      • coreychristian
        Senior Member
        Zabbix Certified Specialist
        • Jun 2012
        • 159

        #4
        Outside of just calculations in triggers, you can also use calculated items to create the baselines you are looking for, then you can make triggers based on the comparison between the base line (calculated item)

        I wouldn't be disappointed to see more turn key solutions in zabbix, though really it would just need to extend on some of the current trending data already built in though.

        Comment

        • js1
          Member
          • Apr 2009
          • 66

          #5
          Sampled statistical variance is relatively easy to do as a calculated item or as a really ugly trigger expression. From there, z-score/std deviation comparison isn't that difficult.

          Another ZBXNEXT issue for better statistical analysis capabilities:



          Shouldn't be that hard given all databases have those stats functions built in.

          Comment

          Working...