Ad Widget

Collapse

How to make a calculated metric showing the duration of abnormality

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • brunoaa
    Junior Member
    • Aug 2021
    • 5

    #1

    How to make a calculated metric showing the duration of abnormality

    I have a setup where I have a zabbix 5.4 analyzing multiple servers. I can't say much about how I setup but I'll try to explain what I have and what I need regardless.

    Characters:
    1. Producer: Produces data and has an HTTP JSON API which allows me to know what kind of information it's producing.
      1. Has execution type ranging from A all the way to Z.
      2. The order is always from A to Z However:
        1. the time it stays in each is variable.
        2. Every letter change, causes a change in the system load and how the load changes is unreasonably difficult to hard code into Zabbix (even as a manually set macro)
      3. No Zabbix agent is allowed to be installed here.
    2. Executor: Receives the produced data and executes the related actions (some are longer than others, they are also repetitive)
      1. Does not know whether it's running A, B, C.... However
      2. It reports the average time messages took to process for every given minute.
        1. 5 entries, every 5 minutes
        2. I.e. with up to a 5 minutes delay, Zabbix has the average of every minute.

    Problem:
    I want a metric that tells me, for every letter change, how long the Executor took to reach the system load associated with the current letter's load, without knowing the load associated with that letter.
    For a human, the identification is done by a significant change in the average time the messages took to be processed.
    Here's a synthetic example (I cannot use real examples due to NDA). In order to fit the screen, the changes in mode are sped up by a lot. The time it takes to reach stability too, however, the graphs do look approximately like that..

    Click image for larger version  Name:	explain_time.png Views:	0 Size:	34.0 KB ID:	430684

    In this image, I want to show a metric that displays the time between the green bars and the red bars.
    Basically, the time starts counting when the HTTP request reports a new type and the time stops when the variation has reached a variation tendency to close to 0.
    What I need to show in a bar graph is close to the following example:

    Click image for larger version  Name:	explain_time_result.png Views:	0 Size:	11.2 KB ID:	430686

    Basically, up to the time when next mode activates, zabbix is supposed to show how long it took to reach that stable level. It's OK if the graphs grow every minute until the system has reached the stability, instead of how I exemplified.
    In the real data, the stability is achieved with data variation of +-5 (but which value is the stable one will change)

    I was told I can assume the system will always reach the stability between each mode.

    What I did so far:

    So far, I was able to do an HTTP agent metric which queries the Producer and stores which mode the Producer is in.

    What if Zabbix allowed:
    If I were able to do a Javascript calculated metric, where I can get values from other metrics and make this metric based on that one, I would have already solved this. Querying for when the type has last changed and then querying for recent variations and calculating the difference in time would be straightforward.
    However, calculated metrics cannot use Javascript to gather values. Only in the post-processing, where I can't do anything about gathering more data anymore.


    So, How can I make a calculated metric in Zabbix which tells me how long the system took to reach processing stability?


    Attached Files
    Last edited by brunoaa; 31-08-2021, 09:48.
  • brunoaa
    Junior Member
    • Aug 2021
    • 5

    #2
    Because no one answered, I had to move on and hack my way through.

    Comment

    Working...