Ad Widget

Collapse

agregate values multiple host when a host is down

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nicolasgoudard
    Junior Member
    • Mar 2021
    • 27

    #1

    agregate values multiple host when a host is down

    Hello,
    I have multiple hosts in a cluster group
    If the host is not available (down), the aggregated calculated time returns an absurd value, because for zabbix the "last" means the last time the value was checked by zabbix. But normally for a given timestamp; if the value has not been checked, it should return 0.

    grpsum ["cluster", system.cpu.num, "last", 0]
    for example here if a host in the "cluster" group has not been online since 8:00 am, the cpu count returned at 11:00 am should be zero and not 32 which was the valid CPU count before 8:00 am but the problem is that i get 32, because the last value checked by zabbix was 32 at 7:59 am.

    Can I achieve it this in zabbix or do I have to do an external script: ssh loop on all machines in the cluster, then sum the processors then send it with zabbix_sender then fetch this value with zabbix_trapper))?

    Thx in advance
    Best regards
  • nicolasgoudard
    Junior Member
    • Mar 2021
    • 27

    #2
    Originally posted by splitek
    You can use other functions besides "last", ie. get max or min for last 1m (this skip hosts that do not have data from the last 1m).

    Additionally link to some nice article: https://blog.zabbix.com/zabbix-aggre...xplained/9869/
    Thanks for your reply. I use Zabbix 5.4.1. So what is the syntax to calculate the sum of an attribute of the hosts in a group, on the last minute ( only for the attribute is available in the last minute, zero else ) ?

    exemple
    host Attribute value last value A min
    A. 3
    B unavailable
    C. 2
    total = 5
    Last edited by nicolasgoudard; 28-07-2021, 08:34.

    Comment

    • nicolasgoudard
      Junior Member
      • Mar 2021
      • 27

      #3
      Originally posted by splitek
      I'm using 5.0 so in 5.0 it will be something like:

      groupsum["host group","item key",max,1m]

      https://www.zabbix.com/documentation...ypes/aggregate

      For 5.4 I think it can be something like:

      sum(max_foreach(/*/net.if.out[eth0,bytes]?[group="video"],1m))

      https://www.zabbix.com/documentation...ated/aggregate

      But I'm not sure it will work like you expect, you must test it.

      Thanks for your reply, I made it but I did not get the good results. I wait for 11 as a good result but I get 10. Because at 11:51, the sum is made for the fields that have timestamp between 11:50 and 11h51 but for one machine in group the value was get at 11:49 . (look at the picture)
      How can I get the value at the same timestamp on every nodes ? I put scheduling = m0-59 in item parameter but it does not work ?
      Sorry for my bad English.
      thx
      Attached Files

      Comment

      • splitek
        Senior Member
        • Dec 2018
        • 101

        #4
        Can you test with setting scheduling to = m0-59s0 for "normal" items and scheduling = m0-59s15 for caclulated items?
        I add a 15s difference to items scheduling, we need to be sure that the data for the normal item are "fresh" (collected just before the computation in clalculated items). If we have the same second then we cant be sure of the order ("collect and next calculate" or "calculate and next collect").

        Also you can replace "1m" in calculated items with "75s" - so calculation will take also "delayed" data. You have in calculation "max", witch means if our "window" (time "75s") get two values then it will choose max from this both. You can swich this "max" to "min" to have different behavior.

        If this methods are not good then you need to do some external script. Or if you can sending data for all hosts in one file then you can use javascript preprocessing (but it also be the script but inside Zabbix).

        Comment

        • nicolasgoudard
          Junior Member
          • Mar 2021
          • 27

          #5
          Originally posted by splitek
          Can you test with setting scheduling to = m0-59s0 for "normal" items and scheduling = m0-59s15 for caclulated items?
          I add a 15s difference to items scheduling, we need to be sure that the data for the normal item are "fresh" (collected just before the computation in clalculated items). If we have the same second then we cant be sure of the order ("collect and next calculate" or "calculate and next collect").

          Also you can replace "1m" in calculated items with "75s" - so calculation will take also "delayed" data. You have in calculation "max", witch means if our "window" (time "75s") get two values then it will choose max from this both. You can swich this "max" to "min" to have different behavior.

          If this methods are not good then you need to do some external script. Or if you can sending data for all hosts in one file then you can use javascript preprocessing (but it also be the script but inside Zabbix).
          Ok thx I have done it now but I have same problem (attached files : 2 nodes are not on the same timestamp so total = 9 ) I dont understand because the node date times are well synchronized with chrony
          i have restarted zabbix agents and zabbix- server services but the problems stays
          Attached Files

          Comment

          • splitek
            Senior Member
            • Dec 2018
            • 101

            #6
            Now I see what can be wrong... set interval to 0 because as in doc (https://www.zabbix.com/documentation...om_intervals):

            Scheduling intervals are used to check items at specific times. While flexible intervals are designed to redefine the default item update interval, the scheduling intervals are used to specify an independent checking schedule, which is executed in parallel.
            So "normal" interval doing additional checks (in parallel) besides this scheduled checks.

            Comment

            • nicolasgoudard
              Junior Member
              • Mar 2021
              • 27

              #7
              Originally posted by splitek
              Now I see what can be wrong... set interval to 0 because as in doc (https://www.zabbix.com/documentation...om_intervals):



              So "normal" interval doing additional checks (in parallel) besides this scheduled checks.
              Thanks I do it now but I have two machines still not synchronized.... I restart Zabbix-agent and server but same ...

              Comment

              Working...