Ad Widget

Collapse

zabbix 2.0.1: problem with "grpsum"

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ghillan
    Junior Member
    • Jan 2012
    • 20

    #1

    zabbix 2.0.1: problem with "grpsum"

    Hello all,
    I have a problem to understand how exactly this function work. Im monitoring available space and free space of several san devices using a self made script

    Lets say i get for each storage ( for now im monitoring 50 of them) and i get something like:
    32TB total space
    1TB free space

    ad i made a graph for each one. THings works good so far but occasionally (still cant figure why ) sometime zabbix fail to retrive the data:

    example from the LOG:
    6433:20120726:113009.064 item [storagename:user_parameter_key[{HOST.HOST1}, LUNS_UNUSED]] became not supported: Received value [] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]
    6433:20120726:114009.918 item [storagename:user_parameter_key[{HOST.HOST1}, LUNS_UNUSED]] became supported

    Not big issue anyway because the items will be check once per hour and the usupported polling will get all data much before them, so i done see any hole in any of my graphs.

    The problem came when i wanted to aggregate all data and have a graph of the total and available space of all the devices. In this case the number i get are not constant, i guess the cause its that the aggregate its summing just the last value and probably its hitting a moment where there are no data for a particular storage.

    My initial aggregate key was:

    grpsum["STORAGE","{$STORAGE_SPACE_TOTAL}","last","0"]

    so i triied to workaround changig to:
    grpsum["STORAGE","{$STORAGE_SPACE_TOTAL}","max","1d"]


    Since single storages are checked every hour ( and the total space dos,nt change anyway) it should get the values from all the storages , so the line should be constant, but it doesn't happen anyway.

    Still the aggregate graph go up and down every hour exacly as was happening using the "last","0". SO im wondering if i misunderstood what the max means. All the graph of the sible storages have all the data for each hour, so the aggregate should never be unable to find the data since im giving him a windows of 24 hours and im sure that i have the correct value even for every hour. It seems to me that the "max" command ot the time windows for some readon dont work at all.


    I know i should try to understand why zabbix sometimes dont retrieve the data, but honestly everytime i try to execute the check with zabbix_get it works just fine ! Also the execution timeits alwasys than 0.2 secs. I triied to find the issue but was never able to replicate it even triing to flood a lot of commands using a script. The script and zabbix_get works just fail never slowing down.......
Working...