Hi everybody!
Today I developed a _theory_ about how to hopefully achieve performance gains by selecting prime numbers for the update interval ('delay') of monitored items. Please let me know what you think as so far this is just a theory! I don't have a large enough Zabbix installation to try it out in practice...
Background: The update interval determines how often an item is being checked by the Zabbix server (ignoring active checks here!). Commonly the update interval values is set to multiples of 30 seconds, i.e. 30,60,120 or 300 seconds. This effectively means, that frequently many checks have to be executed at the same time. For instance, if you had 5 checks at a 30 seconds interval and 5 checks at a 60 seconds interval, every 60 seconds 10 checks are executed. Furthermore there are 'lulls' of 30 seconds where not much happens. Please correct me, if I am wrong here!!!
My idea is to spread out the checks in a way so that checks are still performed at regular intervals but not too many at the same time. In order to achieve this, one has to choose intervals for the individual checks so that their Least Common Multiple (LCM) is as large as possible. In the example above the 10 checks (5 x 30secs and 5 x 60 secs) share an LCM of 60 (2*30=1*60). So every 60 seconds all checks are run. A better choice would be to choose the following intervals for the ten checks, each of which is a prime number: 23,29,31,37,41,47,53,59,61,67,71. Since the LCM of two prime numbers is their product, only after 667 (=23*29) seconds any two of the ten checks will be executed at the same time and otherwise they are nicely spread out over time.
That's the theory anyway. Am I completely wrong here? I wonder whether anybody is willing to try this out on a real Zabbix server. If you prove me wrong - fine no problem
Would be nice for everybody if I were right though
Markus
Today I developed a _theory_ about how to hopefully achieve performance gains by selecting prime numbers for the update interval ('delay') of monitored items. Please let me know what you think as so far this is just a theory! I don't have a large enough Zabbix installation to try it out in practice...
Background: The update interval determines how often an item is being checked by the Zabbix server (ignoring active checks here!). Commonly the update interval values is set to multiples of 30 seconds, i.e. 30,60,120 or 300 seconds. This effectively means, that frequently many checks have to be executed at the same time. For instance, if you had 5 checks at a 30 seconds interval and 5 checks at a 60 seconds interval, every 60 seconds 10 checks are executed. Furthermore there are 'lulls' of 30 seconds where not much happens. Please correct me, if I am wrong here!!!
My idea is to spread out the checks in a way so that checks are still performed at regular intervals but not too many at the same time. In order to achieve this, one has to choose intervals for the individual checks so that their Least Common Multiple (LCM) is as large as possible. In the example above the 10 checks (5 x 30secs and 5 x 60 secs) share an LCM of 60 (2*30=1*60). So every 60 seconds all checks are run. A better choice would be to choose the following intervals for the ten checks, each of which is a prime number: 23,29,31,37,41,47,53,59,61,67,71. Since the LCM of two prime numbers is their product, only after 667 (=23*29) seconds any two of the ten checks will be executed at the same time and otherwise they are nicely spread out over time.
That's the theory anyway. Am I completely wrong here? I wonder whether anybody is willing to try this out on a real Zabbix server. If you prove me wrong - fine no problem
Would be nice for everybody if I were right though
Markus
Well, I am not trying to break
Comment