Heyas all,
I'm hoping someone will be able to point us in the right direction here, or help shed any light on what may be going on.
We have around 3400+ servers being monitored, @1125 new values per second. We have spent hours optimizing the database. We are using version 2.x of Zabbix.
We are about to start completely over again for the 4th time, as once the servers all get loaded into the system everything seems to work pretty well. The problems start once the system starts adding/removing/changing hosts on an hourly basis. There are not a lot of changes, all though there can be anywhere from zero to a dozen servers that need changed/removed or added every hour.
Once this has been going on for a dew days, everything starts slowing down, the housecleaning processes is 100% for 95% of the day and then the corruption eventually starts popping up.
What I mean by the corruption are hosts/templates referencing items that can't be removed, hosts using templates that don't exist (at least by name) and dozens of random hosts duplicated between 2 and 5 times. The later concerns us as there doesnt seem to be a way to do this through the interface, configuration import or API.
We are on running this beast on a server with a terrabyte of space for MySQL alone, 24GB of ram and a dual quad 2.4GHZ Xeon.
The database after running a little over a month is over 480GB in size and continuously growing.
Are we doing anything wrong here, is it not advisable to make the number of changes we are after everything is setup?
I'm hoping someone will be able to point us in the right direction here, or help shed any light on what may be going on.
We have around 3400+ servers being monitored, @1125 new values per second. We have spent hours optimizing the database. We are using version 2.x of Zabbix.
We are about to start completely over again for the 4th time, as once the servers all get loaded into the system everything seems to work pretty well. The problems start once the system starts adding/removing/changing hosts on an hourly basis. There are not a lot of changes, all though there can be anywhere from zero to a dozen servers that need changed/removed or added every hour.
Once this has been going on for a dew days, everything starts slowing down, the housecleaning processes is 100% for 95% of the day and then the corruption eventually starts popping up.
What I mean by the corruption are hosts/templates referencing items that can't be removed, hosts using templates that don't exist (at least by name) and dozens of random hosts duplicated between 2 and 5 times. The later concerns us as there doesnt seem to be a way to do this through the interface, configuration import or API.
We are on running this beast on a server with a terrabyte of space for MySQL alone, 24GB of ram and a dual quad 2.4GHZ Xeon.
The database after running a little over a month is over 480GB in size and continuously growing.
Are we doing anything wrong here, is it not advisable to make the number of changes we are after everything is setup?


.
Comment