Hello,
We are in a phase of stress testing zabbix in our environnement. We're doing it in particular way by creating bogus hosts/items/triggers just made for that (we've created 1000 hosts with 100 items and 200 triggers each). All in all, here's what looks like :
we're testing by using a script, on the zabbix server, that loops around zabbix_sender. The script launch a zabbix_sender for a certain item, puts a random value (float), and goes on the next item of the same host. Once every items, of a single host, have received a value, the script moves on to the next host. It does so until a fixed time given in paramater.
We've noticed that by simply doing a test on the first 500 hosts for 5 minutes, we get a bunch of slow queries. here's an excerpt :
our database is a postgresql 9.0.4 hosted on a seperate machine from zabbix server.
strangely, this doesn't happen when we test the first 100 hosts.
These results have an impact on zabbix. From the frontend, we see a queue for items that are supposed to be checked by zabbix_agent (or internal) that gets ever increasing until the script stop.
We've also noticed that once the queries are done (which happens a couple minutes after the end of the test, obviously) all graphs get updated.
we've noticed this bug report, but it's marked as fixed as of version 1.8.3 (we're using 1.8.6) and there aren't any information on how to reproduce it or how it was fixed (and what caused the problem).
has anybody run into this problem or have any information on that old bug that might explain the slow queries ?
thanks
We are in a phase of stress testing zabbix in our environnement. We're doing it in particular way by creating bogus hosts/items/triggers just made for that (we've created 1000 hosts with 100 items and 200 triggers each). All in all, here's what looks like :
Number of hosts 1062
Number of items 100159
Number of triggers 200089
Number of items 100159
Number of triggers 200089
We've noticed that by simply doing a test on the first 500 hosts for 5 minutes, we get a bunch of slow queries. here's an excerpt :
27106:20110926:120928.425 Slow query: 120.509592 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27101:20110926:120928.498 Slow query: 66.452517 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27107:20110926:120931.416 Slow query: 62.458832 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27123:20110926:120934.875 Slow query: 108.504381 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27101:20110926:120928.498 Slow query: 66.452517 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27107:20110926:120931.416 Slow query: 62.458832 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
27123:20110926:120934.875 Slow query: 108.504381 sec, "update ids set nextid=nextid+256 where nodeid=0 and table_name='events' and field_name='eventid'"
strangely, this doesn't happen when we test the first 100 hosts.
These results have an impact on zabbix. From the frontend, we see a queue for items that are supposed to be checked by zabbix_agent (or internal) that gets ever increasing until the script stop.
We've also noticed that once the queries are done (which happens a couple minutes after the end of the test, obviously) all graphs get updated.
we've noticed this bug report, but it's marked as fixed as of version 1.8.3 (we're using 1.8.6) and there aren't any information on how to reproduce it or how it was fixed (and what caused the problem).
has anybody run into this problem or have any information on that old bug that might explain the slow queries ?
thanks
Comment