ODT Export
 

Database load in Zabbix 1.8

Remember, Zabbix 1.8 was supposed to improve performance?

That's a nice promise, but what are users seeing in real production environments? Luckily, we know that now. Zabbix user verwilst has shared some graphs before and after upgrade from 1.6 (and he has a big monitor). Rough facts about the environment:

  • over 1500 hosts;
  • 125 000 items;
  • 55 000 triggers;
  • 1200+ new values per second.

This installation also has split Zabbix server and database.

Improvements in SQL query amount

And now for the shiny part. Here's a graph of SQL access by Zabbix server, split up by selects, inserts, updates and deletes. On the left hand side we can see Zabbix 1.6 operating, then there's a small gap during the upgrade, and then Zabbix 1.8.1 is getting to work.

MySQL queries. Click to see larger size

So what's the difference? As we can see, all kinds of database access have dropped notably. Selects, for example, dropped more than twice, updates a bit less than twice. What's significant, amount of inserts has dropped from significant 800 per second to pretty much nothing during a normal run (last value is 7.71), with insignificant, small peaks (all below 500). Amount of deletes does not seem to have changed that much.

Looking at the graph we can of course appreciate the improvements in the Zabbix server. We also can spot different things happening. The red risings are quite clearly housekeeper runs, where old data is removed. One run happens when Zabbix server starts, and then it runs once every hour - which is the default housekeeper interval.

There also are smaller bumps in inserts hourly. But these are not aligned with housekeeper runs, instead happening at full hour. These are trend calculations and inserts into the database. At the same time, updates slightly decrease because Zabbix server cache is busy by the trends.

So this single graph gives us both a confirmation that Zabbix server in version 1.8 is much more effective, as well as giving some insight in its daily (or more like hourly in this case) operations. But verwilst was so kind and shared some more graphs, showing the impact of the upgrade.

Improvements in CPU load

As a result of the reduced query count, actual load on servers dropped as well. Here we can see how CPU load stabilises on a lower level after the upgrade on the database host.

CPU load on Zabbix database. Click to see larger size

And here's CPU load change on the Zabbix server - excellent, that one also is lower with 1.8.

CPU load on Zabbix server. Click to see larger size

Improvements in actual data collection

With all the load reduction, there might be some more “production like” metric we could look at, to determine what effect all this had on the efficiency of Zabbix after all. For that we have Zabbix server queue size - amount of items that are being worked on at any given moment. So here's the graph.

Zabbix queue size. Click to see larger size

Zabbix 1.6 had a lot of items to work on, and some notable backlog. Excluding some larger spikes, it seemed to fluctuate around 7 thousand items. As this value was updated less frequently, Zabbix graph has upgrade gap filled with straight line - it does not know what caused the missing data, and the amount of missing values is too small to consider that a gap. Upgrade period is marked on the graph for clarity.

Hey, what's that ? Is there some problem with Zabbix after the upgrade? Line seems to go too low… Now this is indeed a testimony to all the technical improvements we looked at. Actually Zabbix queue dropped from ~7000 to… 19.

So there. Zabbix 1.8 is better, faster, and it might even feed your dog. Confirmed by users.

 
news/2010.02.19-performance_improvements_in_1.8.txt · Last modified: 2010/02/19 21:59 by richlv
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki