Scaling zabbix performance
There are many topics to be considered when scaling a time-series database beyond current possibilities. Zabbix currently faces issues with three points, and the current post will focus on one of these, back-end scalability.
Issue 1 : By reducing the number of NVPS from agents
Sending agent data back only on exception with a dead band. See ZBXNEXT-113 and references to Ganglia.
Issue - 2 : By increasing back-end scalability
Using NoSQL methods for storing and reading time-series data.
Issue - 3 : By increasing efficiency of the storage algorithm
See references
Swinging Door
Derivatime time-series Segment Approximation(DSA). "Effective and efficient similarity search in time series"
Comparison of compression algorithms http://www.castdiv.org/archive/data_compression.pdf
Use a single efficient algorithm for compression for the trend tables. Instead of the gross approximation that is one hour min/max/avg.
=======
Todays topic: Increasing scalability
Using a NosQL method:
Handlersocket is a MySQL plugin that permits the use of NoSQL read and write methods against the mature InnoDB storage engine to multiply the number of queries that can actual be run against a non I/O bound database. If the database does not fit in memory, Handlersocket offers no benefit, which is a bummer for Zabbix, as the History and Trends databases grow easily in the hundreds of Gigabytes.
References:
Use of a NoSQL database for storage and retrieval of time-series data, high number of queries per second and also benefits from inherent HA and replication for massive data sets. Now we are talking.
References:
TokioCabinet, Hbase and other NoSQL databases that are geared to Time-Series data and that do not require extensive programming modifications.
Strategy for Zabbix:
Using NoSQL database for the History and Trends table; The trends table could now make use of a more efficient compression algorithm that can be allowed to take up more space but provide much better data accuracy.
Keep the SQL engine for all other tables.
What do you think Alexei.
You mentioned that Zabbix 2.x would look into NoSQL, is that still in the cards?
Cheers
Tomatos-for-Dollars
There are many topics to be considered when scaling a time-series database beyond current possibilities. Zabbix currently faces issues with three points, and the current post will focus on one of these, back-end scalability.
Issue 1 : By reducing the number of NVPS from agents
Sending agent data back only on exception with a dead band. See ZBXNEXT-113 and references to Ganglia.
Issue - 2 : By increasing back-end scalability
Using NoSQL methods for storing and reading time-series data.
Issue - 3 : By increasing efficiency of the storage algorithm
See references
Swinging Door
Derivatime time-series Segment Approximation(DSA). "Effective and efficient similarity search in time series"
Comparison of compression algorithms http://www.castdiv.org/archive/data_compression.pdf
Use a single efficient algorithm for compression for the trend tables. Instead of the gross approximation that is one hour min/max/avg.
=======
Todays topic: Increasing scalability
Using a NosQL method:
Handlersocket is a MySQL plugin that permits the use of NoSQL read and write methods against the mature InnoDB storage engine to multiply the number of queries that can actual be run against a non I/O bound database. If the database does not fit in memory, Handlersocket offers no benefit, which is a bummer for Zabbix, as the History and Trends databases grow easily in the hundreds of Gigabytes.
References:
Use of a NoSQL database for storage and retrieval of time-series data, high number of queries per second and also benefits from inherent HA and replication for massive data sets. Now we are talking.
References:
TokioCabinet, Hbase and other NoSQL databases that are geared to Time-Series data and that do not require extensive programming modifications.
Strategy for Zabbix:
Using NoSQL database for the History and Trends table; The trends table could now make use of a more efficient compression algorithm that can be allowed to take up more space but provide much better data accuracy.
Keep the SQL engine for all other tables.
What do you think Alexei.
You mentioned that Zabbix 2.x would look into NoSQL, is that still in the cards?
Cheers
Tomatos-for-Dollars
Comment