ZABBIX Forums  
  #11  
Old 24-03-2017, 11:27
jan.garaj jan.garaj is offline
Senior Member
Zabbix certified specialist
 
Join Date: Jan 2010
Location: United Kingdom, Slovakia, Bulgaria
Posts: 472
Default

I think Kafka will be better design decision. You can stream data from the Kafka (even multiple times) to the selected DB(s) (InfluxDB, HBase, OpenTSDB, ...) - search "GrafanaCon 2016: Utkarsh Bhatnagar, Elastic-Monitoring Using Grafana at Sony PlayStation" on youtube.

More valuable resources for Zabbix stress testing are https://monitoringartist.github.io/z...archer/#stress
__________________
Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant
Reply With Quote
  #12  
Old 29-03-2017, 00:04
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Quote:
Originally Posted by jan.garaj View Post
I think Kafka will be better design decision. You can stream data from the Kafka (even multiple times) to the selected DB(s) (InfluxDB, HBase, OpenTSDB, ...) - search "GrafanaCon 2016: Utkarsh Bhatnagar, Elastic-Monitoring Using Grafana at Sony PlayStation" on youtube.

More valuable resources for Zabbix stress testing are https://monitoringartist.github.io/z...archer/#stress
Cheers! Very interesting Talk by Utkarsh Bhatnagar.

Busy looking at https://github.com/edenhill/librdkafka and building a prototype. It's an extra layer but you are correct in that it would allow some form of flexibility in the underlying Time Series choices.
Reply With Quote
  #13  
Old 29-03-2017, 01:19
jan.garaj jan.garaj is offline
Senior Member
Zabbix certified specialist
 
Join Date: Jan 2010
Location: United Kingdom, Slovakia, Bulgaria
Posts: 472
Default

Could you publish your prototype under open source license on the GitHub/GitLab please?
__________________
Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant
Reply With Quote
  #14  
Old 05-04-2017, 02:30
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Quote:
Originally Posted by jan.garaj View Post
Could you publish your prototype under open source license on the GitHub/GitLab please?
Quite possible depending on our policy on publishing code externally. I would like to, but I am new here and will find out!

On another note

https://blog.timescale.com/when-boring-is-awesome-building-a-scalable-time-series-database-on-postgresql-2900ea453ee2


Looks very interesting!
Reply With Quote
  #15  
Old 05-04-2017, 07:45
bbrendon bbrendon is offline
Senior Member
 
Join Date: Sep 2005
Posts: 809
Default

As for scaling I'd say Zabbix suffers from age. Back when Zabbix was born all these new technologies were still many years away. Zabbix hasn't improved since it's inception in terms of architecture and back-end.

I get the feeling there should be something better than Zabbix but I'm not sure what that is. Maybe there isn't.

On the plus side, Zabbix has matured continuously over the years and is still going strong.
__________________
Unofficial Zabbix Expert
Blog, Corporate Site
Reply With Quote
  #16  
Old 07-04-2017, 12:10
Alexei Alexei is offline
Zabbix developer, product manager
 
Join Date: Sep 2004
Location: Riga, Latvia
Posts: 5,642
Post

Quote:
Originally Posted by bbrendon View Post
As for scaling I'd say Zabbix suffers from age. Back when Zabbix was born all these new technologies were still many years away. Zabbix hasn't improved since it's inception in terms of architecture and back-end.
If everything goes as planned there will be a number of serious improvements that would bring much better level of scalability (among other things) into Zabbix 4.0. Some of the improvements will be included into 3.4, API for history data is one of them.

As for newer technologies I always ask myself, will it bring any long term value? I'd like all architectural decisions we make today be well justified.

Quote:
Originally Posted by bbrendon View Post
On the plus side, Zabbix has matured continuously over the years and is still going strong.
True. There are also many interesting concepts and ideas we discuss almost every day in our office. I'm afraid most of the activity remains invisible to Zabbix community. Anyway, nowadays more than ever I'm excited about the future of Zabbix, so much to do!
__________________
Alexei Vladishev
Creator of Zabbix, Product manager
New York | Tokyo | Riga
My Twitter
Reply With Quote
  #17  
Old 12-04-2017, 10:15
SBO SBO is offline
Zabbix Certified Specialist
Zabbix certified specialist
 
Join Date: Sep 2015
Location: France
Posts: 194
Default

There are some pretty interesting case studies available on the Zabbix website, but none that can compare to such a big installation.
Will we someday have a case study of a similar environment described by the OP ?
Reply With Quote
  #18  
Old 12-04-2017, 11:48
Alex.S Alex.S is offline
Senior Member
 
Join Date: Feb 2012
Location: Riga, Latvia
Posts: 116
Default

Someday - for sure

For now though, if anyone is interested in sharing their success story, and I don't mean only if you have 200k+ devices etc, then feel free to drop us a line.
Reply With Quote
  #19  
Old 13-04-2017, 02:37
kloczek kloczek is offline
Senior Member
 
Join Date: Jun 2006
Location: UK/London
Posts: 872
Default

Quote:
Originally Posted by Alex.S View Post
Someday - for sure

For now though, if anyone is interested in sharing their success story, and I don't mean only if you have 200k+ devices etc, then feel free to drop us a line.
Number of devices is not relevant.
What is important is rate of writing metrics points values per second written to the database backend.
In you case is used single insert of some metrics data to conditions().
Zabbix is not working that way because it writes to exact history* table data based on type of those data. In other words inserting to monitoring data all data from single device never happens in case of abbix.
Zabbix server even with very high NVPS rate like few tenths of thousands is doing this operations using few tenths of inserts per second.
with even 100-200k/metrics points written to DB backend you may be doing few thousands inserts per seconds and you can even even lower number of those inserts by enlarging max_allowed_packet (in case of MySQL).
Bottleneck is somewhere. It is not obvious that insert queries are creating not only write IOs on but some well predictable reads operations rate as well. You must have on DB backend side enough big memory cache to hold in memory all those informations which will be need to find all places which needs to be updated or changed.
You can simplify you writing process up to the moment when your DB backend will be only streaming the data to file without adding any metadata allowing you later find quickly exact subset of historic data.
Even by streaming all new data to single file of few some number files you OS on VFS layer only ow writing new data will be doing read IOs .. so even here writing only huge amount of data may endup in kind of bottleneck caused by read IOs.

Using zabbix with writing new monitoring data as batches in few inserts/s is really god enough in case majority of monitoring cases.

As long as we are not talking about raw monitoring but on top of this alarming layer simplified process of writing new data as sequentially written data may cause big problems on adding alarming layer as long as you triggers definition will be using some historic data to calculate values of your triggers/alarms.
If it is the case again you will be usually hitting read IOs bottleneck than write IOs limit.

It is a bit counterintuitive that to gain sometimes very high write data rate first you must solve reads issues created by MFU/MRU data.
Reply With Quote
  #20  
Old 13-04-2017, 10:07
Alexei Alexei is offline
Zabbix developer, product manager
 
Join Date: Sep 2004
Location: Riga, Latvia
Posts: 5,642
Default

Quote:
Originally Posted by kloczek View Post
Bottleneck is somewhere.
Currently Zabbix is limited to processing of about 50-80K NVPS on average with all optimizations made on a decent Intel based server. This level of performance is sufficient for most applications out there, nevertheless it's challenging to achieve better performance in 3.2 or earlier releases.

There are number of places where Zabbix could do much better job and we are fully aware of it.

I think Zabbix 3.4 will eliminate any performance issues on history storage side. I think that the next logical step is to look at the existing architecture and figure out how we can make it more efficient without sacrificing all the guarantees Zabbix provides currently. It's about making Zabbix scale both vertically (still very important!) and horizontally.

We hope to deliver visible results of our work in 4.0, hopefully by the end of this year.
__________________
Alexei Vladishev
Creator of Zabbix, Product manager
New York | Tokyo | Riga
My Twitter
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 08:13.