ZABBIX Forums  
  #1  
Old 15-03-2017, 07:55
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default Largest Zabbix Deployment?

Hello Ladies/Gentlemen

Wondering who here has the largest deployment going on?

I've just started in a new role which is looking at converging our current silo'ed monitoring systems into a single platform to provide a platform agnostic view of various systems.

It's going to be a problem of scale given that the Network, Systems, Active Directory and Database volume is one of the largest in the Southern Hemisphere.

Done some very very basic numbers and at an absolute minimum we looking at :

Number of Hosts : 75k-130k+
Number of proxies : 10-30+
Number of triggers : 1.8Million+
Number of items : 2-3Million+
Number of users : 300-1000
Zabbix DB : 2-6TB
NVPS : between 25k-150k (depending on agreed check intervals)

Anyone running something larger than this?
Reply With Quote
  #2  
Old 16-03-2017, 19:07
SBO SBO is offline
Zabbix Certified Specialist
Zabbix certified specialist
 
Join Date: Sep 2015
Location: France
Posts: 195
Default

Hi,

For such a big infra, I honestly think you should contact Zabbix directly, it's really.. HUGE !
Reply With Quote
  #3  
Old 20-03-2017, 00:36
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

I plan to, at this stage we are conceptualizing the design framework and comparing Zabbix to other products such as Icinga2.
Reply With Quote
  #4  
Old 21-03-2017, 10:53
Alex.S Alex.S is offline
Senior Member
 
Join Date: Feb 2012
Location: Riga, Latvia
Posts: 118
Default

Hi syndeysider,

There are larger installations out there in terms of hosts, items, triggers and proxies, but this is still huge

The only thing I can not vouch for is the number of concurrent users. Not saying it's impossible to handle 1000 users, just haven't heard of anyone having so many at a time.

Cheers,

Alex.
Reply With Quote
  #5  
Old 23-03-2017, 07:38
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Cheers. This is something that we will probably ship off to Grafana.

I'm hoping that the actual use cases for the front end remain with the SME's for each internal IT division that uses the platform.

I'm progressing along nicely with the design concept and have a working prototype in place along with an Incinga 2 instance. It's been a very interesting comparison in my opinion. I've been a very big supported of Zabbix, right back to 2.0.1, but do see some really cool features in Icinga 2, like out the box support for writing to a big Time Series backend etc. Something I think is available but would have to be developed in the current version of Zabbix.

Anyway, i digress, good to know there's bigger installs out there. I'm hoping that once an exec decision is made I can bring Zabbix SIA on board and get some real world design principles nailed down.
Reply With Quote
  #6  
Old 23-03-2017, 17:27
onallion onallion is offline
Senior Member
 
Join Date: Mar 2016
Posts: 128
Default

Should be interesting. What are your current plans for DB? Percona XtraDB?
Reply With Quote
  #7  
Old 24-03-2017, 03:15
jan.garaj jan.garaj is offline
Senior Member
Zabbix certified specialist
 
Join Date: Jan 2010
Location: United Kingdom, Slovakia, Bulgaria
Posts: 473
Default

Quote:
Originally Posted by syndeysider View Post
I've been a very big supported of Zabbix, right back to 2.0.1, but do see some really cool features in Icinga 2, like out the box support for writing to a big Time Series backend etc.
https://www.zabbix.com/documentation...port_callbacks

Implementation depends on the used "DB" (OpenTSDB, InfluxDB, Mongo, Graphite, Elasticsearch, DalmatinerDB, Scyladb, AWS S3, Bigtable, ....). It's can be 10-20 lines of the code usually - piece of cake. It's not a problem to send metric values to the external DB - problem is how to process/integrate metric metadata with that external DB.
__________________
Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant
Reply With Quote
  #8  
Old 24-03-2017, 07:31
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Quote:
Originally Posted by jan.garaj View Post
https://www.zabbix.com/documentation...port_callbacks

Implementation depends on the used "DB" (OpenTSDB, InfluxDB, Mongo, Graphite, Elasticsearch, DalmatinerDB, Scyladb, AWS S3, Bigtable, ....). It's can be 10-20 lines of the code usually - piece of cake. It's not a problem to send metric values to the external DB - problem is how to process/integrate metric metadata with that external DB.
Thanks! This is exactly what I plan on testing!! Really awesome to see this available now.
Reply With Quote
  #9  
Old 24-03-2017, 07:32
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Quote:
Originally Posted by onallion View Post
Should be interesting. What are your current plans for DB? Percona XtraDB?
Yes. With only History data of (x) days stored. Going to partition off the history_xxxx tables onto some PCIe SSD's which have been write optimized. In this space, I'm keeping a close eye on the new Intel Octane SSD's. This should bring my DB down to 1-2TB which i think is a bit more manageable both in terms of restore times and cost.

I'm looking at writing some very basic code to either pull (read off slave with modifications to https://github.com/zensqlmonitor/influxdb-zabbix) or push data (module) to Kafka or InfluxDB directly for the Trend/Historical keeping because of the sheer volume of data.

So Zabbix would process the basic triggers, alerts etc. and store minimal historical data with trend based analysis and service level triggers sitting on top of InfluxDB, Kapcitor, TICKscript etc. Stitch both together with Grafana and we might have something useful.

I've found these two VERY valuable for stress testing :

http://snmpsim.sourceforge.net/simul...d-traffic.html
https://github.com/vulogov/zas_agent...gent-0.1.1.pdf
Reply With Quote
  #10  
Old 24-03-2017, 11:24
jan.garaj jan.garaj is offline
Senior Member
Zabbix certified specialist
 
Join Date: Jan 2010
Location: United Kingdom, Slovakia, Bulgaria
Posts: 473
Default

https://github.com/zensqlmonitor/influxdb-zabbix - that's not good idea - you can't scale it - single point of failure - module option is better solution.

InfluxDB: 150k nvps + additional nvps from other service checks - single node can handle 480k nvps (https://blog.outlyer.com/time-series...ase-benchmarks). But to be safe, you need a cluster. FYI InfluxDB had a problem to read a lot of data in 2015 (https://cds.cern.ch/record/2011172/f...K-2015-060.pdf), maybe it's better now.
__________________
Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 12:03.