ZABBIX Forums  

Go Back   ZABBIX Forums > Zabbix Discussions and Feedback > Zabbix for Large Environments

Reply
 
Thread Tools Display Modes
  #1  
Old 16-06-2017, 23:48
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default N Zabbix servers or 1 Zabbix server + N Zabbix proxys

We have been using a single Zabbix server on a bare metal box without any zabbix proxy for two years. As our infrastructure growing, the Zabbix server reaches its capacity. We plan to migrate Zabbix monitoring to multiple nodes.

Choice 1:
- N Zabbix servers, each server monitors a set of devices.
-- e.g. server 1 monitors network switches, while server 2 monitors Linux servers, etc.

Choice 2:
- 1 Zabbix server + N Zabbix proxys
-- reading through presentations on Zabbix conferences, there may be issues using proxys.

Could you please share your experience or offer your recommendations? Thanks in advance!
Reply With Quote
  #2  
Old 20-06-2017, 05:37
syndeysider syndeysider is offline
Senior Member
 
Join Date: Oct 2013
Posts: 114
Default

Hi

In the past, I have successfully hit over 5k+ hosts with +- 3k NVPS.

-On a single zabbix server (spec'ed correctly)
-mysql backend that is tuned, setup for table partitioning to fast SSD's for History_* and Trend_* tables
-Multiple proxies at various locations

No issues. Not sure which presentations you are looking at. One server, multiple proxies would be my suggestion.
Reply With Quote
  #3  
Old 20-06-2017, 18:08
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default

Hi syndeysider,

Thanks very much for your advice. It is really helpful.

Somewhere on the forum mentioned that mysql partitioning is already implemented on Zabbix 3.4, I was wondering if you happen to hear about that. I did not find it on Zabbix 3.4 doc.

Thanks very much again!
Reply With Quote
  #4  
Old 22-06-2017, 23:02
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default

On our environment now, a single server without any proxy:
Number of hosts: ~200
Number of items: > 40K
Number of triggers: > 20K
Required server performance, new values per second: > 1K

If we are going to use the architecture 1 server + N proxy,

Could you please share your experience or offer your recommendations on the CPU/memory/disk requirements for the server and proxy? Thanks in advance!
Reply With Quote
  #5  
Old 23-06-2017, 13:42
kloczek kloczek is offline
Senior Member
 
Join Date: Jun 2006
Location: UK/London
Posts: 872
Default

Quote:
Originally Posted by wyang View Post
On our environment now, a single server without any proxy:
Number of hosts: ~200
Number of items: > 40K
Number of triggers: > 20K
Required server performance, new values per second: > 1K

If we are going to use the architecture 1 server + N proxy,

Could you please share your experience or offer your recommendations on the CPU/memory/disk requirements for the server and proxy? Thanks in advance!
I see some kind of chicken and egg issue.
Seems you are looking for some factors about HW requirements when you know quite precisely number of items and NVPS.
As well NVPS (1k) and number of items (20k) -> avg every item is sampled every 20s. As it is avg value it means that some quite big number of items have quit low sampling time (like less than 5s). Probably you underestimated number of items.
Calculating NVPS knowing number of items and distribution of the sampling rates is quite difficult.
I see some explanation that those values which you mention are presenting are kind of guess and it is quite possible that you may a bit overestimated NVPS. If you are working on some estimations 200 items/host is usually enough to have base OS activity monitoring .. without any applications layer metrics. Depends on types of applications which needs to be monitored you may have additional few hundredths to even few thousands items per host.

Doing any resources needed estimations related to needs of the zabbix stack you can do this only using NVPS and number of the web clients observing monitoring data over web frontend.
NVPS will determine how big/strong needs to be zabbix server DB backend. In case of 1k NVPS it should enough 8/16 CPU cores/threads, +48GB RAM and 1TB of SATA ssd storage (to have 2 years trends data + 2 weeks raw history data).
Number of web clients scales linearly with number of web clients and to have 20 clients you will need 4/8 cores/threads and +16GB RAM.
Requirements for zabbix server are highly correlated to number of triggers and avg number of triggers definitions evaluations against streams of new data. Sadly zabbix still does not provide internal metrics about speed of evaluating triggers and/or cpu time spent in those evaluations.

HW requirements for zabbix proxies are on the bottom of the list.

Nevertheless usually requirements for zabbix server and proxies are really secondary/minor. Most important is central DB backend.
And yet another conclusion: Zabbix stack with 1k NVPS is relatively small one

Last edited by kloczek; 23-06-2017 at 13:46.
Reply With Quote
  #6  
Old 23-06-2017, 16:42
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default

Thanks very much for the recommendations.
Reply With Quote
  #7  
Old 09-07-2017, 12:43
meetpradeepp meetpradeepp is offline
Junior Member
 
Join Date: Jul 2013
Location: India
Posts: 9
Default N Zabbix server architecture in our shop

We have N Zabbix server architecture. With 1000+ nodes on a VM and SAN it seems to be doing well. Average NVPS is around 300 but we have seen more in cases of bursts. Sometimes we have seen I/O complains because of history syncer but otherwise it works great. We have no users using frontend.. it all goes to ELK.

Keeping configuration in sync and understanding impact of event becomes a challenge assuming you have sorted on what goes to which servers.

Hope that helps.
Reply With Quote
  #8  
Old 10-07-2017, 17:50
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default

Thanks very much for your help!
Reply With Quote
  #9  
Old 12-07-2017, 06:28
LenR LenR is offline
Senior Member
 
Join Date: Sep 2009
Posts: 357
Default

So what part of Zabbix is causing the load? What is the bottleneck, CPU, IO, network?

We have zabbix server & mysql, 5 main proxies to distribute the load, a few others for network access but minor load and a separate web frontend server. All are VM's, disk is FC.

Our Zabbix server uses a lot of ram, I allocated many Gb to innodb buffers, buffer hits are I/O's avoided.

Partitioning and disabling housekeeping for history and trends is essential.

8300 hosts, 1,272K items, 5350 NVPS.
Reply With Quote
  #10  
Old 12-07-2017, 16:45
wyang wyang is offline
Junior Member
 
Join Date: Mar 2016
Posts: 25
Default

Thanks very much!

As being mentioned, we have a single Zabbix server configuration.

On Software Defined Networking (SDN) devices, it happened that each of the these SDN devices has more than 10K items to be monitored. At that time, the trigger 'zabbix agent is unreachable' was triggered on many hosts, while zabbix_get on the hosts reporting the issue worked. Zabbix poller processes busy and history syncer process busy were reported. Decreasing items being monitored on each device to be less than 2K on each device resolved the issue.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:47.