Ad Widget

Collapse

High utilization of Zabbix proxy (v 6.4). Proxy for SNMP monitoring (Network devices)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • seroslaw
    Member
    • Apr 2021
    • 32

    #1

    High utilization of Zabbix proxy (v 6.4). Proxy for SNMP monitoring (Network devices)

    Hi,
    I have a task to build a proxy for monitoring network devices in my company. We will monitor network devices via SNMP. Some items will also have preprocessing. By default - about 200 network devices are to be monitored, including: routers, switches.
    Our Zabbix Server sometimes chokes and freezes and the number of items will increase in the future. We would like to relieve Zabbix Server of SNMP Agent type items and transfer their querying / preprocessing to a separate proxy.

    What I already did:
    I set up two new proxies and started testing them. Both proxies have the same hardware / OS configuration and I assume that Zabbix proxy will be connected in the same way.
    Both proxies have correct network communication between network devices, and between proxy <-> Zabbix server (working).
    Currently I am testing only one of the proxies. I have set up and I can see that it is choking a lot with only 1329 items (Required vps is at 10.37).

    What I already see / notice:
    My actually used for test proxy is very utilised. Screenshot is from last 7 days:


    Click image for larger version

Name:	image.png
Views:	232
Size:	284.4 KB
ID:	496494

    My vsphere - zabbix proxy server performance:
    Click image for larger version  Name:	image.png Views:	0 Size:	109.4 KB ID:	496489​​

    New Zabbix proxy servers configuration (both the same configuration):

    Code:
    Servers set up in VMware (ESXi 6.7 and later (VM version 14))
    10 CPU
    16GB Memory
    HD1: 20GB
    HD2: 40GB
    Network: standard configuration, everything works on the network.
    Operating system (both the same):
    Ubuntu 20.04.6 LTS

    Currently I am monitoring a narrow section of the network environment (only on one of the two proxies, nothing is connected to the other):
    4 network devices
    Number of items: 1329
    Required vps is at level 10.37
    a dozen or so icmp items
    the rest are SNMP Agents
    Preprocessing: half without preprocessing, half have change per second set, 1/3 additionally have multiplier = 8
    Intervals: 30s, 1m (most), 3m, 5m
    Holding history: 90 days (imposed by the organization)

    Proxy summary:
    Click image for larger version  Name:	image.png Views:	0 Size:	15.8 KB ID:	496491

    * In the future I plan to monitor the entire environment which will include:
    Item appearance / sample monitoring configurations:
    Items mainly created from several templates which are contain discovery.
    Planned total number of items created: ~48000

    Each device has additional ICMP monitoring:
    item type: icmpping - 3 items (loss, ping, response time)
    Preprocessing on items: none

    The rest are SNMP Agent type items.
    Intervals: 30s, 1m (most), 3m, 5m
    Holding history: 90 days (imposed by the organization)
    Preprocessing: about half of items have change per second, and about 1/3 additionally multiplier = 8

    Configuration of my Zabbix server (6.4):
    Role: Monitoring EVERYTHING in the company. We monitor websites, elasticsearch, kubernetes, JMX, databases and SNMP Agents (which we plan to migrate to new proxies).
    The current configuration has been improved many times, currently the server is working quite stably but sometimes we notice a slowdown in operation - when searching for items, or simply zabbix tends to freeze suddenly, unfortunately this situation happened a long time ago and I no longer have information from that event.

    Database: External, PostgreSQL
    psql (PostgreSQL) 14.1 (Ubuntu 14.1-2.pgdg20.04+1)

    Zabbix Server (6.4) config:
    Code:
    ListenPort=10051
    LogType=file
    LogFile=/var/log/zabbix/zabbix_server.log
    LogFileSize=1024
    DebugLevel=3
    PidFile=/run/zabbix/zabbix_server.pid
    SocketDir=/var/run/zabbix
    DBHost=<correct>
    DBName=<correct>
    DBUser=<correct>
    DBPassword=<correct>
    DBPort=<correct>
    AllowUnsupportedDBVersions=1
    StartPollers=600
    StartIPMIPollers=0
    StartPreprocessors=115
    StartConnectors=0
    StartPollersUnreachable=120
    StartHist oryPollers=5
    StartTrappers=16
    StartPingers=160
    StartDiscoverers=0
    StartHTTPPollers=60
    StartTimers=2
    StartEscalators=16
    StartAlerters=50
    JavaGateway=127.0.0.1
    JavaGatewayPort=10052
    StartJavaPollers=1
    StartVMwareCollectors=5
    VMwareFrequency=3600
    VMwarePerfFrequency=21600
    VMwareCacheSize=512M
    VMwareTimeout=10
    SNMPTrapperFile=/tmp/zabbix_snmptrap.log.tmp
    StartSNMPTrapper=1
    ListenIP=0.0.0.0
    HousekeepingFrequency=1
    Max HousekeeperDelete=250000
    CacheSize=1408M
    CacheUpdateFrequency=30
    StartDBSyncers=6
    HistoryCacheSize=1024M
    HistoryIndexCacheSize=512M
    TrendCacheSize=384M
    TrendFunctionCacheSize=128M
    ValueCacheSize=2G
    Timeout=30
    TrapperTimeout=60
    UnreachablePeriod=300
    UnavailableDelay=120
    UnreachableDelay=300
    AlertScriptsPath=/usr/share/pdagent-integrations/bin/
    ExternalScripts=/usr/lib/zabbix/externalscripts
    FpingLocation=/usr/bin /fping
    Fping6Location=/usr/bin/fping6
    LogSlowQueries=2200
    TmpDir=/tmp
    StartProxyPollers=0
    ProxyConfigFrequency=600
    ProxyDataFrequency=1
    StartLLDProcessors=4
    AllowRoot=0
    User=zabbix
    SSLCertLocation=/etc/zabbix/ssl/certs
    SSLKeyLocation=/etc/zabbix/ssl/
    StatsAllowedIP=127.0.0.1
    StartReportWriters=1
    WebServiceURL=<correct>
    ProblemHousekeepingFrequency=45
    StartODBCPollers=2
    Zabbix Server utilisation (last 7 days)
    Click image for larger version  Name:	image.png Views:	0 Size:	38.7 KB ID:	496492
    Click image for larger version  Name:	image.png Views:	0 Size:	335.1 KB ID:	496493​​


    Both new proxies configuration v: 6.4 (remember that 3 hosts are connected to only one of them, the other is not used):

    Database: Internal = localhost, PostgreSQL
    psql (PostgreSQL) 13.16 (Ubuntu 13.16-1.pgdg20.04+1)

    Code:
    ProxyMode=0
    Server=<correct>
    Hostname=<correct>
    ListenPort=<correct>
    LogType=file
    LogFile=/var/log/zabbix/zabbix_proxy.log
    LogFileSize=128
    DebugLevel=3
    EnableRemoteCommands=1
    LogRemoteCommands=1
    PidFile=/run/zabbix/zabbix_proxy.pid
    SocketDir=/run/zabbix
    DBHost=127.0.0.1
    DBName=<correct>
    DBUser=<correct>
    DBPassword=<correct>
    ProxyLocalBuffer=0
    ProxyOfflineBuffer=12
    HeartbeatFrequency=30
    ConfigFrequency=300
    DataSenderFrequency=5
    StartPollers=100
    StartIPMIPollers=5
    StartPreprocessors=70
    StartPollersUnreachable=20
    StartTrappers=5
    StartPingers=15
    StartDiscoverers=5
    StartHTTPPollers=30
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
    StartSNMPTrapper=1
    HousekeepingFrequency=1
    CacheSize=2G
    StartDBSyncers=5
    HistoryCacheSize=1.5G
    HistoryIndexCacheSize=1G
    Timeout=30
    ExternalScripts=/usr/lib/zabbix/externalscripts
    FpingLocation=/usr/bin/fping
    Fping6Location=
    LogSlowQueries=300
    AllowRoot=1
    StatsAllowedIP=127.0.0.1,<correct>
    ​​


    Can you advise what I can improve in the configuration of this proxy to reduce its utilization of these 4 hosts (that I am testing) and prepare both proxies for a much higher load of data coming from SNMP when I add the rest of the 200 devices - about 100 for each proxy?
    Last edited by seroslaw; 27-12-2024, 13:55.
  • mrnobody
    Member
    • Oct 2024
    • 61

    #2
    Hi
    An project* not just a task, looks easy but the monitoring world is very complex, you will see, feel and remember

    Man, you need proxy resources, in Zabbix 7 you can have proxy groups.
    If you do it on 6.4, every performance changes will need to restart proxy service + a lot of items to process = monitoring unavaible
    In 7, you can maintanance only what you need, keep other proxy running while = monitoring avaible

    After this, everything will be more balanced.

    Comment


    • seroslaw
      seroslaw commented
      Editing a comment
      but from some reasons I cannot upgrade Zabbix server 7 or install zabbix proxy 7, we are using 6.4.

      Is there any solution or tuning advices for my configuration ? Or only add more resources?

    • mrnobody
      mrnobody commented
      Editing a comment
      So, you can try some configurations as an Trial an Error solution.
      But with this size of installation, think you need something more professional, an entire analisys for a time, to understand what need to be upgraded (software first, then, if need, hardware), i can't say exacly what. Maybe someone in forum can.
  • markfree
    Senior Member
    • Apr 2019
    • 868

    #3
    Your Proxy host seems to be struggling.
    Note that too many processes can be just bad as too few processes.

    At the Zabbix Summit 2024, there was a nice presentation on Zabbix performance tunning. It was focused on version 7.0, but the principles apply to 6.4 as well. It has some good tips for tuning and evaluating your environment.
    The presentation file is also available for download.

    Comment

    Working...