Ad Widget

Collapse

Zabbix is struggling to collect data (crammed agent pollers don't fetch data)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • joeblade
    Junior Member
    • Jun 2026
    • 8

    #1

    Zabbix is struggling to collect data (crammed agent pollers don't fetch data)

    Hi there,

    Would somebody please shed some light on the issue I'm trying to understand and solve.
    I've got quite a large deployment where just recently monitoring items seem not to have been getting data on time that results in huge data gaps.
    Workload is spread among 3 Zabbix proxy servers.
    Just after a fresh reboot of the entire environment there is a burst of work going on across those 3 Zabbix proxy servers and data is coming through.
    However after some time the whole system starts to deteriorate and comes to almost a full halt and only occasional items manage to collect data.
    Increasing the number of async agent pollers would not help (I increased it from 8 to 20) - they all get clogged up over time.

    Interestingly server resources don't seem to be constraints because CPUs and RAM are underutilized and Zabbix proxy servers seem idle.

    1. How to explain the values for awaiting state to be ~1000 and maxed out almost all the time? Why are they not picked up moved out to the queue?

    2. Assuming they are problematic items and occupy the slots because they get close to the allotted timeout (Timeout=10 in my zabbix_proxy.conf but I also tried to bring it down to 3s) - so why are they not moved out to Unreachable pollers for later attempts?

    3. How to check what holds back those 1000 items on a poller and prevents them from being processed?

    NOTE: When I randomly pick a host and an item (e.g. CPU) in GUI and push Test button the return value appears instantaneously.

    Below is an example of one Zabbix proxy servers and summary of Zabbix processes running on it (also see the screenshots that follow).

    Code:
    Parameter                                           Value       Details
    =========                                           =====       =======
    Zabbix server is running                            Yes         zabbix-srv:10051
    Zabbix server version                               7.0.25      New update available
    Zabbix frontend version                             7.0.25      New update available
    Latest release                                      7.0.26      Release notes
    Number of hosts (enabled/disabled)                  2756        2729 / 27
    Number of templates                                 433    
    Number of items (enabled/disabled/not supported)    301001      282348 / 4351 / 14302
    Number of triggers (enabled/disabled [problem/ok])  103761      91473 / 12288 [526 / 90947]
    Required server performance, new values per second  2486.43    
    High availability cluster                           Disabled


    Code:
    cat /etc/os-release
    NAME="Red Hat Enterprise Linux"
    VERSION="8.10 (Ootpa)"
    Code:
    free -h
    total used free shared buff/cache available
    Mem: 15Gi 9.1Gi 655Mi 129Mi 5.6Gi 5.8Gi
    Swap: 9Gi 49Mi 9Gi
    Code:
    cat /proc/cpuinfo | grep processor
    processor : 0
    processor : 1
    processor : 2
    processor : 3

    Code:
    ===
    Load: load average: 1.35, 1.07, 1.11 CPU idle: 83.3 id
    ===
    ### Agent pollers ###
    Active: 20 Idle: 1
    agent poller #1 [got 1 values, queued 1 in 5 sec, awaiting 1000]
    agent poller #2 [got 7 values, queued 7 in 5 sec, awaiting 1000]
    agent poller #3 [got 14 values, queued 10 in 5 sec, awaiting 996]
    agent poller #4 [got 2 values, queued 0 in 5 sec, awaiting 998]
    agent poller #5 [got 3 values, queued 0 in 5 sec, awaiting 79]
    agent poller #6 [got 1 values, queued 1 in 5 sec, awaiting 666]
    agent poller #7 [got 0 values, queued 3 in 5 sec, awaiting 562]
    agent poller #8 [got 1 values, queued 1 in 5 sec, awaiting 1000]
    agent poller #9 [got 2 values, queued 4 in 5 sec, awaiting 1000]
    agent poller #10 [got 6 values, queued 6 in 5 sec, awaiting 1000]
    agent poller #11 [got 4 values, queued 6 in 5 sec, awaiting 1000]
    agent poller #12 [got 3 values, queued 2 in 5 sec, awaiting 999]
    agent poller #13 [got 1 values, queued 0 in 5 sec, awaiting 794]
    agent poller #14 [got 1 values, queued 1 in 5 sec, awaiting 1000]
    agent poller #15 [got 7 values, queued 14 in 5 sec, awaiting 804]
    agent poller #16 [got 1 values, queued 6 in 5 sec, awaiting 251]
    agent poller #17 [got 1 values, queued 0 in 5 sec, awaiting 998]
    agent poller #18 [got 11 values, queued 0 in 5 sec, awaiting 986]
    agent poller #19 [got 2 values, queued 2 in 5 sec, awaiting 292]
    agent poller #20 [got 1 values, queued 0 in 5 sec, awaiting 620]
    
    ### HTTP agent pollers ###
    Active: 0 Idle: 1
    
    ### SNMP pollers ###
    Active: 0 Idle: 1
    
    ### Classic pollers ###
    Active: 1 Idle: 9
    poller #25 [got 0 values in 0.000016 sec, getting values]
    
    ### Unreachable pollers ###
    Active: 1 Idle: 9
    unreachable poller #15 [got 0 values in 0.000033 sec, getting values]
    
    ### Trappers ###
    Active: 0 Idle: 10
    
    ### Preprocessing manager ###
    preprocessing manager #1 [queued 147, processed 168 values, idle 5.067374 sec during 5.080943 sec]
    
    ### TCP ###
    ESTABLISHED: 11 TIME_WAIT: 7185


    Click image for larger version  Name:	img1.png Views:	0 Size:	67.8 KB ID:	513968

    Click image for larger version  Name:	img2.png Views:	0 Size:	37.1 KB ID:	513963

    Click image for larger version  Name:	img3.png Views:	0 Size:	57.8 KB ID:	513964

    Click image for larger version  Name:	img4.png Views:	0 Size:	47.9 KB ID:	513965
    Last edited by joeblade; Today, 22:15.
Working...