Any tips on tips on this is much appreciated.
Env
Zabbix server:
Zabbix 6.0.40. © 2001–2025, Zabbix SIA
Standard E4as v4 (4 vcpus, 32 GiB memory)
OS was upgrade from Linux (ubuntu 20.04)
to Linux (ubuntu 24.04) in JAN 2025.
Database server:
Azure Database for Mysql flexible server
General Purpose, D4ds_v4, 4 vCores, 16 GiB RAM, 100 storage, 600 IOPS
DB parameters:
innodb_io_capacity=600
innodb_io_capacity_max=4000
Zabbix Agents version running on Windows (Windows Server 2019 Datacenter), the monitored hosts.
Some agents are:
zabbix_agent2-6.0.26-windows-amd64-openssl.msi
Other agents are updated to
zabbix_agent2-6.0.40-windows-amd64-openssl.msi
Monitored hosts:68
Required server performance, new values per second 56.16
We have check diagnostics
zabbix_server -c /etc/zabbix/zabbix_server.conf -R diaginfo=valuecache
From your diaginfo=valuecache output, here's the analysis and recommended actions:
1. Cache Usage
Total Size: 266,518,696 bytes (~254MB)
Used: 1,689,944 bytes (~1.6MB)
Free: 266,518,696 bytes (~254MB)
Utilization: ~0.63% (extremely low)
2. Items vs. Values
Items: 3,295
Values: 57,279
Ratio: ~17 values per item (normal for active monitoring)
3. Performance
Time: 0.000965s (very fast response)
We are seeing an issue like this:
ss -ltn ouput.
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:12563 0.0.0.0:* LISTEN -
tcp 4097 4096 0.0.0.0:10051 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:10050 0.0.0.0:* LISTEN -
[ZBX-7933] zabbix generate TCP queue overflow - ZABBIX SUPPORT
From time to time generates zabbix a TCP queue overflow.
Then is no traffic from / to ZABBIX more possible, only zabbix restart help here.
Example agent logs
2025/06/11 00:49:27.024499 [101] cannot receive data from [ZABBIX-IP:10051]: Cannot read message: 'read tcp HOST-NAME:64868->ZABBIX-IP21:10051: i/o timeout'
2025/06/11 00:49:27.024500 [101] active check configuration update from host [HOST-NAME] started to fail
Example zabbix server log
sudo tail -f zabbix_server.log
# 1357:20240130:133326.485 failed to accept an incoming connection: connection rejected, getpername() faild: [107] Transport endpoint is not connected.
Zabbix trapper processes is 0 when it happens, it has nothing to do.
It would be great if we had Zabbix trapper processes more than 75% busy, but it is 0.
We do not seen any Zabbix alerts when Recv-Q is full tcp 10051 on Zabbix server, it is instant fixed if we sudo zabbix-server stop/start.
### Option: StartTrappers
StartTrappers=20
Hoping some can share some tips here, Zabbix 6 is LTS so hoping we don't need to upgrade yet.
Env
Zabbix server:
Zabbix 6.0.40. © 2001–2025, Zabbix SIA
Standard E4as v4 (4 vcpus, 32 GiB memory)
OS was upgrade from Linux (ubuntu 20.04)
to Linux (ubuntu 24.04) in JAN 2025.
Database server:
Azure Database for Mysql flexible server
General Purpose, D4ds_v4, 4 vCores, 16 GiB RAM, 100 storage, 600 IOPS
DB parameters:
innodb_io_capacity=600
innodb_io_capacity_max=4000
Zabbix Agents version running on Windows (Windows Server 2019 Datacenter), the monitored hosts.
Some agents are:
zabbix_agent2-6.0.26-windows-amd64-openssl.msi
Other agents are updated to
zabbix_agent2-6.0.40-windows-amd64-openssl.msi
Monitored hosts:68
Required server performance, new values per second 56.16
We have check diagnostics
zabbix_server -c /etc/zabbix/zabbix_server.conf -R diaginfo=valuecache
From your diaginfo=valuecache output, here's the analysis and recommended actions:
1. Cache Usage
Total Size: 266,518,696 bytes (~254MB)
Used: 1,689,944 bytes (~1.6MB)
Free: 266,518,696 bytes (~254MB)
Utilization: ~0.63% (extremely low)
2. Items vs. Values
Items: 3,295
Values: 57,279
Ratio: ~17 values per item (normal for active monitoring)
3. Performance
Time: 0.000965s (very fast response)
We are seeing an issue like this:
ss -ltn ouput.
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:12563 0.0.0.0:* LISTEN -
tcp 4097 4096 0.0.0.0:10051 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:10050 0.0.0.0:* LISTEN -
[ZBX-7933] zabbix generate TCP queue overflow - ZABBIX SUPPORT
From time to time generates zabbix a TCP queue overflow.
Then is no traffic from / to ZABBIX more possible, only zabbix restart help here.
Example agent logs
2025/06/11 00:49:27.024499 [101] cannot receive data from [ZABBIX-IP:10051]: Cannot read message: 'read tcp HOST-NAME:64868->ZABBIX-IP21:10051: i/o timeout'
2025/06/11 00:49:27.024500 [101] active check configuration update from host [HOST-NAME] started to fail
Example zabbix server log
sudo tail -f zabbix_server.log
# 1357:20240130:133326.485 failed to accept an incoming connection: connection rejected, getpername() faild: [107] Transport endpoint is not connected.
Zabbix trapper processes is 0 when it happens, it has nothing to do.
It would be great if we had Zabbix trapper processes more than 75% busy, but it is 0.
We do not seen any Zabbix alerts when Recv-Q is full tcp 10051 on Zabbix server, it is instant fixed if we sudo zabbix-server stop/start.
### Option: StartTrappers
StartTrappers=20
Hoping some can share some tips here, Zabbix 6 is LTS so hoping we don't need to upgrade yet.
hm....
Comment