Ad Widget

**gonmd** · 30-08-2024, 15:16

Here's some testing :

On Latest data - This host shows no data on a agent active template with default timeout :

Once changed to a passive agent template data it begins being collected:

This is the trapper data collector graph after the template was changed to the passive agent (still the same behavior)

And also the same error on the agent log as shown on original post is still being displayed

**gonmd** · 10-10-2024, 17:16

Could anyone help, please ?
Thanks

**gonmd** · 15-11-2024, 16:10

Hi Community,

Does anyone have information to share about this behavior that could help us? We are stuck and we don´t understand this behavior with active monitoring, after the problem with physical storage.

Regards,

**mrportatoes** · 20-01-2025, 17:24

We're having a very similar issue. Currently on 7.0.8 with two proxy servers, one dedicated for active agents. We had a power outage and once the servers came back online, we saw the trapper (for active agents) was hovering between 90 and 100% utilization, where before it was in the ~1% range. I can also see the CPU utilization has gone way down on the proxy server... Almost like its trying to process everything one at a time instead of asynchronously... if that makes sense. Furthermore, we can see that running "ss -ntl" shows that the Recv-Q is pegged at 128. I tried adding the "ListenBacklog" to the proxy conf which essentially increases the Recv-Q however even bumping it up to 4096, it remains pegged at the higher value.

In the past we had ran into some file deceptors limits but we have /etc/systemd/system/zabbix-proxy.service.d/filelimit.conf defined with some rather large values and, more importantly they haven't changed since the power outage.

I also bumped our "StartTrappers" value in the proxy conf and by a significant amount (300 to 400) and it only dropped utilization by ~5%. and given it was working before... and likely should have been lowered give ~1% utilization, cranking this up to 1000 isn't the answer.

I also made some adjustments to the TCP parameters by shorting the keepalive and timeout settings however this hasn't made any difference either.

sysctl -w net.ipv4.tcp_max_syn_backlog=4096
sysctl -w net.ipv4.tcp_keepalive_time=600
sysctl -w net.ipv4.tcp_fin_timeout=30

If we get this sorted, Ill post the fix here. Likewise, if you have happened to fix it in your instance please share anything you can.
Cheers

**gonmd** · 24-01-2025, 11:33

Hi,

Thanks for sharing that info. From our side, the workaround was to convert from active monitoring to passive. We did several test but we were not able to find the fix.
If you get this sorted, please post the fix.

Cheers

**mrportatoes** · 03-02-2025, 22:21

Just wanted to check in and note we never got a clear fix either. Since we are running 2 proxies we grouped then and flipped on load balancing. This has brought up some new challenges around host redirects but in general the TCP queues are clear and our "Available" host numbers are as high as they have ever been. I'm certain this is a capacity/configuration issue but can't say definitively what it was.

I also took a page from your playbook and took a few of our PowerShell based items and flipped them form Active to Passive. This dramatically dropped the queue count for us but can't say it had an impact on agent communication with the proxies.

**cyber** · 04-02-2025, 16:06

First 2 posts... I would say, issues with connection from agent to proxy (agent->proxy:10051). It says there, in agent log, that it cannot get active items from proxy... so it is not able to talk to proxy (FW restrictions?). When you changed items from active to passive, you changed the direction of queries... now agent only responds to queries from proxy, as you get data, your connection proxy->agent:10050 is okay.

**gonmd** · 07-02-2025, 17:36

Originally posted by cyber

First 2 posts... I would say, issues with connection from agent to proxy (agent->proxy:10051). It says there, in agent log, that it cannot get active items from proxy... so it is not able to talk to proxy (FW restrictions?). When you changed items from active to passive, you changed the direction of queries... now agent only responds to queries from proxy, as you get data, your connection proxy->agent:10050 is okay.

Hi,

If it's firewall, it's strange. Before the storage issue, no problems with active monitoring. Also, some of the clients are on same network as the proxy, so no firewall.

[root@somehost ~]# nc -z -v X.X.X.X 10051
Ncat: Version 7.70 ( https://nmap.org/ncat )
Ncat: Connected to X.X.X.X:10051.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

Ad Widget

Zabbix proxy: Utilization of trapper data collector processes is 100

Zabbix proxy: Utilization of trapper data collector processes is 100

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment