Ad Widget

**Brambo** · 23-12-2024, 10:26

The thing I can think of is that you have agent active items which aren't received on the expected interval. So the issue is maybe not your server but the source hosts which 'fails' to report.
Maybe an update was running on that time? Besides updates, a backup routine can cause this as well if your hosts are suspended etc etc.

**Linwood** · 24-12-2024, 03:00

There's only one item that uses active checks, and it's on only 4 hosts, not the dozen or more that gave an error. So I don't think so. I do have a large number of external checks (have even replaced icmp with an external version to collect other info). But I can't se anything that would stall those either, and they certainly didn't time out (that would show in the log).

**Brambo** · 24-12-2024, 08:51

You are sure that discovery rules / item prototypes don't uses these items as dependent item for data processing?
Without more details it's hard to help. e.g. is it every night or just an one off etc etc.

**Linwood** · 25-12-2024, 02:04

Yes. I have historically built most templates for SNMP, added a few non-active zabbix agent, and only added a log file active some time ago. Not for any good reason but just never did much with active. The one active has a trigger but nothing dependent on it. There are a LOT of dependent ones for external checks though. Hmmm... let me go look .... data collector processes did peak at that time. It didn't hit even 80% but who knows if that's accurate if it wasn't processing. But I don't know what would do that either. External checks run in poller processes, right?

But I don't think these timed out, as I don't see a single external check that timed out in the log at that time.

PS. It hasn't happened again.

**Linwood** · 25-12-2024, 17:48

Happened again last night, roughly (not precisely) same time of day, not the same day of week (as it might be for some scheduled job). I'm suspicious someone is doing something to the hypervisor, stalling the guest. Off to do more detective work. There's nothing I can think of on the zabbix server related to that time of day that I can find.

**Linwood** · 12-01-2025, 02:21

To put a bit of closure to this: Though I do not quite understand how, it appears that tunnels between locations are going down every day about the same time (due to a rekey event with lifetime of 1 day). Why the tunnels go down for about 5 minutes is in someone else's hands who doesn't seem to care and they want to fix this not have me try. The odd part is that they drive up the queue as above, but that may be because I'm using so many external checks and they are timing out, and the accumulation of timing-out item checks is what I am actually seeing.

At any rate, until we can get rid of the rekey issues and see if it solves the problem, this is in limbo - but I do not think it is a Zabbix mystery after all.

Ad Widget

Weird period of no items processed and queue growth - cause?

Weird period of no items processed and queue growth - cause?

Comment

Comment

Comment

Comment

Comment

Comment