Ad Widget

**zillions** · 30-10-2013, 17:57

To update, I think it's not related to the agent, as I just checked a bunch of our network gear, and our SNMP based polling from the server shows the same issue.

(An snmp based network traffic item for an interface on a Cisco device)
2013.Oct.30 11:52:52 2200584
2013.Oct.30 11:51:32 2399416
2013.Oct.30 11:50:32 2011440
2013.Oct.30 11:49:31 3255760
2013.Oct.30 11:48:18 3383824
2013.Oct.30 11:47:14 2767448
2013.Oct.30 11:46:17 2313736
2013.Oct.30 11:45:17 2355464

This check is set to check every 120 seconds.
These checks aren't duplicating at the same time, but you can see that the :45, :47, :49 checks line up, and the :46, :48, :50 checks line up, etc...

So it's something on the server I think, but not sure what.

**zillions** · 30-10-2013, 21:17

I figured it out!!!!
The SNMP side I need to do more digging on, but I ended up figuring out the agent problem!

The issue itself was caused due to a configuration setting for our clustered servers. When we first setup the Zabbix servers, we were having problems with fwd/reverse DNS, so we put both the behind the scenes IP's, and the load balanced, clustered IP in the ServerActive line in the /etc/zabbix/zabbix_agent.conf file.

The problem is that this actually causes the agent to send the data to ALL the IP's you put there.
Once the DNS stuff got fixed, we never went back and removed the direct IP's. That is what caused the problem.

In zabbix_agent.conf, I did the following:
OLD: ServerActive=<Cluster IP>, <Server1 IP>, <Server2 IP>
NEW: ServerActive=<Cluster IP>

The checks went from :
2013.Oct.30 14:07:55 51.366
2013.Oct.30 14:07:54 51.8421
2013.Oct.30 14:05:54 62.8456
2013.Oct.30 14:05:54 62.8456
2013.Oct.30 14:03:54 51.3763
2013.Oct.30 14:03:54 51.3763

To:
Timestamp Value
2013.Oct.30 14:12:13 53.2207
2013.Oct.30 14:10:13 56.9139
2013.Oct.30 14:08:13 20.4989

So the problem appears to be that when the Zabbix agent responds, it sends to EVERY server on that list. That means that <Clustered IP> gets it, but also <server1 IP> and <server2 IP>.
Since the Zabbix server is technically reachable via both IP's, it's accepting both data responses.

The issue wasn't manifested via the inactive checks, because those are initiated on the server side, and only one request is made!

This is going to result in a very high priority ticket for a puppet change, because this is screwing up a number of triggers/alerts that I've been fighting with. I need this changed to just the shared IP.

I'm soooooo pumped!!!
This was a very obscure issue to track down, but I got it!!

**Pada** · 31-10-2013, 12:44

I've also had this kind of thing with Active Zabbix agents when I cloned my VM and forgot to change the cloned VM's Zabbix agent configuration file, which resulted in 2 virtual hosts sending data as a single host in Zabbix.

Ad Widget

Zabbix getting 2 values for each (active?) agent item check

Zabbix getting 2 values for each (active?) agent item check

Comment

Comment

Comment