Zabbix Server 1.8
Clients 1.8
mySQL db
Number of clients: about 1,000 (all linux)
We just reinstalled all the clients over this past week. After this was completed I noticed about 70 of the nodes were reported as "unreachable". Running a "zabbix_get -s node_name -k system.uptime" works and telneting from the worker node back to the server port works. The client config has the correct hostname and server name.
What I see happening that troubles me is the very high number of items in the "10 minute queue". In addition, clients that are "reachable" are starting to have stale data (2+ hours old) shown in the latest data page.
Restarting Zabbix and/or mysql works for a few hours (10 minute queue decreases) but eventually the Items queue up again. I can even run zabbix_get against several of the queued items and get data!
Debug on the clients show request for information so I tend to believe the Zabbix is getting the data, just not entering it in the database.
Clients 1.8
mySQL db
Number of clients: about 1,000 (all linux)
We just reinstalled all the clients over this past week. After this was completed I noticed about 70 of the nodes were reported as "unreachable". Running a "zabbix_get -s node_name -k system.uptime" works and telneting from the worker node back to the server port works. The client config has the correct hostname and server name.
What I see happening that troubles me is the very high number of items in the "10 minute queue". In addition, clients that are "reachable" are starting to have stale data (2+ hours old) shown in the latest data page.
Restarting Zabbix and/or mysql works for a few hours (10 minute queue decreases) but eventually the Items queue up again. I can even run zabbix_get against several of the queued items and get data!
Debug on the clients show request for information so I tend to believe the Zabbix is getting the data, just not entering it in the database.
Comment