To start off:
My server is 2vcpu 2GB RAM centos VPS hosted in Singapore
Current DB size is 17GB
The server has been running well for the last 2 years since deployment.
I have 3 proxy server in different locations.
- proxy1 is 4vcpu 4GB RAM SSD
- proxy2 is 2vcpu 2GB RAM HDD
- proxy3 is 2vcpu 2GB RAM SSD
Suddenly got massive alerts yesterday 10/23/2019 8:00PM about agents not checking in.
So I went into those remote locations and confirmed everything is well EXCEPT the zabbix proxy and server just suddenly stopped talking!
In one of my proxy server log this keeps coming, I can confirm there is good ping connectivity and latency of 40-50 ms between the proxy and zabbix server
I have been scratching my head what is wrong.
There was virtually nothing changed, the problem just popped out.
Got really high queues for hours and hours and the graphs the latest data in zabbix server is confirmed incomplete.
Agents that don't go through any proxy is working fine.
In one of my proxy server this is the logs, In zabbix server logs I see time to time slow query logs.
Querying backlogs in proxy server gives this result:
Could you please help me where else to look at?
My server is 2vcpu 2GB RAM centos VPS hosted in Singapore
Current DB size is 17GB
The server has been running well for the last 2 years since deployment.
I have 3 proxy server in different locations.
- proxy1 is 4vcpu 4GB RAM SSD
- proxy2 is 2vcpu 2GB RAM HDD
- proxy3 is 2vcpu 2GB RAM SSD
Suddenly got massive alerts yesterday 10/23/2019 8:00PM about agents not checking in.
So I went into those remote locations and confirmed everything is well EXCEPT the zabbix proxy and server just suddenly stopped talking!
In one of my proxy server log this keeps coming, I can confirm there is good ping connectivity and latency of 40-50 ms between the proxy and zabbix server
I have been scratching my head what is wrong.
There was virtually nothing changed, the problem just popped out.
Got really high queues for hours and hours and the graphs the latest data in zabbix server is confirmed incomplete.
Agents that don't go through any proxy is working fine.
In one of my proxy server this is the logs, In zabbix server logs I see time to time slow query logs.
7760:20191024:210943.025 housekeeper [deleted 157585 records in 1.894099 sec, idle for 1 hour(s)]
7757:20191024:211436.640 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:211922.719 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() timed out
7759:20191024:211933.789 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:211956.776 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:212516.775 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:212656.165 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:213112.790 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:213203.214 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [104] Connection reset by peer
7757:20191024:213614.468 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:214116.723 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:214205.555 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() timed out
7759:20191024:214216.063 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:214618.423 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:215120.484 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:215130.160 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:211436.640 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:211922.719 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() timed out
7759:20191024:211933.789 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:211956.776 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:212516.775 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:212656.165 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:213112.790 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:213203.214 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [104] Connection reset by peer
7757:20191024:213614.468 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:214116.723 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:214205.555 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() timed out
7759:20191024:214216.063 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
7757:20191024:214618.423 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7757:20191024:215120.484 received configuration data from server at "zabbix.bryanit.net", datalen 1115427
7759:20191024:215130.160 cannot send proxy data to server at "zabbix.bryanit.net": ZBX_TCP_WRITE() failed: [32] Broken pipe
Querying backlogs in proxy server gives this result:
MariaDB [zabbix]> select max(id)-(select nextid from ids where table_name = "proxy_history" limit 1) from proxy_history;
+-----------------------------------------------------------------------------+
| max(id)-(select nextid from ids where table_name = "proxy_history" limit 1) |
+-----------------------------------------------------------------------------+
| 230366 |
+-----------------------------------------------------------------------------+
1 row in set (0.00 sec)
+-----------------------------------------------------------------------------+
| max(id)-(select nextid from ids where table_name = "proxy_history" limit 1) |
+-----------------------------------------------------------------------------+
| 230366 |
+-----------------------------------------------------------------------------+
1 row in set (0.00 sec)
Comment