Hello everyone,
I have a zabbix setup that monitors two fairly active servers - both CE7 machines with 5.2 agents setup, first shows proper graphs throughout: http://prntscr.com/v8ljo9, whilst the second one has lapses like this in EVERY possible graph (though not necessarily at the same time): http://prntscr.com/v8ljdy
What can be the cause? The server is not really overloaded (...it's all it does, monitoring these two machines) and, whilst the 2nd machine does exhibit some overusage at points, this doesn't really coincide with the lapses. Moreover, i keep on seeing timeouts on commands that shouldn't take more than a second:
Whilst the server logs something across these lines:
I've increased the timeouts on both the server and agent to no avail and I'm really out of options as to what to investigate/change - i'd appreciate all the help I can get.
LE: the laps in graphs cover ALL zbx agent info, not strictly those MySQL ones - load averages and everything else is missing.
I have a zabbix setup that monitors two fairly active servers - both CE7 machines with 5.2 agents setup, first shows proper graphs throughout: http://prntscr.com/v8ljo9, whilst the second one has lapses like this in EVERY possible graph (though not necessarily at the same time): http://prntscr.com/v8ljdy
What can be the cause? The server is not really overloaded (...it's all it does, monitoring these two machines) and, whilst the 2nd machine does exhibit some overusage at points, this doesn't really coincide with the lapses. Moreover, i keep on seeing timeouts on commands that shouldn't take more than a second:
HTML Code:
3005666:20201028:073209.470 Failed to execute command " mysql -h"localhost" -P"3306" -sN -e "SELECT COALESCE(SUM(DATA_LENGTH + INDEX_LENGTH),0) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='DBhere'"": Timeout while executing a shell script. 3005667:20201028:073209.472 Failed to execute command " mysql -h"localhost" -P"3306" -sN -e "SELECT COALESCE(SUM(DATA_LENGTH + INDEX_LENGTH),0) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='DBHERE2'"": Timeout while executing a shell script.
HTML Code:
560437:20201102:095325.830 Zabbix agent item "mysql.dbsize["{$MYSQL.HOST}","{$MYSQL.PORT}","dbnamehere "]" on host "HOSTHERE" failed: first network error, wait for 15 seconds
560443:20201102:095340.838 resuming Zabbix agent checks on host "HOSTHERE": connection restored
I've increased the timeouts on both the server and agent to no avail and I'm really out of options as to what to investigate/change - i'd appreciate all the help I can get.
LE: the laps in graphs cover ALL zbx agent info, not strictly those MySQL ones - load averages and everything else is missing.