Zabbix-server version 7.09
Database is on another server postgresql version 13
We are monitoring 1204 hosts.
We have recently started getting Utilization of timer processes over 75%. The problem is intermittent with most of the errors happening overnight.
When the issue started the StartTimers in zabbix_server.conf was 10. I've slowly increased this to 30 but it hasn't resolved the issue.
Both the application server and database server aren't under any cpu or memory load during these times.
I see a connection timeout in the database logs that correspond to when we receive the alerts. "LOG: could not receive data from client: Connection timed out." A dba has confirmed that the database is not running out or connections and seems to be in good order.
We have reached out the network team but nothing has been found.
I didn't find anything useful in the zabbix application log.
Anything I can do to additionally troubleshoot this issue?
Database is on another server postgresql version 13
We are monitoring 1204 hosts.
We have recently started getting Utilization of timer processes over 75%. The problem is intermittent with most of the errors happening overnight.
When the issue started the StartTimers in zabbix_server.conf was 10. I've slowly increased this to 30 but it hasn't resolved the issue.
Both the application server and database server aren't under any cpu or memory load during these times.
I see a connection timeout in the database logs that correspond to when we receive the alerts. "LOG: could not receive data from client: Connection timed out." A dba has confirmed that the database is not running out or connections and seems to be in good order.
We have reached out the network team but nothing has been found.
I didn't find anything useful in the zabbix application log.
Anything I can do to additionally troubleshoot this issue?
Comment