We are running about 150 custom home grown agents (written in Perl) sending data to Zabbix server (v1.8.3). Each of agents send data about hundreds apps running on the hosts - in total number of items monitored is 46511.
On agent side I can see following error:
"agent encountered server error 104 during read: Connection reset by peer"
Other message is either
"agent could not connect to Zabbix server zabbix01:10051"
or
"agent encountered server error 0 during read:"
You can see on the graph that there are constantly some connections refused and some resets sent.
I tried to increase number of trappers (to 100) and I also increased number of allowed DB connections (to 300). We are using PostgreSQL 9.0.1 running locally on the machine.
As result of those problems, we not all the data make it to the database. There are gaps for all the apps. Randomly either few minutes or hours.
Any idea what might be a reason such problems? What can I try to improve performance of our zabbix setup?
On agent side I can see following error:
"agent encountered server error 104 during read: Connection reset by peer"
Other message is either
"agent could not connect to Zabbix server zabbix01:10051"
or
"agent encountered server error 0 during read:"
You can see on the graph that there are constantly some connections refused and some resets sent.
I tried to increase number of trappers (to 100) and I also increased number of allowed DB connections (to 300). We are using PostgreSQL 9.0.1 running locally on the machine.
As result of those problems, we not all the data make it to the database. There are gaps for all the apps. Randomly either few minutes or hours.
Any idea what might be a reason such problems? What can I try to improve performance of our zabbix setup?
Comment