Hi all,
I've deployed Zabbix at the company I currently work for. So far it served us pretty well, but it seems we're running into problems with our IPtables based firewalls lately.
Our Zabbix server is monitoring about 20 hosts which are on a different subnet, separated by a pair of IPtables firewalls. About two days ago the primary unit started logging the following line:
Jul 13 09:40:14 fw01 kernel: ip_conntrack: table full, dropping packet.
Counting the ip_conntrack table entries it appears monitoring those 20 hosts (each with about 30 items) adds somewhere between 7000 en 9000 entries to this table. Looking at the individual hosts there're 100+ connections in TIME_WAIT state every given moment.
I increased the number of entries in the ip_conntrack tables and after lowering the tcp_fin_timeout on the monitoring host to 30 seconds things seem to improve a bit (about 6000 entries in the table now), but I'm wondering if this is desired behaviour. I was expecting only a couple of connections to each monitored host. What else can I do to keep this undesired overhead under control?
I've deployed Zabbix at the company I currently work for. So far it served us pretty well, but it seems we're running into problems with our IPtables based firewalls lately.
Our Zabbix server is monitoring about 20 hosts which are on a different subnet, separated by a pair of IPtables firewalls. About two days ago the primary unit started logging the following line:
Jul 13 09:40:14 fw01 kernel: ip_conntrack: table full, dropping packet.
Counting the ip_conntrack table entries it appears monitoring those 20 hosts (each with about 30 items) adds somewhere between 7000 en 9000 entries to this table. Looking at the individual hosts there're 100+ connections in TIME_WAIT state every given moment.
I increased the number of entries in the ip_conntrack tables and after lowering the tcp_fin_timeout on the monitoring host to 30 seconds things seem to improve a bit (about 6000 entries in the table now), but I'm wondering if this is desired behaviour. I was expecting only a couple of connections to each monitored host. What else can I do to keep this undesired overhead under control?
Comment