I have 4 server 1 master, 3 children. master is nodeid 1 , children are 2,3,4. This children operate fine by themselves, but 2 and 4 fail to sync with master giving me:
NODE 1: Sending configuration changes to slave node 4 for node 4 datalen 1738670
NODE 1: Sending configuration changes to slave node 2 for node 2 datalen 2577922
NODE 1: Error while receiving answer from Node [2] error: ZBX_TCP_READ() failed [Interrupted system call]
NODE 1: Error while receiving answer from Node [4] error: ZBX_TCP_READ() failed [Interrupted system call]
I've change the zabbix trapper timeout to 300, but it doesn't finish in time. and I think the connection is getting dropped.
Backround info.
I was working creating some large templates on node2 for switch gear, and applied templates removed templates and I think it just got behind.
I'm not totally sure about node4, but I have added and removed a lot of host at time and made large configuration changes and I suspect that it just got behind.
Node info.
1:
Number of hosts (monitored/not monitored/templates) 1059 195 / 666 / 198
Number of items (monitored/disabled/not supported) 2003 1843 / 131 / 29
Number of triggers (enabled/disabled)[true/unknown/false] 2594 2567 / 27 [6 / 1223 / 1338
Required server performance, new values per second 25.6744 -
--at one point performance was up to 54
2:
Number of hosts (monitored/not monitored/templates) 567 163 / 6 / 398
Number of items (monitored/disabled/not supported) 1109 1085 / 0 / 24
Number of triggers (enabled/disabled)[true/unknown/false] 1634 1626 / 8 [13 / 14 / 1599]
Required server performance, new values per second 6.6571 -
3:
Number of hosts (monitored/not monitored/templates) 76 24 / 1 / 51
Number of items (monitored/disabled/not supported) 476 476 / 0 / 0
Number of triggers (enabled/disabled)[true/unknown/false] 768 768 / 0 [0 / 0 / 768]
Required server performance, new values per second 9.4667 -
4:
Number of hosts (monitored/not monitored/templates) 763 666 / 0 / 97
Number of items (monitored/disabled/not supported) 1079 933 / 131 / 15
Number of triggers (enabled/disabled)[true/unknown/false] 1040 1021 / 19 [0 / 844 / 177]
Required server performance, new values per second 19.9187 -
I think now the transfer is just not finishing in the expected timeout value. I've attempted to stop servers and start servers, stop 2 to see if 1 will finish with no luck. I'm looking for a way to purge the configuration data that the master is trying to send to the children is this possible? I don't want to rebuild my master if I don't have to. Is there a way to start the daemon so that all it will do is sync up? Can I manually export the data from the master and import it into the child? Any help will be apppreciated Thanks
NODE 1: Sending configuration changes to slave node 4 for node 4 datalen 1738670
NODE 1: Sending configuration changes to slave node 2 for node 2 datalen 2577922
NODE 1: Error while receiving answer from Node [2] error: ZBX_TCP_READ() failed [Interrupted system call]
NODE 1: Error while receiving answer from Node [4] error: ZBX_TCP_READ() failed [Interrupted system call]
I've change the zabbix trapper timeout to 300, but it doesn't finish in time. and I think the connection is getting dropped.
Backround info.
I was working creating some large templates on node2 for switch gear, and applied templates removed templates and I think it just got behind.
I'm not totally sure about node4, but I have added and removed a lot of host at time and made large configuration changes and I suspect that it just got behind.
Node info.
1:
Number of hosts (monitored/not monitored/templates) 1059 195 / 666 / 198
Number of items (monitored/disabled/not supported) 2003 1843 / 131 / 29
Number of triggers (enabled/disabled)[true/unknown/false] 2594 2567 / 27 [6 / 1223 / 1338
Required server performance, new values per second 25.6744 -
--at one point performance was up to 54
2:
Number of hosts (monitored/not monitored/templates) 567 163 / 6 / 398
Number of items (monitored/disabled/not supported) 1109 1085 / 0 / 24
Number of triggers (enabled/disabled)[true/unknown/false] 1634 1626 / 8 [13 / 14 / 1599]
Required server performance, new values per second 6.6571 -
3:
Number of hosts (monitored/not monitored/templates) 76 24 / 1 / 51
Number of items (monitored/disabled/not supported) 476 476 / 0 / 0
Number of triggers (enabled/disabled)[true/unknown/false] 768 768 / 0 [0 / 0 / 768]
Required server performance, new values per second 9.4667 -
4:
Number of hosts (monitored/not monitored/templates) 763 666 / 0 / 97
Number of items (monitored/disabled/not supported) 1079 933 / 131 / 15
Number of triggers (enabled/disabled)[true/unknown/false] 1040 1021 / 19 [0 / 844 / 177]
Required server performance, new values per second 19.9187 -
I think now the transfer is just not finishing in the expected timeout value. I've attempted to stop servers and start servers, stop 2 to see if 1 will finish with no luck. I'm looking for a way to purge the configuration data that the master is trying to send to the children is this possible? I don't want to rebuild my master if I don't have to. Is there a way to start the daemon so that all it will do is sync up? Can I manually export the data from the master and import it into the child? Any help will be apppreciated Thanks
It seems like I should be able to purge something on the master, but I'm not a guru of zabbix or mysql, but I get along with guidance. If I knew which data to get rid of, I'm sure I could figure out how.
I have a couple of concerns, it looks like the mainstream version was 1.6.6 at the time this possible solution was found. I did find this post(
Comment