Hi,
The setup:
Both box with ubuntu server 12.04 + zabbix 2;
The box 1 (master node) runs fine alone and with proxyes;
The box 2 (first node - database number 2) runs without problems and TRY to send the data to box 1 but receive TCP error after some seconds:
Last log lines in box 1 (master node):
The setup:
Both box with ubuntu server 12.04 + zabbix 2;
The box 1 (master node) runs fine alone and with proxyes;
The box 2 (first node - database number 2) runs without problems and TRY to send the data to box 1 but receive TCP error after some seconds:
Code:
19639:20120622:090416.798 cannot send list of active checks to [127.0.0.1]: host [zabbix_node1] not monitored 19647:20120622:090536.807 NODE 2: Sending configuration changes to master node 1 for node 2 datalen 5760 19647:20120622:090543.173 NODE 2: Error while receiving answer from Node [1] error: ZBX_TCP_READ() failed: [104] Connection reset by peer 19647:20120622:090545.477 NODE 2: Error while receiving answer from Node [1] error: ZBX_TCP_READ() failed: [104] Connection reset by peer 19647:20120622:090545.510 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.521 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.532 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.538 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.544 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.551 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.561 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.577 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.582 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused 19647:20120622:090545.589 NODE 2: Unable to connect to Node [1] error: cannot connect to [[10.11.210.241]:10051]: [111] Connection refused
Code:
root@zabbixone:~# tail /tmp/zabbix_server.log -n100 1470:20120622:053054.113 ****************************** 1472:20120622:053054.632 server #1 started [configuration syncer #1] 1480:20120622:053054.655 server #9 started [trapper #1] 1481:20120622:053054.656 server #10 started [trapper #2] 1473:20120622:053054.659 server #2 started [db watchdog #1] 1482:20120622:053054.664 server #11 started [trapper #3] 1474:20120622:053054.670 server #3 started [poller #1] 1484:20120622:053054.691 server #13 started [trapper #5] 1485:20120622:053054.692 server #14 started [icmp pinger #1] 1486:20120622:053054.693 server #15 started [alerter #1] 1487:20120622:053054.694 server #16 started [housekeeper #1] 1487:20120622:053054.695 executing housekeeper 1483:20120622:053054.697 server #12 started [trapper #4] 1477:20120622:053054.700 server #6 started [poller #4] 1476:20120622:053054.701 server #5 started [poller #3] 1478:20120622:053054.703 server #7 started [poller #5] 1475:20120622:053054.707 server #4 started [poller #2] 1494:20120622:053054.716 server #18 started [node watcher #1] 1493:20120622:053054.723 server #17 started [timer #1] 1479:20120622:053054.725 server #8 started [unreachable poller #1] 1499:20120622:053054.732 server #21 started [history syncer #1] 1500:20120622:053054.739 server #22 started [history syncer #2] 1496:20120622:053054.740 server #19 started [http poller #1] 1501:20120622:053054.741 server #23 started [history syncer #3] 1507:20120622:053054.742 server #26 started [ipmi poller #1] 1508:20120622:053054.743 server #27 started [ipmi poller #2] 1505:20120622:053054.747 server #24 started [history syncer #4] 1506:20120622:053054.748 server #25 started [escalator #1] 1470:20120622:053054.757 server #0 started [main process] 1516:20120622:053054.757 server #28 started [proxy poller #1] 1497:20120622:053054.758 server #20 started [discoverer #1] 1517:20120622:053054.763 server #29 started [self-monitoring #1] 1497:20120622:053054.788 fping failed: "10.11.210.0 :" 1487:20120622:053104.017 housekeeper deleted: 38 records from history and trends, 0 records of deleted items, 0 events, 0 alerts, 0 sessions 1482:20120622:053151.326 NODE 1: Received configuration changes from slave node 2 for node 2 datalen 5760 *** stack smashing detected ***: /usr/local/sbin/zabbix_server terminated 1482:20120622:053156.997 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x65373234]. Crashing ... 1482:20120622:053156.998 ====== Fatal information: ====== 1482:20120622:053156.998 Program counter: 0xb7034b19 1482:20120622:053156.998 === Registers: === 1482:20120622:053156.998 gs = 33 = 51 = 51 1482:20120622:053156.998 fs = 0 = 0 = 0 1482:20120622:053156.998 es = 7b = 123 = 123 1482:20120622:053156.998 ds = 7b = 123 = 123 1482:20120622:053156.998 edi = bfb83190 = 3216519568 = -1078447728 1482:20120622:053156.998 esi = bfb830d0 = 3216519376 = -1078447920 1482:20120622:053156.999 ebp = bfb83218 = 3216519704 = -1078447592 1482:20120622:053156.999 esp = bfb83050 = 3216519248 = -1078448048 1482:20120622:053156.999 ebx = b703bff4 = 3070476276 = -1224491020 1482:20120622:053157.000 edx = 65373234 = 1698116148 = 1698116148 1482:20120622:053157.000 ecx = bfb84470 = 3216524400 = -1078442896 1482:20120622:053157.000 eax = bfb83190 = 3216519568 = -1078447728 1482:20120622:053157.000 trapno = e = 14 = 14 1482:20120622:053157.001 err = 4 = 4 = 4 1482:20120622:053157.001 eip = b7034b19 = 3070446361 = -1224520935 1482:20120622:053157.001 cs = 73 = 115 = 115 1482:20120622:053157.001 efl = 210246 = 2163270 = 2163270 1482:20120622:053157.002 uesp = bfb83050 = 3216519248 = -1078448048 1482:20120622:053157.002 ss = 7b = 123 = 123 1482:20120622:053157.002 === Stack frame: === 1482:20120622:053157.003 +0x40(%ebp) = ebp + 64 = 00000000 = 0 = 0 1482:20120622:053157.003 +0x3c(%ebp) = ebp + 60 = 00000040 = 64 = 64 1482:20120622:053157.003 +0x38(%ebp) = ebp + 56 = 00000004 = 4 = 4 1482:20120622:053157.003 +0x34(%ebp) = ebp + 52 = bfb837a0 = 3216521120 = -1078446176 1482:20120622:053157.004 +0x30(%ebp) = ebp + 48 = 00000000 = 0 = 0 1482:20120622:053157.004 +0x2c(%ebp) = ebp + 44 = 080cebce = 135064526 = 135064526 1482:20120622:053157.004 +0x28(%ebp) = ebp + 40 = bfb83908 = 3216521480 = -1078445816 1482:20120622:053157.004 +0x24(%ebp) = ebp + 36 = 00000000 = 0 = 0 1482:20120622:053157.005 +0x20(%ebp) = ebp + 32 = 00000000 = 0 = 0 1482:20120622:053157.005 +0x1c(%ebp) = ebp + 28 = 080cebcf = 135064527 = 135064527 1482:20120622:053157.005 +0x18(%ebp) = ebp + 24 = 00000000 = 0 = 0 1482:20120622:053157.005 +0x14(%ebp) = ebp + 20 = b70b590b = 3070974219 = -1223993077 1482:20120622:053157.006 +0x10(%ebp) = ebp + 16 = 081225d4 = 135407060 = 135407060 1482:20120622:053157.006 +0x0c(%ebp) = ebp + 12 = bfb8324c = 3216519756 = -1078447540 1482:20120622:053157.006 +0x08(%ebp) = ebp + 8 = b7173f00 = 3071753984 = -1223213312 <--- call arguments 1482:20120622:053157.007 +0x04(%ebp) = ebp + 4 = b7174007 <--- return address 1482:20120622:053157.007 (%ebp) = ebp = bfb83278 <--- saved ebp value 1482:20120622:053157.007 -0x04(%ebp) = ebp - 4 = bfb837a0 = 3216521120 = -1078446176 <--- local variables 1482:20120622:053157.008 -0x08(%ebp) = ebp - 8 = 00000040 = 64 = 64 1482:20120622:053157.008 -0x0c(%ebp) = ebp - 12 = 00000000 = 0 = 0 1482:20120622:053157.008 -0x10(%ebp) = ebp - 16 = 00000000 = 0 = 0 1482:20120622:053157.008 -0x14(%ebp) = ebp - 20 = 00000000 = 0 = 0 1482:20120622:053157.009 -0x18(%ebp) = ebp - 24 = 00000000 = 0 = 0 1482:20120622:053157.009 -0x1c(%ebp) = ebp - 28 = 00000000 = 0 = 0 1482:20120622:053157.009 -0x20(%ebp) = ebp - 32 = 00000000 = 0 = 0 1482:20120622:053157.009 -0x24(%ebp) = ebp - 36 = 00000000 = 0 = 0 1482:20120622:053157.010 -0x28(%ebp) = ebp - 40 = 40000000 = 1073741824 = 1073741824 1482:20120622:053157.010 -0x2c(%ebp) = ebp - 44 = 0806fc60 = 134675552 = 134675552 1482:20120622:053157.010 -0x30(%ebp) = ebp - 48 = 0812fff4 = 135462900 = 135462900 1482:20120622:053157.011 -0x34(%ebp) = ebp - 52 = 00000000 = 0 = 0 1482:20120622:053157.011 -0x38(%ebp) = ebp - 56 = 00000000 = 0 = 0 1482:20120622:053157.011 -0x3c(%ebp) = ebp - 60 = 65373234 = 1698116148 = 1698116148 1482:20120622:053157.012 -0x40(%ebp) = ebp - 64 = bfb84470 = 3216524400 = -1078442896 1482:20120622:053157.012 === Backtrace: === 1470:20120622:053157.013 One child process died (PID:1482,exitcode/signal:11). Exiting ... 1470:20120622:053159.015 syncing history data... 1470:20120622:053159.122 syncing history data done 1470:20120622:053159.122 syncing trends data... 1470:20120622:053159.945 syncing trends data done 1470:20120622:053159.946 Zabbix Server stopped. Zabbix 2.0.0 (revision 27675).