Hi all,
I'm having a strange issue on my 5.4.0 proxy.
There is a scheduling in crontab that restart the service but sometimes it does not complete and render the proxy unavailable.
Looking in the log I can see:
with a very log series of this messages, and this happens few times.
After some time, on the next restart I see :
You can see that it took more than an hour to sync to sqllite history, but this happened because I've tried to understand and found, tracing the parent process I can see that is waiting on one of the history syncer.
I then trace the history syncer and seen that:
killing this process the parent process "moved" the wainting on the other syncer and so on until I've killed all 4 syncer and also the availability manager.
After this the parent zabbix_proxy "ended" correctly nd could restart.
Is this something you have already seen ? Is it worth to open an issue ?
Pierluigi
P.s. Sorry for my bad english
I'm having a strange issue on my 5.4.0 proxy.
There is a scheduling in crontab that restart the service but sometimes it does not complete and render the proxy unavailable.
Looking in the log I can see:
Code:
931439:20211021:045201.579 Got signal [signal:11(SIGSEGV),reason:1,refaddr:(nil)]. Crashing ... 931439:20211021:045201.579 ====== Fatal information: ====== 931439:20211021:045201.579 Program counter: 0x5575cb24e965 931439:20211021:045201.579 === Registers: === 931439:20211021:045201.579 r8 = 5575cb760940 = 93964413045056 = 93964413045056 931439:20211021:045201.579 r9 = 7f0b4bca2580 = 139686492906880 = 139686492906880 931439:20211021:045201.579 r10 = 0 = 0 = 0 931439:20211021:045201.579 r11 = 7f0b4bca2600 = 139686492907008 = 139686492907008 931439:20211021:045201.579 r12 = 1 = 1 = 1 931439:20211021:045201.579 r13 = 5575cb3f7863 = 93964409469027 = 93964409469027 931439:20211021:045201.579 r14 = 0 = 0 = 0 931439:20211021:045201.580 r15 = 0 = 0 = 0 931439:20211021:045201.580 rdi = 0 = 0 = 0 931439:20211021:045201.580 rsi = 7ffefe48130c = 140733164557068 = 140733164557068 931439:20211021:045201.580 rbp = 7ffefe47f400 = 140733164549120 = 140733164549120 931439:20211021:045201.580 rbx = 1 = 1 = 1 931439:20211021:045201.580 rdx = 7ffefe48130c = 140733164557068 = 140733164557068 931439:20211021:045201.580 rax = 0 = 0 = 0 931439:20211021:045201.580 rcx = 6 = 6 = 6 931439:20211021:045201.580 rsp = 7ffefe47ebe0 = 140733164547040 = 140733164547040 931439:20211021:045201.580 rip = 5575cb24e965 = 93964407728485 = 93964407728485 931439:20211021:045201.580 efl = 10202 = 66050 = 66050 931439:20211021:045201.580 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 931439:20211021:045201.580 err = 4 = 4 = 4 931439:20211021:045201.580 trapno = e = 14 = 14 931439:20211021:045201.580 oldmask = 0 = 0 = 0 931439:20211021:045201.580 cr2 = 0 = 0 = 0 931439:20211021:045201.580 === Backtrace: === 931439:20211021:045201.580 === Backtrace: === 931439:20211021:045201.600 12: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](zbx_backtrace+0x3f) [0x5575cb36bbbf] 931439:20211021:045201.600 11: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](zbx_log_fatal_info+0x141) [0x5575cb36be1c] 931439:20211021:045201.600 10: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](+0x19d608) [0x5575cb36c608] 931439:20211021:045201.600 9: /lib64/libpthread.so.0(+0x12b20) [0x7f0b4dd2eb20] 931439:20211021:045201.600 8: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](+0x7f965) [0x5575cb24e965] 931439:20211021:045201.600 7: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](+0x7fe4e) [0x5575cb24ee4e] 931439:20211021:045201.600 6: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](datasender_thread+0x15b) [0x5575cb24f216] 931439:20211021:045201.600 5: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](zbx_thread_start+0x37) [0x5575cb370d49] 931439:20211021:045201.600 4: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](MAIN_ZABBIX_ENTRY+0xa60) [0x5575cb214e7e] 931439:20211021:045201.600 3: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](daemon_start+0x2ff) [0x5575cb36b7e9] 931439:20211021:045201.600 2: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](main+0x2f1) [0x5575cb2143d4] 931439:20211021:045201.600 1: /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f0b4bb45493] 931439:20211021:045201.600 0: /usr/sbin/zabbix_proxy: data sender [sent 1000 values in 0.711600 sec, sending data](_start+0x2e) [0x5575cb21327e] 931439:20211021:045201.600 === Memory map: === 931439:20211021:045201.600 5575cb1cf000-5575cb49e000 r-xp 00000000 08:12 16787027 /usr/sbin/zabbix_proxy_sqlite3 931439:20211021:045201.600 5575cb69d000-5575cb750000 r--p 002ce000 08:12 16787027 /usr/sbin/zabbix_proxy_sqlite3 931439:20211021:045201.600 5575cb750000-5575cb75e000 rw-p 00381000 08:12 16787027 /usr/sbin/zabbix_proxy_sqlite3 931439:20211021:045201.600 5575cb75e000-5575cb768000 rw-p 00000000 00:00 0 931439:20211021:045201.600 5575cd100000-5575cd121000 rw-p 00000000 00:00 0 [heap] 931439:20211021:045201.600 5575cd121000-5575cd1b9000 rw-p 00000000 00:00 0 [heap] 931439:20211021:045201.600 5575cd1b9000-5575cef6e000 rw-p 00000000 00:00 0 [heap] 931439:20211021:045201.600 7f0a44a60000-7f0b44a60000 rw-s 00000000 00:05 20054048 /SYSV00000000 (deleted) 931439:20211021:045201.600 7f0b44a60000-7f0b44e60000 rw-s 00000000 00:05 20054047 /SYSV00000000 (deleted) 931439:20211021:045201.600 7f0b44e60000-7f0b45e60000 rw-s 00000000 00:05 20054046 /SYSV00000000 (deleted) 931439:20211021:045201.601 7f0b45e60000-7f0b45e63000 r-xp 00000000 08:12 17245375 /opt/pbis/lib64/libuuid.so.0.0.0 931439:20211021:045201.601 7f0b45e63000-7f0b45f62000 ---p 00003000 08:12 17245375 /opt/pbis/lib64/libuuid.so.0.0.0 931439:20211021:045201.601 7f0b45f62000-7f0b45f63000 rw-p 00002000 08:12 17245375 /opt/pbis/lib64/libuuid.so.0.0.0 931439:20211021:045201.601 7f0b45f63000-7f0b45ff0000 r-xp 00000000 08:12 17177896 /opt/pbis/lib64/liblwbase_nothr.so.0.0.0 931439:20211021:045201.601 7f0b45ff0000-7f0b460ef000 ---p 0008d000 08:12 17177896 /opt/pbis/lib64/liblwbase_nothr.so.0.0.0 931439:20211021:045201.601 7f0b460ef000-7f0b4611d000 rw-p 0008c000 08:12 17177896 /opt/pbis/lib64/liblwbase_nothr.so.0.0.0 931439:20211021:045201.601 7f0b4611d000-7f0b46143000 r-xp 00000000 08:12 17159416 /opt/pbis/lib64/liblsacommon.so.0.0.0
After some time, on the next restart I see :
Code:
1010469:20211021:083201.518 Got signal [signal:15(SIGTERM),sender_pid:1060958,sender_uid:0 ,reason:0]. Exiting ... 1010564:20211021:083201.519 syncing history data in progress... 1010473:20211021:094152.816 configuration cache reloading is already in progress 1010473:20211021:094313.318 configuration cache reloading is already in progress 1010469:20211021:100856.926 syncing history data... 1010469:20211021:100856.953 syncing history data... 100.000000% 1010469:20211021:100856.953 syncing history data done
I then trace the history syncer and seen that:
Code:
[root@azitzbxprxp01 ~]# strace -f -p 1010561 strace: Process 1010561 attached futex(0x7f9332103260, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 3, NULL, FUTEX_BITSET_MATCH_ANY^C strace: Process 1010561 detached
After this the parent zabbix_proxy "ended" correctly nd could restart.
Is this something you have already seen ? Is it worth to open an issue ?
Pierluigi
P.s. Sorry for my bad english
Comment