Ad Widget

Collapse

Zabbix 6.0.18 crashes - child process died exitcode/signal:6

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • stefano_775
    Junior Member
    • Jul 2023
    • 14

    #1

    Zabbix 6.0.18 crashes - child process died exitcode/signal:6

    Hi

    Our Zabbix 6.0.18 had a few instances (3 times) when zabbix_server process exited for no obvious reasons.

    Relevant logs:

    .....
    1681176:20230727:023410.101 SNMP agent item "ifHCOutOctets.["46"]" on host <removed> failed: first network error, wait for 15 seconds
    1682860:20230727:023410.425 SNMP agent item "hwWlanRadioChUtilizationRate2.[<removed>]" on host <removed> failed: first network error, wait for 15 seconds
    1613353:20230727:023411.180 One child process died (PID:1681055,exitcode/signal:6). Exiting ...
    1613353:20230727:023411.180 PROCESS EXIT: 1681055
    1613361:20230727:023411.180 HA manager has been paused
    1683126:20230727:023411.306 cannot write to IPC socket: Broken pipe
    1683126:20230727:023411.306 cannot send data to LLD manager service
    1683709:20230727:023412.010 cannot write to IPC socket: Broken pipe
    1683709:20230727:023412.010 cannot retrieve alert results
    1613361:20230727:023412.860 HA manager has been stopped
    1613353:20230727:023412.976 syncing history data...
    1613353:20230727:023413.065 syncing history data... 100.000000%
    1613353:20230727:023413.065 syncing history data done
    1613353:20230727:023413.065 syncing trend data...
    1613353:20230727:023453.129 syncing trend data done
    1613353:20230727:023454.099 Zabbix Server stopped. Zabbix 6.0.18 (revision d2032721bc8).
    3089556:20230727:023504.224 Starting Zabbix Server. Zabbix 6.0.18 (revision d2032721bc8).
    .....

    Looked through several log pages before the crash, and all log entries are related to hosts becoming unreachable, or resuming.

    We are running an HA environment so the standby node takes over​ as expected.
    Increasing the verbosity of the logs is not viable due to how infrequently the issue happens (weeks pass between crashes) and the fairly large number of hosts/items we have: ~40000 hosts and 2 mil items.

    Any idea on what might be causing this, or how to troubleshoot this further?

    Info on system:

    - 2 Zabbix server nodes (6.0.18)
    - 2 Frontends (6.0.18)
    - 3 database nodes (postgresql 14.8 with timeseries). Primary database node set statically in the Zabbix configuration (no database failover occurs)

    All VMs running Ubuntu 22.04

    Thanks

    Stefano
  • stefano_775
    Junior Member
    • Jul 2023
    • 14

    #2
    bumping

    Comment

    • tim.mooney
      Senior Member
      • Dec 2012
      • 1427

      #3
      6.0.19 has been out for a month and 6.0.20 was just released today. They both have lots of fixes listed in the release notes, so it seems like they would be worth trying.

      Comment

      Working...