Ad Widget

Collapse

Zabbix 1.1.6 suddenly segfault

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #1

    Zabbix 1.1.6 suddenly segfault

    How interesting. Any ideas how to fix this one?

    zabbix_server[5822]: segfault at 0000000000000000 rip 00002b6ef7f12929 rsp 00007fffb36af708 error 6
    server:/var/log/zabbix-server#

    Code:
    015083:20070314:002946 End of update_triggers [20504]
    015083:20070314:002946 Sending back [OK
    ]
    015083:20070314:002946 Length [3]
    015083:20070314:002946 Sockfd [5]
    015083:20070314:002946 After write()
    015083:20070314:002946 Before accept()
    015084:20070314:002946 After accept()
    015084:20070314:002946 In process_trapper_child
    015084:20070314:002946 Before read(65000)
    015084:20070314:002946 Read 160 bytes
    015084:20070314:002946 After read() 3 [160]
    015084:20070314:002946 Got data:<req><host>c2VhYnVyeS5zYjE=</host><key>cGVyZl9jb3VudGVyW1xQaHlzaWNhbERpc2soX1RvdGFsKVxDdXJyZW50IERpc2sgUXVldWUgTGVuZ3RoXQ==</key><data>MS4wMDAwMDA=</data></req>
    015084:20070314:002946 Trapper got [<req><host>c2VhYnVyeS5zYjE=</host><key>cGVyZl9jb3VudGVyW1xQaHlzaWNhbERpc2soX1RvdGFsKVxDdXJyZW50IERpc2sgUXVldWUgTGVuZ3RoXQ==</key><data>MS4wMDAwMDA=</data></req>]
    015084:20070314:002946 XML received [<req><host>c2VhYnVyeS5zYjE=</host><key>cGVyZl9jb3VudGVyW1xQaHlzaWNhbERpc2soX1RvdGFsKVxDdXJyZW50IERpc2sgUXVldWUgTGVuZ3RoXQ==</key><data>MS4wMDAwMDA=</data></req>]
    015084:20070314:002946 In process_data([seabury.sb1],[perf_counter[\PhysicalDisk(_Total)\Current Disk Queue Length]],[1.000000],[])
    015084:20070314:002946 Executing query:select i.itemid,i.key_,h.host,h.port,i.delay,i.description,i.nextcheck,i.type,i.snmp_community,i.snmp_oid,h.useip,h.ip,i.history,i.lastvalue,i.prevvalue,i.hostid,h.status,i.value_type,h.errors_from,i.snmp_port,i.delta,i.prevorgvalue,i.lastclock,i.units,i.multiplier,i.snmpv3_securityname,i.snmpv3_securitylevel,i.snmpv3_authpassphrase,i.snmpv3_privpassphrase,i.formula,h.available,i.status,i.trapper_hosts,i.logtimefmt,i.valuemapid from hosts h, items i where h.status=0 and h.hostid=i.hostid and h.host='seabury.sb1' and i.key_='perf_counter[\\PhysicalDisk(_Total)\\Current Disk Queue Length]' and i.status=0 and i.type in (2,7)
    015084:20070314:002946 In check_security()
    015084:20070314:002946 Processing [1.000000]
    015084:20070314:002946 In process_new_value()
    015084:20070314:002946 In add_history(perf_counter[\PhysicalDisk(_Total)\Current Disk Queue Length],,0,2)
    015084:20070314:002946 In add_history(21363,DOUBLE:1.000000)
    015084:20070314:002946 In add_history()
    015084:20070314:002946 Executing query:insert into history (clock,itemid,value) values (1173857386,21363,1.000000)
    015078:20070314:002946 Executing query:select i.itemid,i.key_,h.host,h.port,i.delay,i.description,i.nextcheck,i.type,i.snmp_community,i.snmp_oid,h.useip,h.ip,i.history,i.lastvalue,i.prevvalue,i.hostid,h.status,i.value_type,h.errors_from,i.snmp_port,i.delta,i.prevorgvalue,i.lastclock,i.units,i.multiplier,i.snmpv3_securityname,i.snmpv3_securitylevel,i.snmpv3_authpassphrase,i.snmpv3_privpassphrase,i.formula,h.available,i.status,i.trapper_hosts,i.logtimefmt,i.valuemapid from hosts h, items i where i.nextcheck<=1173857386 and i.status in (0) and i.type not in (2,7) and h.status=0 and h.disable_until<=1173857386 and h.errors_from!=0 and h.hostid=i.hostid and i.key_ not in ('status','icmpping','icmppingsec','zabbix[log]') order by i.nextcheck
    015078:20070314:002946 Spent 0 seconds while updating values
    015078:20070314:002946 Executing query:select count(*),min(i.nextcheck) as nextcheck from items i,hosts h where i.nextcheck<=1173857386 and i.status in (0) and i.type not in (2,7) and h.status=0 and h.disable_until<=1173857386 and h.errors_from!=0 and h.hostid=i.hostid and i.key_ not in ('status','icmpping','icmppingsec','zabbix[log]') order by nextcheck
    015078:20070314:002946 No items to update for minnextcheck.
    015078:20070314:002946 Nextcheck:-1 Time:1173857386
    015078:20070314:002946 Sleeping for 5 seconds
    015079:20070314:002946 Timeout while executing operation.
    015074:20070314:002946 One server process died. Shutting down...
    015074:20070314:002946 0. Killing PID=[15075]
    015074:20070314:002946 1. Killing PID=[15076]
    015074:20070314:002946 2. Killing PID=[15077]
    015074:20070314:002946 3. Killing PID=[15078]
    015074:20070314:002946 4. Killing PID=[15079]
    015074:20070314:002946 5. Killing PID=[15080]
    015074:20070314:002946 6. Killing PID=[15081]
    015074:20070314:002946 7. Killing PID=[15082]
    015081:20070314:002946 Server [7]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015075:20070314:002946 Server [1]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015082:20070314:002946 Server [8]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015077:20070314:002946 Server [3]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015083:20070314:002946 Server [9]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015078:20070314:002946 Server [4]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015080:20070314:002946 Server [6]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015074:20070314:002946 8. Killing PID=[15083]
    015074:20070314:002946 9. Killing PID=[15084]
    015084:20070314:002946 Server [10]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    015074:20070314:002946 ZABBIX server is down.
    015079:20070314:002946 Server [5]. Got QUIT or INT or TERM or PIPE signal. Exiting...
    server:/var/log/zabbix-server#
    Unofficial Zabbix Expert
    Blog, Corporate Site
  • bbrendon
    Senior Member
    • Sep 2005
    • 870

    #2
    I went back to a database backup that is about 1 month old and it fixed the problem.

    I'll have to figure this one out in the morning. At least for now the alerts are back online.
    Unofficial Zabbix Expert
    Blog, Corporate Site

    Comment

    • Alexei
      Founder, CEO
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Sep 2004
      • 5654

      #3
      Please can you grep for '15076' and post last few lines here.
      Alexei Vladishev
      Creator of Zabbix, Product manager
      New York | Tokyo | Riga
      My Twitter

      Comment

      • bbrendon
        Senior Member
        • Sep 2005
        • 870

        #4
        You are so smart

        How did you know it was #1?

        It appears the remote actions are causing it to crash!

        020009:20070314:104035 Run remote commands START [actionid:36]
        020009:20070314:104035 get_next_command START [command_list: 'box#touch /tmp/remotecommandtest']
        020009:20070314:104035 Result of get_next_command [alias:box, is_group:1, command:touch /tmp/remotecommandtest]
        020009:20070314:104035 Executing query:select distinct h.host from hosts_groups hg,hosts h, groups g where hg.hostid=h.hostid and hg.groupid=g.groupid and g.name='box'
        020009:20070314:104035 run_remote_command START [hostname: 'box.cheetah3', command: 'touch /tmp/remotecommandtest']
        020009:20070314:104035 Executing query:select distinct host,ip,useip,port from hosts where host='box.cheetah3'
        020009:20070314:104035 get_value_agent: host[box.cheetah3] ip[] key [system.run[touch /tmp/remotecommandtest,nowait]]
        020009:20070314:104035 gethostbyname() failed [Unknown host]
        020009:20070314:104035 run_remote_command [result:-3]
        020009:20070314:104035 run_remote_command START [hostname: 'box.elephant', command: 'touch /tmp/remotecommandtest']
        020009:20070314:104035 Executing query:select distinct host,ip,useip,port from hosts where host='box.elephant'
        020009:20070314:104035 get_value_agent: host[box.elephant] ip[] key [system.run[touch /tmp/remotecommandtest,nowait]]
        020009:20070314:104035 gethostbyname() failed [Unknown host]
        020009:20070314:104035 run_remote_command [result:-3]
        020009:20070314:104035 run_remote_command START [hostname: 'box.firewall', command: 'touch /tmp/remotecommandtest']
        020009:20070314:104035 Executing query:select distinct host,ip,useip,port from hosts where host='box.firewall'
        020009:20070314:104035 get_value_agent: host[box.firewall] ip[67.155.108.2] key [system.run[touch /tmp/remotecommandtest,nowait]]
        020007:20070314:104040 1. Killing PID=[20009]
        Unofficial Zabbix Expert
        Blog, Corporate Site

        Comment

        Working...