Ad Widget

Collapse

segfault после обновления zabbix server до 1.8 (Centos)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • kodmis
    Junior Member
    • Nov 2009
    • 9

    #1

    segfault после обновления zabbix server до 1.8 (Centos)

    Вроде у всех все обновилось с полпинка, а у меня не хочет.
    Что я не так делаю?
    Система Centos 5.4 x64: 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
    Собрал сервер с такими же ключами, как и предыдущую версию 1.6.5
    Пропатчил БД mysql:
    ./upgrades/dbpatches/1.8/mysql/upgrade -uzabbix zabbix

    После запуска сервера в лог сообщений ядра пишет сообщения типа:
    Code:
    Dec 15 17:41:43 o10 zabbix_server[17981]: segfault at 00002cf3a467a8ad rip 00000033cf879140 rsp 00007fff6abac1a8 error 4
    Dec 15 17:42:21 o10 zabbix_server[17977]: segfault at 0000000000000004 rip 0000000000426526 rsp 00007fff6abaae60 error 4
    Dec 15 17:47:35 o10 zabbix_server[24100]: segfault at 00002d2b039f58ad rip 00000033cf879140 rsp 00007fff67db57d8 error 4
    И, как я понимаю, сервер работать с агентами отказывается, в логе заббикса вот что (прокси я отключил пока):
    Code:
     24096:20091215:174734.909 Starting zabbix_server. Zabbix 1.8 (revision 8565).
     24096:20091215:174734.909 **** Enabled features ****
     24096:20091215:174734.909 SNMP monitoring:       YES
     24096:20091215:174734.910 IPMI monitoring:        NO
     24096:20091215:174734.910 WEB monitoring:        YES
     24096:20091215:174734.910 Jabber notifications:  YES
     24096:20091215:174734.910 ODBC:                   NO
     24096:20091215:174734.910 SSH2 support:           NO
     24096:20091215:174734.910 IPv6 support:           NO
     24096:20091215:174734.910 **************************
     24100:20091215:174735.135 server #1 started [DB Cache]
     24102:20091215:174735.184 server #3 started [Poller. SNMP:YES]
     24101:20091215:174735.206 server #2 started [Poller. SNMP:YES]
     24103:20091215:174735.224 server #4 started [Poller. SNMP:YES]
     24108:20091215:174735.263 server #9 started [Trapper]
     24109:20091215:174735.263 server #10 started [Trapper]
     24110:20091215:174735.264 server #11 started [Trapper]
     24111:20091215:174735.265 server #12 started [Trapper]
     24104:20091215:174735.267 server #5 started [Poller. SNMP:YES]
     24106:20091215:174735.273 server #7 started [Poller. SNMP:YES]
     24107:20091215:174735.297 server #8 started [Poller. SNMP:YES]
     24112:20091215:174735.297 server #13 started [Trapper]
     24115:20091215:174735.298 server #14 started [ICMP pinger]
     24116:20091215:174735.298 server #15 started [ICMP pinger]
     24118:20091215:174735.298 server #16 started [Alerter]
     24120:20091215:174735.299 server #17 started [Housekeeper]
     24122:20091215:174735.299 server #18 started [Timer]
     24105:20091215:174735.311 server #6 started [Poller. SNMP:YES]
     24128:20091215:174735.340 server #21 started [Node watcher. Node ID:0]
     24129:20091215:174735.340 server #22 started [HTTP Poller]
     24133:20091215:174735.346 server #24 started [DB Syncer]
     24134:20091215:174735.346 server #25 started [Escalator]
     24126:20091215:174735.352 server #20 started [Poller for unreachable hosts. SNMP:YES]
     24123:20091215:174735.355 server #19 started [Poller for unreachable hosts. SNMP:YES]
     24096:20091215:174735.360 server #0 started [Watchdog]
     24131:20091215:174735.391 server #23 started [Discoverer. SNMP:YES]
     24109:20091215:174737.325 Unknown proxy "proxy_on_mxs1"
     24110:20091215:174739.511 Unknown proxy "proxy_on_mxs1"
     24112:20091215:174741.603 Unknown proxy "proxy_on_mxs1"
     24111:20091215:174745.483 Unknown proxy "proxy_on_mxs1"
     24096:20091215:174813.532 One child process died (PID:24112). Exiting ...
     24096:20091215:174815.534 Syncing history data...
     24096:20091215:174815.534 Syncing trends data...
     24096:20091215:174815.534 Syncing trends data...done.
    В списке процессов такая картина :
    Code:
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    ZN     0:00 [zabbix_server] <defunct>
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    SN     0:00 /usr/local/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    Никто с такой проблемой не сталкивался? Может подскажете в какую сторону копать?
    Last edited by kodmis; 15-12-2009, 17:20.
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    Можно попросить увеличить DebugLevel и LogFileSize и прислать _полный_ лог на [email protected]. Спасибо!
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • kodmis
      Junior Member
      • Nov 2009
      • 9

      #3
      Спасибо за отклик. Архив с логом отправил

      Comment

      • igor
        ZABBIX Support Specialist
        • Mar 2009
        • 40

        #4
        Hi!
        Thank you for the log file.
        As we can see from your log file chashed process which has ID=17981:

        17977:20091215:174143.514 One child process died (PID:17981). Exiting ...

        And this process is "DB Cache" module:

        17981:20091215:174143.179 server #1 started [DB Cache]

        I don't know why in your case in the log does not appear message that there is not enough shared memory, but seems that you should increase the amount of the shared memory in your Centos system.

        You can do it increasing the value of the parameter "kernel.shmmax" in the /etc/sysctl.conf file.
        After this change you will need to reboot the system.
        After the reboot you can check the settings using the command "sysctl -a | grep shmmax".
        Please let us know about the results after that.

        Comment

        • kodmis
          Junior Member
          • Nov 2009
          • 9

          #5
          Now i have this options:
          Code:
          # /sbin/sysctl -a | grep shm
          vm.hugetlb_shm_group = 0
          kernel.shmmni = 4096
          kernel.shmall = 4294967296
          kernel.shmmax = 68719476736
          
          # ipcs -lm
          
          ------ Shared Memory Limits --------
          max number of segments = 4096
          max seg size (kbytes) = 67108864
          max total shared memory (kbytes) = 17179869184
          min seg size (bytes) = 1
          Do I realy need increase the amount of the shared memory?

          Comment

          • kodmis
            Junior Member
            • Nov 2009
            • 9

            #6
            Кстати, с пустой базой данных сервер не валится в segfault.
            Поробую еще раз пропатчить старую БД из резервной копии.

            Comment

            Working...