Ad Widget

Collapse

strange zabbix craches since 1.6.3 and 1.6.4

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • vnz
    Junior Member
    • Oct 2008
    • 9

    #1

    strange zabbix craches since 1.6.3 and 1.6.4

    hello,

    we've been using zabbix 1.6.2 without majors problems since it's available.

    Big picture:
    zabbix_server and frontend are on host A
    using a mysql database on host B
    every host is running a zabbix_agent (version may vary from 1.4.6 to 1.6.4)

    I've upgraded to 1.6.3 and then 1.6.4 to adress 2 problems:
    * the security hole in 1.6.2
    * the huge memory usage of the agent on windows 2008 x64 server

    Since this upgrade, zabbix_server is not working very well.

    1) dns resolution and web test: zabbix_server stopped to resolve hostname. All web tests triggers get activated. I didn't find any log entry about that. It could not be a temporary network problem or a local resolution problem on that host: dns server is running on zabbix_server host, and it works perfectly.

    All i could do is to restart zabbix_server... It fix that problem.

    2) agent 1.6.4 on windows 2008 x64 server still use too much memory: 350Mo a week after upgrading to 1.6.4

    3) zabbix_server random crash

    13066:20090427:182913 Item [host:agent.ping] error: Get value from agent failed: *** Cannot connect to [x.x.x.x]:10050 [No route to host]
    13051:20090427:182925 One child process died. Exiting ...
    13051:20090427:182927 ZABBIX Server stopped. ZABBIX 1.6.4.
    27397:20090427:182927 Starting zabbix_server. ZABBIX 1.6.4.
    27397:20090427:182927 **** Enabled features ****
    27397:20090427:182927 SNMP monitoring: YES
    27397:20090427:182927 WEB monitoring: YES
    27397:20090427:182927 Jabber notifications: YES
    27397:20090427:182927 ODBC: NO
    27397:20090427:182927 IPv6 support: YES
    27397:20090427:182927 **************************
    27397:20090427:182927 Listener failed with error: zbx_tcp_listen() Fatal error: unable to serve on any address. [[x.x.x.x]:10051].
    /usr/sbin/zabbix_server [9006]: Warning: ZABBIX semaphores already exist, trying to recreate.

    I didn't see that it crashed until this morning.

    Had to clean up the sem with
    ipcs -a
    ipcrm -m zabbix_key
    ipcrm -S zabbix_key

    I've downgrade to 1.6.3 to observe if it has the same problems. I remenbered facing problem 1) with 1.6.3.

    So i'm considering downgrading to 1.6.2 if 1.6.3 has the same problem. But still, i don't have solutions for windows 2008 agents memory usage (except stopping their agents and being blind).

    Anyone having the same problems?
    What did you do?
  • Cray
    Member
    • Mar 2009
    • 72

    #2
    I'm (fortunately) not experiencing random server crashes, but I have other problems regarding monitoring by proxies (explained in some of my threads).

    Regarding the Windows 2008 x64 memory leak : I confirm this is a major problem, I'm experiencing it too (350Mo and growing for the zabbix process).

    Strangely, I'm not experiencing those memory issues on Windows 2003 x64.

    I will eventually upgrade to Zabbix 1.8 when it gets out next week, and see if it solves my problems (another possible solution is to get the latest 1.6.4 build from svn, but I don't have the time to look everyday if a new build is available).

    Comment

    • vnz
      Junior Member
      • Oct 2008
      • 9

      #3
      Originally posted by Cray
      Regarding the Windows 2008 x64 memory leak : I confirm this is a major problem, I'm experiencing it too (350Mo and growing for the zabbix process).

      Strangely, I'm not experiencing those memory issues on Windows 2003 x64.
      same here, i'm using this version: http://www.suiviperf.com/zabbix/

      (http://www.zabbix.com/forum/showthre...t=12006&page=2 is the original thread about this problem)

      -----------------------------------------------------------

      as expected i experienced zabbix 1.6.3 dns crash again, every web test become false, triggering all websites alerts in the middle of the night.

      zzzzZZZZzzzz *bip* *bip* *bip*

      I downgraded to version 1.6.2 (1.6.2-2 sarge/unstable in the debian version numerotation). Actions stopped working: no more mail or sms sent.

      In the log:
      32031:20090430:102216 [Z3005] Query failed: [1062] Duplicate entry '22347' for key 1 [insert into alerts (alertid,actionid,eventid,userid,clock,mediatypeid ,sendto,subject,message,status,alerttype,esc_step) values (22347,4,71987,3,1241079736,1,'[email protected]','a lert title','alert body',0,0,0)]

      ... Great. Had to update manualy the table ids with the next (good) alertid value (something like 25xxx) to get email being sent again.

      Seems that zabbix stopped to maintain values in the ids table when upgrading to version 1.6.3, creating a sync problem in the database when downgrading.

      Now for the windows agent, i've no solution but to restart the zabbix agent service once a day or week.

      Comment

      Working...