Ad Widget

Collapse

Our Zabbix Server suddenly went down.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sjoshi0@walmartlabs.com
    Junior Member
    • Jun 2015
    • 9

    #1

    Our Zabbix Server suddenly went down.

    Hi All,

    We are running a Couple of Zabbix Servers in an Primary/Secondary kindof model and today our primary server suddenly went down.
    We are running a PostgreSQL database on the same box as the primary server
    which is also the same DB the other server connects to.
    On each of our server we also run the JavaGateway to support JMX metrics.
    I checked that the PostgreSQL server was running normally and noticed nothing untoward in the /var/logs/messages and dmesg logs.

    -bash-4.1$ dmesg | grep -i memory
    initial memory mapped : 0 - 20000000
    init_memory_mapping: 0000000000000000-0000000075ddf000
    init_memory_mapping: 0000000100000000-000000307ffff000
    Reserving 141MB of memory at 48MB for crashkernel (System RAM: 198655MB)
    PM: Registered nosave memory: 0000000000095000 - 0000000000096000
    PM: Registered nosave memory: 0000000000096000 - 0000000000098000
    PM: Registered nosave memory: 0000000000098000 - 00000000000a0000
    PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
    PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
    PM: Registered nosave memory: 0000000075dcc000 - 0000000075dde000
    PM: Registered nosave memory: 0000000075ddf000 - 0000000090000000
    PM: Registered nosave memory: 0000000090000000 - 00000000fec00000
    PM: Registered nosave memory: 00000000fec00000 - 00000000fee10000
    PM: Registered nosave memory: 00000000fee10000 - 00000000ff800000
    PM: Registered nosave memory: 00000000ff800000 - 0000000100000000
    Memory: 198153016k/203423740k available (5336k kernel code, 2263676k absent, 3007048k reserved, 7016k data, 1292k init)
    please try 'cgroup_disable=memory' option if you don't want memory cgroups
    Initializing cgroup subsys memory
    Freeing initrd memory: 18885k freed
    Non-volatile memory driver v1.3
    crash memory driver: version 1.1
    Freeing unused kernel memory: 1292k freed
    Freeing unused kernel memory: 788k freed
    Freeing unused kernel memory: 1568k freed
    IPVS: Connection hash table configured (size=4096, memory=64Kbytes

    The following are the last lines from the zabbix_server.log which indicate something related to dbconfig.c resulted in an Out Of Memory Exception or something of that sort to bring down the server...

    40531:20151104:135839.323 fping failed: cdc-hpcblx009-14.myCompany.com_10.224.162.19 address not found
    40499:20151104:135839.625 __mem_malloc: skipped 0 asked 24 skip_min 4294967295 skip_max 0
    40499:20151104:135839.625 file:dbconfig.c,line:446 zbx_mem_realloc(): out of memory (requested 16 bytes)
    40499:20151104:135839.625 file:dbconfig.c,line:446 zbx_mem_realloc(): please increase CacheSize configuration parameter
    40494:20151104:135839.628 One child process died (PID:40499,exitcode/signal:1). Exiting ...
    40494:20151104:135841.631 syncing history data...
    40494:20151104:135841.751 syncing history data done
    40494:20151104:135841.751 syncing trends data...
    40494:20151104:135844.508 syncing trends data done
    40494:20151104:135844.508 Zabbix Server stopped. Zabbix 2.4.4 (revision 52341).

    Somebody had earlier seen such abrupt termination of the server and had opined in https://support.zabbix.com/browse/ZBX-4415

    Can any of you please shed light if you have seen such behavior and how did you fix it ?
  • tatapoum
    Senior Member
    • Jan 2014
    • 185

    #2
    Hi,

    You need to increase the CacheSize parameter in the Zabbix server configuration file, as written in the log file. You haven't allocated enough memory for the cache.

    Comment

    Working...