Hi
We are currently working on a big instance (close to 2k nvps) and are now on a state that postgres and zabbix-server is running quite smoothly. Now we re-enabled our cmdb-sync script which sychronizes the cmdb with the zabbix-api. It checks the presence and templates for all hosts and changes the ones which are not the same. This could lead to quite many changes, at least if it did not run for a while.
Now we have the issue, that zabbix-server crashes in a very strange way if this is done.
The last log-line is "forced reloading of the configuration cache", systemctl shows it as "active (running) but the main process as exited / FAILURE). All child-processes (pollers eg.) are still there and have to be kill -9ed manually. Also, the postgres-server still runs smoothly, the webinterface and api are responsive.
We already increased the CacheSize to 4G and decreased the UpdeFrequency to 300. DBSyncers is on 4
Zabbix-Server version 4.2.3 on CentOS7, Postgres 11 with partitioned tables history and trends
We are currently working on a big instance (close to 2k nvps) and are now on a state that postgres and zabbix-server is running quite smoothly. Now we re-enabled our cmdb-sync script which sychronizes the cmdb with the zabbix-api. It checks the presence and templates for all hosts and changes the ones which are not the same. This could lead to quite many changes, at least if it did not run for a while.
Now we have the issue, that zabbix-server crashes in a very strange way if this is done.
The last log-line is "forced reloading of the configuration cache", systemctl shows it as "active (running) but the main process as exited / FAILURE). All child-processes (pollers eg.) are still there and have to be kill -9ed manually. Also, the postgres-server still runs smoothly, the webinterface and api are responsive.
We already increased the CacheSize to 4G and decreased the UpdeFrequency to 300. DBSyncers is on 4
Zabbix-Server version 4.2.3 on CentOS7, Postgres 11 with partitioned tables history and trends
Code:
Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service: main process exited, code=killed, status=9/KILL Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: Stopped Zabbix Server. Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: Unit zabbix-server.service entered failed state. Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service failed. Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: Starting Zabbix Server... Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service: Supervising process 1139 which is not our child. We'll most likely not notice when it exits. Jun 25 15:58:59 v1tzabbix.net.be.ch systemd[1]: Started Zabbix Server. Jun 25 15:59:37 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service: main process exited, code=killed, status=9/KILL Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: Usage: Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: kill [options] <pid|name> [...] Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: Options: Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -a, --all do not restrict the name-to-pid conversion to processes Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: with the same uid as the present process Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -s, --signal <sig> send specified signal Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -q, --queue <sig> use sigqueue(2) rather than kill(2) Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -p, --pid print pids without signaling them Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -l, --list [=<signal>] list signal names, or convert one to a name Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -L, --table list signal names and numbers Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -h, --help display this help and exit Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: -V, --version output version information and exit Jun 25 15:59:38 v1tzabbix.net.be.ch kill[1164]: For more details see kill(1). Jun 25 15:59:38 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service: control process exited, code=exited status=1 Jun 25 15:59:38 v1tzabbix.net.be.ch systemd[1]: Unit zabbix-server.service entered failed state. Jun 25 15:59:38 v1tzabbix.net.be.ch systemd[1]: zabbix-server.service failed. Jun 25 15:59:44 v1tzabbix.net.be.ch systemd[1]: Stopped Zabbix Server. Jun 25 15:59:44 v1tzabbix.net.be.ch systemd[1]: Starting Zabbix Server... Jun 25 15:59:44 v1tzabbix.net.be.ch systemd[1]: Started Zabbix Server.