Ad Widget

Collapse

Zabbix Server fails to shutdown when MySQL is entirely dead

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • brian.smith
    Junior Member
    • Apr 2021
    • 2

    #1

    Zabbix Server fails to shutdown when MySQL is entirely dead

    Configuration.
    2 node cluster using the documentation from the Zabbix Blog, and am extending the Cluster services some.

    For sake of just keeping it extremely easy. I have a script that monitors if MySQL is UP/DOWN. If it is DOWN, Pacemaker initiates a failover. It attempts to stop the Zabbix Server (systemctl zabbix-server stop).

    Now, Zabbix Server is still attempting to access MySQL;

    2348716:20210419:090712.740 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to MySQL server on '198.6.1.2' (115)
    2348716:20210419:090712.740 database is down: reconnecting in 10 seconds

    Which makes sense. EVERYTHING that I have read suggests using the default systemd script from Zabbix, which I am, and there is provisions in there to ensure the way Systemd starts / stops everything in order.

    However, in the event that the MySQL Daemon dies, Zabbix Server should still be able to shutdown gracefully.

    Of course there are work arounds, I.E, have MySQL in its own cluster independent of Zabbix, etc. But even if I needed to take the box down for maintenance and MySQL is dead, I would still need to forcefully (killall -9 zabbix_server) the processes so that I do not have to wait.

    Surely there is exists a way that if Zabbix receives the "-SIGTERM" it should be able to stop even if its database backend is down? Any suggestions?
  • brian.smith
    Junior Member
    • Apr 2021
    • 2

    #2
    Pacemaker Workaround

    pcs resource config ZabbixServer

    Resource: ZabbixServer (class=systemd type=zabbix-server)
    Operations: start interval=0s timeout=100 (ZabbixServer-start-interval-0s)
    stop interval=0s on-fail=ignore timeout=20s (ZabbixServer-stop-interval-0s)
    monitor interval=10s on-fail=standby (ZabbixServer-monitor-interval-10s)


    I decreased the timeout for stopping down to 20s, by default was 100, guessing minutes it didn't have an 's'. My system isn't very busy and that works for me. Notice though, I changed the "on-fail" to ignore. I am expecting stopping to fail.
    On the "monitor", I also told it do an standby in the event of failure.

    The actual commands used:
    pcs resource update ZabbixServer op stop interval=0s timeout=20s on-fail=ignore
    pcs resource update ZabbixServer op monitor interval=0s timeout=10s on-fail=standby


    What exactly does this resolve?
    In the event MySQL is dead and you need to initiate a "pcs node standby" to fail over to the other host, stopping the "zabbix-server" daemon will no longer cause it to fail.

    However, we should be able to stop the daemon gracefully!
    Last edited by brian.smith; 20-04-2021, 17:46.

    Comment

    Working...