Ad Widget

Collapse

Dead lock between mariadb.service and zabbix-server.service

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jros
    Junior Member
    • Mar 2023
    • 16

    #1

    Dead lock between mariadb.service and zabbix-server.service

    Hi!

    After several days having problems with zabbix and the server that host it I have been able to understand what was happening.

    A week ago I arrive to the office at the morning and I found a warning saying "Database error: No such file or directory [Retry]"

    Click image for larger version  Name:	chrome_80ToTwhMkS.png Views:	0 Size:	5.9 KB ID:	482715

    In the server I could see that service "mariadb.service" is stopped (using "service mariadb status")
    I tried to start the service ("service mariadb start" or "systemctl start mariadb.service") and it hangs ... I used ctrl+c and reboot the system using "reboot" command.
    The reboot process hangs too, it hangs when try to stop zabbix-server service:
    Click image for larger version  Name:	KHpvwVYF00.png Views:	0 Size:	85.2 KB ID:	482716

    I had to force restart the machine via VMWare.

    After several researchs and investigations I can see that the problem occurs always after system try to do the daily update, after execute "apt upgrade", then I decide to try in interactive executing "apt upgrade".
    When I execute "apt upgrade" several packages were installed but the process hangs when try to restart the services:
    Click image for larger version  Name:	eTfzDstYiT.png Views:	0 Size:	25.0 KB ID:	482717

    In other session I look at zabbix_server.log and I saw this:
    Click image for larger version  Name:	MkkrRpFhbA.png Views:	0 Size:	107.5 KB ID:	482718

    Seems that when zabbix_server service stops need to access to database, and don't stops until access to it, however the mariadb is stopped. Then the restarting services process doesn't end.

    After restart I try to restart the services to try reproduce the problem:
    Click image for larger version  Name:	ahNx3kSXuc.png Views:	0 Size:	7.2 KB ID:	482719
    Effectively this process hangs ... MariaDb is stopped and zabbix-server continue running waiting for DB ... then the UI show the message saying "Database error".

    Any help?

    Thanks in advance.

    PD:
    Linux version:

    Distributor ID: Ubuntu
    Description: Ubuntu Noble Numbat (development branch)
    Release: 24.04
    Codename: noble

    Zabbix version 6.4.13

    MariaDB: mariadb-server-10.6/now 1:10.6.16+maria~ubu2204 amd64 [instalado, local]
    Last edited by jros; 19-04-2024, 11:27.
  • toj
    Junior Member
    • Oct 2023
    • 8

    #2
    What does "journalctl -u mariadb.service" and the mariadb error logs tell you?

    Comment

    • jros
      Junior Member
      • Mar 2023
      • 16

      #3
      Nothing special (I think), when restart the services:
      Code:
      abr 22 07:27:47 SZabbix systemd[1]: Stopping mariadb.service - MariaDB 10.6.16 database server...
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] /usr/sbin/mariadbd (initiated by: unknown): Normal shutdown
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] InnoDB: FTS optimize thread exiting.
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] InnoDB: Starting shutdown...
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] InnoDB: Restricted to 2028 pages due to innodb_buf_pool_dump_pct=25
      abr 22 07:27:47 SZabbix mariadbd[1125]: 2024-04-22 7:27:47 0 [Note] InnoDB: Buffer pool(s) dump completed at 240422 7:27:47
      abr 22 07:27:48 SZabbix mariadbd[1125]: 2024-04-22 7:27:48 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
      abr 22 07:27:48 SZabbix mariadbd[1125]: 2024-04-22 7:27:48 0 [Note] InnoDB: Shutdown completed; log sequence number 549926178946; transaction id 425890260
      abr 22 07:27:48 SZabbix mariadbd[1125]: 2024-04-22 7:27:48 0 [Note] /usr/sbin/mariadbd: Shutdown complete
      abr 22 07:27:48 SZabbix systemd[1]: mariadb.service: Deactivated successfully.
      abr 22 07:27:48 SZabbix systemd[1]: Stopped mariadb.service - MariaDB 10.6.16 database server.
      abr 22 07:27:48 SZabbix systemd[1]: mariadb.service: Consumed 1h 38min 26.601s CPU time.

      Comment

      • toj
        Junior Member
        • Oct 2023
        • 8

        #4
        Restart?, if this is the full log then mariadb is not running.

        What does "systemctl status mariadb.service" say?

        Comment

        • zimma
          Junior Member
          • Aug 2024
          • 3

          #5
          I just ran into this too.

          Code:
          Restarting services...
           /etc/needrestart/restart.d/systemd-manager
           systemctl restart mysql.service nginx.service opensmtpd.service packagekit.service php8.3-development-fpm.service php8.3-fpm.service php8.3-production-fpm.service rabbitmq-server.service redis-server.service ssh.service systemd-journald.service systemd-networkd.service systemd-resolved.service systemd-udevd.service zabbix-agent2.service zabbix-server.service
          Looks like systemd issued a stop command to mysql, then a stop command to zabbix_server which then hung shutting down because mysql was gone.
          systemd did not bother to issue the start command for mysql because it was waiting for zabbix_server to shut down.

          Wondering whether having both After AND Requires in the systemd definition would prevent this.

          Comment

          • zimma
            Junior Member
            • Aug 2024
            • 3

            #6
            I have just recreated this.

            Code:
            sudo systemctl restart mysql.service nginx.service opensmtpd.service packagekit.service php8.3-development-fpm.service php8.3-fpm.service php8.3-production-fpm.service rabbitmq-server.service redis-server.service ssh.service systemd-journald.service systemd-networkd.service systemd-resolved.service systemd-udevd.service zabbix-agent2.service zabbix-server.service
            Proof that mysql start is waiting for zabbix-server shutdown

            Code:
            ubuntu@megumi:~$ sudo systemctl list-jobs
            JOB UNIT TYPE STATE
            217645 zabbix-server.service restart running
            216322 mysql.service start waiting
            
            2 jobs listed.
            ubuntu@megumi:~$ sudo systemctl cancel 217645
            ubuntu@megumi:~$ sudo systemctl list-jobs
            No jobs running.
            ubuntu@megumi:~$ ps -ax | grep mysql
            783412 ? Ssl 0:01 /usr/sbin/mysqld
            783466 pts/2 S+ 0:00 grep --color=auto mysql

            Comment

            • zimma
              Junior Member
              • Aug 2024
              • 3

              #7
              Adding a systemd override to include the Requires dependency to mysql.service fixes this for me.

              Code:
              sudo tee /etc/systemd/system/zabbix-server.service.d/override.conf > /dev/null <<'EOF'
              [Unit]
              Requires=mysql.service
              EOF
              sudo systemctl daemon-reload

              Comment

              • z0nk
                Member
                • Oct 2024
                • 45

                #8
                Same problem on 24.04 LTS. Looks like it is not fixed. I installed clear Ubuntu with Zabbix yesterday and at night it die - first time in my carrier mysql/mariadb is dead...

                Comment

                • z0nk
                  Member
                  • Oct 2024
                  • 45

                  #9
                  Code:
                  # systemctl list-jobs
                  JOB UNIT TYPE STATE
                  7682 zabbix-server.service restart running
                  6731 mysql.service start waiting
                  6150 apt-daily-upgrade.service start running
                  
                  3 jobs listed.
                  # systemctl cancel 7682
                  # systemctl list-jobs
                  JOB  UNIT                      TYPE  STATE
                  6150 apt-daily-upgrade.service start running
                  
                  1 jobs listed.​
                  Why there is no fix since over half year when zabbix 7 is LTS?

                  Comment

                  • Clontarf[X]
                    Member
                    • Jan 2017
                    • 80

                    #10
                    Big +1 for this, makes patching boxes a nightmare and super risky.

                    Comment

                    • pfoo
                      Junior Member
                      • Jun 2016
                      • 1

                      #11
                      I probably just experienced the same deadlock. Is there an active bug report to track this one ?

                      Comment

                      • abhi92
                        Junior Member
                        • May 2025
                        • 1

                        #12
                        Hello Guys, I am facing this exact same problem. Is there any fix or anything else ?
                        My Workaround which has worked is, disable the apt-daily.timer ; apt-daily.service ; apt-daily-upgrade.timer ; apt-daily-upgrade.service

                        Comment

                        • danielec
                          Junior Member
                          • Oct 2025
                          • 3

                          #13
                          I feel like this is still an issue in 7 LTS.
                          It was pointed out as a bug https://support.zabbix.com/browse/ZBX-23456 but it was dismissed. I'm not familiar with Zabbix bug tracking but this seems like a bug to me.

                          The proper way to solve should be what zimma reported above or:
                          Code:
                          # systemctl add-requires zabbix-server.service mysql.service

                          Comment

                          Working...