Ad Widget

Collapse

Zabbix server won't stop (sometimes), 7.0.18, Ubuntu 24.04.3, psql 17.2

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Linwood
    Senior Member
    • Dec 2013
    • 398

    #1

    Zabbix server won't stop (sometimes), 7.0.18, Ubuntu 24.04.3, psql 17.2

    I have zabbix running on a number of clients and am embarrassed to say it's my own home network where I have a problem I can't find.


    When doing a reboot of ubuntu, it will hang forever. Sometimes. Not always.

    I cannot find any resource issues - plenty of disk and memory and CPU. When it comes back up it runs fine.

    This last time I did a "systemctl stop zabbix-server" and then quite literally stopped (kill -9) every pid that had "zabbix" in the name and then did the reboot and it hung as shown below.

    What I'd like to do is change that unlimited to something like 3 minutes or so. Or figure out why it is hanging, but so far I find nothing -- zabbix was running fine and responding fine, and polling when I did the shutdown.

    I'd love to find the root cause, but in the meantime anyone know how to persistently change the timeout on that stop line from unlimited?

    I can change /lib/systemd/system/zabbix-server.service but it's going to get overwritten (or it should) by updates, right?

    Should it REALLY be infinity?


    Click image for larger version

Name:	stop.jpg
Views:	412
Size:	215.1 KB
ID:	507186
  • troffasky
    Senior Member
    • Jul 2008
    • 567

    #2
    Yes I have also seen this.
    7.0.x, proxy and server, Ubuntu 22.04, MariaDB.
    Probably happened 2-3 times in total over past year. Only solution is to hard-reset the VM. It doesn't just happen on shutdown, also when installing updates.
    Last edited by troffasky; 15-09-2025, 09:53.

    Comment

    • Linwood
      Senior Member
      • Dec 2013
      • 398

      #3
      So doesn't sound specific to Postgresql then. Any zabbix team members following, is there any reason the distro service file uses infinite?

      Comment

      • cyber
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Dec 2006
        • 4807

        #4
        Have you ever looked into zabbix own log from that time ? I have seen cases, when history sync during shutdown takes ages... you can literally see it going 0.xx% per sec, as it writes that progress to log... And it may take 30-40 minutes... even more.

        Comment

        • Aer0
          Junior Member
          • Sep 2024
          • 22

          #5
          I have the same problem at my test system which has quite high load. Not sure about the cause (i have not seen log-file), but I'm thinking to make a "#DefaultTimeoutStopSec=" for force stop.

          Comment

          • Linwood
            Senior Member
            • Dec 2013
            • 398

            #6
            Originally posted by cyber
            Have you ever looked into zabbix own log from that time ? I have seen cases, when history sync during shutdown takes ages... you can literally see it going 0.xx% per sec, as it writes that progress to log... And it may take 30-40 minutes... even more.
            Oddly enough it looks to me like it DID stop (2nd screen shot), but look (1st screen shot) what I found in the syslog.

            I have no idea what that means. Let me do a couple of reboots.

            Click image for larger version

Name:	syslog.jpg
Views:	338
Size:	177.6 KB
ID:	507276
            Click image for larger version

Name:	stopped.jpg
Views:	331
Size:	116.4 KB
ID:	507278
            Attached Files

            Comment

            • Linwood
              Senior Member
              • Dec 2013
              • 398

              #7
              I just did a reboot, and there is NOTHING in the syslog showing zabbix server stopped.

              However, with zabbix running I did a systemctl stop zabbix-server, and that invalid argument is in the result (and it did stop). So I think that's wrong, not sure what but maybe something in the distro incompatible with this systemd, but it does not at least by itself keep it from stopping.

              But sometimes it won't stop. Those of you with it also failing, hunt around next time, see if you can find clues. It sounds like it's infrequent but fairly widespread given the responses (and how little traffic you get here normally).

              Comment

              • cyber
                Senior Member
                Zabbix Certified SpecialistZabbix Certified Professional
                • Dec 2006
                • 4807

                #8
                Some syntax error in service file? Bad encoding?
                Code:
                systemd-analyze verify yourname.service

                Comment


                • Linwood
                  Linwood commented
                  Editing a comment
                  Learned a new command, thank you. No errors shown. If you do a systemctl stop zabbix-server and check syslog is there an error there?

                  To my knowledge this is the file from the distro, though I think at this instant I've changed it from infinite to 5m, but the original error was before I changed it. I would expect it to give the error on any system at similar versions.
              • troffasky
                Senior Member
                • Jul 2008
                • 567

                #9
                Bitten by this again today. Tried to shut down VM, stuck waiting per @Linwood's first screenshot.

                In the journal I can see lots of services being stopped, then nothing for 30 minutes, then it times out.

                Code:
                Nov 17 15:05:44 proxy systemd[1]: Stopped target Network is Online.
                Nov 17 15:05:44 proxy systemd[1]: Stopped target Host and Network Name Lookups.
                Nov 17 15:05:44 proxy systemd[1]: NetworkManager-wait-online.service: Deactivated successfully.
                Nov 17 15:05:44 proxy systemd[1]: Stopped Network Manager Wait Online.
                Nov 17 15:35:11 proxy systemd[1]: reboot.target: Job reboot.target/start timed out.
                Nov 17 15:35:11 proxy systemd[1]: Timed out starting System Reboot.
                Nov 17 15:35:11 proxy systemd[1]: reboot.target: Job reboot.target/start failed with result 'timeout'.
                Nov 17 15:35:11 proxy systemd[1]: Forcibly rebooting: job timed out
                Nov 17 15:35:11 proxy systemd[1]: Shutting down.

                It's weird, but you would not know that zabbix-proxy is the culprit here as it doesn't seem to log anything about it in the journal!

                Service file has

                TimeoutSec=infinity

                configured in it. Changed to 300.

                There are references to this issue in this ancient bug:

                Comment

                • troffasky
                  Senior Member
                  • Jul 2008
                  • 567

                  #10
                  Upgraded a proxy from Ubuntu 20.04, to 22.04 to 24.04 today. Hit by this at every shutdown! On the 22.04 shutdown step, even though TimeoutSec=infinity was in the service file, it actually said it was only going to wait 5 minutes.

                  Comment

                  • 0711it
                    Junior Member
                    • Jan 2026
                    • 1

                    #11
                    Faced this problem on multiple proxys that we manage so i decided to dig into this a little deeper. The Problem can occur after automatic updates or on reboot.

                    Proxy Log looks like this when the problem occurs: (08:37 is about the time i hard reset the vm)

                    Code:
                    Aus /var/log/zabbix/zabbix_proxy.log
                    
                    
                    
                    
                    2698381:20260113:065055.634 syncing history data in progress...
                    
                    2698378:20260113:065055.770 thread stopped [discovery worker #12]
                    
                    2698593:20260113:065055.875 thread stopped
                    
                    2698378:20260113:065055.980 thread stopped [discovery worker #1]
                    
                    2698378:20260113:065056.083 thread stopped [discovery worker #2]
                    
                    2698378:20260113:065056.187 thread stopped [discovery worker #3]
                    
                    2698378:20260113:065056.290 thread stopped [discovery worker #5]
                    
                    2698378:20260113:065056.393 thread stopped [discovery worker #6]
                    
                    2698378:20260113:065056.497 thread stopped [discovery worker #7]
                    
                    2698378:20260113:065056.599 thread stopped [discovery worker #8]
                    
                    2698378:20260113:065056.703 thread stopped [discovery worker #9]
                    
                    2698378:20260113:065056.806 thread stopped [discovery worker #10]
                    
                    2698378:20260113:065056.911 thread stopped [discovery worker #11]
                    
                    2698378:20260113:065057.017 thread stopped [discovery worker #13]
                    
                    2698378:20260113:065057.121 thread stopped [discovery worker #14]
                    
                    2698378:20260113:065057.225 thread stopped [discovery worker #15]
                    
                    2698378:20260113:065057.330 thread stopped [discovery worker #4]
                    
                    2698369:20260113:065057.436 [1] thread stopped [preprocessing worker #1]
                    
                    2698369:20260113:065057.541 [6] thread stopped [preprocessing worker #6]
                    
                    2698369:20260113:065057.644 [9] thread stopped [preprocessing worker #9]
                    
                    2698369:20260113:065057.749 [14] thread stopped [preprocessing worker #14]
                    
                    2698369:20260113:065057.862 [2] thread stopped [preprocessing worker #2]
                    
                    2698369:20260113:065057.966 [7] thread stopped [preprocessing worker #7]
                    
                    2698369:20260113:065058.070 [4] thread stopped [preprocessing worker #4]
                    
                    2698369:20260113:065058.174 [5] thread stopped [preprocessing worker #5]
                    
                    2698369:20260113:065058.278 [11] thread stopped [preprocessing worker #11]
                    
                    2698369:20260113:065058.381 [15] thread stopped [preprocessing worker #15]
                    
                    2698369:20260113:065058.484 [3] thread stopped [preprocessing worker #3]
                    
                    2698369:20260113:065058.588 [13] thread stopped [preprocessing worker #13]
                    
                    2698369:20260113:065058.691 [16] thread stopped [preprocessing worker #16]
                    
                    2698369:20260113:065058.794 [8] thread stopped [preprocessing worker #8]
                    
                    2698369:20260113:065058.896 [10] thread stopped [preprocessing worker #10]
                    
                    2698369:20260113:065059.004 [12] thread stopped [preprocessing worker #12]
                    
                    1114:20260113:083724.349 Starting Zabbix Proxy (active) [removed]. Zabbix 7.0.20 (revision e9302d4d6fc).
                    
                    1114:20260113:083724.351 **** Enabled features ****
                    
                    1114:20260113:083724.351 SNMP monitoring: YES
                    
                    1114:20260113:083724.351 IPMI monitoring: YES
                    
                    1114:20260113:083724.351 Web monitoring: YES
                    /var/log/syslog looks like this:

                    Code:
                    2026-01-13T06:50:53.508043+01:00 et02 apt.systemd.daily[185789]: pid = os.fork()
                    
                    2026-01-13T06:50:55.197264+01:00 et02 systemd[1]: Stopping packagekit.service - PackageKit Daemon...
                    
                    2026-01-13T06:50:55.205511+01:00 et02 systemd[1]: Stopping zabbix-agent.service - Zabbix Agent...
                    
                    2026-01-13T06:50:55.205985+01:00 et02 systemd[1]: packagekit.service: Deactivated successfully.
                    
                    2026-01-13T06:50:55.206266+01:00 et02 systemd[1]: Stopped packagekit.service - PackageKit Daemon.
                    
                    2026-01-13T06:50:55.211490+01:00 et02 systemd[1]: Starting packagekit.service - PackageKit Daemon...
                    
                    2026-01-13T06:50:55.213322+01:00 et02 systemd[1]: Stopping zabbix-proxy.service - Zabbix Proxy...
                    
                    2026-01-13T06:50:55.219648+01:00 et02 systemd[1]: zabbix-agent.service: Deactivated successfully.
                    
                    2026-01-13T06:50:55.219908+01:00 et02 systemd[1]: Stopped zabbix-agent.service - Zabbix Agent.
                    
                    2026-01-13T06:50:55.219976+01:00 et02 systemd[1]: zabbix-agent.service: Consumed 1h 20min 49.139s CPU time, 17.4M memory peak, 0B memory swap peak.
                    
                    2026-01-13T06:50:55.230305+01:00 et02 systemd[1]: Starting zabbix-agent.service - Zabbix Agent...
                    
                    2026-01-13T06:50:55.235893+01:00 et02 PackageKit: daemon start
                    
                    2026-01-13T06:50:55.255928+01:00 et02 systemd[1]: Started packagekit.service - PackageKit Daemon.
                    
                    2026-01-13T06:50:55.412901+01:00 et02 zabbix_agentd[186716]: /usr/sbin/zabbix_agentd: /usr/local/lib/libcurl.so.4: no version information available (required by /usr/sbin/zabbix_agentd)
                    
                    2026-01-13T06:50:55.529141+01:00 et02 systemd[1]: Started zabbix-agent.service - Zabbix Agent.
                    
                    2026-01-13T06:50:59.019774+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1161 [Warning] Aborted connection 1161 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.020057+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1150 [Warning] Aborted connection 1150 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.020087+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1166 [Warning] Aborted connection 1166 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.023119+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1172 [Warning] Aborted connection 1172 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.029586+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1164 [Warning] Aborted connection 1164 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.056195+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1163 [Warning] Aborted connection 1163 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.076936+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1165 [Warning] Aborted connection 1165 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    
                    2026-01-13T06:50:59.095404+01:00 et02 mariadbd[918]: 2026-01-13 6:50:59 1162 [Warning] Aborted connection 1162 to db: 'zabbix_proxy' user: 'zabbix' host: 'localhost' (Got an error reading communication packets)
                    ​

                    After some testing around i found that the issue can be reproduced by running systemctl daemon-reload so this has to me some dependecy/timing issue.

                    I made following changed which at least solve it when running systemctl daemon-reload - i cant say if this will fix the actual problem as this occurs sporadic.


                    Code:
                    1.
                    
                    nano /usr/lib/systemd/system/zabbix-proxy.service
                    
                    Add Line:
                    
                    Requires=mariadb.service
                    
                    
                    
                    
                    2.
                    
                    nano /usr/lib/systemd/system/zabbix-proxy.service
                    
                    #change this from infinite:
                    TimeoutSec=300
                    
                    
                    
                    
                    systemctl daemon-reload
                    
                    systemctl restart zabbix-proxy
                    ​

                    Comment

                    Working...