Ad Widget

Collapse

Zabbix server does not start

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • galvcom
    Junior Member
    • May 2024
    • 13

    #1

    Zabbix server does not start

    I upgrade from 6.0.27 to 6.0.32 with sudo apt full-upgrade and now the service does not start. Is there a command to see when this process ends?
    Code:
      2463:20240715:152014.402 Starting Zabbix Server. Zabbix 6.0.32 (revision e0ebc610bbe).
      2463:20240715:152014.415 ****** Enabled features ******
      2463:20240715:152014.415 SNMP monitoring:           YES
      2463:20240715:152014.415 IPMI monitoring:           YES
      2463:20240715:152014.415 Web monitoring:            YES
      2463:20240715:152014.415 VMware monitoring:         YES
      2463:20240715:152014.415 SMTP authentication:       YES
      2463:20240715:152014.415 ODBC:                      YES
      2463:20240715:152014.415 SSH support:               YES
      2463:20240715:152014.415 IPv6 support:              YES
      2463:20240715:152014.415 TLS support:               YES
      2463:20240715:152014.416 ******************************
      2463:20240715:152014.416 using configuration file: /etc/zabbix/zabbix_server.conf
      2463:20240715:152014.592 current database version (mandatory/optional): 06000000/06000044
      2463:20240715:152014.592 required mandatory version: 06000000
      2463:20240715:152014.592 optional patches were found
      2463:20240715:152014.592 starting automatic database upgrade
    Systemctl says that the server is up, but the gui say it isn't.
    Code:
    ● zabbix-server.service - Zabbix Server
         Loaded: loaded (/lib/systemd/system/zabbix-server.service; disabled; vendor preset: enabled)
         Active: active (running) since Mon 2024-07-15 16:13:00 CST; 6min ago
        Process: 2677 ExecStart=/usr/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
       Main PID: 2679 (zabbix_server)
          Tasks: 1 (limit: 14211)
         Memory: 10.8M
            CPU: 95ms
         CGroup: /system.slice/zabbix-server.service
                 └─2679 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
    
    Jul 15 16:12:59 pruebas-grafana systemd[1]: Starting Zabbix Server...
    Jul 15 16:13:00 pruebas-grafana systemd[1]: Started Zabbix Server.
    Click image for larger version  Name:	imagen.png Views:	0 Size:	16.8 KB ID:	487558

    Already set in mysql.
    Code:
    set global log_bin_trust_function_creators = 1;
    If I try to restart/stop the zabbix server, it gets stuck and even I can't restart the virtual machine because the zabbix process never stops, show something like:
    Code:
    A stop job is running for zabbix server ( time / no limit)
    Last edited by galvcom; 16-07-2024, 01:01.
  • Answer selected by galvcom at 16-07-2024, 19:01.
    Markku
    Senior Member
    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
    • Sep 2018
    • 1781

    Right, that's mentioned in the upgrade notes, they are worth reading.



    Markku

    Comment

    • Markku
      Senior Member
      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
      • Sep 2018
      • 1781

      #2
      The Zabbix server log should indicate the reason for the process to stop, but as you showed, the process has not stopped yet. Are you sure the log file doesn't have any further output? It is not common at all to see the zabbix_server process just stuck with the last log line of "starting automatic database upgrade".

      You can try force-killing (sudo kill -9 <the-pid-of-zabbix_server>) it and then start the service (or reboot) again.

      Markku

      Comment

      • galvcom
        Junior Member
        • May 2024
        • 13

        #3
        It took me about 1 hour to finish this process for a 11.75 GB database, but there is no feedback during the execution, but now everything seems to work fine, so next time I will just have to wait.
        Code:
          2700:20240715:170514.227 ODBC:                      YES
          2700:20240715:170514.227 SSH support:               YES
          2700:20240715:170514.227 IPv6 support:              YES
          2700:20240715:170514.227 TLS support:               YES
          2700:20240715:170514.227 ******************************
          2700:20240715:170514.227 using configuration file: /etc/zabbix/zabbix_server.conf
          2700:20240715:170514.373 current database version (mandatory/optional): 06000000/06000044
          2700:20240715:170514.373 required mandatory version: 06000000
          2700:20240715:170514.373 optional patches were found
          2700:20240715:170514.374 starting automatic database upgrade
          2700:20240715:180103.366 slow query: 3348.991231 sec, "create index auditlog_4 on auditlog (recordsetid)"
          2700:20240715:180103.456 completed 100% of database upgrade
          2700:20240715:180103.456 database upgrade fully completed
          5488:20240715:180103.687 starting HA manager
          5488:20240715:180103.760 HA manager started in active mode
          2700:20240715:180103.775 server #0 started [main process]
          5490:20240715:180103.780 server #2 started [configuration syncer #1]
          5489:20240715:180103.792 server #1 started [service manager #1]
          5491:20240715:180107.295 server #3 started [alert manager #1]
          5492:20240715:180107.298 server #4 started [alerter #1]
          5493:20240715:180107.299 server #5 started [alerter #2]
          5494:20240715:180107.301 server #6 started [alerter #3]
          5495:20240715:180107.303 server #7 started [preprocessing manager #1]
          5496:20240715:180107.305 server #8 started [preprocessing worker #1]
          5497:20240715:180107.307 server #9 started [preprocessing worker #2]
          5498:20240715:180107.308 server #10 started [preprocessing worker #3]
          5505:20240715:180107.316 server #17 started [discoverer #1]
          5506:20240715:180107.317 server #18 started [history syncer #1]
          5501:20240715:180107.319 server #13 started [lld worker #2]
          5499:20240715:180107.320 server #11 started [lld manager #1]
          5500:20240715:180107.320 server #12 started [lld worker #1]
          5504:20240715:180107.322 server #16 started [http poller #1]
          5511:20240715:180107.323 server #23 started [proxy poller #1]
          5503:20240715:180107.324 server #15 started [timer #1]
          5502:20240715:180107.326 server #14 started [housekeeper #1]
          5507:20240715:180107.329 server #19 started [history syncer #2]
          5509:20240715:180107.334 server #21 started [history syncer #4]
          5510:20240715:180107.335 server #22 started [escalator #1]
          5522:20240715:180107.336 server #31 started [unreachable poller #1]
          5518:20240715:180107.340 server #29 started [poller #4]
          5515:20240715:180107.341 server #26 started [poller #1]
          5525:20240715:180107.346 server #32 started [trapper #1]
          5533:20240715:180107.351 server #39 started [history poller #1]
          5519:20240715:180107.352 server #30 started [poller #5]
          5513:20240715:180107.353 server #25 started [task manager #1]
          5512:20240715:180107.354 server #24 started [self-monitoring #1]
          5516:20240715:180107.354 server #27 started [poller #2]
          5534:20240715:180107.361 server #40 started [history poller #2]
          5517:20240715:180107.363 server #28 started [poller #3]
          5541:20240715:180107.368 server #46 started [odbc poller #1]
          5526:20240715:180107.369 server #33 started [trapper #2]
          5529:20240715:180107.371 server #36 started [trapper #5]
          5527:20240715:180107.371 server #34 started [trapper #3]
          5531:20240715:180107.372 server #37 started [icmp pinger #1]
          5532:20240715:180107.373 server #38 started [alert syncer #1]
          5508:20240715:180107.374 server #20 started [history syncer #3]
          5528:20240715:180107.375 server #35 started [trapper #4]
          5535:20240715:180107.386 server #41 started [history poller #3]
          5536:20240715:180107.388 server #42 started [history poller #4]
          5537:20240715:180107.391 server #43 started [history poller #5]
          5539:20240715:180107.393 server #44 started [availability manager #1]
          5540:20240715:180107.393 server #45 started [trigger housekeeper #1]

        Comment

        • Markku
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Sep 2018
          • 1781

          #4
          Right, that's mentioned in the upgrade notes, they are worth reading.



          Markku

          Comment

          • galvcom
            Junior Member
            • May 2024
            • 13

            #5
            Yes, it was a mistake on my part to ignore that, thanks for the help.

            Comment

            • Markku
              Senior Member
              Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
              • Sep 2018
              • 1781

              #6
              No problem, but one more question: how many auditlog records do you have, and what kind of database server (performance-wise) do you have? For future reference, to get an idea how long the indexing could take.

              select count(*) from auditlog;

              Markku

              Comment

              • galvcom
                Junior Member
                • May 2024
                • 13

                #7
                Default MySQL database installation, my zabbix installation was done following a udemy course so I think everything is installed with the default options, is there any way to improve this? I'm trying to understand how to clean this audit log as it is larger than the data stored, but I need to make a backup first in case something goes wrong.

                Only zabbix is installed on this VM
                Code:
                +--------------------+
                | Database           |
                +--------------------+
                | information_schema |
                | mysql              |
                | performance_schema |
                | sys                |
                | zabbix             |
                +--------------------+
                
                66 GB      /mysql
                27.2 GiB   /mysql/zabbix
                18.1 GiB   /mysql/zabbix/auditlog.ibd
                4.7 GiB    /mysql/zabbix/history.ibd
                1.1 GiB    /mysql/zabbix/history_uint.ibd
                192.0 MiB  /mysql/zabbix/trends.ibd
                136.0 MiB  /mysql/zabbix/trends_uint.ibd
                Code:
                mysql> select count(*) from auditlog;
                +----------+
                | count(*) |
                +----------+
                | 36421752 |
                +----------+
                1 row in set (2 min 26.98 sec)​

                Comment

                • Markku
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                  • Sep 2018
                  • 1781

                  #8
                  https://dev.mysql.com/doc/refman/9.0...ol-resize.html (innodb_buffer_pool_size) is the most important setting in tuning MySQL AFAIK, the default value is small 128 MB. For example, on one of my DB servers where I have 2 GB RAM, I have set innodb_buffer_pool_size to 1G (I'm using MariaDB but similar thing there), but on a larger 32 GB server I have set it to something like 28 GB.

                  36 million is a large amount of auditlog, yes. I remember some Zabbix versions being overly active in logging some LLD events (or something like that), causing auditlog table to explode in size. It was fixed later but of course didn't address the already existing auditlog entries.

                  You can try something like this:

                  select count(*) from auditlog where clock < unix_timestamp()-86400*365;
                  = count auditlog entries older than 365 days

                  and then (at your discretion) delete the old records:

                  delete from auditlog where clock < unix_timestamp()-86400*365;

                  Markku
                  Last edited by Markku; 17-07-2024, 18:27. Reason: Corrected the innodb_buffer_pool_size to 1G (on a 2G server)

                  Comment

                  • Markku
                    Senior Member
                    Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                    • Sep 2018
                    • 1781

                    #9
                    I see the auditlog explosion happened in 6.0.3 and was fixed in 6.0.6 (ZBX-20792; released in June 2022).

                    Markku

                    Comment

                    • Markku
                      Senior Member
                      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                      • Sep 2018
                      • 1781

                      #10
                      FWIW, I checked one of my Zabbix 6.0 databases and saw this:

                      $ sudo ls -l /var/lib/mysql/zabbix/auditlog.ibd
                      -rw-rw---- 1 mysql mysql 486539264 Jul 18 14:20 /var/lib/mysql/zabbix/auditlog.ibd


                      = almost 500 MB of auditlog, even though count(*) returned under 2000 rows, as I had deleted the unnecessary auditlog entries already in 2022.

                      I then ran "optimize table auditlog;" on the database and the result was:

                      $ sudo ls -l /var/lib/mysql/zabbix/auditlog.ibd
                      -rw-rw---- 1 mysql mysql 7340032 Jul 19 10:47 /var/lib/mysql/zabbix/auditlog.ibd


                      = now it's under 10 MB.

                      So if you end up deleting the old auditlog entries, "optimize table" would be a good idea.

                      Markku

                      Comment

                      • galvcom
                        Junior Member
                        • May 2024
                        • 13

                        #11
                        I've made some changes.

                        Increased from 1 to 4
                        Code:
                        mysql> SELECT @@innodb_buffer_pool_instances;
                        +--------------------------------+
                        | @@innodb_buffer_pool_instances |
                        +--------------------------------+
                        |                              4 |
                        +--------------------------------+
                        1 row in set (0.00 sec)
                        Leaved as default
                        Code:
                        mysql> SELECT @@innodb_buffer_pool_chunk_size;
                        +---------------------------------+
                        | @@innodb_buffer_pool_chunk_size |
                        +---------------------------------+
                        |                       134217728 |
                        +---------------------------------+
                        1 row in set (0.00 sec)
                        Changed from default to 8GB / I have12 GB RAM.
                        Code:
                        mysql> SELECT @@innodb_buffer_pool_size;
                        +---------------------------+
                        | @@innodb_buffer_pool_size |
                        +---------------------------+
                        |                8589934592 |
                        +---------------------------+
                        1 row in set (0.00 sec)
                        I only maintain the last 3 months data and delete the rest, but this data was only 7911 records
                        Code:
                        mysql>  select count(*) from auditlog where clock < unix_timestamp()-86400*90;
                        +----------+
                        | count(*) |
                        +----------+
                        |     7911 |
                        +----------+
                        1 row in set (0.00 sec)
                        Output of optimize
                        Code:
                        mysql> optimize table auditlog;
                        +-----------------+----------+----------+-------------------------------------------------------------------+
                        | Table           | Op       | Msg_type | Msg_text                                                          |
                        +-----------------+----------+----------+-------------------------------------------------------------------+
                        | zabbix.auditlog | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
                        | zabbix.auditlog | optimize | status   | OK                                                                |
                        +-----------------+----------+----------+-------------------------------------------------------------------+
                        2 rows in set (2 hours 56 min 31.25 sec)
                        
                        ​
                        The size changed from 19 GB to about 16 GB, so maybe in 3 months when I repeat this process the size will reduce even more since I disabled audit log.

                        Comment

                        • Markku
                          Senior Member
                          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
                          • Sep 2018
                          • 1781

                          #12
                          Good point: in MariaDB (that I tend to use) the instances are not used anymore and default chunk size is set automatically, while in MySQL tuning them as well is useful.

                          Markku

                          Comment

                          Working...