Ad Widget

Collapse

Issue with Upgrading Zabbix from 6.0.2 to 6.4: Database Upgrade Failing in HA Mode

Collapse
This topic has been answered.
X
X
 
  • Time
  • Show
Clear All
new posts
  • sasanaalem
    Junior Member
    • Sep 2023
    • 28

    #1

    Issue with Upgrading Zabbix from 6.0.2 to 6.4: Database Upgrade Failing in HA Mode

    I am currently running Zabbix 6.0.2 in a Docker environment and attempting to upgrade to Zabbix 6.4. My setup includes Zabbix server and agent containers, with the Zabbix server using MySQL (Percona XtraDB Cluster). I have followed the upgrade instructions, but the upgrade process fails because the Zabbix server still attempts to operate in HA mode during the database upgrade.

    Current Setup:
    • Database: Percona XtraDB Cluster
    My configuration:
    Code:
    cat docker-compose.yml
    version: "3.9"
    
    services:
      zabbix-server-1:
        image: 'zabbix/zabbix-server-mysql:ubuntu-6.4-latest'
    #    ports:
    #      - "10051:10051"
        volumes:
          - /etc/localtime:/etc/localtime:ro
          - /var/lib/snmp/mibs:/var/lib/zabbix/mibs:ro
          - './usr/lib/zabbix/alertscripts:/usr/lib/zabbix/alertscripts:ro'
          - './usr/lib/zabbix/externalscripts:/usr/lib/zabbix/externalscripts:ro'
        environment:
          - DB_SERVER_HOST=172.31.63.1
          - DB_SERVER_PORT=3306
          - ZBX_HANODENAME=zabbix-server-1
          - ZBX_NODEADDRESS=172.31.63.1
        env_file:
          - ./envs/.env_mysql
          - ./envs/.env_server
        container_name: zabbix-server-1
        hostname: zbxSRV-1
        restart: unless-stopped
        network_mode: host
        ulimits:
          nproc: 65535
          nofile:
            soft: 20000
            hard: 40000
        stop_grace_period: 30s
      zabbix-agent-1:
        image: 'zabbix/zabbix-agent:ubuntu-latest'
        volumes:
          - /etc/localtime:/etc/localtime:ro
          - './zabbix-server/zabbix_agentd.d:/etc/zabbix/zabbix_agentd.d:ro'
          - './zabbix-server/var/lib/zabbix:/var/lib/zabbix:ro'
        env_file:
         - ./envs/.env_agent
        environment:
          - ZBX_SERVER_HOST=127.0.0.1,172.31.63.1
          - ZBX_ACTIVE_ALLOW=false
        privileged: true
        container_name: zabbix-agent-1
        hostname: zbxAgent-1
        restart: unless-stopped
        pid: "host"
        network_mode: host
        stop_grace_period: 5s
    also here is my docker logs for zabbix-server

    Code:
    docker logs -f --tail=30 zabbix-server-1
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "TLSCipherPSK13": ''...removed
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "TLSKeyFile": ''...removed
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "TLSPSKIdentity": ''...removed
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "TLSPSKFile": ''...removed
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "ServiceManagerSyncFrequency": ''...removed
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "HANodeName": 'zabbix-server-1'...updated
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "NodeAddress": '172.31.63.1'...updated
    ** Updating '/etc/zabbix/zabbix_server.conf' parameter "User": 'zabbix'...updated
    Starting Zabbix Server. Zabbix 6.4.14 (revision 0a50e61).
    Press Ctrl+C to exit.
    
         7:20240516:232828.012 Starting Zabbix Server. Zabbix 6.4.14 (revision 0a50e61).
         7:20240516:232828.012 ****** Enabled features ******
         7:20240516:232828.012 SNMP monitoring:           YES
         7:20240516:232828.012 IPMI monitoring:           YES
         7:20240516:232828.012 Web monitoring:            YES
         7:20240516:232828.012 VMware monitoring:         YES
         7:20240516:232828.012 SMTP authentication:       YES
         7:20240516:232828.012 ODBC:                      YES
         7:20240516:232828.012 SSH support:               YES
         7:20240516:232828.012 IPv6 support:              YES
         7:20240516:232828.012 TLS support:               YES
         7:20240516:232828.012 ******************************
         7:20240516:232828.012 using configuration file: /etc/zabbix/zabbix_server.conf
         7:20240516:232828.044 Zabbix supports only "utf8_bin,utf8mb3_bin,utf8mb4_bin" collation(s). Database "zabbix" has default collation "utf8mb4_0900_ai_ci"
         7:20240516:232828.080 current database version (mandatory/optional): 06000000/06000020
         7:20240516:232828.080 required mandatory version: 06040000
         7:20240516:232828.080 mandatory patches were found
         7:20240516:232828.082 cannot perform database upgrade in HA mode: all nodes need to be stopped and Zabbix server started in standalone mode for the time of upgrade.
         7:20240516:232828.082 Zabbix Server stopped. Zabbix 6.4.14 (revision 0a50e61).
    Steps Taken:
    1. ​Dynamically in MySQL:SET GLOBAL log_bin_trust_function_creators = 1;
    2. Updated Docker Image in docker-compose.yml:
    3. Cleared the ha_node Table:​
    ​4.Set Environment Variable to Disable HA Mode:

    Request for Help: What additional steps or configurations are necessary to ensure the Zabbix server starts in standalone mode to complete the database upgrade? Are there any known issues or additional configurations required when upgrading from 6.0.2 to 6.4 that I might have overlooked?​
  • Answer selected by sasanaalem at 31-05-2024, 14:04.
    cyber
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Dec 2006
    • 4807

    You are doing major version upgrade. Shut all servers down. Start one with new version without HA, it will do upgrades, after that you can restart new version servers in HA mode.
    I have no experience with containers, usually they are just for annoying the hell out of people.. If leaving specific keywords out of config does not help you to start it in standalone mode, then I cannot really help you..

    Comment

    • cyber
      Senior Member
      Zabbix Certified SpecialistZabbix Certified Professional
      • Dec 2006
      • 4807

      #2
      So stop all your servers, take one of them out of cluster (comment the " HANodeName" line), Start it up, have it perform the upgrades, stop it, put that host back to cluster... start all nodes.... removing from cluster can be done by runtime commands also (zabbix_server -R ha_remove_node=target)
      Last edited by cyber; 24-05-2024, 10:56.

      Comment

      • sasanaalem
        Junior Member
        • Sep 2023
        • 28

        #3
        Thank you for your explanation. Do you mean that I should stop all nodes and take a backup when I want to take a backup? Should this stop be done at the Galera level or at the MySQL server level? I took a backup of a database that is on Galera and I want to use my old database on a new machine. The Zabbix documentation is not accurate for this and I did not get good guidance from the documentation unfortunately.

        Comment

        • sasanaalem
          Junior Member
          • Sep 2023
          • 28

          #4
          I increase debug level and I got this

          Code:
               7:20240517:080542.748 Starting Zabbix Server. Zabbix 6.4.14 (revision 0a50e61).
               7:20240517:080542.749 ****** Enabled features ******
               7:20240517:080542.749 SNMP monitoring:           YES
               7:20240517:080542.749 IPMI monitoring:           YES
               7:20240517:080542.749 Web monitoring:            YES
               7:20240517:080542.749 VMware monitoring:         YES
               7:20240517:080542.749 SMTP authentication:       YES
               7:20240517:080542.749 ODBC:                      YES
               7:20240517:080542.750 SSH support:               YES
               7:20240517:080542.750 IPv6 support:              YES
               7:20240517:080542.750 TLS support:               YES
               7:20240517:080542.750 ******************************
               7:20240517:080542.750 using configuration file: /etc/zabbix/zabbix_server.conf
               7:20240517:080542.750 In zbx_load_modules()
               7:20240517:080542.750 End of zbx_load_modules():SUCCEED
               7:20240517:080542.750 In zbx_ipc_service_start() service:rtc
               7:20240517:080542.751 In zbx_ipc_socket_open()
               7:20240517:080542.751 End of zbx_ipc_socket_open():FAIL
               7:20240517:080542.751 End of zbx_ipc_service_start():SUCCEED
               7:20240517:080542.751 In zbx_db_get_database_type()
               7:20240517:080542.751 In zbx_db_connect() flag:0
               7:20240517:080542.792 End of zbx_db_connect():0
               7:20240517:080542.792 query [txnlev:0] [select userid from users limit 1]
               7:20240517:080542.793 there is at least 1 record in "users" table
               7:20240517:080542.793 End of zbx_db_get_database_type():ZBX_DB_SERVER
               7:20240517:080542.793 In init_database_cache()
               7:20240517:080542.793 In zbx_shmem_create() param:'HistoryCacheSize' size:2147483648
               7:20240517:080542.794 valid user addresses: [0x7efe354b9170, 0x7efeb54b8ff0] total size: 2147483264
               7:20240517:080542.794 End of zbx_shmem_create()
               7:20240517:080542.794 In zbx_shmem_create() param:'HistoryIndexCacheSize' size:2147483648
               7:20240517:080542.794 valid user addresses: [0x7efdb54b9180, 0x7efe354b8ff0] total size: 2147483248
               7:20240517:080542.794 End of zbx_shmem_create()
               7:20240517:080542.794 In init_trend_cache()
               7:20240517:080542.794 In zbx_shmem_required_size() size:0 chunks_num:1 descr:'trend cache' param:'TrendCacheSize'
               7:20240517:080542.795 End of zbx_shmem_required_size() size:422
               7:20240517:080542.795 In zbx_shmem_create() param:'TrendCacheSize' size:2147483648
               7:20240517:080542.795 valid user addresses: [0x7efd354b9170, 0x7efdb54b8ff0] total size: 2147483264
               7:20240517:080542.795 End of zbx_shmem_create()
               7:20240517:080542.795 End of init_trend_cache()
               7:20240517:080542.795 End of init_database_cache()
               7:20240517:080542.795 In zbx_db_connect() flag:0
               7:20240517:080542.834 End of zbx_db_connect():0
               7:20240517:080542.835 query [txnlev:0] [select default_character_set_name,default_collation_name from information_schema.SCHEMATA where schema_name='zabbix']
               7:20240517:080542.835 query [txnlev:0] [select count(*) from information_schema.`COLUMNS` where table_schema='zabbix' and data_type in ('text','varchar','longtext') and (character_set_name not in ('utf8','utf8mb3','utf8mb4') or collation_name not in ('utf8_bin','utf8mb3_bin','utf8mb4_bin'))]
               7:20240517:080542.842 In zbx_db_connect() flag:0
               7:20240517:080542.869 End of zbx_db_connect():0
               7:20240517:080542.870 In zbx_dbms_version_info_extract()
               7:20240517:080542.870 End of zbx_dbms_version_info_extract() version:80036
               7:20240517:080542.870 In DBcheck_version()
               7:20240517:080542.870 In zbx_db_connect() flag:0
               7:20240517:080547.908 End of zbx_db_connect():0
               7:20240517:080547.909 query [txnlev:0] [show tables like 'dbversion']
               7:20240517:080547.912 query [txnlev:0] [select mandatory,optional from dbversion]
               7:20240517:080547.912 current database version (mandatory/optional): 06000000/00000000
               7:20240517:080547.913 required mandatory version: 06040000
               7:20240517:080547.913 mandatory patches were found
               7:20240517:080547.914 query [txnlev:0] [show tables like 'ha_node']
               7:20240517:080547.916 query [txnlev:1] [begin;]
               7:20240517:080547.916 query [txnlev:1] [select unix_timestamp(),ha_failover_delay from config]
               7:20240517:080547.917 cannot retrieve database time
               7:20240517:080547.918 query [txnlev:1] [select lastaccess,name from ha_node where status not in (1,2) order by ha_nodeid for update]
               7:20240517:080547.919 query [txnlev:1] [commit;]
               7:20240517:080547.919 cannot perform database upgrade in HA mode: all nodes need to be stopped and Zabbix server started in standalone mode for the time of upgrade.
               7:20240517:080547.920 End of DBcheck_version():FAIL
               7:20240517:080547.920 Zabbix Server stopped. Zabbix 6.4.14 (revision 0a50e61).

          Comment

          • cyber
            Senior Member
            Zabbix Certified SpecialistZabbix Certified Professional
            • Dec 2006
            • 4807

            #5
            This is not about DB ... to perform an upgrade, start up one zabbix server without HA at first... let it do its upgrade tasks and restore Zabbix HA.

            Comment

            • sasanaalem
              Junior Member
              • Sep 2023
              • 28

              #6
              Originally posted by cyber
              This is not about DB ... to perform an upgrade, start up one zabbix server without HA at first... let it do its upgrade tasks and restore Zabbix HA.
              I do this, I disable anything about HA in zabbix server!But still I got these error when I want to try up and running my compose!
              Code:
              - ZBX_HANODENAME=zabbix-server-1
              - ZBX_NODEADDRESS=172.31.63.1
              I remove these from my compose and try to up and running it but still have same error!

              Comment

              • sasanaalem
                Junior Member
                • Sep 2023
                • 28

                #7
                No one can help!?

                Comment

                • cyber
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Dec 2006
                  • 4807

                  #8
                  You are doing major version upgrade. Shut all servers down. Start one with new version without HA, it will do upgrades, after that you can restart new version servers in HA mode.
                  I have no experience with containers, usually they are just for annoying the hell out of people.. If leaving specific keywords out of config does not help you to start it in standalone mode, then I cannot really help you..

                  Comment

                  Working...