Ad Widget

Collapse

Configuring Zabbix 4.0LTS active/passive cluster with pacemaker/corosync in MS Azure

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ITOMDave
    Member
    • Nov 2018
    • 53

    #1

    Configuring Zabbix 4.0LTS active/passive cluster with pacemaker/corosync in MS Azure

    Hi experts,

    I'm working to put together an experimental Zabbix build on Azure - something that I'm sure has been done a million times before. I think I'm so very close to completing it, but I can't seem to get the Zabbix UI to recognise that the Zabbix server is running when I access the UI via the cluster VIP......
    Click image for larger version  Name:	Zabbix Cluster piv1.png Views:	1 Size:	19.0 KB ID:	383137


    The configuration I have is this :
    • Database as an Azure MYSql service
    Zabbix Server
    • 2 VM's for the Zabbix server (one will be an active node, the other a standby node).
    • VM's in a pacemaker / corosync cluster.
    Zabbix Front End
    • 2 VM's
    • Load-balanced (or at least will be !!)

    Cluster Configuration Details so far
    • Cluster VIP of 10.1.1.100
    • 2 nodes in the cluster : pcs status shows :
      Cluster name: N-PRD-CL1
      Stack: corosync
      Current DC: N-PRD-L-Cor1 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum
      Last updated: Sun Jul 28 22:21:12 2019
      Last change: Sun Jul 28 19:22:50 2019 by hacluster via cibadmin on N-PRD-L-Cor1

      2 nodes configured
      2 resources configured

      Online: [ N-PRD-L-Cor1 N-PRD-L-Cor2 ]

      Full list of resources:

      cluster_vip (ocf::heartbeat:IPaddr2): Started N-PRD-L-Cor1
      zabbix_server (systemd:zabbix-server): Started N-PRD-L-Cor1

      Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    As a separate entity, the cluster itself appears to be working fine - if I place the active node in standby Zabbix starts up on the other node. Failback works OK as well.

    I've configured the Zabbix front end server (/etc/zabbix/web/zabbix.conf.php) to contain the following :
    $ZBX_SERVER = '10.1.1.100';
    $ZBX_SERVER_PORT = '10051';
    $ZBX_SERVER_NAME = '';


    i.e. $ZBX_SERVER is set to the IP of the CLUSTER.

    The only change in the /etc/httpd/conf.d/zabbix.conf is to the timezone as usual.

    On each Zabbix server within the cluster, the zabbix_server.conf file is the same (showing basic output only) :
    LogFile=/var/log/zabbix/zabbix_server.log
    LogFileSize=0
    PidFile=/var/run/zabbix/zabbix_server.pid
    SocketDir=/var/run/zabbix
    DBHost=xxxxxxxxxxxxxxxxxxxxx
    DBName=xxxxxxxxxxxxxxxxxxxx
    DBUser=xxxxxxxxxxxxxxxx
    DBPassword=xxxxxxxxxxxx
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
    Timeout=4
    AlertScriptsPath=/usr/lib/zabbix/alertscripts
    ExternalScripts=/usr/lib/zabbix/externalscripts
    LogSlowQueries=3000


    I have read in various places, such as https://ericsysmin.com/2016/02/18/configuring-high-availability-ha-zabbix-server-on-centos-7/> to update the SourceIP and ListenIP parameters in the zabbix_server.conf with the cluster VIP address. I've done this and restarted the zabbix server, but it makes no difference with theZabbix GUI still insisting that the server isn't running.

    I've probably made a simple "newbie" mistake and omitted or misunderstood something.

    All suggestions as to a solution are welcomed. So far I've thought of :
    1. Do I need to do anything in Azure - the VIP only 'exists' in the context of pcs ?
    2. I see that there are some pcs resource agents for azure :

      azure-lb
      azure-repo-svc
      azure-repo-svc
      azure-repo-svc.path
      azure-repo-svc.path

      Do I need to do something with these ?
    I'll continue to research this, but I'm hoping someone can educate me.

    Thanks in advance.

    ITOMDave











    Last edited by ITOMDave; 30-07-2019, 20:46. Reason: Added some tags to make it easier to find
  • ITOMDave
    Member
    • Nov 2018
    • 53

    #2
    The research took longer than I'd have liked, but I'm pretty sure that I've fixed it. I'll try and put together a "how-to" on this when I get the opportunity but for now :
    1. ocf:heartbeat:IPADDR2 isn't appropriate for an Azure environment. It's what sets up the VIP on the cluster.
    2. Using this as the heartbeat resource doesn't tell Azure anything.
    3. Use an alternative resource agent that can interface with Azure
    More to come soon when I've had chance to finish testing and write proper documentation.
    D

    Comment

    • Markku
      Senior Member
      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
      • Sep 2018
      • 1781

      #3
      Good to hear, it will be interesting to read what you have found out implementing that in Azure!

      Markku

      Comment

      • Markku
        Senior Member
        Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
        • Sep 2018
        • 1781

        #4
        ITOMDave Hi, did you figure out and document a way to implement the Zabbix cluster in Azure?

        Markku

        Comment

        • ITOMDave
          Member
          • Nov 2018
          • 53

          #5
          Hi Markku - indeed I did get it working. The documentation I produced is very specific to my environment but maybe the following might help :

          The vast majority of information I got came from here:
          https://ericsysmin.com/2016/02/18/configuring-high-availability-ha-zabbix-server-on-centos-7/

          The main thing to do is to make sure that your AZURE load balancer is defined as a resource in PCS :


          Once I'd understood how all that works I was able to configure a PCS resource for the Azure Load Balancer
          Code:
          [FONT="Lucida Console"]pcs resource create p_azure-lb ocf:heartbeat:azure-lb op monitor timeout="20s" interval="10s"[/FONT]
          Although the command doesn't return an error, looking at the pacemaker GUI (https://<insert pacemaker IP here>:2224/managec/N-PRD-CL1/main#/resources/p_azure-lb ) there is an error regarding not being able to find "nc" (netcat).

          You should also make sure that azure-lb and zabbix-server are co-located and constrained.

          It's important that you understand the theory and concepts of pcs/pacemaker as well as Azure load balancing. Environment tend to vary quite a lot so just listing a whole bunch of commands for you probably won't be doing you a favour.

          I hope the above pointers help.

          Dave.




          Comment

          • Markku
            Senior Member
            Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
            • Sep 2018
            • 1781

            #6
            Hi, thanks, interesting, good to know! Some other clustering configuration resources for the readers (not Azure-related though):

            https://blog.zabbix.com/zabbix-ha-cluster-setups/
            Later in this document: Setting up the database serversSetting up the Zabbix serversSetting up the frontend (web) servers Edmunds Vesmanis had a presentation in Zabbix Summit 2019 about Zabbix HA s…


            Markku

            Comment

            Working...