Ad Widget

Collapse

How to Monitoring AIX cluster

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • rcremasco
    Junior Member
    • Sep 2013
    • 2

    #1

    How to Monitoring AIX cluster

    We need to monitor an AIX cluster.

    We installed the zabbix agent on both node, all check on the single node are working fine.

    Then we configured a new host for the virtual IP for the clustered resourse. Now the item created un that node is giving the error:

    Get value from agent failed: ZBX_TCP_READ() failed: [4] Interrupted system call



    How we can monitor the clustered resources?

    thanks
  • dave_t
    Junior Member
    • Apr 2007
    • 28

    #2
    On my AIX clusters, I run an agent on each node, which is listening to the server on port 10050

    This is defined in /etc/zabbix/zabbix_agentd.conf

    You need to make sure you update the following:
    Logfile=/tmp/NODE_logfile.log
    PidFile=/tmp/NODE_logfile.log
    Hostname=NODE
    ListenPort=10050

    This agent is started at system boot on both nodes.

    I then tell the zabbix server about each node and apply the Template_AIX

    I then run *another* zabbix agent, which is started by hacmp which runs the same zabbix_agentd daemon, but uses different config file.

    -rw-r--r-- 1 root system 8738 Jul 12 15:01 zabbix_agentd_cluster.conf

    In here, I change the following:
    PidFile=/tmp/zabbix_agentd_cluster.pid
    LogFile=/tmp/zabbix_agentd_cluster.log
    HostnameItem=system.hostname

    NOTE: the reason why I use "HostnameItem=system.hostname" as opposed to "Hostname" is because when a cluster failover occurs, the hostname of the active node is changed by the hacmp failover scripts (which are executed by the "hostname MY_CLUSTER" and "uname -S MY_CLUSTER" commands).
    FWIW - assuming your service label is "MY_CLUSTER", this also means that I can run scripts independent of hacmp which start off with :

    if [ `hostname` != "CLUSTER" ]
    then
    exit 0
    fi


    anyhoo.....back to the "zabbix_agentd_cluster.conf"...
    In here, I also change the port which the MY_CLUSTER version of the agent will listen on, to a different port than the one being used by the NODE.

    ListenPort=11050

    I then create a *new* host (i.e. "MY_CLUSTER") on the zabbix server, and point it at port 11050

    I then configure the create a template host called "Template_MY_CLUSTER" and apply it to the host called "MY_CLUSTER"
    ( Typically, these are the things that are in the cluster resource group, such as application availability, and storage that's been created for the application).

    Finally, with reference to: https://www.zabbix.com/forum/showthread.php?t=41224
    I then create an zabbix(agent) item in a template called "Template_PowerHA" and apply it to each node in the cluster.

    You need to add the following *single line* in the /etc/zabbix/zabbix_agentd*.conf" (or wherever you put your zabbix agent config file(s))

    # START #

    UserParameter=Cluster_State,CLUSTER_STATE=`/usr/bin/lssrc -ls clstrmgrES | grep "^Current state" | awk ' { print $NF } '` ; if [ "$CLUSTER_STATE" = "" ] ; then echo 0 ; elif [ "$CLUSTER_STATE" = "ST_NOT_CONFIGURED" ] ; then echo 1 ; elif [ "$CLUSTER_STATE" = "ST_INIT" ] ; then echo 2 ; elif [ "$CLUSTER_STATE" = "ST_STABLE" ] ; then echo 3 ; elif [ "$CLUSTER_STATE" = "ST_JOINING" ] ; then echo 4 ; elif [ "$CLUSTER_STATE" = "ST_VOTING" ] ; then echo 5 ; elif [ "$CLUSTER_STATE" = "ST_BARRIER" ] ; then echo 6 ; elif [ "$CLUSTER_STATE" = "ST_CBARRIER" ] ; then echo 7 ; elif [ "$CLUSTER_STATE" = "ST_RP_RUNNING" ] ; then echo 8 ; elif [ "$CLUSTER_STATE" = "ST_RP_FAILED" ] ; then echo 9 ; fi

    # END #


    Hope this helps, but drop me a line if you need any more info...

    Thanks,

    Dave

    Comment

    • rcremasco
      Junior Member
      • Sep 2013
      • 2

      #3
      thanks..

      ... i will try.

      Comment

      • dave_t
        Junior Member
        • Apr 2007
        • 28

        #4
        Template_PowerHA

        Here's the template I was talking about...
        Attached Files

        Comment

        • anageradline888
          Junior Member
          • Sep 2022
          • 3

          #5
          Hii guys,
          I am new in Zabbix Monitoring.
          Right now I already use zabbix for monitoring my Windows Hyper-V Cluster.
          I already use dafyre idea, install zabbix agent on Server 1 and 2, I also input the Cluster-Name on Zabbix-Host Monitoring.
          but i have issue with the Quorum Disk, when the Quorum on Server 1, Server 2 Triggres have alert "The trigger is not discovered anymore and will be deleted in xxDay xxHour xxMinutes"

          Is it there is idea the best practice for Monitoring Hyper-V Cluster in Zabbix Monitoring??
          thanks Before​

          Comment

          Working...