Ad Widget

Collapse

Is there a way to alert when the Zabbix DB server is down

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jroberson
    Senior Member
    • May 2008
    • 124

    #1

    Is there a way to alert when the Zabbix DB server is down

    I've got a silly question, perhaps. I recently had my Zabbix DB server go down ... BUT I didn't get any alert warning me that it was down. Is there a way Zabbix can alert me if my DB server goes down? Seems like I may already know the answer because if the DB is down, how is Zabbix supposed to know how to alert me, eh? If not, any suggestions on a secondary monitoring script or application to do so?

    Thanks in advance.
  • walterheck
    Senior Member
    • Jul 2009
    • 153

    #2
    We have a zabbix server and two database servers, and each of those servers keeps track of all others through a fairly simple script that runs as a cronjob. If any of the servers go down, we get spammed by email, jabber and sms.

    Here it is (straight from puppet, so still has erb tags in it ):

    Code:
    #!/bin/bash
    # Script to health check zabbix and its database
    # Author: [email protected]
    # License: Tribily Internal
    
    # Define required variables
    #
    LOG_TIME=`date +%d-%b-%Y-%H:%M:%S`
    LOG="/var/log/zabbix-server/zabbix_alive.log"
    ZBX_SERVER="<%= scope.lookupvar("zabbix_server_fqdn") %>"
    DB_SERVER1="<%= scope.lookupvar("localdbserverhostname") %>"
    DB_SERVER2="<%= scope.lookupvar("localdbslaveserverhostname") %>"
    ZBX_PROCESS=`ps aux | grep "sbin\/zabbix_server" | grep -v grep | awk '{print $NF}' | sort | uniq | cut -f4 -d"/"`
    DB_PROCESS=`ps aux | grep "sbin\/mysqld" | awk '{print $11}' | cut -f4 -d"/"`
    ZBX_LOG="/var/log/zabbix-server/zabbix_server.log"
    DB_LOG="/var/log/mysql/mariadb-error.log"
    JABBER="/etc/zabbix/alertscripts/jabberes_alert.pl"
    SMS="/etc/zabbix/alertscripts/cli_sms_clickatel.php"
    PERL=`which perl`
    PHP=`which php`
    
    # Checks run from Zabbix Server Box
    #
    if [ `hostname` == "$ZBX_SERVER" ]
    then
    
      # Check process table on server for running Zabbix server
      #
      if [ "$ZBX_PROCESS" == "zabbix_server" ]
      then
              echo "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is Alive on port 10051 from `hostname -f`" >> $LOG
      else
              echo "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`" >> $LOG
              tail -n15 $ZBX_LOG  | mail -s "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Zabbix Server has Crashed"
      fi
      
      # Check if Zabbix Database server03 is listening to connections
      #
      netcat -z ${DB_SERVER1} 3306
      if [ `echo $?` -eq 0 ]
      then
        echo "$LOG_TIME : Database Server $DB_SERVER1 is accepting connections" >> $LOG
      else
        echo "$LOG_TIME : Database Server $DB_SERVER1 has stopped accepting connections"  >> $LOG
        echo "For Errors please check $DB_LOG on $DB_SERVER1" | mail -s "$LOG_TIME : Database Server has stopped accepting connections" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
      fi 
    
      # Check if Zabbix Database server02 is listening to connections
      #
      netcat -z ${DB_SERVER2} 3306
      if [ `echo $?` -eq 0 ]
      then
        echo "$LOG_TIME : Database Server $DB_SERVER2 is accepting connections" >> $LOG
      else
        echo "$LOG_TIME : Database Server $DB_SERVER2 has stopped accepting connections"  >> $LOG
        echo "For Errors please check $DB_LOG on $DB_SERVER2" | mail -s "$LOG_TIME : Database Server has stopped accepting connections" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
      fi 
    
    elif [ `hostname -s` == "$DB_SERVER1" ]
    then
    
      # Check process table on server for running Database server
            #       
            if [ "$DB_PROCESS" == "mysqld" ]
            then    
                    echo "$LOG_TIME : Database Server $DB_SERVER1 is Alive" >> $LOG 
            else    
                    echo "$LOG_TIME : Database Server $DB_SERVER1 has Crashed" >> $LOG 
                    tail -n15 $DB_LOG | mail -s "$LOG_TIME : Database Server $DB_SERVER1 has Crashed" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Database Server $DB_SERVER1 has Crashed"
            fi      
    
      # Check if Zabbix Server is listening to connections
            #
      netcat -z ${ZBX_SERVER} 10051
            if [ `echo $?` -eq 0 ]
            then
                    echo "$LOG_TIME : Cannot reach Zabbix Server ${ZBX_SERVER} on port 10051 from `hostname -f`" >> $LOG
            else
                    echo "$LOG_TIME : Zabbix Server on `hostname` has stopped accepting connections"  >> $LOG
                    echo "For Errors please check $ZBX_LOG on $ZBX_SERVER" | mail -s "$LOG_TIME : Database Server has stopped accepting connections" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Zabbix Server has Crashed"
            fi
    
    
    elif [ `hostname -s` == "$DB_SERVER2" ]
    then
    
      # Check process table on server for running Database server
            #       
            if [ "$DB_PROCESS" == "mysqld" ]
            then    
                    echo "$LOG_TIME : Database Server $DB_SERVER2 is Alive" >> $LOG 
            else    
                    echo "$LOG_TIME : Database Server $DB_SERVER2 has Crashed" >> $LOG 
                    tail -n15 $DB_LOG | mail -s "$LOG_TIME : Database Server $DB_SERVER2 has Crashed" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
        $PERL $JABBER [email protected] "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Database Server $DB_SERVER2 has Crashed"
            fi      
    
      # Check if Zabbix Server is listening to connections
            #
      netcat -z ${ZBX_SERVER} 10051
            if [ `echo $?` -eq 0 ]
            then
                    echo "$LOG_TIME : Cannot reach Zabbix Server ${ZBX_SERVER} on port 10051 from `hostname -f`" >> $LOG
            else
                    echo "$LOG_TIME : Zabbix Server on `hostname` has stopped accepting connections"  >> $LOG
                    echo "For Errors please check $ZBX_LOG on $ZBX_SERVER" | mail -s "$LOG_TIME : Database Server has stopped accepting connections" [email protected] [email protected]
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
        $PERL $JABBER [email protected] "$LOG_TIME : Zabbix Server ${ZBX_SERVER} is unreachable on port 10051 from `hostname -f`"
    #    $PHP $SMS +1234567890 "$LOG_TIME : Zabbix Server has Crashed"
            fi
    
    else
      echo "$LOG_TIME : Script not meant to run on machine `hostname`" > $LOG
      cat $LOG | mail -s "$LOG_TIME : Script not meant to run on machine `hostname`" [email protected] [email protected]
    fi
    Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

    Comment

    Working...