Ad Widget

Collapse

Agent alerts howto

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bluelinq
    Member
    • Feb 2008
    • 44

    #1

    Agent alerts howto

    I have several servers that have an agent reporting to the zabbix server. The servers are behind private lans so the zabbix server cannot contact them directly. This is working fine and I get all the stats we want however how do you enable alerts in this case?

    My issue is how to define in the Zabbix server that if a server agent has not reported itsef every so many minutes then assume the server is down and send an alert? Or perhaps there is a different way to do it?

    Is anyone doing something like this?

    Just as a thought, it will be incredible that the agent could create something like an ssh "tunnel" via an initiate agent conversation to the zabbix server. This way you can fake the by directional communication.

    Regards,

    Paul
  • globifrosch
    Member
    • Sep 2005
    • 74

    #2
    Originally posted by bluelinq
    I have several servers that have an agent reporting to the zabbix server. The servers are behind private lans so the zabbix server cannot contact them directly. This is working fine and I get all the stats we want however how do you enable alerts in this case?
    i use only active checks and made a item for agent.ping with trigger {hostname:agent.ping.nodata(<sec>)}=1

    all other triggers depend on the agent.ping trigger.

    Comment

    • bluelinq
      Member
      • Feb 2008
      • 44

      #3
      globifrosch

      thanks for the suggestion. Does this mean that by using {hostname:agent.ping.nodata(<sec>)}=1 you are pinging the server every second? and also, what effect does this have in overall db size? I have been using zabbix for less than a month and the db is growing very fast so need to make sure any changes we do don't make the beast mad :-)

      Regards,

      Paul

      Comment

      • globifrosch
        Member
        • Sep 2005
        • 74

        #4
        Originally posted by bluelinq
        Does this mean that by using {hostname:agent.ping.nodata(<sec>)}=1 you are pinging the server every second?
        no, in zabbix you get hosts, items and triggers and then actions. with items you gather data from the host. with triggers you monitor the data from the items. so you define in the item (with key agent.ping) the interval. in the "Actions" tab in the zabbix web interface you define on which items you like to be informed of changes.

        - Thomas

        Comment

        • bluelinq
          Member
          • Feb 2008
          • 44

          #5
          globifrosch,

          Ok I set it up as this. I created a trigger per sever (can it be done in a template?) with this value as the expression {MYSERVERNAME:agent.ping.nodata(45)}=1 and I put the severity as High.

          For the action I did this,

          Name: Host Connection Lost
          Event Source: Triggers
          Condition: Trigger Severity = High
          Sent email to user:

          Subject:
          {HOSTNAME}/{TRIGGER.NAME}: {STATUS}

          Body:
          {HOSTNAME}/{IPADDRESS}:
          Date occurred at {DATE}
          Time occurred: {TIME}
          Severety: {TRIGGER.SEVERITY}
          Executed by: {TRIGGER.NAME}
          Status: {TRIGGER.STATUS}

          Here is the issue. I get randomly errors about different servers. I have adjusted the nodata from 10 seconds to 60 now and it still does it. It is less than when I had it at 30, but I still see the problem. Now, this is the kicker,
          See the email below, the difference in time between the two events is only one second appart, and the same with other value, it can be 1 or 30 seconds etc, but those should not be triggering the action. Any ideas?

          Regards,

          Paul


          Message from the same trigger/action on one server below.

          From: zabbix
          Sent: Monday, March 24, 2008 8:25 PM
          To: Paul Aviles
          Subject: SERVER5/Agent Ping SERVER5: ON

          SERVER5/10.0.21.52:
          Date occurred at 2008.03.24
          Time occurred: 20:24:24
          Severety: High
          Executed by: Agent Ping SERVER5
          Status: ON

          From: zabbix
          Sent: Monday, March 24, 2008 8:25 PM
          To: Paul Aviles
          Subject: SERVER5/Agent Ping SERVER5: OFF

          SERVER5/10.0.21.52:
          Date occurred at 2008.03.24
          Time occurred: 20:24:25
          Severety: High
          Executed by: Agent Ping SERVER5
          Status: OFF

          Comment

          • globifrosch
            Member
            • Sep 2005
            • 74

            #6
            Originally posted by bluelinq

            Ok I set it up as this. I created a trigger per sever (can it be done in a template?) with this value as the expression {MYSERVERNAME:agent.ping.nodata(45)}=1 and I put the severity as High.
            hmm.. i use it like this:

            - Item: every 30 seconds
            - Trigger: 65 seconds

            so there may be 1 loss or some real delay in the network. i've never had problems with this configuration.

            and yes, this can be done in a template. all Items/Triggers can be in a template and linked to a host.

            Comment

            Working...