Ad Widget

Collapse

Emails being sent for unknown reason

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • GeneBean
    Junior Member
    • Jan 2012
    • 21

    #1

    Emails being sent for unknown reason

    Something is triggering an action that should not be. The triggers that in theory are going off match an action that sends emails so email is going out that should not be. The thing is, these triggers are not actually tripping. For example, I got a message saying a host has just been rebooted when it has not. The content of that email is below:
    Code:
    This message was triggered by "Boomhauer has just been restarted"
    The trigger currently has a status of OK
    It's severity is listed as Information
    If defined, this link will let you see more info about this alert:
    
    The last value Zabbix recorded related to this trigger is: 27 days, 00:19:45
    
    This is the expression used to create this trigger: {Boomhauer:system.uptime.last(0)}<10m
    I am also getting messages like this when the host is in maintenance. The action is set as follows:
    Code:
    Type of calculation
     (A) and (B) and (C)
    Conditions
    Label	Name	Action
    (A)	Maintenance status not in "maintenance" 	
    (B)	Trigger value = "PROBLEM" 	
    (C)	Host group = "Backups"
    I have also verified that this is the action that is sending the messages. Monitoring -> Events shows the following related to this:
    Code:
    "Time","Host","Description","Status","Severity","Duration","Ack","Actions"
    "Feb 15th, 2014 10:59:38 AM","Boomhauer","Boomhauer has just been restarted","OK","Information","2h 53m 34s","No"," - "
    "Feb 15th, 2014 10:49:38 AM","Boomhauer","Boomhauer has just been restarted","UNKNOWN","Information","10m","No"," - "
    Any idea how to troubleshoot or fix this? The only thing more disconcerting than no monitoring is false alarms so any help would be greatly appreciated.
  • aib
    Senior Member
    • Jan 2014
    • 1615

    #2
    I cannot see anything wrong in this trigger/action sequence.

    It was like that:
    - your host was restarted
    Code:
    "Feb 15th, 2014 10:49:38 AM","Boomhauer","Boomhauer has just been restarted","UNKNOWN","Information","10m","No"," - "
    but because no any Action was associated with UNKNOWN state - you didn't get any email/messages.

    - then your host worked more than 10 minutes and trigger switched back to OK state from PROBLEM state.
    Code:
    "Feb 15th, 2014 10:59:38 AM","Boomhauer","Boomhauer has just been restarted","OK","Information","2h 53m 34s","No"," - "
    because of that you got a message
    Code:
    This message was triggered by "Boomhauer has just been restarted"
    [B]The trigger currently has a status of OK[/B]
    It's severity is listed as Information
    If defined, this link will let you see more info about this alert:
    
    The last value Zabbix recorded related to this trigger is: 27 days, 00:19:45
    
    This is the expression used to create this trigger: {Boomhauer:system.uptime.last(0)}<10m
    Notice the line The trigger currently has a status of OK -it's informing you that your host is OK now and problem has gone.

    Is it correct or I'm not right?
    Sincerely yours,
    Aleksey

    Comment

    • GeneBean
      Junior Member
      • Jan 2012
      • 21

      #3
      I think there is more to it than that OR something is jacking with the states. This is not my only problem host, it's just the worst. I am linking to some exports from it to see if they help. I have replaced my IP and FQDN in them but I can assure you there were real at the time of export.
      Export of templates attached to this host: http://pastebin.com/Mbk6eDLs
      Export of this host: http://pastebin.com/MjRXTgQr

      Thank you very much for the help!

      Comment

      Working...