Ad Widget

Collapse

emails are being delayed

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MikesH
    Junior Member
    • Jul 2016
    • 2

    #1

    emails are being delayed

    Hello Forum,

    i have one problem with our zabbix(very small ~ 70 hosts, Required server performance, new values per second - 254.71) in work. I will do my best to describe it. :]

    When trigger is triggered, zabbix starts sending emails. We have 5 admins which are configured to recieve emails on private and company mails, so that's 10 emails. And i think correct behaviour is all emails are sends "in one moment (1 sec)". But our zabbix sends approximately one email per 10 seconds. So last admin recieve email after 100 sec. Then trigger with OK status is send aswell so thats another 100 sec. And if more then on device is unreachable its twice that much. I think you know where it goes. :]

    Can you pls give me a hint, where should i look to change this behaviour? Zabbix is working correctly in other things. Is there any variable where can i change this? Or does zabbix is too busy to handle few mail?

    Thanks for help
    Best Regards Martin
  • Linwood
    Senior Member
    • Dec 2013
    • 398

    #2
    The emails should all have headers (depending on your client you may need to work to look at them).

    The first thing I would do is look at one of the first, and one of the last to receive the mail, and look at all the "received" headers, and make sure it is really zabbix that is the delay. Look in particular at the very first (bottom most) received-from, and find the zabbix server's handoff to the first relay, and see if the times really are different (as opposed to people's mail being delayed in transit). I think you can also look in the gui for zabbix's time it thinks it sent the mail.

    As to whether zabbix is too busy, how busy is your server itself, e.g. how much idle time on average? 254 vps is a fair amount for 70 hosts (I guess it depends on the hosts of course).

    Not sure which process type sends mail -- escalators? If you can confirm that you might check how many are started in zabbix-server.conf and start a few more and see if it helps.

    Comment

    • bdbell
      Junior Member
      • Jul 2016
      • 2

      #3
      You don't say what platform you are running on, but the first place I would look is the mail logs, if the system has them (for example, /var/log/maillog on a Linux server). Sending one email precisely every 10 seconds seems more like an issue with the mail handler than Zabbix.

      The only other possibility is that the action is possibly configured to act on a 10 second cycle, and that is what is causing the unusual behaviour. But I would consider that very unlikely.

      Comment

      • MikesH
        Junior Member
        • Jul 2016
        • 2

        #4
        Hello,

        sorry for late respond i'm very busy this week.

        our zabbix is running in VMware on debian, we are using postfix for mails. 8GB ram, 4 procesors E5-2420, its on hp 360 gen 8, so its quite new.

        I checked header in mail and compare it with trigger time and there is already delay. Well i might it describe it more what i observed. When trigger happens i can see in dashboards "last issues - collumn actions", on which mail is trigger sent, time, status. And there i see the delay already. When one mail is sent then zabbix wait for about 10 sec when the change on the next mail's status is changed to sent. and in these intervals the mails are being recieved. On the other hand, when i checked events with the specific device, the time in action is same in each mail.

        Sometimes in dashboard i can see processes housekeeper and alerter are busy more then 75% (everytime its 100%). Could this have anything to do with it?

        i check mail.log, zabbix configand didnt find anything unusual.

        Thanks for help and answers, sorry for my english :]

        Comment

        • Linwood
          Senior Member
          • Dec 2013
          • 398

          #5
          Hopefully someone with more of a clue will chime in but...

          Housekeeper being 100% busy is not all that unusual, and may or may not be a real issue. You can search around here and find lots of info on how to address, notably in postgresql using triggers to partition data so you can drop whole storage areas and not have it delete individual rows. I also found (though this will vary a lot depending on system and load) that changing so it does not limit the number of items deleted actually helped me, it was more efficient doing (say) 30,000 at a time than 3000. What it's doing is cleaning up either expired data or orphaned references; if you (as I did) have a lot of change in your items as you are getting zabbix stable, this can lead to even more cleanup needed.

          It is not good, but it does not seem to normally cause issues on modest systems because it is single threaded -- there is only one housekeeper running as best I can tell. So while it may never seem to finish, it also does not (usually) swamp the system. You might look at the unix side and see if it is swamping access to the disk or cpu's.

          As to email...I would be concerned if your alerter process is 100% busy. That seems rather unusual. You might consider getting either a tcp trace (e.g. tcpdump or wireshark) or turn on logging to get a transcript of the sessions with postfix. I do not know, but maybe sending email is single threaded and there is a delay in that area. For example, if postfix is trying to do rDNS lookups and timing out (not even sure if that is a possibility, just saying check if it is the delay cause). If (and I do not know) Zabbix can only do one session with postfix at a time, postfix could be the issue? You can also turn on debugging for the zabbix alerter specifically (look at zabbix_server's runtime control parameter to increase logging, that way you can get details on just one process without being flooded with debug data as you are if you increase it in the config file).

          Comment

          Working...