Ad Widget

Collapse

Daily Wakeup Call - 1:07am

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • aethos
    Junior Member
    • Apr 2012
    • 4

    #1

    Daily Wakeup Call - 1:07am

    I have a strange issue I have been unable to isolate. I have the standard "system unreachable" trigger running for agent.ping - though I've bumped it up to 4 minutes. Every morning at 1:07am I get a notification from zabbix that every one of my systems is unreachable for 4 minutes followed by an immediate "ok" message. There are no errors in the server log and when I graph the latest data for agent.ping, the systems are not actually being recorded as offline. The data shows they are up but the triggers are all being fired.

    The trigger in question is:
    {Template_Zabbix_Agent:agent.ping.nodata(4m)}=1

    Anyone have any suggestions? Much appreciated.

    James
  • jan.garaj
    Senior Member
    Zabbix Certified Specialist
    • Jan 2010
    • 506

    #2
    Is Zabbix housekeeper running at 1:07am?
    Check which Zabbix processes are busy at 1:07am in your Zabbix performance graph.
    Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
    My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

    Comment

    • aethos
      Junior Member
      • Apr 2012
      • 4

      #3
      It is. I looked into it a bit wondering if that could be part of it. It runs at 1:03 and finishes at 1:07 appx. However it also runs every hour yet I only get these alerts at 1:07.

      On the performance graph I don't see anything too special. There was a small spike at 1am today, but there are larger spikes during the day. In fact it looks like every time housekeeper runs there's a small jump in the zabbix queue.

      Perhaps I should just try bumping it up to 5 minutes? I'm at a loss for other ideas.

      Comment

      • jan.garaj
        Senior Member
        Zabbix Certified Specialist
        • Jan 2010
        • 506

        #4
        You can change it to 5 minutes, but it'll be only quick workaround. If your database will get larger, you will have this issue again.

        It's common issue, when Zabbix DB is huge. You need to optimize performance of your DB (partitioning, caches, ...). Check forum for keyword "housekeeper", for advices.
        Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
        My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

        Comment

        Working...