Ad Widget

Collapse

Monitored host maintenance mode patch (zabbix-1.6)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bcheese
    Junior Member
    • Jun 2006
    • 26

    #16
    Originally posted by Aly
    I'm afraid you should agree. In your patch there is no maintenance plans.. but when they will be, it will be a problem to understand is host in maintenance mode or not. While we can select hosts in maintenance by timeline, there will problem in switching (as someone suggested) boolean field. In which moment it will be switched and by what process?
    Aly,

    I admit that there are no maintenance plans yet. This patch was first developed to provide me a crutial capability that was needed by our monitoring at work. I am now consulting with the community to take it the level where it would suit 99% of Zabbix installations. I already have a design concept in my head which is based on discussions in this forum which I am slowing building on my dev copy of Zabbix-1.6 to implement this.

    I feel it is quite simple to determine whether a host is in maint mode or not using some simple boolean logic. This concept is a lot simpler than systems I design in my employ. If this was the hardest thing I needed to do at work I would be a much happier person. I have included a crude text based high level flow chart to assist in understanding the process to be used in determining a hosts maintenance status.

    1. Retrieve the host's maintenance flag from DB (hosts.maintenance), is it a 1? Yes, host is in maint mode and goto point 4.
    2. Retrieve the hosts maintenance plan(s) from the database and check them to see if now() is within any of them. Are any of them active? Yes, we are in maint mode and goto point 4.
    3. We got to here, so we are no in maint mode.
    4. We know the state of this host.

    The boolean field (stored in the DB as an int) and the UI changes are already there in my patch for the manual operation. I need to move the point in the zabbix_server code where it is actioned, but that is not a big problem and I have already determined the SQL I need to use to get the info I need.

    The only remaining situation I see at this time that needs resolving is how to deal with alerts which contain triggers from multiple hosts where less than all hosts on the alert are in maintenance mode? Perhaps you have some thoughts on this one?

    Cheers,
    Brian.

    Comment

    • Aly
      ZABBIX developer
      • May 2007
      • 1126

      #17
      Originally posted by bcheese
      • This menu has an option to cancel the maint mode immediately for hosts which are in maint mode.
      In your scheme this option won't be supported?!
      Zabbix | ex GUI developer

      Comment

      • Alexei
        Founder, CEO
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2004
        • 5654

        #18
        Just a quick note.

        I think that simple boolean logic does not work here well. Remember that host maintenance mode (on, off) affects many parts of ZABBIX. I am pretty sure that some users will be happy to have data collected within the maintenance mode, other possibly want to suppress notifications, etc etc. Mapping, graphs, reports, etc will obviously be affected as well.

        There are many things to take care of! It is not that simple.

        Anyway we have good news for you. There is a great chance that support of maintenance mode will be implemented in 1.8.

        We appreciate your effort very much!
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment

        • ashuji
          Member
          • Dec 2008
          • 35

          #19
          Does not look good

          Hi

          This way you can suppress the Email Alerts from Zabbix for hosts under maintenance but it will bring up another issue. Check the scenario below:

          If you are monitoring the SLA (IT Services) of a particular service running on particular host, if that host goes under maintenance for 2 hours, we move it to Group Maintenance and zabbix stop sending mail alerts for that particular host while triggers TURN ON. Trigger turning ON means its SLA will be impacted, in SLA these 2 hours will be reflected as down time. While maintenance window time should not appear as downtime in SLA reports.

          Whats work aroudn for this ?

          Regards

          Ashwani Jain

          Comment

          • Andreas Bollhalder
            Senior Member
            Zabbix Certified Specialist
            • Apr 2007
            • 144

            #20
            I think it depends on the type of SLA, if maintenance is in the percentage or not.

            Therefor, I think you need to be able to set a flag for the calculation of the SLA if maintenance time should be included or not.

            Andreas
            Zabbix statistics
            Total hosts: 380 - Total items: 12190 - Total triggers: 4530 - Required server performance: 224.2

            Comment

            Working...