Ad Widget

Collapse

SLA Explanation / Example

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • navtek007
    Senior Member
    • May 2005
    • 100

    #1

    SLA Explanation / Example

    Hi All,

    this question has probably been asked a lot, but i still don't understand how the SLA configuration works. For example say we have a SLA that says the following: (dummy numbers)

    Web Services 75% 8:30am - 5:30pm
    Database Services 100% 6:30am - 6:30pm
    Mail Services 100% 6:00am - 6:30pm

    How would i implement this using the SLA config screen.

    thanks in advanced.
  • Kepeke1967
    Junior Member
    • May 2005
    • 1

    #2
    same issue here..

    Hello,

    We have the same issue here: applying timeframes to certain SLA-monitoring-items. It doesn't seem to work using triggers from a template host.
    Must be something we're doing wrong??

    Any help would be very appreciated!

    Comment

    • navtek007
      Senior Member
      • May 2005
      • 100

      #3
      Ok its been a while since the last response so I will just stay with the first part of my question:

      I still don't understand how to configure the IT Services side of things?

      1 . Do I just create a service called "Database Services" then do I hard/soft link all my database related triggers to this service OR is each trigger created as a sub service eg "Database is Unreachable" is created as another service and then it is linked to the Database Services service?

      2. Would I only add triggers that affect the uptime of the service, not ones like diskspace OR are certain trigger severity levels weighted differently when calculating the SLA level eg: Warning, Disaster etc...

      3. What is the difference between a soft and hard link?

      4. Could someone give me an example on how I could use this?

      Sorry for the long post, but i feel i am missing out on this functionality in Zabbix.

      thanks in advanced.

      Comment

      • illumin8
        Member
        • Jun 2005
        • 36

        #4
        Originally posted by navtek007
        I still don't understand how to configure the IT Services side of things?
        I have the same issue. There is no documentation on how IT Services work that explains the difference between MAX or MIN calculation, hard and soft links, and just a quick tutorial on how to do it.

        My problem is I've tried creating IT Services that are linked to a trigger. I make the trigger go True, then the SLA stays at 100%, and isn't adjusted like it should be. I must be configuring it incorrectly and I'm just not sure how. Any help you guys could give would be greatly appreciated.

        Comment

        • illumin8
          Member
          • Jun 2005
          • 36

          #5
          I have hammered on it for around 8 hours, creating IT Services, linking them to Triggers, everything, and no matter what I do, I can't get the SLA it's reporting to drop below 100%....

          Edit: OK, I finally got it to drop below 100% by using soft links instead of hard. I still don't know what the difference is (perhaps someone could explain it to me). It seems to be accurately reporting SLA now, however, it would be great to have a small explaination of it.

          I also found a bug (I think) in IT Service setup (affecting 1.0 and alpha 1.1): If you set an IT Service to MIN algorithm, you can't edit it again. When you try to edit it, the Algorithm field shows up blank (doesn't populate from the database), and you can no longer update the entry. You have to delete it and start over.
          Last edited by illumin8; 15-06-2005, 17:03.

          Comment

          • navtek007
            Senior Member
            • May 2005
            • 100

            #6
            Alexei would you be able to help us all out here, there have been so many threads about this being a grey area in Zabbix. Everything else works great and i'm sure this works well however i don't think a lot of people know how to use it properly. Some definitions and examples would be a great start I think.

            Thanks in advanced.

            Comment

            • illumin8
              Member
              • Jun 2005
              • 36

              #7
              SLA Setup Walkthrough

              Ok, I think I've got it figured out and I'm hoping this will help others that need to setup SLAs. This is how I did it:

              First of all, a little background. I have about 100 hosts that I'm monitoring with simplecheck icmpping. They don't have any Zabbix agents installed, and I'm only monitoring them with pings to track server uptime. Each host has only 1 item and 1 trigger called "Cannot ping {HOSTNAME}".

              Now, how to setup IT Services:
              1. Login to Zabbix
              2. Click on Configuration
              3. Click on IT Services
              4. Click on the drop-down next to Trigger, and choose your host group, then host, then the trigger that is the basis for forming the SLA. In my case, the trigger I am monitoring is "Cannot ping {HOSTNAME}". I'm using linked templates, so it actually expands the trigger to show the real hostname.
              5. Check the two boxes that say "Show SLA", and "Link to trigger".
              6. Type in the acceptable SLA percentage. Leave the service name blank. It will automatically use the name of the trigger anyway.
              7. Click the Add button.
              8. Repeat the above procedure until you have added all the triggers that might calculate into an SLA. For me, this is only 1 trigger per server, but you might have many more you want to track.
              9. Now, you're going to create a Parent service that will hold all of the child triggers you just added SLAs for.
              10. Type in a name for the parent service. This might be something like "Oracle Database Server".
              11. Choose a status calculation algorithm. Here is an explanation of what the two options, MAX or MIN will do: Use the MAX algorithm for things like a farm of web servers, where if one or two are down, the service is still fine. Things that are load-balanced or clustered like web servers are prime candidates for MAX. Use the MIN algorithm for services that are not clustered or load-balanced. For example, if you're tracking a number of conditions on a single server, like "Server load is below a certain level" and "Response time is good", you want to use MIN, because if any one trigger gets set to True, you want it to mark the service as down.
              12. Set the SLA percentage to what you want.
              13. Do not check the "Link to trigger" checkbox.
              14. Click the Add button.
              15. Now that you've created the parent service, you need to link it to the child services (triggers) that we created above.
              16. Click on the Parent service in the list of services. Now you will be taken to a similar screen, but you are editing just that service, instead of the master list of all services.
              17. In the "Link to" section, choose the child trigger to link the parent service to. I chose to use a soft link, since that's the default, but I still don't know what the difference is. If somebody wants to clarify that, I would greatly appreciate it.
              18. Click the add link button.
              19. Continue until you have added all of the child triggers you want to that service.
              20. Parent services can also be child services of other parent services. This allows you to create hierarchies.


              Ok, so now that I've explained all of this, I thought I would give you a more real-world example of how I used it.
              • We have two data centers. We'll call them "East Coast" and "West Coast".
              • Each data center has three web servers. We'll call them "eastweb1", "eastweb2", "eastweb3", "westweb1", "westweb2", and "westweb3".
              • Each web server has a trigger defined that will be set if it goes offline or crashes.
              • The first thing we do, is create 6 child services. One child service for each web server. Each child service is created by linking it to the trigger for that web server.
              • Next, we create two parent services, one called "Web Servers - East Coast", and one called "Web Servers - West Coast". Each one of these is created using MAX algorithm, since if any one server is up, the service is considered operational.
              • Now we link "eastweb1", "eastweb2", and "eastweb3" to the parent "Web Servers - East Coast", and link "westweb1", "westweb2", and "westweb3" to "Web Servers - West Coast".
              • Now, we create another parent service called simply "Web Servers", again using MAX algorithm, since even if we lose an entire data center, as long as the other data center is operational, the web service is still considered up.
              • We link "Web Servers - East Coast" and "Web Servers - West Coast" to the parent service "Web Servers".


              You should be all set at this point. When you go into the IT Services View screen, you'll see just "Web Servers" listed as a service, along with the SLA. If you click on Web Servers, you can drill down and see availability for both East Coast and West Coast. If you click on either of those, you can see availability on an individual service basis.

              So you can see how easy it is to create hierarchies of services with this. It seems to be working for me. I hope this tutorial helps someone else.

              Comment

              • navtek007
                Senior Member
                • May 2005
                • 100

                #8
                Great! That is exacltly what i wanted to know. Good example of the MAX/MIN functionality as well.
                Last edited by navtek007; 18-06-2005, 13:22.

                Comment

                • illumin8
                  Member
                  • Jun 2005
                  • 36

                  #9
                  Please sticky this thread

                  Will somebody please sticky this thread? Even though I've had to figure all of this out from trial and error, it is still the BEST and ONLY documentation on the web that actually explains how to setup IT Services in Zabbix.

                  IT Services is probably the most requested, and yet least usable and documented feature in Zabbix.

                  It would be nice to see Alexei give some type of official explanation of how IT Services is supposed to work.
                  Last edited by illumin8; 15-12-2005, 18:18.

                  Comment

                  • Wolfgang
                    Senior Member
                    Zabbix Certified Trainer
                    Zabbix Certified Specialist
                    • Apr 2005
                    • 116

                    #10
                    @illumin8

                    Thank you very much for putting together how SLAs are to setup :-)
                    http://www.intellitrend.de
                    Specialised in monitoring large environments and Zabbix API programming.

                    Comment

                    • crs9
                      Member
                      • Feb 2006
                      • 35

                      #11
                      I still having a few problems with SLA and not sure where I'm going wrong. using beta7
                      1) I create my triggers to calculate my SLA.
                      2) I then create my parent
                      3) I then soft link my triggers to the parent
                      4) I can go back into the config and see the parent as service 1 and the triggers as service 2
                      5) Strange thing is I click on a trigger in the config and within a trigger's config, the trigger statement is set at default. Meaning is says "all" and "select host...", is this correct?
                      6) Second strange thing is when I go to monitor IT Services, I click on the parent device and it drills down, but nothing is on the next page, which tells me I'm certainly missing a step.

                      Can anyone shed some light on the step I'm missing?

                      Thanks

                      Comment

                      • herr_bpl
                        Junior Member
                        • Jan 2006
                        • 15

                        #12
                        I just can confirm, no matter how i will try to make service(s) and play with triggers, SLA stands proudly on 100%. Impressive for management but absolutely not reliable :'(

                        Hope, next version and feature freeze will shed some light upon it...

                        Comment

                        • Alexei
                          Founder, CEO
                          Zabbix Certified Trainer
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Sep 2004
                          • 5654

                          #13
                          Originally posted by herr_bpl
                          I just can confirm, no matter how i will try to make service(s) and play with triggers, SLA stands proudly on 100%. Impressive for management but absolutely not reliable :'(
                          This is because SLA updates its first status on a trigger change. When adding new service, or linking a services to a trigger, SLA is OK be default.

                          This is to be changed.
                          Alexei Vladishev
                          Creator of Zabbix, Product manager
                          New York | Tokyo | Riga
                          My Twitter

                          Comment

                          • ghislain
                            Senior Member
                            • Jun 2005
                            • 160

                            #14
                            is that article ok with you Alexei ? perhaps if the info are good it can ba added to the wiki ?

                            regards,
                            Ghislain.
                            Regards,
                            Ghislain.

                            Comment

                            • axel
                              Member
                              • Aug 2005
                              • 36

                              #15
                              Hi i have Problems with IT Services too.

                              I set up like this.

                              - Routers MIN - SLA 100%
                              - Trigger (Ping Router1) MIN - SLA 100%
                              - Trigger (Ping Router2) MIN - SLA 100%
                              - Trigger (Ping Router3) MIN - SLA 100%

                              But if i Check the Availability report of Router2 i have 98.9177% .

                              I use Zabbix 1.0 too and there it services works ok.

                              For Example if Trigger (Ping Router1) is ON nothing happen on the IT Service site. The Services told me that everything is OK .

                              May be i dont know what IT Service should do

                              Please help thx



                              SFM-Router OK - - Show
                              [TRIGGER] PING VPN Router1 OK - 99.05%/100.00% Show
                              [TRIGGER] PING VPN Router2 OK - 99.05%/100.00% Show

                              Comment

                              Working...