Ad Widget

**splitek** · 25-07-2019, 09:12

7. IT services

https://www.zabbix.com/documentation/3.0/manual/it_services

11 IT services

https://www.zabbix.com/documentation/3.0/manual/web_interface/frontend_sections/monitoring/it_services

SLA - It is very simple but first read the doc and section:

Status calculation algorithm

Method of calculating service status:

When IT service is "in state" then SLA is reduced. When IT service have children then calculation (reduction) depends on configuration (chosen algorithm).

Downtime - service state within this period does not affect SLA. You define downtime periods in configuration of IT Service.

Period - periods are defined in frontend. For now, only way to get SLA for non defined period is to query Zabbix API:

service.getsla

https://www.zabbix.com/documentation/3.0/manual/api/reference/service/getsla

**Raido** · 25-07-2019, 09:28

The type of SLA calculation is "Problem, if at least one child has a problem".
Eg. If customer 4 have 14 machines. Machine 1 have 2 active problems. During the Machine 1 SLA calculation - does the 2 active problems are summed together, averaged together or which way the Machine 1 SLA is calculated if it has multiple active problems?

**splitek** · 25-07-2019, 11:02

Originally posted by Raido

Machine 1 have 2 active problems.

What you mean by "problem" here? You think about problems from problems view for some host (machine 1)? It is not like that.
Think like that:
IT service in zabbix is connected to trigger (it can be only one trigger). Trigger can be OK/PROBLEM, from that IT service can be OK/PROBLEM. Like you see number of problems on host doesn't matter. State of connected trigger matters. IT service show ratio UP/DOWN for one chosen trigger. You can say something like "that service have SLA 50% so it worked 50% of the time it should worked".

Now... you have service "customer" with children services - calculation is "Problem, if at least one child has a problem". If one of this children go PROBLEM then it propagates PROBLEM up to parent service "customer" and parent will be in PROBLEM too. If you change calculation to "Problem, if all children have problems" then all children need to be in PROBLEM to propagate it to parent.

**Raido** · 25-07-2019, 11:57

How is the parent (yellow) SLA calculated if the children uptime is much higher?

Attached Files

**splitek** · 25-07-2019, 14:41

Hard to tell... but I will try.
Let say parent configuration is: "Problem, if at least one child has a problem". So our parent go PROBLEM when child 1 or child 2 or child 3 (... or so on) is in PROBLEM.
Let's draw timeline with every minute. On every minute we put 0 if child is OK, or 1 if child is in a PROBLEM.

Child 1: 00001111
Child 2: 11001100
Child 3: 11110000
-------------------------
Parent: 11111111

As you see every one child uptime is higher, Parent have no up time.
"OR" operation in made in every second on all children statuses and this result is propagated to parent. When result is 1 then Parent is in PROBLEM state.

Ad Widget

Zabbix 3.2.11 IT Services SLA, downtime and period

Zabbix 3.2.11 IT Services SLA, downtime and period

Comment

Comment

Comment

Comment

Comment