Ad Widget

**uid0** · 29-11-2013, 01:47

push, sorry!

**natalia** · 02-12-2013, 08:39

Originally posted by uid0

Does maybe somebody can provide the correct trigger expression to monitor the real CPU utilization?

I am using :

trigger name : CPU utilization > 95% for 4 mins on {HOST.NAME}

({TRIGGER.VALUE}=0 & {Template OS Linux:system.cpu.util[,user].min(4m)}>95) | ({TRIGGER.VALUE}=1 & {Template OS Linux:system.cpu.util[,user].min(4m)}>80)

**kloczek** · 02-12-2013, 09:39

Originally posted by uid0

But a high load does not always mean that the server is overloaded and as i see zabbix does not have a standard trigger for cpu utilization. Or does i miss something?

CPU utilization has nothing to do with CPU load.
CPU load it is length of the system running queue.
Current CPU load is displayed in "r" column of vmstat command output and it is integer value.
loadavg{1,5,15} are average values of length ot this queue in last {1,5,15} minutes, and because these values are average values this is why these numbers are float values.

**karmukis** · 14-08-2014, 22:48

I know this an old poste but I'm having the following problem, and please bare with me, and sort of newbie on zabbix...

I want to get the CPU LOAD on PORCENTAGE... and is you use "system.cpu.load", it gives you the average data, and what I want is that but on porcentage,....

What can I use to get this data, on porcentage? like when you execute an "htop" on the serve.

thank you so much

karmukis

**jan.garaj** · 14-08-2014, 23:13

You want cpu utilization, not cpu load, if you want a percent.
Check item key - system.cpu.util in manual

1 Zabbix agent

https://www.zabbix.com/documentation/2.2/manual/config/items/itemtypes/zabbix_agent#supported_item_keys

**bbrendon** · 15-08-2014, 08:46

Telling someone what they want? That's not nice. You can get CPU load as a percentage. The basic idea is to take the load average divided by the # of CPUs.

There are a few ways to do that in zabbix. Probably the easiest is to combine two items (one with load avg and one with # of CPUs) to calculate a 3rd item which would be the percentage.

I can think of a few better ways (IMHO), but they would be much more involved.

**jan.garaj** · 15-08-2014, 09:23

User first

Let's go to fight :-P. I've made a custom implementation for cpu usage monitoring, so I feel very confidently. But I'm open for new knowledge.

Also Zabbix manual refers to wiki:

Load (computing) - Wikipedia

http://en.wikipedia.org/wiki/Load_%28computing%29

In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs.

How can you express it on %? What is 100%? I have never ever seen any monitoring system, which shows load on %.

What can I use to get this data, on porcentage? like when you execute an "htop" on the serve.

What is on percentage in htop? man htop -> processor usage

=> user want to see CPU usage (utilization)
He doesn't understand about load, utilization, but he mentioned example with htop. That's the main reason why I have recommended to use system.cpu.util - I listen user ;-)

BTW1: Actually zabbix agent is not able to provide the same value as htop, only avg1/avg5/avg15 values are available.
BTW2:

The basic idea is to take the load average divided by the # of CPUs.

zabbix has item key - system.cpu.load, but instead of parameter all, use percpu (total load divided by online CPU count)

**karmukis** · 15-08-2014, 21:27

jajajaja
ok, I'm really happy for all the help I'm getting, but, going back to "monitoring side"

Please give me some time to check all the data you send me and try to figure this out.

The idea is that our servers, because of the amount of request, and the consecuent transactions, our CPUs utilization scalates to dangerous level....

So far I have configure SYSTEM.CPU.LOAD to warn us when the check an "all" cpu is above 5, because when using the "percpu" I'm not sure what value is the one it takes to compare against the warning value.... I check it, and when my CPU load was on 5.... zabbix was telling me that it was under normal values, so then, I executed HTOP, and saw that, from the 8 CPUs, some the CPU usage was on 100%, some at 80]% and maybe 3 of them were on 20% of cpu usage... so that why I'm having such a headache!

This is how the trigger looks like right now.

{requester-a1:system.cpu.load[all,avg1].last(0)}>5

again, thank you so much.

**jan.garaj** · 15-08-2014, 22:24

IMHO the best trigger for you is
{requester-a1:system.cpu.load[percpu,avg1].last(0)}>1

Critical value for cpu.load[all,avg1] depends on number of online CPUs. I have some servers with >20, but it has 24 CPUs so "normalized" cpu load (percpu) is 20/24=0,8 so I don't have problem. It'll be problem if I have load 20 on 1 CPU device.

You are safe if you have load 5 and you have 8 CPUs. Also your CPU utilization is not 100% for all your CPU. I don't see any problem with your CPU (metric load, utilization), so what is your problem?

**bbrendon** · 16-08-2014, 06:56

Wow. I've been using CPU load as a percentage for ages. I never knew zabbix added percpu on the load item! Wow.

**karmukis** · 19-08-2014, 15:56

Ok,
first of all... I'm not a he, I'm a she...
Then, I leave the cpu-load alarm working all weekend, and I'm happy to say it seems to be working fine, but the values I have set seems to be quite low because I kept sending alarms about CPU load problems on the server I'm testing....
Checking the server itself, its normal values for the "load average" are around 1.95, 2, 3....
and very often the load scalates to 5 or 6 ...
So keep testing.
I need to check ho to aply different values for this alarm, because, the requeste load on the servers is not very well distributed (thanks to amazon ELB), and sometimes, we have 5 server doing nothing, and 7 crying for help....

Ok, again,thanks for everything, I'll keep you posted.
karmukis >> karina

**jan.garaj** · 19-08-2014, 22:44

OK Karina.

AWS is out of scope, but my notes:
- ELB - do you use sticky session?
- AWS - check also Cloudwatch CPU metrics; usually Zabbix CPU metrics are not the same as Cloudwatch CPU metrics (hypervisor vs OS)
- check steal CPU usage time - for example: your VM can consume 100% CPU time, but actually 90% is "only" steal CPU time

Ad Widget

system.cpu.load != cpu utilization?

system.cpu.load != cpu utilization?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment