ZABBIX Forums  
  #1  
Old 13-04-2012, 08:18
grippi grippi is offline
Junior Member
 
Join Date: Apr 2012
Posts: 2
Default cpu.load to high on Windows

I'm running Windows Server 2008 R2 as VM on Citrix XenServer 5.6 SP2. For monitoring the Server i use
Zabbix 1.8.8 and the Zabbix-Windows-Agent 1.8.8 64-bit.

The problem is that the system.cpu.load[,avg1] is too high. The Template is configured to trigger an alarm when the load goes
over 5. Well, the MS Exchanger Server 2010 goes several times a day over 5, till 25 sometimes. That's not normal, isn't it?


Could it be that Windows sends wrong or multiplied data?
Attached Images
 
Reply With Quote
  #2  
Old 13-04-2012, 16:44
HullZabbix HullZabbix is offline
Senior Member
 
Join Date: Feb 2011
Posts: 104
Default

Are you supposed to be using system.cpu.load ?

For my windows servers I use

system.cpu.util[,,avg1]
Reply With Quote
  #3  
Old 16-04-2012, 08:36
grippi grippi is offline
Junior Member
 
Join Date: Apr 2012
Posts: 2
Default

Well, the original template uses the cpu load. So I thought it would be the
best to use it as zabbix released it.

I will try to use cpu.util the next days.

I am still interested to know why the cpu load is so high on windows systems.

Thanks for your hint.
Reply With Quote
  #4  
Old 16-04-2012, 14:36
ghoz ghoz is offline
Senior Member
 
Join Date: May 2011
Posts: 194
Default

Hi.
Cpu load is the number of threads in a runnable state on the system.
on the windows agent, it's mesured using an averaged "processor queue length" perfmon counter.

normaly on a uniproc it's supposed to be blow 1 with higher spikes, but I've seen very high numbers on some systems with very low real CPU usage , specificaly on VMs...

On linux, it includes processes waiting for disk, and I guess it's the same for windows.

So all in all it's not a good CPU usage indicator, more an overall system health indicator ...

Ghoz
Reply With Quote
  #5  
Old 18-06-2012, 12:57
pmsousa pmsousa is offline
Junior Member
 
Join Date: Jan 2012
Location: Porto, Portugal
Posts: 21
Default

Hi,

I have exactly the same issue on several Windows servers, especially those running SQL Server (which is very CPU/Mem intensive). The strange factor is that those triggers firing up only started after I've upgraded Zabbix from 1.8.13 to 2.0.0!!!



Any solution for this issue or have you just changed the item "Processor load" to "Processor util"?
Attached Images
 

Last edited by pmsousa; 20-06-2012 at 15:30. Reason: Added chart image
Reply With Quote
  #6  
Old 20-06-2012, 16:47
pmsousa pmsousa is offline
Junior Member
 
Join Date: Jan 2012
Location: Porto, Portugal
Posts: 21
Default

The answer to my question...

First system.cpu.load is the CPU Queue Length not the % Processor Time (who gave the name to that item should revise it...)

From wikipedia and another post on this forum I found the comparison between "CPU Load vs CPU Utilization" (http://en.wikipedia.org/wiki/Load_(computing)).

From Microsoft I found this document about "Observing Processor Queue Length".

With this information It was easy to duplicate my Zabbix graph on Windows performance monitor, understand the data and why that damn trigger was firing like crazy!!!


Problem solved, Zabbix template updated with new item for CPU Utilization and new graph added...



If you need some help with this issue, send me a message...
Reply With Quote
  #7  
Old 08-08-2012, 18:04
vincecmic vincecmic is offline
Junior Member
 
Join Date: Aug 2012
Posts: 1
Default me too!

I am also getting alot of emails regarding the "processor load is too high" message on Windows VM guests.

Please help me out with the solution.

Thanks for your help
Reply With Quote
  #8  
Old 09-08-2012, 11:42
pmsousa pmsousa is offline
Junior Member
 
Join Date: Jan 2012
Location: Porto, Portugal
Posts: 21
Default

Quote:
Originally Posted by vincecmic View Post
I am also getting alot of emails regarding the "processor load is too high" message on Windows VM guests.

Please help me out with the solution.

Thanks for your help
Hi Vince,

Have you read the document from Wikipedia I've mentioned on my previous post?

A first step is to understand, when talking about performance monitoring techniques, that "% Processor Time" and "Processor Queue Length" are two different things. Often people mistake "Processor Queue Lenght" with "% Processor Time" has I think the developer that made the Zabbix Windows Template and named it "system.cpu.load"... Also those metrics were developed 20 years ago for Windows NT and for physical machines not virtual (VMWare or Hyper-V).

The "% Processor Time" counters in Windows are measurements derived using a sampling technique. The OS Scheduler samples the state of the CPU once per system clock tick, driven by a high priority timer-based interrupt.
The "System\Processor Queue Length" counter in Perfmon is an instantaneous counter that reflects the current number of Ready threads waiting in the OS Scheduler queue.

From Microsoft's Technet (http://technet.microsoft.com/en-us/l.../cc940375.aspx) you can see the usage of "% Processor Time" and "System\Processor Queue Length" combined in order to find saturated processors. There is also a note at the end with reference values for multiprocessor systems (also physical servers).

On a virtual environment you can't use the same trigger values because virtual processors and queuing are handled differently than physical one's. The physical processor queues are used for several virtual processors and that messes up the values you use as reference for Zabbix triggers.
If I remember right the value set for the trigger was 5 and if you read the note I've mentioned, on a system with high CPU activity the expected range of processor queue length is 4 to 12 so the trigger must be set higher.

This post is getting veeerrrryyyy biiiigggg and I'm expecting a remote assist from a software provider at any minute, so I must stop for now but...

You can make a first experience. Open a perfmon on two windows boxes (physical and virtual) and add the counters for "% Processor Time" and "System\Processor Queue Length". Let it run for a while and compare the graphs with the ones on the Technet article.

I'll be back later...
Reply With Quote
  #9  
Old 09-08-2012, 16:26
pmsousa pmsousa is offline
Junior Member
 
Join Date: Jan 2012
Location: Porto, Portugal
Posts: 21
Angry Grrrrrrrrr

I've spent the last hour writing (editing) my previous post just to lose all the text I wrote after pressing the "Submit Reply"...

I'm going for a cigarette and think if I can manage to write all of that again!!!
Reply With Quote
  #10  
Old 27-03-2013, 19:50
emmanux emmanux is offline
Junior Member
Zabbix Certified Specialist
 
Join Date: Mar 2013
Location: Argentine
Posts: 19
Default

I finally moved to perf_counter["\238(_Total)\6"], results are better understandable for Microsoft Admins.

Last edited by emmanux; 08-11-2013 at 20:40.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 14:10.