Ad Widget

**richlv** · 12-08-2010, 20:58

5 to 20 queries per second should not be an issue.
how did you measure disk usage rate ?
how much of that was reads, how much - writes ?
is that a virtual machine by any chance ?

**makini** · 16-08-2010, 19:00

Similar issues with 1.8.3 after upgrade from 1.8.2

Hi,

After the upgrade to 1.8.3 from 1.8.2 we started experiencing similar spikes in CPU and IO load...

Our setup is larger though:
Number of hosts (monitored/not monitored/templates) 162 134 / 21 / 7
Number of items (monitored/disabled/not supported) 4642 4345 / 292 / 5
Number of triggers (enabled/disabled)[problem/unknown/ok] 3187 2940 / 247 [11 / 19 / 2910]
Number of users (online) 38 2
Required server performance, new values per second 35.81 -

Database queries p/s is around 180 (it's on MySQL), most of those are in "Sleep" command state. The spikes in IO and CPU usage can be seen here:

avg-cpu: %user %nice %system %iowait %steal %idle
38.00 0.50 15.00 28.50 0.00 18.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 10.89 0.00 480.20 0.00 4356.44 9.07 117.69 246.82 2.07 99.31

avg-cpu: %user %nice %system %iowait %steal %idle
4.48 0.00 5.47 64.18 0.00 25.87
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 12.00 0.00 441.00 0.00 3800.00 8.62 131.26 287.71 2.27 100.20

avg-cpu: %user %nice %system %iowait %steal %idle
3.00 0.00 5.00 20.50 0.00 71.50
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 7.00 0.00 190.00 0.00 1552.00 8.17 58.88 172.49 2.33 44.30

avg-cpu: %user %nice %system %iowait %steal %idle
6.00 0.50 5.50 67.00 0.00 21.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 15.00 0.00 478.00 0.00 4456.00 9.32 124.70 246.28 2.10 100.20

avg-cpu: %user %nice %system %iowait %steal %idle
7.46 1.49 6.47 61.69 0.00 22.89
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 31.68 0.99 336.63 7.92 3057.43 9.08 94.00 351.18 2.65 89.60

The 1.8.2 (release) version did not have such load causing spikes on the database...

**magawake** · 26-08-2010, 15:10

Same exact problem,

LOG: statement: select distinct t.triggerid,t.expression,t.description,t.url,t.com ments,t.status,t.value,t.priority,t.type,t.error,f .itemid from triggers t,functions f,items i where i.status not in (3) and i.itemid=f.itemid and t.status=0 and f.triggerid=t.triggerid and f.itemid in (22563,22461,22453,22454,22455,22516,22517,22518,2 2549,22519,22550,22520,22521,22431,22522)
LOG: statement: select distinct i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.delta,i.prevorgvalue ,i.lastclock,i.units,i.multiplier,i.formula,i.stat us,i.valuemapid,h.dns,i.trends,i.lastlogsize,i.dat a_type,i.mtime from hosts h,items i, functions f where h.hostid=i.hostid and h.status=0 and i.status=0 and f.function in ('nodata','date','dayofweek','time','now') and i.itemid=f.itemid and (h.maintenance_status=0 or h.maintenance_type=0) and h.hostid between 000000000000000 and 099999999999999
LOG: statement: begin;
LOG: statement: select dh.dhostid,dh.status,dh.lastup,dh.lastdown from dhosts dh,dservices ds where ds.dhostid=dh.dhostid and dh.druleid=2 and ds.ip='10.11.1.170' order by dh.dhostid
LOG: statement: commit;
LOG: statement: select t.httptestid,t.name,t.applicationid,t.nextcheck,t. status,t.delay,t.macros,t.agent,t.authentication,t .http_user,t.http_password from httptest t,applications a,hosts h where t.applicationid=a.applicationid and a.hostid=h.hostid and t.nextcheck<=1280769151 and mod(t.httptestid,1)=0 and t.status=0 and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0) and t.httptestid between 000000000000000 and 099999999999999
LOG: statement: select escalationid,actionid,triggerid,eventid,r_eventid, esc_step,status from escalations where status in (0,1) and nextcheck<=1280769151 and escalationid between 000000000000000 and 099999999999999
LOG: statement: select count(*),min(t.nextcheck) from httptest t,applications a,hosts h where t.applicationid=a.applicationid and a.hostid=h.hostid and mod(t.httptestid,1)=0 and t.status=0 and h.status=0 and (h.maintenance_status=0 or h.maintenance_type=0) and t.httptestid between 000000000000000 and 099999999999999

This query is knocking my DB server down.

DB server has 32G of memory and 8 cores. I have like 4 of these selects runnings.
Using all snmp with 400 hosts.

I think we can optimize this query by indexing the proper fields...
________
California dispensaries

**magawake** · 26-08-2010, 16:08

I fixed the problem by tuning postgresql.
I followed this webpage, http://wiki.postgresql.org/wiki/Tuni...tgreSQL_Server

Things are much faster now...
________
X hamster

Ad Widget

Zabbix generating high CPU/database load

Zabbix generating high CPU/database load

Comment

Comment

Comment

Comment