Hi,
our Zabbix server ran into database deadlocks several times today. After that there seems to be a bug with the transaction handling.
The messages from the postgres-log:
Nov 12 15:46:09 zs postgres[16928]: [2-1] ERROR: deadlock detected
Nov 12 15:46:09 zs postgres[16928]: [2-2] DETAIL: Process 16928 waits for ShareLock on transaction 278998374; blocked by process 31052.
Nov 12 15:46:09 zs postgres[16928]: [2-3] Process 31052 waits for ShareLock on transaction 278998391; blocked by process 16928.
Nov 12 15:46:09 zs postgres[16928]: [2-4] Process 16928: update triggers set value=0,lastchange=1289570142,error='' where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [2-5] Process 31052: update triggers set value=1,lastchange=1289573165,error='' where triggerid=14596
Nov 12 15:46:09 zs postgres[16928]: [2-6] HINT: See server log for query details.
Nov 12 15:46:09 zs postgres[16928]: [2-7] STATEMENT: update triggers set value=0,lastchange=1289570142,error='' where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [3-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [3-2] STATEMENT: select description,priority,comments,url,type from triggers where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [4-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [4-2] STATEMENT: select eventid,value from events where source=0 and object=0 and objectid=14881 and value in (0,1) order by object desc,objectid desc,eventid desc
limit 1
Nov 12 15:46:09 zs postgres[16928]: [5-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [5-2] STATEMENT: insert into events (eventid,source,object,objectid,clock,value) values (609568,0,0,14881,1289570142,0)
Nov 12 15:46:09 zs postgres[16928]: [6-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [6-2] STATEMENT: select serviceid from services where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [7-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [7-2] STATEMENT: select distinct i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.d
elta,i.prevorgvalue,i.lastclock,i.units,i.multipli er,i.formula,i.status,i.valuemapid,h.dns,i.trends, i.lastlogsize,i.data_type,i.mtime,f.function,f.par ameter from hosts h,items i,functions f where i.hostid=h.
hostid and i.itemid=f.itemid and f.functionid=18625
Nov 12 15:46:09 zs postgres[16928]: [8-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [8-2] STATEMENT: select t.triggerid, t.value from trigger_depends d,triggers t where d.triggerid_down=14882 and d.triggerid_up=t.triggerid
Nov 12 15:46:09 zs postgres[16928]: [9-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [9-2] STATEMENT: update triggers set value=2,lastchange=1289570113,error='Could not obtain function and item for functionid: 18625' where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [10-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [10-2] STATEMENT: select description,priority,comments,url,type from triggers where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [11-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [11-2] STATEMENT: insert into events (eventid,source,object,objectid,clock,value) values (609569,0,0,14882,1289570113,2)
Nov 12 15:46:09 zs postgres[16928]: [12-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [12-2] STATEMENT: select serviceid from services where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [13-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [13-2] STATEMENT: select distinct i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.
delta,i.prevorgvalue,i.lastclock,i.units,i.multipl ier,i.formula,i.status,i.valuemapid,h.dns,i.trends ,i.lastlogsize,i.data_type,i.mtime,f.function,f.pa rameter from hosts h,items i,functions f where i.hostid=h
.hostid and i.itemid=f.itemid and f.functionid=18586
Nov 12 15:46:09 zs postgres[16928]: [14-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [14-2] STATEMENT: update triggers set error='Could not obtain function and item for functionid: 18586' where triggerid=14884
[ several hundreds of those messages per second ]
I "fixed" it by selectively killing the database connection processes from zabbix_server to Postgres until everything worked again. Meanwhile the load on the system rose above 50.
Maybe the following error (also several times today) has something to do with it:
Nov 12 16:07:36 zs postgres[4817]: [2-1] ERROR: duplicate key value violates unique constraint "trends_pkey"
Nov 12 16:07:36 zs postgres[4817]: [2-2] STATEMENT: insert into trends (itemid,clock,num,value_min,value_avg,value_max) values (23283,1289574000,1,1.183105,1.183105,1.183105);
Nov 12 16:07:36 zs postgres[4817]: [2-3]
Nov 12 16:07:36 zs postgres[4817]: [3-1] WARNING: there is no transaction in progress
Can you already debug the issue with this information? Do you need more?
our Zabbix server ran into database deadlocks several times today. After that there seems to be a bug with the transaction handling.
The messages from the postgres-log:
Nov 12 15:46:09 zs postgres[16928]: [2-1] ERROR: deadlock detected
Nov 12 15:46:09 zs postgres[16928]: [2-2] DETAIL: Process 16928 waits for ShareLock on transaction 278998374; blocked by process 31052.
Nov 12 15:46:09 zs postgres[16928]: [2-3] Process 31052 waits for ShareLock on transaction 278998391; blocked by process 16928.
Nov 12 15:46:09 zs postgres[16928]: [2-4] Process 16928: update triggers set value=0,lastchange=1289570142,error='' where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [2-5] Process 31052: update triggers set value=1,lastchange=1289573165,error='' where triggerid=14596
Nov 12 15:46:09 zs postgres[16928]: [2-6] HINT: See server log for query details.
Nov 12 15:46:09 zs postgres[16928]: [2-7] STATEMENT: update triggers set value=0,lastchange=1289570142,error='' where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [3-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [3-2] STATEMENT: select description,priority,comments,url,type from triggers where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [4-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [4-2] STATEMENT: select eventid,value from events where source=0 and object=0 and objectid=14881 and value in (0,1) order by object desc,objectid desc,eventid desc
limit 1
Nov 12 15:46:09 zs postgres[16928]: [5-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [5-2] STATEMENT: insert into events (eventid,source,object,objectid,clock,value) values (609568,0,0,14881,1289570142,0)
Nov 12 15:46:09 zs postgres[16928]: [6-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [6-2] STATEMENT: select serviceid from services where triggerid=14881
Nov 12 15:46:09 zs postgres[16928]: [7-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [7-2] STATEMENT: select distinct i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.d
elta,i.prevorgvalue,i.lastclock,i.units,i.multipli er,i.formula,i.status,i.valuemapid,h.dns,i.trends, i.lastlogsize,i.data_type,i.mtime,f.function,f.par ameter from hosts h,items i,functions f where i.hostid=h.
hostid and i.itemid=f.itemid and f.functionid=18625
Nov 12 15:46:09 zs postgres[16928]: [8-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [8-2] STATEMENT: select t.triggerid, t.value from trigger_depends d,triggers t where d.triggerid_down=14882 and d.triggerid_up=t.triggerid
Nov 12 15:46:09 zs postgres[16928]: [9-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [9-2] STATEMENT: update triggers set value=2,lastchange=1289570113,error='Could not obtain function and item for functionid: 18625' where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [10-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [10-2] STATEMENT: select description,priority,comments,url,type from triggers where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [11-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [11-2] STATEMENT: insert into events (eventid,source,object,objectid,clock,value) values (609569,0,0,14882,1289570113,2)
Nov 12 15:46:09 zs postgres[16928]: [12-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [12-2] STATEMENT: select serviceid from services where triggerid=14882
Nov 12 15:46:09 zs postgres[16928]: [13-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [13-2] STATEMENT: select distinct i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.
delta,i.prevorgvalue,i.lastclock,i.units,i.multipl ier,i.formula,i.status,i.valuemapid,h.dns,i.trends ,i.lastlogsize,i.data_type,i.mtime,f.function,f.pa rameter from hosts h,items i,functions f where i.hostid=h
.hostid and i.itemid=f.itemid and f.functionid=18586
Nov 12 15:46:09 zs postgres[16928]: [14-1] ERROR: current transaction is aborted, commands ignored until end of transaction block
Nov 12 15:46:09 zs postgres[16928]: [14-2] STATEMENT: update triggers set error='Could not obtain function and item for functionid: 18586' where triggerid=14884
[ several hundreds of those messages per second ]
I "fixed" it by selectively killing the database connection processes from zabbix_server to Postgres until everything worked again. Meanwhile the load on the system rose above 50.
Maybe the following error (also several times today) has something to do with it:
Nov 12 16:07:36 zs postgres[4817]: [2-1] ERROR: duplicate key value violates unique constraint "trends_pkey"
Nov 12 16:07:36 zs postgres[4817]: [2-2] STATEMENT: insert into trends (itemid,clock,num,value_min,value_avg,value_max) values (23283,1289574000,1,1.183105,1.183105,1.183105);
Nov 12 16:07:36 zs postgres[4817]: [2-3]
Nov 12 16:07:36 zs postgres[4817]: [3-1] WARNING: there is no transaction in progress
Can you already debug the issue with this information? Do you need more?
Comment