Hey
First of all I need to say that Zabbix is an awesome product!
We are having some huge problems with Zabbix 1.8.10 and postgresql 9.1. Zabbix stops processing values. Which means thats zabbix stops working several times a week!The Queue is all red. The fix is to restart postgresql.
Here is some logs and output:
select * from pg_locks; (253 of those entries)
relation | 320181778 | 643956949 | | | | | | | | 49/22354 | 23245 | AccessShareLock | t
relation | 320181778 | 643957121 | | | | | | | | 56/23655 | 23257 | AccessShareLock | t
relation | 320181778 | 320181787 | | | | | | | | 36/22046 | 23228 | AccessShareLock | t
select * from pg_stat_activity; (78 of those entries)
320181778 | zabbix | 23173 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.466681+02 | | 2012-04-06 10:01:14.629902+02 | f | <IDLE>
320181778 | zabbix | 23154 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.461894+02 | | 2012-04-06 10:00:49.324219+02 | f | <IDLE>
320181778 | zabbix | 23156 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462143+02 | | 2012-04-06 10:01:08.175567+02 | f | <IDLE>
320181778 | zabbix | 23157 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462345+02 | | 2012-04-06 10:00:40.338098+02 | f | <IDLE>
320181778 | zabbix | 23159 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462584+02 | | 2012-04-06 10:00:23.777723+02 | f | <IDLE>
320181778 | zabbix | 23161 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.46294+02 | | 2012-04-06 10:00:54.805044+02 | f | <IDLE>
320181778 | zabbix | 23163 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.463347+02 | | 2012-04-06 10:01:15.068805+02 | f | <IDLE>
320181778 | zabbix | 23165 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.46519+02 | | 2012-04-06 10:00:56.470469+02 | f | <IDLE>
320181778 | zabbix | 23171 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.466296+02 | | 2012-04-06 10:00:52.792051+02 | f | <IDLE>
320181778 | zabbix | 23178 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467716+02 | | 2012-04-06 10:01:15.11217+02 | f | <IDLE>
320181778 | zabbix | 23175 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467005+02 | | 2012-04-06 10:00:34.017369+02 | f | <IDLE>
320181778 | zabbix | 23177 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467951+02 | | 2012-04-06 10:01:17.872265+02 | f | <IDLE>
zabbix_server.conf
grep -v ^# /etc/zabbix/zabbix_server.conf | sed '/^$/d'
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=50
DebugLevel=3
DBName=xxxxxxx
DBUser=xxxxxxx
DBPassword=xxxxxxx
StartPollers=20
StartPollersUnreachable=3
StartTrappers=30
StartPingers=4
HousekeepingFrequency=24
MaxHousekeeperDelete=0
DisableHousekeeping=0
CacheSize=512M
StartDBSyncers=12
HistoryCacheSize=32M
TrendCacheSize=16M
Timeout=30
AlertScriptsPath=/etc/zabbix/
ExternalScripts=/etc/zabbix/externalscripts
/var/log/zabbix/zabbix.log says "query failed"
postgresql.conf
max_connections = 350
shared_buffers = 2GB
work_mem = 32MB
maintenance_work_mem = 384MB
synchronous_commit = off
wal_buffers = 1MB
checkpoint_segments = 25
effective_cache_size = 16GB
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 2000ms
postgres-debug.logs: Full of locks!
LOG: process 8702 still waiting for ShareLock on transaction 9239968 after 1000.094 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8705 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.034 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8703 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.119 ms
STATEMENT: update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8701 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.047 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8710 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.066 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8708 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.072 ms
STATEMENT: update ids set nextid=nextid+4 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8704 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.068 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
Any suggestions? The rest of the logs and more info can be provided
Thanks for all help!
Best regards
Hahnuim
CASE SOLVED!
First of all I need to say that Zabbix is an awesome product!
We are having some huge problems with Zabbix 1.8.10 and postgresql 9.1. Zabbix stops processing values. Which means thats zabbix stops working several times a week!The Queue is all red. The fix is to restart postgresql.
Here is some logs and output:
select * from pg_locks; (253 of those entries)
relation | 320181778 | 643956949 | | | | | | | | 49/22354 | 23245 | AccessShareLock | t
relation | 320181778 | 643957121 | | | | | | | | 56/23655 | 23257 | AccessShareLock | t
relation | 320181778 | 320181787 | | | | | | | | 36/22046 | 23228 | AccessShareLock | t
select * from pg_stat_activity; (78 of those entries)
320181778 | zabbix | 23173 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.466681+02 | | 2012-04-06 10:01:14.629902+02 | f | <IDLE>
320181778 | zabbix | 23154 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.461894+02 | | 2012-04-06 10:00:49.324219+02 | f | <IDLE>
320181778 | zabbix | 23156 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462143+02 | | 2012-04-06 10:01:08.175567+02 | f | <IDLE>
320181778 | zabbix | 23157 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462345+02 | | 2012-04-06 10:00:40.338098+02 | f | <IDLE>
320181778 | zabbix | 23159 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.462584+02 | | 2012-04-06 10:00:23.777723+02 | f | <IDLE>
320181778 | zabbix | 23161 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.46294+02 | | 2012-04-06 10:00:54.805044+02 | f | <IDLE>
320181778 | zabbix | 23163 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.463347+02 | | 2012-04-06 10:01:15.068805+02 | f | <IDLE>
320181778 | zabbix | 23165 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.46519+02 | | 2012-04-06 10:00:56.470469+02 | f | <IDLE>
320181778 | zabbix | 23171 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.466296+02 | | 2012-04-06 10:00:52.792051+02 | f | <IDLE>
320181778 | zabbix | 23178 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467716+02 | | 2012-04-06 10:01:15.11217+02 | f | <IDLE>
320181778 | zabbix | 23175 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467005+02 | | 2012-04-06 10:00:34.017369+02 | f | <IDLE>
320181778 | zabbix | 23177 | 29888 | zabbix | | | | -1 | 2012-04-04 11:49:49.467951+02 | | 2012-04-06 10:01:17.872265+02 | f | <IDLE>
zabbix_server.conf
grep -v ^# /etc/zabbix/zabbix_server.conf | sed '/^$/d'
LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=50
DebugLevel=3
DBName=xxxxxxx
DBUser=xxxxxxx
DBPassword=xxxxxxx
StartPollers=20
StartPollersUnreachable=3
StartTrappers=30
StartPingers=4
HousekeepingFrequency=24
MaxHousekeeperDelete=0
DisableHousekeeping=0
CacheSize=512M
StartDBSyncers=12
HistoryCacheSize=32M
TrendCacheSize=16M
Timeout=30
AlertScriptsPath=/etc/zabbix/
ExternalScripts=/etc/zabbix/externalscripts
/var/log/zabbix/zabbix.log says "query failed"
postgresql.conf
max_connections = 350
shared_buffers = 2GB
work_mem = 32MB
maintenance_work_mem = 384MB
synchronous_commit = off
wal_buffers = 1MB
checkpoint_segments = 25
effective_cache_size = 16GB
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 2000ms
postgres-debug.logs: Full of locks!
LOG: process 8702 still waiting for ShareLock on transaction 9239968 after 1000.094 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8705 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.034 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8703 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.119 ms
STATEMENT: update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8701 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.047 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8710 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.066 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8708 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.072 ms
STATEMENT: update ids set nextid=nextid+4 where nodeid=0 and table_name='events' and field_name='eventid'
LOG: process 8704 still waiting for ExclusiveLock on tuple (1,52) of relation 320182344 of database 320181778 after 1000.068 ms
STATEMENT: update ids set nextid=nextid+2 where nodeid=0 and table_name='events' and field_name='eventid'
Any suggestions? The rest of the logs and more info can be provided
Thanks for all help!
Best regards
Hahnuim
CASE SOLVED!