Hi all,
We are using zabbix for monitoring our all servers and the number of items reached 30w. Our back-end database is Oracle. Last week, the issue of TX lock occurred to us. Once oracle have more than ONE TX lock(get from 'ora active | grep TX | wc -l'), the whole zabbix is out of work. We tried to make our intervals of items longer - 1400 value per second to 600 value per second. But it didn't make any sense. After our analyzing oracle performance, we found that the TX lock is caused by updating the table 'ids' in data base. Further, the root cause is the method of getting 'eventid' for concurrency(in src/libs/zbxdbhigh/db.c). We modified the source code for using 'sequence' in oracle instead the traditional method. After a week' observation, zabbix works well.
And now I'd like to share this experience for zabbix users.
We are using zabbix for monitoring our all servers and the number of items reached 30w. Our back-end database is Oracle. Last week, the issue of TX lock occurred to us. Once oracle have more than ONE TX lock(get from 'ora active | grep TX | wc -l'), the whole zabbix is out of work. We tried to make our intervals of items longer - 1400 value per second to 600 value per second. But it didn't make any sense. After our analyzing oracle performance, we found that the TX lock is caused by updating the table 'ids' in data base. Further, the root cause is the method of getting 'eventid' for concurrency(in src/libs/zbxdbhigh/db.c). We modified the source code for using 'sequence' in oracle instead the traditional method. After a week' observation, zabbix works well.
And now I'd like to share this experience for zabbix users.
Comment