Ad Widget

**kloczek** · 14-05-2018, 17:30

Originally posted by mellis

It is going to be very hard to sale a full scale roll out of a nation wide zabbix if we have this issue with the 3.4 / 4.0.

No, it uses DB backend eve less than 3.2.
Do you have any basic DB backend monitoring like selec, indersts, begin, end, delete queries stats? IO statis?

**vso** · 14-05-2018, 19:19

Please provide more information is it busy housekeeper ?

**mellis** · 14-05-2018, 22:07

Sorry for the slow responce, I had already reverted back to 3.2 so i grab some information under the 3.2 and then did the upgrade again and grabed it agian. I also have some screen shots of the housekeeping. all the raw information is in the word doc attached. i will try to give you a good summary here.

1) the hardware, these are VM's 4xCPU, 24GB, ~220GB Disk. One server has the database the other has the Web GUI and the Zabbix Server.
2) the OS CentOS 7.3 fully updated.
3) the Database MySQL 5.7.22 little better than 30GB

4) my.cnf
show_compatibility_56 = ON
performance_schema
innodb_buffer_pool_size = 2G
innodb_data_home_dir=/home/mysql
innodb_file_per_table = 1
innodb_buffer_pool_instances = 14
innodb_buffer_pool_size = 18G
innodb_page_cleaners=14
tmp_table_size = 36M
max_heap_table_size = 36M
join_buffer_size = 1G
sort_buffer_size = 4M
read_rnd_buffer_size = 4M
query_cache_size = 0
query_cache_type = 0
query_cache_limit = 2M
max_connections = 376
wait_timeout = 14400
interactive_timeout = 14400
datadir=/home/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
#logging stuff
log-error=/var/log/mysqld.log
slow_query_log = 1
slow_query_log_file = /var/log/slow_quiery.log
pid-file=/var/run/mysqld/mysqld.

WhneI am running the 3.2 the disk IO is in the 20 to 30% range, after the 3.4 upgrade it is over 80%.

The housekeeping looks the same, it gets to 100% for 30mins every hour. Graph on doc.

In the mysql.log file I see lots of [Note] Aborted connection 30399 to db: 'zabbix' user: 'zabbix' host: 'localhost' (Got an error reading communication packets) and [Note] InnoDB: page_cleaner: 1000ms intended loop took 6170ms. The settings might not be optimal. (flushed=437 and evicted=0, during the time.)

I have chased these error for months but can never seem to get the to clear out.

I have also use the mysqltuner.pl script and have seem some improvment, but can ever seem to complete solve the problem.

Any help would be great,,

Attached Files

Datbase Detail.docx (692.7 KB, 35 views)

**vso** · 14-05-2018, 23:00

which exact version of 3.2 and 3.4 do you use ? It looks like you might have too big events table.

**mellis** · 14-05-2018, 23:05

It was 3.2.11 to 3.4.8. I did the data purge described at http://www.michaelfoster82.co.uk/zab...lete-old-data/ trimming the tables back to 7 dayshistory and 90 days trends, that helped for about 18 hours.

**mellis** · 16-05-2018, 14:35

I am seeing disk IO on the database server now over 100% and the housekeeping goes to 100% for 30mins once every hour. I would have not though that there was much housekeeping to do since i did a data purge and remove any history over 7 days old. I am using atop to get the disk usage.
In the mysql.log i am seeing alot of messages the the "InnoDB page_cleaner: 1000ms intended loop took 5000ms. ( this number bounces around 4000 to 6000)

I have increased the innodb_buffer_pool_instances & innodb_page_cleaners to 14 each.

Again this is a 4 vCpu box with 24GB ram, it ios using 21GB at this time.

**vso** · 16-05-2018, 14:37

Can you please attach the log, it will show what housekeeper has cleaned.

**mellis** · 16-05-2018, 14:50

Would only let me upload a small one, what should i look for and i will scan all of them and send the parts needed

Attached Files

zabbix_server_log.zip (65.7 KB, 34 views)

**vso** · 16-05-2018, 14:55

Something like:
3314:20180516:155439.059 housekeeper [deleted 0 hist/trends, 3 items/triggers, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.456314 sec, idle for 1 hour(s)]

**mellis** · 16-05-2018, 15:15

did not see it, set the debug up to 5 and size to 8, i will get back as soon as I find something

**mellis** · 16-05-2018, 16:04

Attached please find addition information, i never have found a line in the log like the one above. what I did find is detailed in the doc.

Attached Files

Notes from 5-16.docx (289.7 KB, 31 views)

**mellis** · 16-05-2018, 18:11

Should I look at doing some partitions? 100% of the host are now reporting thatthey are unreachable, but I know they are other wise I would be having bigger problems than this.

Ad Widget

Disk IO usage with 3.4 over 80%

Disk IO usage with 3.4 over 80%

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment