Ad Widget

**f.koch** · 01-09-2013, 17:33

Hi,

1. I`m afraid the autovacuum isnt working at all.

hm, why do you think it's not working?
make sure you enabled stats_start_collector and stats_row_level

5. 300nvps is big value, but in plan is ~500nvps.

do you have problems with 300 nvps ?

in general, which problems do you try to solve?

regards

**BDiE8VNy** · 01-09-2013, 17:51

Hi!

Just want to write down some things that I've done when I was facing serious performance (mostly I/O related) issues.
Everything is written from memory and is for sure not everything that I've done.

First things that I would suggest:

Get familiar with Zabbix internals (processes, caches, value processing, etc.)
Don't touch Zabbix history caches unless you have to - lots does not help lots
Don't increase StartDBSyncers unless you have to and know how history synchronization works
If you use Zabbix proxies, and there are only a few reasons not to, set StartTrappers to more than that
Be sure you know how to capture bad performing database queries and how to analyze them
Disable housekeeper and partition history and trends tables
If using Zabbix prior to 2.1.3 take care every foreign key child is indexed

The following settings are for informational purposes only and are not intended to be reasonable or suitable for any other installation!

Code:

shared_buffers = 2GB
temp_buffers = 32MB
work_mem = 64MB
maintenance_work_mem = 512MB
max_stack_depth = 9MB

shared_buffers is probably to high. I think I definitely recognized better frontend responsiveness until reaching 1GB.
temp_buffers, work_mem and maintenance_work_mem is set on feeling.
max_stack_depth is set to size limit enforced by the kernel

Code:

synchronous_commit = off
checkpoint_segments = 128
checkpoint_timeout = 60min
checkpoint_completion_target = 0.9

synchronous_commit is disabled since Zabbix data is (at this time) not critical in terms of currentness of data in case of a db crash.
check_point_segemtns are sized to assure checkpointing is triggered under normal load by timeout only.
checkpoint_timeout is set to it's maximum. That should minimize count of dirty buffer writes for frequently updated data.
checkpoint_completion_target is set to slow down dirty buffer writes to complete almost before next checkpointing occurs.

Code:

seq_page_cost = 4.2
effective_cache_size = 20GB
autovacuum_naptime = 15s
autovacuum_vacuum_cost_delay = 0

seq_page_cost had to be adjusted to avoid sequencial scans for a particualar query where a index scan is the better choice.
effective_cache_size gives the database an idea how much pages might be located in filesystem buffer.
autovacuum_naptime is balanced according to total vacuum time.
autovacuum_vacuum_cost_delay is set to 0 to be as aggressive as possible. History tables have autovacuum_vacuum_cost_delay set to 20

Currently most annoying issue to me: FrozenXIDs
Freezing might be pain in the ass on large history tables (in my case up too 100GB per month) when autovacuum kicks in.

Here some information about the corresponding Zabbix environment.
All Linux hosts are virtualized via OpenVZ

Zabbix-Server
- CentOS 6.4 (i386)
- PostgreSQL 9.1.9
- Zabbix-Server 2.0.7

Zabbix-Proxies (SQLite)
- CentOS 5.9 and 6.4 (i386 and x86_64)
- 40 proxies in use
- All proxies equipped with Zabbix-Java-Gateway, snmptrapd, snmptt, ODBC and DBforBix

System x3650 M3
2 x Intel Xeon Processor X5670
64GB RAM (Total)
8GB RAM (Zabbix OpenVZ container including DB/Server/Frontend)
8x 2.5" SAS 300GB 10K RAID6 (1x Spare)

Some numbers
Monitored hosts: ~1.2K
Monitored items: ~150K
Active triggers: ~80K
Host groups: ~200
User groups: ~30
Users: ~100
NVPS: ~1K

The following numbers are in average over one day:

Zabbix busy internal timer processes ~25%
Zabbix busy configuration syncer ~5%
Zabbix busy hystory syncer ~13%

CPU idle time 85%
CPU user time 14%
CPU system time 0.9%
CPU iowait time 0.1%

Read Ops ~12
Write Ops ~780

Read Bps ~190 KB/s
Write Bps ~ 6.9 MB/s

Edit:
Recent modifications:

Code:

shared_buffers = 1GB
bgwriter_lru_maxpages = 0
checkpoint_segments = 250
#autovacuum_naptime = 1min
deadlock_timeout = 5s

shared_buffers (as already mentioned) was set a bit too high.
bgwriter_lru_maxpages is set to 0 to disable background writing and avoid heavily updated data to be written often. I tried this before without significant performance gain. This time it makes a difference - I was probably too impatience. Anyhow, ZBXNEXT-1689 is a better approach and will make this needless.
checkpoint_segments increased to avoid checkpointing by xlog.
autovacuum_naptime is reset to it's default (1min)
deadlock_timeout has been set according to max execution time for most queries on normal load

Edit:

Code:

shared_buffers = 2GB
#temp_buffers = 8MB
work_mem = 80MB

Turns out that reducing shared_buffers from 2GB to 1GB (in connection with disabled background writing) let checkpointing to be finisched within 20mins and thus higher I/O rates.
After reverting the setting back to 2GB checkpointing gets spread over ~45 mins
temp_buffers reverted to default.
work_mem increased to avoid creating temporary files on considerably sorts

**pietro54** · 30-08-2014, 13:00

Hi,
sorry for very late respond.
Thank you for your replay, thank you for your patient for my text

It doesn't help a lot in my performance issue but it was very informative.

**kloczek** · 31-08-2014, 15:21

Originally posted by pietro54

Hello,
I got big zabbix database, over 170gb. Over 300nvps. Its working "not bad" but it can work better

TBH IMO it is very small zabbix installation and mid size zabbix installations begins somewhere at few k to few tenths k nvps.

I belive the bottleneck of the system is database.

Bottleneck is in storage .. in latency of I/Os (not even in max bandwidth)

[..]

Software:
Postgres 9.1
Zabbix 2.0.4
Redhat 6.3

[..]

1. I`m afraid the autovacuum isnt working at all.
2. I`m making vacuum full on evry sunday. This help for a while. Database is ligther for even 15gb (175gb before and 160gb after vacuum)
3. I want to hold long history. Now Im holding almoust evry item for 90 days. I plan to hold this for 190days. And trends for 365days
4. I know I should make partitionig, so we can jump this tip.
5. 300nvps is big value, but in plan is ~500nvps.
6. When postgres 9.3 will be stable version i wannt to migrate my database.
7. Is there something else then vaccum analyze in postgres, like analyze in oracle?

Please advice me, is this correct path to tuning my zabbix, is there something I can do now?

First switch to partitioning. Database type (postgress or mysql) does not really matter. Switch to zabbix 2.2 to utilize zabbix server cashing which decreases number of select up to 3 times.

I'm running 1.2knvps on four year old hardware (HP blade gen6).
DB backen is using 200GBx2 SSDs (behind RAID controller).
Zabbix server 2.2 is running on HP blade gen1 and is not using even 10% of CPU power.
I should switch DB backed to Solaris to use ZFS compression (my current zabbix database is 170GB). With 1MB zfs record size compression ratio is about 3.6 so effectively my 200GB SSDs will be like more than 3 times more fast storage (cost of buying this size enterprise SSDs for gen6 is almost two times higher than Solaris license cost .. if you need support).

Zabbix uses typical warehouse database. Using for such DB postgresql is IMO overkill. Mysql as simpler and easier to maintain is IMO better choice (but it is only my private opinion and I'm aware that some people may not agree with this). However here IMO more important may be skills of someone who will be looking after zabbix DB backend.

Ad Widget

Postgres For large zabbix

Postgres For large zabbix

Comment

Comment

Comment

Comment