Ad Widget

**Alexei** · 09-04-2010, 06:45

Originally posted by untergeek

Our problem really does seem to be that Zabbix can't write to our DB fast enough to keep up with the stream of data we are trying to monitor, at least with Oracle on the backend and a Sun T5120 on the server side (8 cores, 16G of RAM).

Any suggestions?

Zabbix does exactly the same processing for all types of back-end databases. If you say that Zabbix works fast under MySQL but is very slow under Oracle, it basically means that Oracle (settings, configuration, whatever) is to blame. There is nothing Oracle-specific in Zabbix code.

Yes, we do not use use Oracle-specific tweaks, but Zabbix Server should demonstrate at least comparable performance with Oracle, if not better than with MySQL, as Oracle scales much better theoretically. Your DB hardware is very nice, I would do some Oracle tuning.

**untergeek** · 09-04-2010, 16:11

Our DB Hardware is nicer than that! That's the spec for our Zabbix Server. In spite of the nice hardware, Zabbix can't write out fast enough to keep up.

I suppose I should have stated that we have these numbers:

Number of hosts (real, not templates) 342
Number of triggers 4320
Required server performance, new values per second 248.45277777778

Shouldn't Oracle be able to keep up with that? I would expect. The problem, according to our DBA is not that Oracle is incapable, but that Zabbix isn't sending fast enough. It is also sending straight SQL queries instead of binary formatted, but that's a topic for another thread. There is a cost for doing straight SQL queries, but our hardware should be capable. What seems to be happening is that Zabbix (on OUR SPARC hardware) doesn't seem to be able to process all of the entries in a speedy enough fashion because the DB Syncer is single-threaded and single-threads on our SPARC hardware are nowhere near as fast as single-threaded performance on newer Intel hardware (or even older Intel hardware). Also of note is that the recommended setup is with MySQL running locally, with Zabbix connecting to a socket instead of via TCP. Shouldn't that also increase theoretical performance vs. network connections?

I guess our questions are as follows:

Can the number of DB Syncers be increased? We see this option in server.c
Code:
```
int     CONFIG_DBSYNCER_FORKS           = 1;
```
and would love to increase it, but would not try without some understanding of what that would do.
Would migrating to Intel hardware with Linux (Ubuntu) on top increase our throughput even staying with Oracle on the backend?
If there are ways to increase performance of the millions of writes we're doing to Oracle, what kind of tweaks have been or are being used by others? Do you have any ideas?

We love Zabbix. We're just disappointed that it can't handle the load we're trying to throw at it as presently configured. We're not abandoning Zabbix. We're just trying to learn how to make this work right.

**untergeek** · 10-04-2010, 01:19

Problem solved in an unsupported way.

We set this in server.c and it works like a champ with Oracle (cannot recommend this to anyone else with any other database. Oracle is designed to take care of some of these things so you don't have to. Don't think the other SQLs can).

Code:

int     CONFIG_DBSYNCER_FORKS           = 12;

Yes. We tried it and it worked. Not only did it work, it worked well. I tested this with as many as 100 syncers and as few as 5 (seeing as 1 is the default).

With 100, we started to run into collisions on the IDS table. In fact, that was the issue with even 25 syncers with the number of items we have. 50 was okay with only 109 values per second (as dictated on the Dashboard) in our certification environment. Our production environment has the items described above and we had no collisions with 12 syncers. This is working so well we're surprised beyond measure.

I understand that this is not supported or recommended. It works well for us and thought you should know.

Of 13852 lines captured containing "DB syncer spent x.xx seconds processing y items:

1145 are greater than 2 seconds (approximately 8.27%)
1290 are between 1 and 2 seconds (approximately 9.32%)
11417 are less than 1 second (approximately 82.42%)

The top 10 longest syncs:

31.669499
31.321584
30.168350
29.419622
27.699291
27.088114
23.160533
21.065184
19.959127
19.725157

The top 10 shortest syncs:

0.000077
0.000077
0.000077
0.000077
0.000077
0.000077
0.000077
0.000077
0.000077
0.000077

We are not using DebugLevel=4 any more, but will continue to monitor the progress of this for the coming week.

**Alexei** · 12-04-2010, 11:52

It is absolutely not recommended way of speeding up Zabbix. It may lead to a collision with IDS regardless of number of syncers and other issues (loss of ordering for historical data, etc).

Number of significant improvements in logic of existing database cache is on the way. It is very likely that the changes will be released within 3-4 weeks. Please be patient, meanwhile be careful when trying non-official hacks.

**untergeek** · 12-04-2010, 16:17

Alexei,

I hope the changes work. Unfortunately I think that our problem is inherent to our architecture. Meanwhile, we are carefully evaluating this. We are not taking it lightly. I believe the only reason we're succeeding at this is because we're using Oracle. Oracle processes things in a different enough manner that it should succeed in this way.

So far the data is good. We'll keep you posted.

**Alexei** · 13-04-2010, 10:47

Great, please keep me updated.

**untergeek** · 30-04-2010, 19:58

It's been a few weeks so far and it's been running well.

I did notice this line in the Changelog:

- [ZBXNEXT-325] added StartDBSyncers parameter for parallel writing to DB (Sasha)

This is excellent news! I am looking forward to official support for this in 1.8.3!

Out of curiosity, did my experience have anything to do with this or was it in the pipes already?

**Alexei** · 10-05-2010, 13:37

Originally posted by untergeek

Out of curiosity, did my experience have anything to do with this or was it in the pipes already?

It was already in our pipeline.

An incomplete list of planned performance related improvements can be found here: https://support.zabbix.com/browse/ZBXNEXT-318

Ad Widget

Zabbix 1.8 hates Oracle (as a backend)

Zabbix 1.8 hates Oracle (as a backend)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment