Ad Widget

**mschlegel** · 08-01-2011, 00:00

In the latest case I've experienced, I got the same errors showing up shortly after restarting the zabbix_server process, however, after restarting one of the proxies, I have not seen any additional duplicate key messages.

What additional information would be helpful in identifying the cause of this problem?

Thank you

**untergeek** · 12-01-2011, 18:19

I haven't seen errors like that. Can you elaborate on your db setup?

For instance:

Is it MySQL (presumed)? Postgres? Oracle on linux?
How many dbsyncers do you have configured in the zabbix_server.conf?

Your setup is curious. You have an extremely high number of servers, items and triggers and a seemingly low number of values per second. Do you have a long delay between checks? That many triggers must result in a rather long queue of DB reads. That low number of required writes per second just seems so incongruous with the rest of the numbers. I wonder what the backlog is from 70+ proxies all trying to contact your zabbix server.

Our setup is considerably different:

446 active hosts
24,145 active items
7,013 active triggers
408.6 required values per second (actual measured value from the Zabbix Internal of WriteCache is closer to 180 values per second).

We currently have NO proxies, but we heavily monitor the hosts we have in a variety of ways. We're RHEL5 on HP 380DL servers with 6G of RAM and a monster Oracle backend (since we use it for our customers, we might as well use it for ourselves too).

**mschlegel** · 12-01-2011, 19:31

Howdy,

We are running mysql 5.1.41. The zabbix server is running on on a quad core AMD Opteron with 6G ram. Disk is setup with DRBD partitions - one for data & one for InnoDB logs, mirrored to an identical host for HA purposes. In normal operation, the mysql server runs on one host and the zabbix_server process runs on the other host.

Possibly relevant zabbix_server config items:
HousekeepingFrequency=4
CacheSize=128M
HistoryCacheSize=8M
TrendCacheSize=32M

DB Syncer's is not defined in our config at this point, so it would be at the default of 4.

The vast majority of the hosts we have in zabbix are only monitoring 7 parameter, and most of those are recently changed to from a 5 minute to a 10 minute update cycle to reduce database load. This change reduced the values per second from about 183/sec to the current 110/sec.

Does it seem likely that the large number of proxies might be causing the zabbix server to have a harder time keeping up with item updates from these hosts than it might have with a smaller number of proxies?

Is there a practical limit to how many hosts can be carried directly off a single server without proxies?

Thank you,

**untergeek** · 13-01-2011, 16:57

Since we're running Oracle, and a monster one at that, this should be taken with the necessary salt…

That said, I think you should experiment with increasing the number of dbsyncers. This may require tweaks to max connections on your db, but increasing syncers may help. Each proxy is trying to get a connection to write to the db and you're trying to get 70+ proxies with only 4 syncers. Bump that to 8, 12, 16, 24, 32 or something and see if the problem changes any.

**mschlegel** · 14-01-2011, 17:29

I'm running the server with DB Syncers set up to 8 now. Doesn't look like its made an impact on the system load of the database server, so at least it can be left for observation for a while.

Is there any way to see what is in the zabbix server's database queue? It seems like that would be the best indicator overall of how far behind the database is running.

Thank you for your assistance.

**untergeek** · 14-01-2011, 18:19

You can see a graphic representation of which items are behind in the UI:

Administration->Queue

You can also see how far behind in columns, 5 seconds, 10 seconds, 30 seconds, etc.

If you select the dropdown on the right side, you can do an Overview by Proxy, or Details to see individual items and the delay.

In your case, the Overview by Proxy could be invaluable.

**mschlegel** · 14-01-2011, 18:57

I know the queue shows how far behind the system is as a whole, but I didn't think that it could distinguish between items that are in the server but not in the database and items that have not yet been sent to the server from the agents/proxies. While both can be useful, it seems like knowing what data has arrived at the server but not yet been pushed into the database would be the more important item to know in this particular case.

**untergeek** · 14-01-2011, 19:29

Ah, I see. I do not think that Zabbix has any way of informing you of what is in the write caches, but it does let you know how many values are in it. That's about it.

To see what hasn't been written yet, you'd probably have to look in your database. In our case, we found that Oracle was writing VERY fast in most cases, but that's also why we went for the full 64 DBSyncers. Oracle worked best that way as it gets the best bang-for-the-zabbix-buck (zabbix being single-threaded, oracle totally parallel minded). MySQL may behave differently.

What's the DataSenderFrequency from your proxies?
The heartbeat frequency?
Are they in active or passive mode?

We're going to be deploying lots of proxies soon too, I think, so these become very important for me to understand as well.

**mschlegel** · 14-01-2011, 22:36

We have the heartbeat frequency set to 60 and SenderFrequency set to 30. All proxies are currently active proxies, though I'm wondering if switching them over to passive might help with the issues we are seeing. After all, the server can't be swamped by incoming connections if it has to go ask for the data itself.

Thank you

**untergeek** · 14-01-2011, 22:39

Hmmm. A sender frequency of 30 seconds. How frequent were your items polling?

I'm just trying to figure out how many messages are coming all at once from how many servers. It may just be that "batching" them like that sends more than Zabbix knows what do do with. How many Trappers do you have?

**mschlegel** · 14-01-2011, 23:17

Most items behind the proxies are 600s cycle now. 70 hosts behind each proxy, 7 items each host.

Proxy has the following start options:
Pollers 2
IPMIPollers 0
PollersUnreachable 1
Trappers 1
Pingers 1
Discoverers 0
HTTPPollers 0

Proxies are using SQLite and doing hourly housekeeping.

Server has the following start options:
Pollers 5
IPMIPollers 0
PollersUnreachable 1
Trappers 5
Pingers 3
Discoverers 1
HTTPPollers 1

Currently running housekeeping every 4 hours.

**untergeek** · 14-01-2011, 23:20

5 trappers may be too few. Consider that if you have 70 proxies, that's how they communicate back to the server (at least, that's how I understand it).

The items come back from the proxies to the trappers. If 5 come back simultaneously, your trappers are all busy, causing the others to wait, fail, or have to reschedule. It might be beneficial to try to boost that number a bunch. I have our setup (which you've seen the numbers for) with 100 trappers.

**zaicnupagadi** · 06-09-2012, 02:30

I had similar issue while I was adding new user, the SQL has been saying to me that it cannot add another record with ID fields "8".

Deleting the database and recreating it didn't work out, as I looked on that table, I saw that last user that has been added has that ID equal "8". So I have deleted that user, added him again, and later there was no problem with adding other users. I don't know what was the problem at the end, but deleting the last entry worked for me.

Ad Widget

Duplicate entry 'xx-xx' for key 'PRIMARY' - trends & history tables

Duplicate entry 'xx-xx' for key 'PRIMARY' - trends & history tables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment