Ad Widget

**tim.mooney** · 07-09-2023, 21:20

Originally posted by chaos.entalpico

Sometimes, but not always, a false event is triggered at the same time other traps reach the system

That sounds like some kind of parsing problem with the intermediate script or program you're using with snmptrapd. Is it possible it's getting one TRAP but mishandling it and writing it out was two separate entries in the intermediate file that Zabbix reads? Or is it possible there's some text (line feeds or newlines) in part of the received trap, that is causing the intermediate script or program to incorrectly split one incoming TRAP into two separate entries in the file?

My environment doesn't have a high volume of TRAPs, as we only use TRAP reception for a small number of dumb devices that we can't monitor any other way, but you bring up a point I had never considered with the particular intermediate script we're using: I don't remember if the script my site is using does any locking or anything else to ensure that two nearly simultaneous traps couldn't end up interspersed in the log file. I would have to do some reading to see if that's even a possibility (it's possible snmptrapd forces the scripts to run sequentially without any need for locking in the script itself), but it's not really a problem I would encounter in my environment.

You may want to add some additional logging to the script or program you're using, to see if there are cases where the script is being run more than once at the same time.

**chaos.entalpico** · 11-09-2023, 14:15

Originally posted by tim.mooney

You may want to add some additional logging to the script or program you're using, to see if there are cases where the script is being run more than once at the same time.

Hi Tim,
thanks for the feedback. I've checked all the logs I could find but I didn't find anything useful. You mention to add some additional logging. We are using the pearls script suggested in the official installation guide... do you know how can I enable more logs?

**tim.mooney** · 11-09-2023, 21:37

If you're talking about the script 'zabbix_trap_receiver.pl', then my apologies: now that I look at it, it doesn't currently have any built in logging. If you don't already know how to program in the language perl, adding logging wouldn't be very straightforward for you. It had been a while since I had looked at that script (we use a locally-modified version of that script for our minimal trap reception), and I had forgotten that it didn't have any built-in logging.

Knowing that, I'm not sure what to suggest. When I'm trying to debug a weird problem like what you're running into, I usually try make sure that logging is enabled and perhaps temporarily the log-level is increased, so I get more verbose messages about what's happening. Depending upon the developers of the software, that may or may not be useful -- they may not be doing any logging in the parts of the code that I need more information about. Since there's currently no logging in the perl receiver scripts, about all you could do for logging is to try enable logging for snmptrapd (see the LOGGING section of the snmpcmd(1) man page).

Sorry I don't have a better suggestion, but for a problem that happens infrequently, trying to be prepared with logging for "information capture" and hoping you get enough info the next time it happens is often the easiest approach to tracking down the problem.

**chaos.entalpico** · 12-09-2023, 09:42

Hi Tim,
no problem, thanks for your feedback anyway.

**chaos.entalpico** · 21-09-2023, 14:20

Hello,

I couldn't find any piece of hint in any log, so I tried to play around with things.
I've managed to improve the situation a bit by removing the recovery expression.

Before I had something like this:

Problem: last(/whatever/snmptrap.alarm.code)=123 and last(/whatever/snmptrap.alarm.resolution)=1
Recovery: last(/whatever/snmptrap.alarm.code)=123 and last(/whatever/snmptrap.alarm.resolution)=2

Now:

last(/whatever/snmptrap.alarm.code)=123 and (last(/whatever/snmptrap.alarm.resolution)=1 or last(/whatever/snmptrap.alarm.resolution)>2)

This works almost fine, except sometimes the problems are marked as cleared by Zabbix without the appropriate trap. The logic behind, I believe, is that when Zabbix receives a trap with alarm code different than 123 and a resolution value of 2 right after the trigger, the condition is interpreted as FALSE and the alarm is cleared. So, if a node is generating different traps with different alarm codes, there is a chance that some alarm is marked as clear. So I have to find a workaround for another workaround.

I think I'm going insane

**cyber** · 22-09-2023, 08:25

Each trigger is recalculated, if any of used items is receiving new value. It can be microseconds apart, but it will be recalculated for each new value.

**chaos.entalpico** · 25-09-2023, 11:35

Originally posted by cyber

Each trigger is recalculated, if any of used items is receiving new value. It can be microseconds apart, but it will be recalculated for each new value.

Yes, I thought so. Any tip to reach the light

?
Thanks!

Ad Widget

Fake problems/alarms without a corresponding snmp trap

Fake problems/alarms without a corresponding snmp trap

Comment

Comment

Comment

Comment

Comment

Comment

Comment