Ad Widget

**richlv** · 21-11-2008, 17:17

we're seeing those as well, but only on single machine so far - and in agent log instead of server log.
the machine actually is zabbix server itself.
we could try finding out common factors - do you have user parameters defined for that host ?
do you see those messages in serverlog only ? what about agent logs ?

**Crazy Marty** · 21-11-2008, 22:02

SIGPIPE is raised on a write operation to a closed (or otherwise broken) pipe -- well, it used to be only on pipes, but now that pipes & sockets share a lot of code in many Unix-/Linux-/Posix-like systems, it applies to sockets, too. So it means that a socket has been closed (by the reader) by the time data is written to it (by the writer).

It would probably be wise for the code to arrange to catch SIGPIPE, and at least try to report on just which socket (unexpectedly) went away.

**richlv** · 22-11-2008, 09:57

thanks for the info on sockets, didn't know that. yes, as in many cases, improved error message would help a lot in debugging

**Emir Imamagic** · 11-12-2008, 15:42

Originally posted by richlv

we're seeing those as well, but only on single machine so far - and in agent log instead of server log.
the machine actually is zabbix server itself.
we could try finding out common factors - do you have user parameters defined for that host ?
do you see those messages in serverlog only ? what about agent logs ?

We see these only in server logs. I read about the SIGPIPE signal but I don't understand which part of zabbix server is using pipes. Especially since we're using PostgreSQL database on a different machine via TCP connection.

**Emir Imamagic** · 07-01-2009, 19:19

Now I see that there is an open bug for this issue:

Loading...

https://support.zabbix.com/browse/ZBX-518

Probably interesting point here is that both of us are using PostgreSQL database.

I left additional comment there as well cuz there hasn't been any comments from developers and this is causing our infrastructure a lot of problems.

**Emir Imamagic** · 03-02-2009, 22:59

One obvious thing we forgot to check is lowering the default value RefreshActiveChecks. Based on the debug logs it seems that SIGPIPEs strike in case when DB is under big load and server somehow doesn't manage to answer to agent's request for active checks in timely manner.

In our setup we used the default value (60s) which creates significant load on server with our number of machines and items. We raised the value to the maximum - 3600 and hoping for the best. Too bad this value can't be configured on server side.

One thing scares me a bit now - in case when agent didn't get the list it tried again in 60s. Does 60s come from RefreshActiveChecks? Does that mean in our setup agent will wait for 3600s before requesting the list again?

Thanks,
emir

**richlv** · 04-02-2009, 10:32

...which couldn't be our problem, as the agent does not have any active checks assigned (they are explicitly disabled).
could it be that agent and server problems with this error are different ?

**Emir Imamagic** · 04-02-2009, 11:07

Originally posted by richlv

...which couldn't be our problem, as the agent does not have any active checks assigned (they are explicitly disabled).
could it be that agent and server problems with this error are different ?

I would say yes. If you increase the debug level can you at least figure out in which part does the SIGPIPE occur?

**richlv** · 04-02-2009, 12:05

with debug level 4 it shows :

Code:

 11157:20090204:115145 Before
 11157:20090204:115145 Run remote command [/home/zabbix/bin/hpacucliwrapper controller cache 0 ] Result [1] [1]...
 11157:20090204:115145 Sending back [1]
 11157:20090204:115145 Got SIGPIPE. Where it came from???
 11157:20090204:115145 Process listener error: ZBX_TCP_WRITE() failed [Broken pipe]
 11158:20090204:115147 Before
 11158:20090204:115147 Run remote command [/home/zabbix/bin/hpacucliwrapper array 0 B ] Result [1] [1]...
 11158:20090204:115147 Sending back [1]
 11158:20090204:115147 Got SIGPIPE. Where it came from???
 11158:20090204:115147 Process listener error: ZBX_TCP_WRITE() failed [Broken pipe]

(both server and client are 1.4.6).
so it seems like agent succeeded in it's operations, but sending data to server somehow errored out (though data itself is delivered ok).

**Emir Imamagic** · 06-02-2009, 15:03

Originally posted by Emir Imamagic

One thing scares me a bit now - in case when agent didn't get the list it tried again in 60s. Does 60s come from RefreshActiveChecks? Does that mean in our setup agent will wait for 3600s before requesting the list again?

to answer to myself since no-one bothers, answer is no. In case of failure, agent will query for active check after 60s:
Getting list of active checks failed. Will retry after 60 seconds

cheers,
emir

**Emir Imamagic** · 06-02-2009, 15:04

Originally posted by richlv

(both server and client are 1.4.6).
so it seems like agent succeeded in it's operations, but sending data to server somehow errored out (though data itself is delivered ok).

sorry, don't have a clue. I'm quite sure that your problem is quite different from what we're facing. I suggest you open a bug, but unfortunately it seems Zabbix crew is not keen on solving these problems lately

Cheers,
emir

**Emir Imamagic** · 12-02-2009, 01:46

I see that the ZBX 311 regarding problems with PostgreSQL database has been solved. Could we expect that this issue (https://support.zabbix.com/browse/ZBX-518) will be resolved in 1.6.3?

Thanks,
emir

**bennett.lain** · 11-11-2010, 23:50

was this ever fixed?!

im using 1.4 with no real ability to upgrade right now.

this just started happening to us in the last couple of weeks

this IS causing issues, we are loosing some of the data our system is trying to collect.

is there something that i can do on my end to fix this WITHOUT upgrading?
i'd like to know what might have caused this, because im at a loss.

Ad Widget

Zabbix server fails with "Got SIGPIPE. Where it came from???"

Zabbix server fails with "Got SIGPIPE. Where it came from???"

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment