Ad Widget

Collapse

Zabbix Server Stops After A While

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • BDiE8VNy
    Senior Member
    • Apr 2010
    • 680

    #16
    Originally posted by Alexei
    I would track number of established connections to Zabbix server TCP/10051 (trapper).
    That's a good advice. I'll do that.

    Originally posted by Alexei
    [...] proxies are trying to push information to the server and the write cache gets full.[...]
    I see, but all write caches seemed to be fine - until server restart and chasing up (see attached image)

    Originally posted by Alexei
    [...] network issues between Zabbix Server and one of your proxies.[...]
    I can't exclude this. Anyhow, all Zabbix Proxies proceeded collecting data without any issues. Even the agent on the server is connected via proxy.
    After restarting the server all missing data has been delivered completely by all proxies.

    Originally posted by Alexei
    Connect to your proxies to see how much unsent data they have in their databases [...]
    As mentioned before I'm confident that they were caching properly. Anyway, I'll take a look insight the proxy database in case this issue happens again. What hopefully never will happen :-)

    Originally posted by Alexei
    Quick solution (data loss, not recommended): connect to proxies and drop unsent data.
    Better solution: wait when Zabbix recovers. Note that currently it is not in a good shape, data collection is likely significantly delayed.
    In my case the solution was easy. Restarting the Zabbix Server does the trick without any loss of historical data. Immediately after the server was up again it got the full load by all proxies.

    I wonder what kind of issue leads to full allocation of all trapper sockets on the server without processing anything. And why are the faulty/stalled connections not closed by reaching TrapperTimeout?

    Is it conceivable that the server stops processing for any reason, what caused trappers to be exhausted?
    That would explain why no intervention on proxies were necessary and why even a working TrapperTimeout had not helped out.
    Attached Files

    Comment

    Working...