Ad Widget

Collapse

Zabbix stopped monitoring SNMP this morning

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • evand
    Junior Member
    • Apr 2009
    • 6

    #1

    Zabbix stopped monitoring SNMP this morning

    I originally posted this in the "Help" forum last night but am moving it here:

    I upgraded my Zabbix installation to 1.6.4 about 4 weeks ago and everything seemed to be running fine until this morning. I logged in and saw that nothing except for ICMP had been updated since 1:11 AM (America/New_York TZ). The ICMP pinger seems to be working fine and the data is going into the database, but none of the SNMP data is going in, nor are any of the TCP port checks.

    The logfile is set to Debug=3 and reveals nothing. The entire log file is shown below, it never adds anything past this. If I put it in debug=4 it shows a lot more stuff but nothing that appears relevant.

    Code:
      5922:20090520:221058 Starting zabbix_server. ZABBIX 1.6.4.
      5922:20090520:221058 **** Enabled features ****
      5922:20090520:221058 SNMP monitoring:       YES
      5922:20090520:221058 WEB monitoring:        YES
      5922:20090520:221058 Jabber notifications:  YES
      5922:20090520:221058 ODBC:                   NO
      5922:20090520:221058 IPv6 support:           NO
      5922:20090520:221058 **************************
      5927:20090520:221058 server #1 started [Poller. SNMP:YES]
      5928:20090520:221058 server #2 started [Poller. SNMP:YES]
      5929:20090520:221059 server #3 started [Poller. SNMP:YES]
      5930:20090520:221059 server #4 started [Poller. SNMP:YES]
      5931:20090520:221059 server #5 started [Poller. SNMP:YES]
      5932:20090520:221059 server #6 started [Trapper]
      5938:20090520:221059 server #7 started [Trapper]
      5940:20090520:221059 server #8 started [Trapper]
      5942:20090520:221059 server #9 started [Trapper]
      5944:20090520:221059 server #10 started [Trapper]
      5946:20090520:221059 server #11 started [ICMP pinger]
      5949:20090520:221059 server #12 started [ICMP pinger]
      5951:20090520:221059 server #13 started [ICMP pinger]
      5955:20090520:221059 server #14 started [ICMP pinger]
      5959:20090520:221059 server #15 started [ICMP pinger]
      5963:20090520:221059 server #16 started [Alerter]
      5965:20090520:221059 server #17 started [Housekeeper]
      5965:20090520:221059 Executing housekeeper
      5967:20090520:221059 server #18 started [Timer]
      5974:20090520:221059 server #20 started [Node watcher. Node ID:0]
      5976:20090520:221059 server #21 started [HTTP Poller]
      5978:20090520:221059 server #22 started [HTTP Poller]
      5980:20090520:221059 server #23 started [HTTP Poller]
      5982:20090520:221059 server #24 started [HTTP Poller]
      5968:20090520:221059 server #19 started [Poller for unreachable hosts. SNMP:YES]
      5984:20090520:221059 server #25 started [HTTP Poller]
      5986:20090520:221059 server #26 started [Discoverer. SNMP:YES]
      5988:20090520:221059 server #27 started [Escalator]
      5922:20090520:221059 server #0 started [Watchdog]
      5922:20090520:221059 In main_watchdog_loop()
      5965:20090520:221111 Deleted 0 records from history and trends
    That server was started about 10 minutes ago and that was the last log entry. I've tried restarting postgres, vacuum/analyzing the database, and even rebooting the machine. As I said, this was up running fine for about a month and then just conked out early this morning with nobody doing anything to it. Any ideas?

    BTW - I am able to snmpwalk all remote servers from the Zabbix machine so it's not a connectivity/firewall issue. It just doesn't seem to be attempting to gather the info. As I said, even TCP port checks (I check 80, 443, and 22) don't work - only ICMP.
  • evand
    Junior Member
    • Apr 2009
    • 6

    #2
    Update

    I have since increased the number of pollers etc and disabled housekeeping, which has increased the number of items that Zabbix checks before it stops altogether, but the problem remains - a few seconds after startup it stops checking everything except ICMP.

    Comment

    • evand
      Junior Member
      • Apr 2009
      • 6

      #3
      If I execute this SQL query:

      Code:
      update items set nextcheck=0;
      And then stop & start zabbix_server, it will check most of the SNMP items at least once, and then die. The date on the box is correct:

      Thu May 21 10:24:06 EDT 2009

      Comment

      • evand
        Junior Member
        • Apr 2009
        • 6

        #4
        I am at my wit's end with this now. I have gone as far as taking a fresh CentOS 5.3 install, installing Postgres on it, restoring a DB dump of the zabbix DB onto it, compiling Zabbix 1.6.4 from source, and the problem still existed on the brand new server. This leads me to believe that it's either some strange configuration problem (and the config certainly didn't change at 1 AM) or a bug in the code. Unfortunately at this point it looks like my options are to attempt a revert to a previous version of zabbix (1.4.x) or move to another platform all together.

        Comment

        Working...