Ad Widget

Collapse

One client out of 221 refuses to connect

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • slinx
    Junior Member
    • Feb 2010
    • 14

    #1

    One client out of 221 refuses to connect

    Hello,

    We have a ZABBIX Server (daemon) v1.6.5 (revision 7442) (3 April 2009) server monitoring 221 clients.

    All of the clients have the same configuration, running ZABBIX Agent (daemon) v1.4.5 (25 March 2008).

    All of them report into the server EXCEPT ONE.

    I can telnet to port 10050 on the client from the server, send a zabbix key name, and get a response.

    I can telnet to port 10051 on the server from the client.

    Here's the client's log:

    Code:
      9686:20100216:142349 In refresh_metrics('10.98.2.35',10051)
      9686:20100216:142349 get_active_checks('10.98.2.35',10051)
      9686:20100216:142349 Sending [ZBX_GET_ACTIVE_CHECKS ta365]
      9686:20100216:142349 Before read
      9686:20100216:142349 In parse_list_of_checks() [ZBX_EOF]
      9686:20100216:142349 In disable_all_metrics()
      9686:20100216:142349 Parsed [ZBX_EOF]
      9686:20100216:142349 In process_active_checks('10.98.2.35',10051)
      9686:20100216:142349 In get_min_nextcheck()
      9686:20100216:142349 Sleeping for 60 seconds
      9686:20100216:142449 In process_active_checks('10.98.2.35',10051)
      9686:20100216:142449 In get_min_nextcheck()
      9686:20100216:142449 Sleeping for 60 seconds
      9686:20100216:142549 In refresh_metrics('10.98.2.35',10051)
      9686:20100216:142549 get_active_checks('10.98.2.35',10051)
      9686:20100216:142549 Sending [ZBX_GET_ACTIVE_CHECKS ta365]
      9686:20100216:142549 Before read
      9686:20100216:142549 In parse_list_of_checks() [ZBX_EOF]
      9686:20100216:142549 In disable_all_metrics()
      9686:20100216:142549 Parsed [ZBX_EOF]
      9686:20100216:142549 In process_active_checks('10.98.2.35',10051)
      9686:20100216:142549 In get_min_nextcheck()
      9686:20100216:142549 Sleeping for 60 seconds
    Here's the agentd.conf file (blank lines and comments stripped):

    Code:
    Server=10.98.2.35
    Hostname=ta365
    StartAgents=5
    DebugLevel=4
    PidFile=/var/tmp/zabbix_agentd.pid
    LogFile=/tmp/zabbix_agentd.log
    Timeout=3
    Yet this is what I see in the server log for this client:
    Code:
     30013:20100216:143956 Item [ta365:controllerRollupStatus.1] error: Timeout while connecting to [ta365.ta.com:161]
    Processes look normal:
    Code:
    [root@ta365 ~]# ps -ef | grep zabbix
    zabbix    9680     1  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    zabbix    9681  9680  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    zabbix    9683  9680  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    zabbix    9684  9680  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    zabbix    9685  9680  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    zabbix    9686  9680  0 Feb15 ?        00:00:00 /usr/local/sbin/zabbix_agentd
    root     23099 14409  0 14:39 pts/3    00:00:00 grep zabbix
    All the other clients are working fine. What could I be missing?

    Thanks!
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    Zabbix server tries to connect to port 161, which is supposed to be used by SNMP agents! You may want to check what port number is configured for this host in Zabbix GUI.
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • slinx
      Junior Member
      • Feb 2010
      • 14

      #3
      In the GUI, it is configured to use port 10050.

      Comment

      • Alexei
        Founder, CEO
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2004
        • 5654

        #4
        What about item with a key 'controllerRollupStatus.1'. It seems to be SNMP based, right?

        In version 1.6.x Zabbix considers host as unavailable if at least one check constantly timeouts. So, make this item (and any other broken items) work or disable it.
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment

        • slinx
          Junior Member
          • Feb 2010
          • 14

          #5
          Originally posted by Alexei
          What about item with a key 'controllerRollupStatus.1'. It seems to be SNMP based, right?

          In version 1.6.x Zabbix considers host as unavailable if at least one check constantly timeouts. So, make this item (and any other broken items) work or disable it.
          Thanks Alexei. I had to install the net-snmp packages and run /etc/init.d/dataeng enablesnmp , and restart the dataeng script. It is working now.

          Comment

          Working...