Ad Widget

Collapse

Post Installation Troubles - Part 2

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • QwErTy_LoGiC
    Member
    • Feb 2010
    • 66

    #1

    Post Installation Troubles - Part 2

    Hello all,

    I have just installed Zabbix 1.8.1 on Ubuntu 9.10 server and the installation went pretty well.

    I have encountered a problem with both the agentd and the server. Neither seem to be able to connect to the server. Here is the log from /tmp/zabbix_server.log:

    Code:
    1815:20100218:143930.170 Starting zabbix_server. Zabbix 1.8.1 (revision 9702).
      1815:20100218:143930.170 **** Enabled features ****
      1815:20100218:143930.170 SNMP monitoring:       YES
      1815:20100218:143930.170 IPMI monitoring:        NO
      1815:20100218:143930.170 WEB monitoring:        YES
      1815:20100218:143930.170 Jabber notifications:  YES
      1815:20100218:143930.170 ODBC:                   NO
      1815:20100218:143930.170 SSH2 support:           NO
      1815:20100218:143930.170 IPv6 support:           NO
      1815:20100218:143930.170 **************************
      1816:20100218:143930.215 server #1 started [DB Cache]
      1822:20100218:143930.216 server #7 started [Trapper]
      1825:20100218:143930.217 server #10 started [Trapper]
      1826:20100218:143930.217 server #11 started [Trapper]
      1829:20100218:143930.218 server #14 started [Housekeeper]
      1829:20100218:143930.218 Executing housekeeper
      1831:20100218:143930.219 server #15 started [Timer]
      1815:20100218:143930.220 server #0 started [Watchdog]
      1838:20100218:143930.220 server #21 started [Escalator]
      1835:20100218:143930.221 server #18 started [HTTP Poller]
      1834:20100218:143930.222 server #17 started [Node watcher. Node ID:0]
      1817:20100218:143930.244 server #2 started [Poller. SNMP:YES]
      1818:20100218:143930.244 server #3 started [Poller. SNMP:YES]
      1828:20100218:143930.244 server #13 started [Alerter]
      1819:20100218:143930.244 server #4 started [Poller. SNMP:YES]
      1820:20100218:143930.244 server #5 started [Poller. SNMP:YES]
      1821:20100218:143930.244 server #6 started [Poller. SNMP:YES]
      1827:20100218:143930.244 server #12 started [ICMP pinger]
      1823:20100218:143930.244 server #8 started [Trapper]
      1824:20100218:143930.245 server #9 started [Trapper]
      1837:20100218:143930.245 server #20 started [DB Syncer]
      1833:20100218:143930.274 server #16 started [Poller for unreachable hosts. SNMP:YES]
      1836:20100218:143930.279 server #19 started [Discoverer. SNMP:YES]
      1818:20100218:143931.245 Item [Zabbix Server:system.cpu.util[,nice,avg1]] error: Get value from agent failed: Cannot connect to [127.0.0.1:10050] [Connection refused]
      1818:20100218:143931.246 ZABBIX Host [Zabbix Server]: first network error, wait for 15 seconds
      1829:20100218:143932.941 Deleted 0 records from history and trends
      1833:20100218:143950.275 Item [Zabbix Server:agent.ping] error: Get value from agent failed: Cannot connect to [127.0.0.1:10050] [Connection refused]
      1833:20100218:143950.276 ZABBIX Host [Zabbix Server]: another network error, wait for 15 seconds
      1833:20100218:144005.276 Item [Zabbix Server:agent.ping] error: Get value from agent failed: Cannot connect to [127.0.0.1:10050] [Connection refused]
      1833:20100218:144005.278 ZABBIX Host [Zabbix Server]: another network error, wait for 15 seconds
      1833:20100218:144020.278 Item [Zabbix Server:agent.ping] error: Get value from agent failed: Cannot connect to [127.0.0.1:10050] [Connection refused]
      1833:20100218:144020.300 Disabling ZABBIX host [Zabbix Server]
    And here is the log from /tmp/zabbix_agentd.log :

    Code:
     1357:20100218:152205.692 zabbix_agentd started. Zabbix 1.8.1 (revision 9702).
      1360:20100218:152205.699 zabbix_agentd collector started
      1361:20100218:152205.699 zabbix_agentd listener started
      1362:20100218:152205.699 zabbix_agentd listener started
      1363:20100218:152205.699 zabbix_agentd listener started
      1364:20100218:152205.699 zabbix_agentd active check started [127.0.0.1:10051]
      1364:20100218:152305.705 No active checks on server: host [scmonitor] not found
      1364:20100218:152505.714 No active checks on server: host [scmonitor] not found
    Any ideas?

    Thanks!
  • MrKen
    Senior Member
    • Oct 2008
    • 652

    #2
    "host [scmonitor] not found"

    That looks like a problem! Zabbix Server doesn't know about that host.

    Be sure that the name that you have set in your zabbix_agentd.conf, is exactly the same as the name you have given your host in the zabbix gui, case sensitive.

    # Unique hostname. Required for active checks.
    Hostname=scmonitor

    Also, it may not be absolutely necessary, but I would use the real IP address rather than 127.0.0.1

    Don't forget to restart the agent after you make changes.

    HTH,
    MrKen
    Disclaimer: All of the above is pure speculation.

    Comment

    • QwErTy_LoGiC
      Member
      • Feb 2010
      • 66

      #3
      Some progress...

      Hello all,

      I have managed to solve some issues, the connect failed problem went away, seems that my entries in the /etc/services failed weren't done properly.

      But I still get the host not found errors both in the agent log and the server log.

      Here is what I get now when I start the server:

      Code:
      1828:20100219:105701.917 Starting zabbix_server. Zabbix 1.8.1 (revision 9702).
        1828:20100219:105701.917 **** Enabled features ****
        1828:20100219:105701.917 SNMP monitoring:       YES
        1828:20100219:105701.917 IPMI monitoring:        NO
        1828:20100219:105701.917 WEB monitoring:        YES
        1828:20100219:105701.917 Jabber notifications:  YES
        1828:20100219:105701.917 ODBC:                   NO
        1828:20100219:105701.917 SSH2 support:           NO
        1828:20100219:105701.917 IPv6 support:           NO
        1828:20100219:105701.917 **************************
        1829:20100219:105701.938 server #1 started [DB Cache]
        1830:20100219:105701.967 server #2 started [Poller. SNMP:YES]
        1831:20100219:105701.968 server #3 started [Poller. SNMP:YES]
        1832:20100219:105701.968 server #4 started [Poller. SNMP:YES]
        1835:20100219:105701.969 server #7 started [Trapper]
        1836:20100219:105701.969 server #8 started [Trapper]
        1837:20100219:105701.969 server #9 started [Trapper]
        1839:20100219:105701.970 server #11 started [Trapper]
        1840:20100219:105701.970 server #12 started [ICMP pinger]
        1841:20100219:105701.970 server #13 started [Alerter]
        1843:20100219:105701.970 server #14 started [Housekeeper]
        1843:20100219:105701.970 Executing housekeeper
        1845:20100219:105701.971 server #15 started [Timer]
        1838:20100219:105701.974 server #10 started [Trapper]
        1834:20100219:105701.997 server #6 started [Poller. SNMP:YES]
        1847:20100219:105701.999 server #16 started [Poller for unreachable hosts. SNMP:YES]
        1833:20100219:105702.000 server #5 started [Poller. SNMP:YES]
        1853:20100219:105702.000 server #17 started [Node watcher. Node ID:0]
        1854:20100219:105702.000 server #18 started [HTTP Poller]
        1858:20100219:105702.001 server #20 started [DB Syncer]
        1859:20100219:105702.001 server #21 started [Escalator]
        1828:20100219:105702.002 server #0 started [Watchdog]
        1856:20100219:105702.029 server #19 started [Discoverer. SNMP:YES]
        1843:20100219:105704.883 Deleted 0 records from history and trends
        1836:20100219:105744.942 Sending list of active checks to [10.28.32.237] failed: host [scmonitor.xxx.xxx] not found
        1847:20100219:105747.001 Item [Zabbix Server:agent.ping] error: Got empty string from [127.0.0.1]. Assuming that agent dropped connection because of access permissions
      And here is what I get when I start the agent:

      Code:
      1927:20100219:105914.037 zabbix_agentd started. Zabbix 1.8.1 (revision 9702).
        1927:20100219:105914.037 cfg: para: [LogFile] val [/tmp/zabbix_agentd.log]
        1927:20100219:105914.037 cfg: para: [DebugLevel] val [4]
        1927:20100219:105914.037 cfg: para: [Server] val [10.28.32.237]
        1927:20100219:105914.037 cfg: para: [Hostname] val [scmonitor.xxx.xxx]
        1928:20100219:105914.037 zabbix_agentd collector started
        1929:20100219:105914.037 zabbix_agentd listener started
        1930:20100219:105914.038 zabbix_agentd listener started
        1931:20100219:105914.038 zabbix_agentd listener started
        1932:20100219:105914.038 zabbix_agentd active check started [10.28.32.237:10051]
        1932:20100219:105914.038 In init_active_metrics()
        1932:20100219:105914.038 In send_buffer('10.28.32.237','10051')
        1932:20100219:105914.038 Values in the buffer 0 Max 100
        1932:20100219:105914.038 refresh_active_checks('10.28.32.237',10051)
        1932:20100219:105914.038 Sending [{
              "request":"active checks",
              "host":"scmonitor.xxx.xxx"}]
        1932:20100219:105914.038 Before read
        1928:20100219:105914.039 In collector_diskdevice_get("")
        1928:20100219:105914.039 In collector_diskdevice_add("")
        1932:20100219:105914.041 Got [{
              "response":"failed",
              "info":"host [scmonitor.xxx.xxx] not found"}]
        1932:20100219:105914.041 In parse_list_of_checks()
        1932:20100219:105914.041 In disable_all_metrics()
        1932:20100219:105914.041 No active checks on server: host [scmonitor.xxx.xxx] not found
        1932:20100219:105914.041 In process_active_checks('10.28.32.237',10051)
        1932:20100219:105914.041 In get_min_nextcheck()
        1932:20100219:105914.041 Sleeping for 1 seconds
        1932:20100219:105915.041 In send_buffer('10.28.32.237','10051')
        1932:20100219:105915.041 Values in the buffer 0 Max 100
      I made sure that the hostname parameter in the /etc/zabbix/zabbix_agentd.conf file did match the value entered in the front end and that the Server parameter has the IP address and not the loopback address.

      Could it have something to do with the hosts file config? Just in case, here it is:

      [CODE]127.0.0.1 localhost
      #127.0.1.1 scmonitor.xxx.xxx scmonitor
      10.28.32.237 scmonitor.xxx.xxx scmonitor

      # The following lines are desirable for IPv6 capable hosts
      ::1 localhost ip6-localhost ip6-loopback
      fe00::0 ip6-localnet
      ff00::0 ip6-mcastprefix
      ff02::1 ip6-allnodes
      ff02::2 ip6-allrouters
      ff02::3 ip6-allhosts
      CODE]

      Any ideas?

      Comment

      • MrKen
        Senior Member
        • Oct 2008
        • 652

        #4
        You still have "host [scmonitor.xxx.xxx] not found" in both the Server log and the Agent log.

        Also, the Server log still has a reference to 127.0.0.1
        "Got empty string from [127.0.0.1]

        You changed the IP address in the zabbix_agentd.conf, but did you also change the Host IP in the gui.

        Another source of confusion for many people is the Hostname in Zabbix. On the Configuration --> Hosts gui page, the first three columns are Name, DNS, IP. In zabbix the Host or Hostname refers to the first collumn 'Name'. Name can be anything you want, e.g MyServer, QwertyNMS, BilliesStuff, etc. The name has nothing to do with DNS. So a name like 'scmonitor' is okay, it doesn't need to be 'scmonitor.xxx.xxx' but it can be if you want - so long as it is exactly the same in the gui and in the conf file.

        You're getting closer.

        MrKen
        Disclaimer: All of the above is pure speculation.

        Comment

        • QwErTy_LoGiC
          Member
          • Feb 2010
          • 66

          #5
          Getting closer...

          I think you really nailed it MrKen. Seems the differences in the name column caused this. Everything is uniform and working!

          Thanks MrKen!

          Comment

          Working...