Ad Widget

Collapse

zabbix 1.8.4 agent.ping nodata doesn't work

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • boy01
    Junior Member
    • Dec 2007
    • 24

    #1

    zabbix 1.8.4 agent.ping nodata doesn't work

    I used to monitor zabbix agents with a trigger:

    agent.ping.nodata(180)>0

    This isn't working with 1.8.4 zabbix.

    Code:
     29238:20110217:140232.700 Item [myhost.mydom.fi:net.if.out[lo]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29238:20110217:140232.704 Zabbix Host [myhost.mydom.fi]: first network error, wait for 15 seconds
     29233:20110217:140234.702 Item [myhost.mydom.fi:vfs.fs.size[/,pfree]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29233:20110217:140234.702 Zabbix Host [myhost.mydom.fi]: another network error, wait for 15 seconds
     29245:20110217:140249.243 Item [myhost.mydom.fi:net.if.out[lo]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29245:20110217:140249.243 Zabbix Host [myhost.mydom.fi]: another network error, wait for 15 seconds
     29245:20110217:140304.247 Item [myhost.mydom.fi:proc.num[mysqld]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29245:20110217:140304.247 Zabbix Host [myhost.mydom.fi]: another network error, wait for 15 seconds
     29245:20110217:140319.251 Item [myhost.mydom.fi:vfs.fs.size[/mnt3,pfree]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29245:20110217:140319.301 Disabling Zabbix host [myhost.mydom.fi]
     29245:20110217:140519.326 Item [myhost.mydom.fi:vfs.fs.size[/opt,pfree]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
     29245:20110217:140719.355 Item [myhost.mydom.fi:vfs.fs.size[/opt,used]] error: Get value from agent failed: Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
    Why "disabling host"? Does it explain why I don't get any alarms?

    How do people check that their zabbix_agentd processes are running
    with zabbix 1.8.4?

    Read the message #8 before making any bigger conclusions...
    Last edited by boy01; 23-02-2011, 14:53. Reason: Auts! My old trigger definition was wrong. Fixed now.
  • untergeek
    Senior Member
    Zabbix Certified Specialist
    • Jun 2009
    • 512

    #2
    Cannot connect to [myhost.mydom.fi:10050] [Connection refused]
    Connection refused ought to be your first clue. If from the Zabbix server you cannot execute

    Code:
    telnet zabbix_client.example.com 10050
    Then you have firewall or network issues.

    Likewise you need to be able to run:

    Code:
    telnet zabbix_server.example.com 10051
    from the zabbix_client box.

    If these fail to connect, Zabbix is not the problem, it's your network connectivity.

    Comment

    • boy01
      Junior Member
      • Dec 2007
      • 24

      #3
      Maybe I wrote my problem too hasty...

      Problem is:

      1.zabbix_agentd dies on a client machine (for some problem condition)
      2.on zabbix_server I have a trigger: agent.ping.nodata(180)>0
      3.above trigger dosn't go off and I don't get any alarms

      It used to work with 1.6.6.

      How do you monitor your zabbix_agentd process with 1.8.4?
      Last edited by boy01; 23-02-2011, 10:21. Reason: Trigger fixed.

      Comment

      • untergeek
        Senior Member
        Zabbix Certified Specialist
        • Jun 2009
        • 512

        #4
        I use agent.ping to check and see if I'm getting data.

        I use tcp,10050 to see if the agent is on the other side and running. As a simple check, you should not depend on the agent being running to let you know it's running.

        Comment

        • boy01
          Junior Member
          • Dec 2007
          • 24

          #5
          Originally posted by untergeek
          I use agent.ping to check and see if I'm getting data.

          I use tcp,10050 to see if the agent is on the other side and running. As a simple check, you should not depend on the agent being running to let you know it's running.
          So, you have an item to check tcp/10050 port for every agent host
          on you zabbix server? And trigger for every agent, too?
          I don't like that approach. I have several hundred agent hosts...

          If I understood you wrong, could you please provide your item(s) and trigger(s)
          samples to get alarm when zabbix_agentd has died on some host..

          agent.ping.nodata was easy way to accomplish this before.
          Too bad, it isn't working anymore.

          Comment

          • untergeek
            Senior Member
            Zabbix Certified Specialist
            • Jun 2009
            • 512

            #6
            We have 425 hosts as well and have no problems with using both. We'll be doubling that number before too long.

            Yes, I am advocating using both. The reason I found is that if zabbix doesn't get a response from agent.ping it can result in UNKNOWN rather than a trigger.

            If port 10050 is down, I know that the agent is offline. It's quite easy to guarantee results.

            agent.ping is also agent based. Are you using it as an active agent or passive? Based on the number of Pollers you have you could run into trouble with passive. We are nearly completely active agent based. If you have a long time period for RefreshActiveChecks then there is the potential for timeouts and other stuff with your agent. That's why we get a warning if the agent.ping is down for more than 3 minutes and a disaster when 10050 is down for 3 minutes.

            Even with 1.6.6 agent.ping was unreliable, but it was WAY more reliable in the way you are describing. We also used it that way before. Circumstances had us adapt, and this was our solution. It's way more reliable.

            Comment

            • boy01
              Junior Member
              • Dec 2007
              • 24

              #7
              Originally posted by untergeek
              Yes, I am advocating using both. The reason I found is that if zabbix doesn't get a response from agent.ping it can result in UNKNOWN rather than a trigger.

              If port 10050 is down, I know that the agent is offline. It's quite easy to guarantee results.
              ...
              agent.ping is also agent based. Are you using it as an active agent or passive?
              Why is agent.ping UNKNOWN when zabbix_agentd isn't responding?
              I think this is a major problem here. It should raise PROBLEM, imho.
              Of course the problem could be network connection also (while passive mode), but that I don't mind. I like to know of any network problems also.

              Allmost all my agents are in passive mode.
              ----
              I tried zabbix_agentd monitoring your way...
              It's working fine, but I don't like it at all:
              -created a new template: zabbix_clients-checks
              -created item for _each_ host to be monitored "monitor zabbix_agentd on host X",
              ie. for _every_ host zabbix_agentd is supposed to be running
              -created new trigger for above item (again for every host to be monitored)
              -installed this template on zabbix server host

              This way I need to give permissions for host administrators to add
              new item AND trigger to above template for their new host. Way too complicated.

              Am I missing something here or is this the way you make it work?
              Above _does_ work ok.
              Last edited by boy01; 21-02-2011, 14:17.

              Comment

              • boy01
                Junior Member
                • Dec 2007
                • 24

                #8
                Manual 1.6:
                "agent.ping Check the agent availability.
                Always return ‘1’.
                - Can be used as a TCP ping."

                "nodata sec any Returns:
                1 – if no data received during period of time in seconds. The
                period should not be less than 30 seconds.
                0 - otherwise
                "

                Manual 1.8:
                "agent.ping
                Check the agent availability. Returns '1' if agent is available, nothing if unavailable. - Can be used as a TCP ping. Use function nodata() to check for host unavailability
                "
                "nodata sec any
                Returns:
                1 – if no data received during period of time in seconds. The period should not be less than 30 seconds.
                0 - otherwise
                "
                I can't see any difference on manual pages between versions 1.6.6 and 1.8.4
                (my old and new zabbix server). @untergeek may be right while saying
                "The reason I found is that if zabbix doesn't get a response from agent.ping it can result in UNKNOWN rather than a trigger."

                Ie. ITEM: agent.ping = unknown, BUT shouldn't agent.ping.nodata still
                return 1 and fire off the trigger ?

                ----
                I'm really confused now...my trigger seems to work now...tried it with several test machines.
                Hmm...hopefully my first message is wrong and agent.ping.nodata is OK w/ 1.8.4 version also.

                This is my current trigger:
                {my_template:agent.ping.nodata(135)}>0

                agent.ping update interval is 30s, so I'm getting an alarm when 5th poll fails.

                Maybe this thread could be removed/closed if this was my mistake from the beginning...?
                At least I can't repeat the situation I discribed in my first message.
                Last edited by boy01; 23-02-2011, 14:51.

                Comment

                • alj
                  Senior Member
                  • Aug 2006
                  • 188

                  #9
                  I hit the same problem with 1.8.6
                  Server went down, nobody noticed.
                  agent.ping.nodata(600)}=1 trigger is in UNKNOWN state

                  Please help, i need reliable way to alert on unreachable agents and I'm out of ideas.

                  Comment

                  • alj
                    Senior Member
                    • Aug 2006
                    • 188

                    #10
                    I just created simple check to monitor tcp port of agent, it receives data but trigger will not fire. I tried disabling enabling host nothing works.
                    That's why "status" and 'agent.ping.nodata' triggers didn't work for me as well - there is something wrong with trigger evaluation logic when the agent is not responsive.

                    Comment

                    • frankymryao
                      Member
                      • Oct 2011
                      • 52

                      #11
                      The same condition I had.

                      I use 'agent.ping.nodata(300)}=1' and it works well in 1.8.8. While I stop an agent, the trigger would soon alert it.

                      Comment

                      • MANKY
                        Junior Member
                        • Apr 2012
                        • 3

                        #12
                        I am writing in this thread as I find it quite similar.

                        I have the query that;
                        1) I need to monitor the ping of one of the linux server without installing the zabbix agent on the server. Is it possible?
                        2) Someone said yes that is why I am trying agent.ping, however it says "Get value from agent failed- Interrupted System Call"

                        Please let me know how this can be done.

                        Thanks

                        Comment

                        • drumspirit
                          Junior Member
                          • Mar 2014
                          • 13

                          #13
                          Hello all,

                          Does the function nodata() work with any numeric item (float or integer) ?

                          Comment

                          Working...