Ad Widget

Collapse

What are Zabbix Agent Active Checks

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tekemp
    Junior Member
    • Sep 2009
    • 5

    #1

    What are Zabbix Agent Active Checks

    Can someone help me differentiate between zabbix agent and zabbix agent (active)? When would I want to use each item type and what are their differences?

    Thanks a ton!
  • simonuk1
    Member
    • Mar 2009
    • 66

    #2
    zabbix agent is where the zabbix server talk to the agent and asks for metrics, whereas zabbix agent active send the metric it has been told to collect to the zabbix server.

    So basically using active agent means that the load i taken away from the zabbix server and given to each agent to do most of the work.

    Also zabbix active means the zabbix server doesnt have to wait for each agent to asnwer back with metric it just processes the information from the agents send the information in.

    Hope that helps

    Simon

    Comment

    • frankcheong
      Member
      • Oct 2009
      • 73

      #3
      I see, but I have created some items (details you can refer to this post) that only works when I select the item to be collected thru zabbix_agent but not zabbix_agent (active). Any help is much appreciated?

      Comment

      • tchjts1
        Senior Member
        • May 2008
        • 1605

        #4
        Originally posted by frankcheong
        I see, but I have created some items (details you can refer to this post) that only works when I select the item to be collected thru zabbix_agent but not zabbix_agent (active). Any help is much appreciated?

        On the host side, in zabbix_agentd.conf, you must have the field "Hostname=<matched hostname here>

        That must exactly match what you have for the host name in the frontend, otherwise you will not collect the metrics. if you change that field on the host, be sure to restart the agent.

        Comment

        • frankcheong
          Member
          • Oct 2009
          • 73

          #5
          Originally posted by tchjts1
          On the host side, in zabbix_agentd.conf, you must have the field "Hostname=<matched hostname here>

          That must exactly match what you have for the host name in the frontend, otherwise you will not collect the metrics. if you change that field on the host, be sure to restart the agent.
          I did fill in the same hostname on the zabbix_agentd.conf as well as the frontend.

          Comment

          • tchjts1
            Senior Member
            • May 2008
            • 1605

            #6
            Originally posted by frankcheong
            I did fill in the same hostname on the zabbix_agentd.conf as well as the frontend.
            And you restarted the agent if you made a change on that side? The only time I have seen my items be supported in passive mode but not active is when I had a mismatch in the name of the host.

            Comment

            • frankcheong
              Member
              • Oct 2009
              • 73

              #7
              Originally posted by tchjts1
              And you restarted the agent if you made a change on that side? The only time I have seen my items be supported in passive mode but not active is when I had a mismatch in the name of the host.
              I did restarted the agent, while I cannot see it run when I observe the log entries.

              Comment

              • frankcheong
                Member
                • Oct 2009
                • 73

                #8
                Originally posted by frankcheong
                I did restarted the agent, while I cannot see it run when I observe the log entries.

                With your confidence, I have retried to change all setting as zabbix_agent (active) and it works. I guess the last problem I encounter is because there are some typo in the zabbix_agentd.conf with the UserParameter whereby zabbix_agent failed to parse in all parameter and it did just skip them without prompting. Therefore it won't collect those parameter because it didn't know how to do that without a UserParameter telling it what to do.

                But the strangest thing is I cannot observe any log entries regarding those problem I mentioned even I have set the DebugLevel to 4.

                Anyway, now I can run them using active mode which save a lot of burden from the zabbix server or the zabbix proxy.

                Comment

                • frankcheong
                  Member
                  • Oct 2009
                  • 73

                  #9
                  Originally posted by frankcheong
                  With your confidence, I have retried to change all setting as zabbix_agent (active) and it works. I guess the last problem I encounter is because there are some typo in the zabbix_agentd.conf with the UserParameter whereby zabbix_agent failed to parse in all parameter and it did just skip them without prompting. Therefore it won't collect those parameter because it didn't know how to do that without a UserParameter telling it what to do.

                  But the strangest thing is I cannot observe any log entries regarding those problem I mentioned even I have set the DebugLevel to 4.

                  Anyway, now I can run them using active mode which save a lot of burden from the zabbix server or the zabbix proxy.
                  Just found from the following message from the zabbix_agentd.log :-
                  2554:20100120:113454.298 Sending [{
                  "request":"active checks",
                  "host":"hosta.domain.com"}]
                  2554:20100120:113454.298 Before read
                  2554:20100120:113454.378 Got [{
                  "response":"failed",
                  "info":"host [hosta.domain.com] not found"}]

                  Which is strange. I have setup in both the frontend UI as welll as the zabbix_agentd.conf hostname entry with the same hostname "hosta.domain.com". The strangiest thing is, all active check run without problem which really confuse me.

                  Then I have enabled DebugLevel to 4 on zabbix server and monitor the log entry and I found the below entry which is related to the active check for hosta.
                  1673:20100120:115740.783 Trapper got [{
                  "request":"active checks",
                  "host":"hosta.domain.com"}] len 58
                  1673:20100120:115740.783 In send_list_of_active_checks_json()
                  1673:20100120:115740.783 In get_hostid_by_host(host:'hosta.domain.com')
                  1673:20100120:115740.783 Query [txnlev:0] [select hostid from hosts where host='hosta.domain.com' and proxy_hostid=0 and hostid between 000000000000000 and 099999999999999]
                  1701:20100120:115740.834 In process_escalations()
                  1701:20100120:115740.834 Query [txnlev:0] [select escalationid,actionid,triggerid,eventid,r_eventid, esc_step,status from escalations where status in (0,1) and nextcheck<=1263959860 and escalationid between 000000000000000 and 099999999999999]
                  1701:20100120:115740.835 Escalator spent 0.001037 seconds while processing escalation items. Nextcheck after 3 sec.
                  1673:20100120:115740.838 Query [txnlev:1] [begin;]
                  1673:20100120:115740.838 Query [txnlev:1] [select autoreg_hostid from autoreg_host where proxy_hostid=0 and host='hosta.domain.com' and autoreg_hostid between 000000000000000 and 099999999999999]
                  1673:20100120:115740.839 In process_event(eventid:0,object:3,objectid:1)
                  1673:20100120:115740.839 In DBget_maxid events.eventid
                  1673:20100120:115740.839 Query [txnlev:1] [select nextid from ids where nodeid=0 and table_name='events' and field_name='eventid']
                  1673:20100120:115740.840 Query [txnlev:1] [update ids set nextid=nextid+1 where nodeid=0 and table_name='events' and field_name='eventid']
                  1673:20100120:115740.840 Query [txnlev:1] [select nextid from ids where nodeid=0 and table_name='events' and field_name='eventid']
                  1673:20100120:115740.841 End of DBget_maxid "events"."eventid":31283
                  1673:20100120:115740.841 Query [txnlev:1] [insert into events (eventid,source,object,objectid,clock,value) values (31283,2,3,1,1263959860,1)]
                  1673:20100120:115740.841 In process_actions() eventid:31283
                  1673:20100120:115740.841 Query [txnlev:1] [select actionid,evaltype,status,eventsource from actions where status=0 and eventsource=2 and actionid between 000000000000000 and 099999999999999]
                  1673:20100120:115740.842 End process_actions()
                  1673:20100120:115740.842 End of process_event()
                  1673:20100120:115740.842 Query [txnlev:1] [commit;]
                  1673:20100120:115740.842 Sending list of active checks to [192.168.1.1] failed: host [hosta.domain.com] not found
                  1673:20100120:115740.842 Sending [{
                  "response":"failed",
                  "info":"host [hosta.domain.com] not found"}]

                  And I have then go to postgresql issue the command :-
                  zabbix=# select hostid from hosts where host='hosta.domain.com';
                  hostid
                  --------
                  10053
                  (1 row)

                  and it can successfully return one row. I just copy and paste those hostname to ensure no typo can affect the correctness. Anyway, anyone got idea what is the problem? Coz after a while, the active check no longer work again. I guess the log entries I saw is left over from zabbix_proxy or zabbix_server.

                  Therefore, after changing the agent to zabbix_agent to zabbix_agent (active), all checking failed to start, because it failed to locate the correct host while when I issue the SQL myself, it works without problem.
                  Last edited by frankcheong; 20-01-2010, 05:49.

                  Comment

                  • frankcheong
                    Member
                    • Oct 2009
                    • 73

                    #10
                    Finally, I seems to be able to find out what caused the problem (but unluckily not the solution). Cause zabbix agent hosta.domain.com has been monitored by zabbix server servera.domain.com via zabbix proxy proxya.domain.com

                    While from the zabbix_server log entries I found that the query "select hostid from hosts where host='hosta.domain.com' and proxy_hostid=0 and hostid between 000000000000000 and 099999999999999" which assume proxy_hostid=0 that I assume means this host is not being monitored by any proxy while infact the real situation is hosta is being monitored by proxya. Does that means active check have problem when the zabbix_agent is being monitored by any zabbix_proxy? Is that a bug in version 1.8?

                    Comment

                    • frankcheong
                      Member
                      • Oct 2009
                      • 73

                      #11
                      To get it works, I tried to monitor the server direct without using proxy, while it still failed. From the zabbix_agentd.log, I found time out message:-


                      19450:20100120:155335.647 Sending [{
                      "request":"active checks",
                      "host":"servera.domain.com"}]
                      19450:20100120:155335.647 Before read
                      19449:20100120:155335.647 Before
                      19449:20100120:155335.647 Run remote command [who|wc -l] Result [7] [ 11]...
                      19449:20100120:155335.647 Sending back [11.000000]
                      19447:20100120:155335.777 Processing request.
                      19447:20100120:155335.797 Requested [net.tcp.service[smtp]]
                      19447:20100120:155335.797 Sending back [1]
                      19448:20100120:155336.257 Processing request.
                      19448:20100120:155336.277 Requested [net.if.out[eth0,bytes]]
                      19448:20100120:155336.277 Sending back [94795138810]
                      19449:20100120:155336.377 Processing request.
                      19447:20100120:155336.387 Processing request.
                      19447:20100120:155336.407 Requested [system.cpu.load[,avg5]]
                      19447:20100120:155336.407 Sending back [0.560000]
                      19449:20100120:155336.407 Requested [system.cpu.util[,nice,avg1]]
                      19449:20100120:155336.407 Sending back [0.091014]
                      19448:20100120:155336.457 Processing request.
                      19448:20100120:155336.467 Requested [vm.memory.size[free]]
                      19448:20100120:155336.467 Sending back [15376384]
                      19447:20100120:155336.547 Processing request.
                      19447:20100120:155336.567 Requested [vm.memory.size[shared]]
                      19447:20100120:155336.567 Sending back [0]
                      19449:20100120:155336.607 Processing request.
                      19449:20100120:155336.637 Requested [proc.num[syslogd]]
                      19449:20100120:155336.667 Sending back [1]
                      19448:20100120:155336.687 Processing request.
                      19448:20100120:155336.717 Requested [net.tcp.service[ssh]]
                      19448:20100120:155336.717 Sending back [1]
                      19447:20100120:155337.407 Processing request.
                      19447:20100120:155337.427 Requested [system.cpu.load[,avg15]]
                      19447:20100120:155337.427 Sending back [0.510000]
                      19449:20100120:155337.477 Processing request.
                      19449:20100120:155337.497 Requested [system.swap.size[,pfree]]
                      19449:20100120:155337.497 Sending back [11.650241]
                      19448:20100120:155337.537 Processing request.
                      19448:20100120:155337.557 Requested [system.cpu.util[,system,avg1]]
                      19448:20100120:155337.557 Sending back [5.750455]
                      19447:20100120:155337.587 Processing request.
                      19447:20100120:155337.607 Requested [vm.memory.size[buffers]]
                      19447:20100120:155337.607 Sending back [352894976]
                      19449:20100120:155337.647 Processing request.
                      19449:20100120:155337.667 Requested [proc.num[sshd]]
                      19449:20100120:155337.697 Sending back [26]
                      19450:20100120:155338.627 Timeout while answering request
                      19450:20100120:155338.627 Get active checks error: ZBX_TCP_READ() failed [Interrupted system call]
                      19450:20100120:155338.627 In process_active_checks('servera.domain.com',10051)
                      19450:20100120:155338.627 In get_min_nextcheck()
                      19450:20100120:155338.627 In send_buffer('servera.host.com','10051')
                      19450:20100120:155338.627 Values in the buffer 0 Max 100
                      19450:20100120:155338.627 Sleeping for 1 seconds
                      19448:20100120:155338.847 Processing request.

                      And in the zabbix server, I found the following log entries:-
                      1665:20100120:155302.401 Trapper got [{
                      "request":"active checks",
                      "host":"hosta.domain.com"}] len 58
                      1665:20100120:155302.401 In send_list_of_active_checks_json()
                      1665:20100120:155302.401 In get_hostid_by_host(host:'hosta.domain.com')
                      1665:20100120:155302.401 Query [txnlev:0] [select hostid from hosts where host='hosta.domain.com' and proxy_hostid=0 and hostid between 000000000000000 and 099999999999999]
                      1665:20100120:155302.457 Query [txnlev:0] [select i.itemid,i.key_,h.host,h.port,i.delay,i.descriptio n,i.type,h.useip,h.ip,i.history,i.lastvalue,i.prev value,i.hostid,i.value_type,i.delta,i.prevorgvalue ,i.lastclock,i.units,i.multiplier,i.formula,i.stat us,i.valuemapid,h.dns,i.trends,i.lastlogsize,i.dat a_type,i.mtime from hosts h,items i where i.hostid=h.hostid and h.status=0 and i.type=7 and h.hostid=10053 and h.proxy_hostid=0 and (i.status=0 or (i.status=3 and i.lastclock+600<=1263973982))]


                      It seems like the query takes quite long time to run, then I have executed the query on postgresql, it takes around 1 min before query retrieve result. As of this moment, my installation have four machine only.

                      For interest and troubleshooting purpose, I have then executed the sql without the status checking portion [(i.status=3 and i.lastclock+600<=1263973982))] and the result return almost instantly. When I double check the table items and I found that there are already four index below:-
                      Indexes:
                      "items_pkey" PRIMARY KEY, btree (itemid)
                      "items_1" UNIQUE, btree (hostid, key_)
                      "items_3" btree (status)
                      "items_4" btree (templateid)

                      Then I suspect postgresql doesn't know to combine using three different index in this query due to the items_1 index contains an additional key_ on the index, I have built an index for testing purpose using the statement:-

                      create index hostitemstatus on items (hostid, itemid, status);

                      And the query return instantly. The strangest thing is when I then delete the index hostitemstatus, the postgresql also return instantly..... which is more than weird. It seems like postgresql has been trained up for using the existing index even I have drop the new index. I have even restart the database and the query also return instantly.

                      Really don't know what to say or recommend.


                      Anyway, the zabbix_agent can not be monitored by any zabbix proxy if you have active check. Wonder if this is a bug in version 1.8.

                      Comment

                      Working...