Ad Widget

Collapse

Auto Discovery in Large Env.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • natalia
    Senior Member
    • Apr 2013
    • 159

    #1

    Auto Discovery in Large Env.

    Hello,

    My env. is 5 different locations with ~1000 hosts in each location.
    I am going to install Zabbix server 2.2.3 (2.4 once it will be released) and proxy in each location.
    Is it any option that auto discovery will running only on hosts that zabbix_agent installed, instead of scan whole IP range ?

    There is another option to use auto registration for the new hosts and then via API add host to relevant host groups and assign to templates
    but Actions of Discovery is very useful and I will prefer to use it.

    What is the best practices for auto discovery in Large Env. ?

    Thanks
  • Colttt
    Senior Member
    Zabbix Certified Specialist
    • Mar 2009
    • 878

    #2
    maybe something like this:
    Debian-User

    Sorry for my bad english

    Comment

    • natalia
      Senior Member
      • Apr 2013
      • 159

      #3
      Originally posted by Colttt
      Thanks for the reply !

      Discovery & Actions allow assign host to host group/templates according files, process, services,etc. installed on the host and change when it modified in the host , in agent auto-registration I don't have this functionality since it will scan host only once when zabbix_agent will be installed and I don't have option to check process, services,etc. installed on the host (HostMetadata not allowed several values separate with , and also when I change it - nothing will be update).
      Let me know if I am wrong ?

      I tried config Discovery Checks only according Zabbix agent "agent.hostname" instead of "IP address" but it didn't work - should it work ?



      More questions :-)
      1. I define discovery interval 3600 sec. but it actually running more time ~1 day - should I increase StartDiscoverers=50 (in my env. now 231 hosts monitoring by Zabbix server 2.2.2 and 323 hosts via proxy) ?
      2. What is the correct way for discovery, one discovery rule for all checks ( I will have ~50 checks) or split it for several discovery rules ?

      Thanks a lot for your help !
      Last edited by natalia; 11-06-2014, 07:43.

      Comment

      • natalia
        Senior Member
        • Apr 2013
        • 159

        #4
        Can someone share his experience for discovery in Large Env. ?

        I see only one process in proxy that running discovery :

        zabbix_proxy: discoverer #12 [processed 1 rules in 70583.173542 sec, performing discovery]

        Is it normal behavior ?

        Discovery with 7 agent checks is stuck (it check only 11 hosts from 370) and in
        DB (mysql) in drules table I see nextcheck "0"

        Should discovery work via proxy ?

        What is the correct way for discovery, one discovery rule for all checks ( I will have ~50 checks) or split it for several discovery rules ?

        Zabbix 2.2.3 & 2.2.4

        Thanks !
        Last edited by natalia; 01-07-2014, 14:05.

        Comment

        • ingus.vilnis
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Mar 2014
          • 908

          #5
          Natalia,

          Many questions in these posts but I will to answer some of them and share my thoughts.

          Here is how I see you should configure discovery:
          https://www.zabbix.com/documentation...discovery/rule
          • Configuration -> Discovery -> Create discovery rule
          • Give the rule a name, something unique for each proxy or subnet in your case
          • Discovery by proxy - select proxy if subnet is behind one
          • Give a valid IP range of desired subnet like 192.168.1.1-254
          • Delay in sec depends on your needs. 3600 seconds (1h) is ok for a start
          • Now the part which I did not understand from your comments. You told something about 50 checks. Did you probably mean items (metrics you want to monitor for each host)? Could you please explain or confirm that you need these 50 checks? Will you assign different templates according to these checks?
          • Moving on, add a new discovery check. Type: Zabbix agent, Port: 10050, Key: system.uname
          • You can also add ICMP ping here, can be useful, but the more checks you will add, the more time and resources the Discovery process will take.
          • Device uniqueness criteria: IP address would probably be best here
          • Check Enabled
          • Save


          You told that your discovery process takes more than an hour. It could be because of too many added checks for discovery. Check your Zabbix internal processes and how busy the discoverers are. If they are 100% used all the time, please increase StartDiscoverers= parameter in zabbix_server.conf and zabbix_proxy.conf files accordingly. Don't forget to restart zabbix_server or proxy processes after editing the config. Anyways, the discovery in larger environments can take pretty much time.

          Now the next thing. Adding the discovered hosts to monitoring and assign templates to them.
          • Configuration -> Actions -> select Discovery from top right dropdown
          • In default Zabbix setup there is already an example Auto discovery. Linux servers.
          • The config there is like this:
          • Received value like Linux
          • Discovery status = Up
          • Service type = Zabbix agent
          • Add to host groups: Linux servers
          • Link to templates: Template OS Linux
          • Then add another action for Windows etc.
          • The only problem here is that you can not add the discovered hosts already to be monitored by proxy. You have to do that manually with Mass update feature later in frontend.


          I really hope I could give you some new ideas about this. By modifying these suggestions you could probably set up some pretty nice discovery of your network. Please ask if there is something more unclear to you.

          Best Regards,
          Ingus

          Comment

          • natalia
            Senior Member
            • Apr 2013
            • 159

            #6
            Ingus,

            First of all, Many thanks for your help and that you spend your time to understand my problems !!!!

            Originally posted by ingus.vilnis
            [*]Now the part which I did not understand from your comments. You told something about 50 checks. Did you probably mean items (metrics you want to monitor for each host)? Could you please explain or confirm that you need these 50 checks? Will you assign different templates according to these checks?
            In discovery rule I added 50 agent checks, for example :

            Zabbix agent "agent.hostname"
            Zabbix agent "system.sw.os[name]"
            Zabbix agent "system.uname"
            Zabbix agent "vfs.file.exists[/etc/init.d/apache]"
            Zabbix agent "vfs.file.exists[/etc/init.d/jboss]"
            Zabbix agent "vfs.file.exists[/etc/init.d/tomcat]"
            Zabbix agent "vfs.file.exists[/etc/init.d/mysql]"
            Zabbix agent "vfs.file.exists[/etc/init.d/named]"
            ....

            with uniqueness criteria : Zabbix agent "agent.hostname"

            Then, in Actions -> discovery I will assign host to different templates and host groups according to these checks

            Originally posted by ingus.vilnis
            [*]You can also add ICMP ping here, can be useful, but the more checks you will add, the more time and resources the Discovery process will take.[*]Device uniqueness criteria: IP address would probably be best here
            when I define "uniqueness criteria: IP address" it will add all VIP as hosts but it's not good since not data is coming with this hostname and I see all VIP hosts in delay queue

            Originally posted by ingus.vilnis
            You told that your discovery process takes more than an hour. It could be because of too many added checks for discovery. Check your Zabbix internal processes and how busy the discoverers are. If they are 100% used all the time, please increase StartDiscoverers= parameter in zabbix_server.conf and zabbix_proxy.conf files accordingly. Don't forget to restart zabbix_server or proxy processes after editing the config. Anyways, the discovery in larger environments can take pretty much time.
            in proxy I define : StartDiscoverers=30, according the graph only busy 4.14% (avg.)
            in Zabbix server : StartDiscoverers=50 , according the graph only busy 0.0014% (avg.)
            Discovery rule is define to run by proxy and I see on proxy server only one process that is running discovery :

            root 16954 16763 0 Jun29 ? 00:00:01 zabbix_proxy: discoverer #8 [processed 0 rules in 0.000228 sec, idle 60 sec]
            root 16955 16763 0 Jun29 ? 00:00:01 zabbix_proxy: discoverer #9 [processed 0 rules in 0.000140 sec, idle 60 sec]
            root 16956 16763 0 Jun29 ? 00:00:01 zabbix_proxy: discoverer #10 [processed 0 rules in 0.000313 sec, idle 60 sec]
            root 16957 16763 0 Jun29 ? 00:00:01 zabbix_proxy: discoverer #11 [processed 0 rules in 0.000138 sec, idle 60 sec]
            root 16958 16763 0 Jun29 ? 00:00:09 zabbix_proxy: discoverer #12 [processed 1 rules in 51721.168174 sec, performing discovery]

            so, my conclusion was that discovery is using only one process per discovery rule and I decided to split 50 checks to separate discovery rules but it didn't help and some of them was stuck and nothing was happened.

            I know that there is another option to use auto registration for the new hosts and then via API add host to the relevant host groups and assign to templates but Actions -> discovery is very useful and I will prefer to use it.

            Another question, is it any way to define in discovery rule to check only hosts with zabbix_agent installed, instead of scan whole IP range ?

            Originally posted by ingus.vilnis
            [*]The only problem here is that you can not add the discovered hosts already to be monitored by proxy. You have to do that manually with Mass update feature later in frontend.
            not necessary since discovery in performed by proxy and it automatically add host as monitored by this proxy

            Thanks again !

            Comment

            • ingus.vilnis
              Senior Member
              Zabbix Certified Trainer
              Zabbix Certified SpecialistZabbix Certified Professional
              • Mar 2014
              • 908

              #7
              Natalia,

              You know it is a pleasure to help people with reasonable questions here! At the same time it is a great opportunity to get new knowledge and skills. Sure, not always are all advice helpful or 100% accurate (as sometimes expected here), but we do our best.

              Thank you for such detailed description! Post by post I start to understand more and more about your environment and now I see that there is no place for simple advice and solutions.

              After some in-depth analysis and consulting I have the following comments for you:
              You are right that each discovery rule is processed by one StartDiscoverer process. I did not pay enough attention to this detail before. So after checking the source code I can say that splitting your check in to multiple smaller will utilize more discoverer processes.

              so, my conclusion was that discovery is using only one process per discovery rule and I decided to split 50 checks to separate discovery rules but it didn't help and some of them was stuck and nothing was happened.
              Not sure why that did not help in your situation. Some detailed debug information would be heplful. However another idea in this case would be to use multiple rules, with all 50 agent checks BUT limiting IP range for each rule, so that each rule processes something like 20 IP addresses. There is a nice Clone feature for these rules, so you can set it up pretty easily.

              Another question, is it any way to define in discovery rule to check only hosts with zabbix_agent installed, instead of scan whole IP range ?
              No. How can discoverer process know whether the agent exists on the host or not other than scanning each IP one by one? The good thing however is that the basic scan of IP is done very fast. Checking the agent items then takes more time.

              Best Regards,
              Ingus

              Comment

              • natalia
                Senior Member
                • Apr 2013
                • 159

                #8
                Ingus,

                I am very appreciative for your help :-)

                see below my comments :-)

                Originally posted by ingus.vilnis
                Not sure why that did not help in your situation. Some detailed debug information would be heplful.
                I spend a lot of time to debug/understand with no success (-:
                I don't keep all the info, but something very strange - in table "drules" for the rule performing by proxy nextcheck value always "0" and never change.

                after several times that I delete all discovery rules and actions,
                I find that if I create 2 discovery rules :
                1. Zabbix agent "agent.hostname"
                Zabbix agent "system.sw.os[name]"

                2. the second with all my checks ( I define 8 checks for now)

                it's working, the first rule was completed very fast but the second is running during 1 day

                btw, after upgrade to 2.2.4 if I modify discovery rule (save it) it delete "discovery check" of this rule in all actions
                I want to recheck this in another env. before I will open a bug

                Originally posted by ingus.vilnis
                However another idea in this case would be to use multiple rules, with all 50 agent checks BUT limiting IP range for each rule, so that each rule processes something like 20 IP addresses. There is a nice Clone feature for these rules, so you can set it up pretty easily.
                Good idea !!! will check this ... do you know if it any limit how many rules can be created ? or it just depend of number StartDiscoverers is config on server/proxy ?

                Originally posted by ingus.vilnis
                No. How can discoverer process know whether the agent exists on the host or not other than scanning each IP one by one? The good thing however is that the basic scan of IP is done very fast. Checking the agent items then takes more time.
                maybe I will open feature request for this.
                btw, if auto registration will be rerun every time when metadata change on the host it will be very useful ! and I will no need any more to run 50 checks in network discovery (I find some feature request for this but without due date)

                Comment

                • ingus.vilnis
                  Senior Member
                  Zabbix Certified Trainer
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Mar 2014
                  • 908

                  #9
                  Natalia,

                  I checked the info you provided.

                  Originally posted by natalia
                  I don't keep all the info, but something very strange - in table "drules" for the rule performing by proxy nextcheck value always "0" and never change.
                  I did some tests. Nextcheck value in drules table sets to other than 0 the moment when discovery rule is completely processed. If you have 0 there then the rule either is not started or not finished because of huge subnet or many checks.

                  Originally posted by natalia
                  it's working, the first rule was completed very fast but the second is running during 1 day
                  Yes, 8 checks for each IP can be very time consuming. But for a test you can create a rule with 8 checks and check only a few hosts (IP 1-5). How much time does it take then?

                  Originally posted by natalia
                  btw, after upgrade to 2.2.4 if I modify discovery rule (save it) it delete "discovery check" of this rule in all actions
                  I want to recheck this in another env. before I will open a bug
                  I just saw your recently created bug reports and feature requests. I can confirm that this issue happens in my system as well. Probably as a workaround you could avoid using Discovery check in Actions but use some other parameters instead?

                  Best Regards,
                  Ingus

                  Comment

                  • natalia
                    Senior Member
                    • Apr 2013
                    • 159

                    #10
                    Thanks a lot, Ingus !

                    Originally posted by ingus.vilnis
                    Natalia,
                    I did some tests. Nextcheck value in drules table sets to other than 0 the moment when discovery rule is completely processed. If you have 0 there then the rule either is not started or not finished because of huge subnet or many checks.
                    I changed delay interval to 36000 , will update you if it will be change Nextcheck value (I think 10h should be enough to complete discovery)
                    Hope that it's not a bug because it's happened only when I am running discovery via proxy


                    Originally posted by ingus.vilnis
                    Yes, 8 checks for each IP can be very time consuming. But for a test you can create a rule with 8 checks and check only a few hosts (IP 1-5). How much time does it take then?
                    is it any way to know exactly how log it took ? without running proxy in debug mode
                    I created now 2 more discovery rules for only 8 IP's with delay 600, without any actions - will update if value in drules will be changed

                    Originally posted by ingus.vilnis
                    Probably as a workaround you could avoid using Discovery check in Actions but use some other parameters instead?
                    what else can I use instead of Discovery check in Actions ?
                    I didn't find any alternative (-:

                    Comment

                    • ingus.vilnis
                      Senior Member
                      Zabbix Certified Trainer
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Mar 2014
                      • 908

                      #11
                      Natalia,

                      Originally posted by natalia
                      I changed delay interval to 36000 , will update you if it will be change Nextcheck value (I think 10h should be enough to complete discovery)
                      Hope that it's not a bug because it's happened only when I am running discovery via proxy
                      I don't think it is a bug. I had the same on my test DB either with or without proxy. Nextcheck was 0 for the first time.

                      Originally posted by natalia
                      is it any way to know exactly how log it took ? without running proxy in debug mode
                      I created now 2 more discovery rules for only 8 IP's with delay 600, without any actions - will update if value in drules will be changed
                      You can simply run a rule, make a coffee and check if the discovery has finished and then check approximate time. Anyway, do some tests and then we can discuss the results.

                      Originally posted by natalia
                      what else can I use instead of Discovery check in Actions ?
                      I didn't find any alternative (-:
                      Received value like .. maybe? Another thing is that we will get your discovery to work correctly some day and then you will not need to edit the rules after that.

                      Best Regards,
                      Ingus

                      Comment

                      • natalia
                        Senior Member
                        • Apr 2013
                        • 159

                        #12
                        Originally posted by ingus.vilnis
                        I don't think it is a bug. I had the same on my test DB either with or without proxy. Nextcheck was 0 for the first time.

                        You can simply run a rule, make a coffee and check if the discovery has finished and then check approximate time. Anyway, do some tests and then we can discuss the results.
                        I made a coffee :-) and looks like both rules are completed but "Nextcheck" still 0.
                        in your env. Nextcheck value changed ?

                        Originally posted by ingus.vilnis
                        Received value like .. maybe? Another thing is that we will get your discovery to work correctly some day and then you will not need to edit the rules after that.
                        Received value not good because it return 0 or 1 for the all checks (-:
                        only if I will change the return value instead of number 0:1 to be text "JBoss",...

                        Thanks !

                        Comment

                        • ingus.vilnis
                          Senior Member
                          Zabbix Certified Trainer
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Mar 2014
                          • 908

                          #13
                          Originally posted by natalia
                          I made a coffee :-) and looks like both rules are completed but "Nextcheck" still 0.
                          in your env. Nextcheck value changed ?


                          Received value not good because it return 0 or 1 for the all checks (-:
                          only if I will change the return value instead of number 0:1 to be text "JBoss",...

                          Thanks !
                          Natalia,
                          So what I did.
                          1. Made a coffee as well
                          2. Created a new Discovery rule with 2 checks for 20 IPs with no proxy
                          3. Nextcheck was 0 for the new rule
                          4. After a few minutes I had 20 hosts discovered
                          5. My nextcheck value turned from 0 to 1404390513 and keeps changing every minute. (I had 60 seconds as interval)

                          Regarding received value I have to think then. Probably there is no other option than to have discovery check.

                          Ingus

                          Comment

                          • natalia
                            Senior Member
                            • Apr 2013
                            • 159

                            #14
                            Originally posted by ingus.vilnis
                            Natalia,
                            So what I did.
                            1. Made a coffee as well
                            2. Created a new Discovery rule with 2 checks for 20 IPs with no proxy
                            3. Nextcheck was 0 for the new rule
                            4. After a few minutes I had 20 hosts discovered
                            5. My nextcheck value turned from 0 to 1404390513 and keeps changing every minute. (I had 60 seconds as interval)

                            Regarding received value I have to think then. Probably there is no other option than to have discovery check.

                            Ingus
                            could you test the above steps with proxy ?
                            I think, that "proxy" is a problematic here ?

                            Thanks

                            Comment

                            • ingus.vilnis
                              Senior Member
                              Zabbix Certified Trainer
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Mar 2014
                              • 908

                              #15
                              Originally posted by natalia
                              could you test the above steps with proxy ?
                              I think, that "proxy" is a problematic here ?

                              Thanks
                              Yes, I did that and I can now confirm that nextcheck is not set if rule processed by proxy (v. 2.2.3).
                              You can create a bug report on that if you really need this. I did not find any ZBX on this. Only a feature request to show this status value in frontend.

                              Best Regards,
                              Ingus

                              Comment

                              Working...