Ad Widget

Collapse

Zabbix milestone achieved - 1,000 hosts monitored

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #1

    Zabbix milestone achieved - 1,000 hosts monitored

    Today I enabled host number 1,000

    1 Zabbix App server (RHEL - HP DL360 G6, DQC 2.40Ghz , 8GB RAM)
    1 Zabbix DB server (RHEL, MySql - HP DL380 G6, DQC 2.53Ghz, 24GB RAM, SAN storage)
    13 Zabbix proxies (RHEL, MySql - HP DL360 G6, DQC 2.40Ghz, 8GB RAM)
    1 Zabbix Administrator

    Hosts span OS's of Win, AIX, HP-UX, Linux and Solaris
    Zabbix version 1.8.2
    All items are passive

    Made possible through talented teammates and Zabbix support.
    Attached Files
    Last edited by tchjts1; 29-08-2010, 18:36.
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    i'd like to point out that "new values per second" is quite... high
    Zabbix 3.0 Network Monitoring book

    Comment

    • tchjts1
      Senior Member
      • May 2008
      • 1605

      #3
      Always a critic

      Almost all items have update intervals of 60 seconds or more and I have the majority of un-needed items disabled. If you have a good idea of how to lower the nvps... I'm all ears!

      Comment

      • richlv
        Senior Member
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2005
        • 3112

        #4
        it was not a criticism - just an observation regarding the installation size
        Zabbix 3.0 Network Monitoring book

        Comment

        • tchjts1
          Senior Member
          • May 2008
          • 1605

          #5
          I'm still all ears. I'll let you buy me a drink in Colorado in August.

          For what it's worth... Alexei says that 1.8.2 should be able to handle up to 1,000 nvps. 1.8.3 should be able to handle 2,000+ nvps.

          I would like to lower the current value, but I don't want to get to a point where graphing takes a hit (broken lines, dots, etc).

          Still an achievement considering about a month ago we were sitting at about 250 hosts monitored.

          Comment

          • richlv
            Senior Member
            Zabbix Certified Trainer
            Zabbix Certified SpecialistZabbix Certified Professional
            • Oct 2005
            • 3112

            #6
            Originally posted by tchjts1
            I would like to lower the current value, but I don't want to get to a point where graphing takes a hit (broken lines, dots, etc).
            btw, increasing intervals wouldn't lead to broken graphs - it would only lead to less granularity in them (and collected data)
            Zabbix 3.0 Network Monitoring book

            Comment

            • tchjts1
              Senior Member
              • May 2008
              • 1605

              #7
              That's good info. I may be able to do some tweaking on items where granularity is not so important.

              nvps trimmed:
              Attached Files
              Last edited by tchjts1; 02-07-2010, 18:32.

              Comment

              • walterheck
                Senior Member
                • Jul 2009
                • 153

                #8
                Very nice! Can you tell us a bit more about the hardware specs of the zabbix server and the database server? We hope to get to (and go far over ) 1000 hosts as well within the next couple of months.
                Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

                Comment

                • tchjts1
                  Senior Member
                  • May 2008
                  • 1605

                  #9
                  Sure. This is all new hardware specifically purchasded for the Zabbix project.

                  Zabbix APP: HP DL360 G6, DQC 2.40Ghz , 8GB RAM
                  Zabbix DB: HP DL380 G6, DQC 2.53Ghz, 24GB RAM, SAN storage
                  13 Proxies: HP DL360 G6, DQC 2.40Ghz, 8GB RAM

                  Beefy machines for proxies, but our intentions are to be able to deploy additional tools to these if we desire.

                  Here's a few screenshots (1 month view) to show the loads on these servers.
                  You can see where tweaking has reduced loads in the past week on the DB and MySql. We still have some tuning to do.

                  On the APP Server graph for processed values per second, the spikes were from using automysqlbackup script to do backups on the DB server. We stopped using that about a week ago, and we now take a Linux snapshot of the DB and do the backup from that. automysqlbackup was basically locking the DB for 45 minutes, then flooding the App server when it unlocked causing the spikes.
                  Attached Files
                  Last edited by tchjts1; 03-07-2010, 18:45.

                  Comment

                  • tchjts1
                    Senior Member
                    • May 2008
                    • 1605

                    #10
                    Continuation of graphs. The last 3 graphs are of our heavier used proxy with about 275 enabled hosts reporting through that one currently.
                    Attached Files
                    Last edited by tchjts1; 03-07-2010, 18:49.

                    Comment

                    • walterheck
                      Senior Member
                      • Jul 2009
                      • 153

                      #11
                      That looks quite nice! I dream of getting my hands on hardware like that as a MySQL consultant, haha!
                      Although I'm surprised that the qps on your mysql server hovers around 1000. I presume you use the value from \s and not the "real" value from the delta's of questions in the variable list of mysql?
                      Which MySQL version and 'distro' are you using?

                      Also, is there any failover on the DB side? SAN for a DB server always makes me nervous. What if the SAN fails?

                      Thanks for sharing by the way, and sorry for all the questions
                      Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

                      Comment

                      • tchjts1
                        Senior Member
                        • May 2008
                        • 1605

                        #12
                        Originally posted by walterheck

                        Thanks for sharing by the way, and sorry for all the questions
                        I don't mind at all. That's what these forums are for.

                        Originally posted by walterheck
                        That looks quite nice! I dream of getting my hands on hardware like that as a MySQL consultant, haha!
                        Although I'm surprised that the qps on your mysql server hovers around 1000. I presume you use the value from \s and not the "real" value from the delta's of questions in the variable list of mysql?
                        Which MySQL version and 'distro' are you using?

                        Also, is there any failover on the DB side? SAN for a DB server always makes me nervous. What if the SAN fails?
                        Queries per second is gathered from the default Userparameter that is included in the zabbix_agentd.conf file:
                        UserParameter=mysql.questions,mysqladmin -uxxx -pxxx status|cut -f4 -d":"|cut -f1 -d"S"

                        And the "Store value" for that item is "Delta (speed per second)"

                        mysql Ver 14.14 Distrib 5.1.45, for unknown-linux-gnu (x86_64) using readline 5.1

                        And this is all on RHEL 5.4

                        Regarding DB failover. Yeah, that's a sticking point right now. We are still deciding which way to go with that. It's more in the hands of our Linux SME than it is me. But if it were to fail today, I assume we would throw the latest copy of the backup we have onto one of our lab Zabbix DB machines and simply point to it. There'd be some data loss, but it would only be a max of the time bewteen last backup and restore time.

                        Certainly not the best solution, and not where we intend to be for long.

                        Comment

                        • walterheck
                          Senior Member
                          • Jul 2009
                          • 153

                          #13
                          Originally posted by tchjts1
                          I don't mind at all. That's what these forums are for.
                          True that

                          Originally posted by tchjts1
                          Queries per second is gathered from the default Userparameter that is included in the zabbix_agentd.conf file:
                          UserParameter=mysql.questions,mysqladmin -uxxx -pxxx status|cut -f4 -d":"|cut -f1 -d"S"
                          That is not really a very accurate value. It would be better to use the numbers from the show variables output, they represent a much more accurate value. You might want to look into an extended zabbix mysql monitoring script that one of my coworkers started here: https://launchpad.net/ourdelta-zabbix-scripts

                          Originally posted by tchjts1
                          mysql Ver 14.14 Distrib 5.1.45, for unknown-linux-gnu (x86_64) using readline 5.1
                          I'd recommend you to take a look at MariaDB (http://montyprogram.com). This is a mysql fork that was started by the original MySQL founder after he left when Oracle took over. It's great and it offers much more perspective towards the future. Monty Program is currently employing most of the core devs that used to work on MySQL. MariaDB 5.1 is already much better then stock MySQL, but the upcoming 5.2 is where the benefits really start to come out: since MP employs the whole MySQL optimiser team, they have managed to make impressive improvements for the upcoming version. Definitely worth investigating!

                          Originally posted by tchjts1
                          Regarding DB failover. Yeah, that's a sticking point right now. We are still deciding which way to go with that. It's more in the hands of our Linux SME than it is me. But if it were to fail today, I assume we would throw the latest copy of the backup we have onto one of our lab Zabbix DB machines and simply point to it. There'd be some data loss, but it would only be a max of the time bewteen last backup and restore time.
                          Wow, scary! With the amount of data you're gathering, that could easily cost you days of downtime! I suggest looking at MMM for MySQL (http://mysql-mmm.org). It allows for very easy HA MySQL setups with little hassle and automatic failover. We're using it in many production sites and it really is the close-to-perfect solution for an app like zabbix. You could even have a slave that you can then take the backups/reports off that would have no impact on performance of the master db's. Many possibilities there!

                          My .02$
                          Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

                          Comment

                          • tchjts1
                            Senior Member
                            • May 2008
                            • 1605

                            #14
                            Excellent input. I will pass the info on to my MySql DBA and Linux Engineer.

                            Comment

                            • richlv
                              Senior Member
                              Zabbix Certified Trainer
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Oct 2005
                              • 3112

                              #15
                              Originally posted by walterheck
                              That is not really a very accurate value. It would be better to use the numbers from the show variables output, they represent a much more accurate value.
                              what's inaccurate about graphing speed per second from questions ?
                              mysql qps, yes, that's not really useful for long running mysql installations.
                              Zabbix 3.0 Network Monitoring book

                              Comment

                              Working...