Ad Widget

Collapse

Zabbix Proxy Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Zabbix Proxy Performance

    Originally posted by MarkusL View Post
    Our goal is monitoring small customer networks. We are working with one zabbix-proxy per customer.

    At the moment we are migrating with a complete new installation our old 1.6.5-zabbix-server/proxies to a fresh and basic redesigned 1.8.4-zabbix-server/proxy-environtment.

    We really love the model server <-> proxy in zabbix. Btw, all proxies are configured to store config for 48h in case of longer internet-downtimes at us or customer. In this case, proxies store all collected data in local db and deliver at reconnecting. With little tricks in trigger-dep. we only get one alert; no nodata or other things popping up. GREAT!

    P.S.: very interesting thread! I´m really interested to hear from other people working with zabbix-proxies.
    MarkusL, I'm still learning about the behaviours and limits of Zabbix Proxy.

    We're using virtual machines (VMWare and Xen) to provide 10 proxies, each one with 4 CPU Cores and 4GB RAM. But it's not using more than 1GB RAM and CPU usage never hits 20%.

    The proxies are separated by function:
    2x Trappers (for integration w/ other internal systems w/o agent or snmp)
    2x SNMP checks (eg. Cisco switches)
    6x Agent (mixed agent+icmp+smnp)

    So I config the start pollers corresponding with your main function.

    I'm using SQLite and collected data still stored locally for 24 hours.

    The stats show me the limit for my proxies around 500vps. Above this value we could see many late checks (queue).

    #2
    we have a small environment..

    Hosts: 28
    Items: 3338
    Trigger: 303
    vps: 52

    OS: Debian 6 32bit
    DB: PostgreSQL (nobody knows the future about mysql, because Oracle)

    Server: HP ProLiant G4 DL380, 3,4GHz Xeon, 1GB-RAM
    Debian-User

    Sorry for my bad english

    Comment


      #3
      Originally posted by xsbr View Post
      The stats show me the limit for my proxies around 500vps. Above this value we could see many late checks (queue).
      Wow! 500vps per proxy is impressive!
      Very interesting for me, that the proxy<->server is scalable. Surely it´s a question of proxy-performance and internet-connection.

      As we are dealing with separated customer networks, each of our proxy collects data around 5 to 40 hosts with 1000 to 10000 items / triggers. It´s really not that much; thats why we can use Intel Atom.

      At the moment our firewall is around 130kB/s permanently used by all zabbix-proxies.

      Our proxies are "all-in-one"-proxies; no separation of services. BUT we designed all our templates to NOT collect that much log- and text-data, simply to reduce bandwidth and at the end of the line keep our central DB "small" (about 100GB at the moment, storing 2 days up to 1 year of data).
      For example eventlog-Monitoring in Windows: we do not collect eventlogs over zabbix-agent. We have a lot of planned tasks (2008 R2) with trigger "by event" pushing per zabbix_sender only a value to a corresponding item (f.e. Windows-Server-Backup finished succcessfull -> zabbix_sender value 0 to winevent[winsrvbackup]).
      For SNMP-traps we use our own routine to translate the traps and strap out all unnessasary data; we only get "the message" transfered to our server. This is directly implenented in our proxies; so their most importent job is to keep WAN-transfered data as small as possible. Specially with log-file-monitoring we have had little problems in past (our old system). To much log-data slowed down the hole transfer to our server.

      Another important thing in our server<->proxy-constellation is time.
      Our zabbix-server is configured as ntp-server; all proxies pull data from our zabbix-server; all devices (ups, firewall, etc.), linux-hosts and Windows-DCs pull ntp-data from the proxies. A simple way to keep the whole time-scenario "consistent".


      Kind regards,

      Markus.

      Comment


        #4
        Actually performance of 500vps can be achieved even on embedded fan-less hardware like Soekris or similar products provided SQLite is used for data storage.
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment


          #5
          Originally posted by Alexei View Post
          Actually performance of 500vps can be achieved even on embedded fan-less hardware like Soekris or similar products provided SQLite is used for data storage.
          Really, proxies process use very few resources. Because of this, we're using only virtual machines.

          The limitation of proxy is the network latency and the time to retrieve the information from host.

          Comment


            #6
            Hello,
            I am trying to figure out how to optimize my Zabbix proxies.
            I was thinking on putting the whole proxy database in a tmpfs?
            So it would just need a bootup script to recreate the database in case of restart.

            Do you think it's a relevant idea?
            Last edited by tof233; 10-05-2011, 16:44.

            Comment


              #7
              Originally posted by tof233 View Post
              Hello,
              I was thinking on putting the whole proxy database in a tmpfs?
              Make sure that the your bottleneck is Disk IO.

              If proxy needs 100ms to get a value, the poller will be "sleeping" until get the value. In this case, increase pollers is the solution.

              Originally posted by tof233 View Post
              So it would just need a bootup script to recreate the database in case of restart.
              If you're using SQLite, zabbix proxy creates it automatically.

              Comment


                #8
                Thank you for your answer,
                Right now, our proxy isn't overloaded. It's more a question about if it becomes in the future (1 year or more).
                In fact I saw there is the Mermory (HEAP) engine in MySQL. It keeps the tables structure on disk and their datas in memory.
                http://dev.mysql.com/doc/refman/5.0/...ge-engine.html

                Comment


                  #9
                  We changed from "One master sever and five child servers" to "One master server and two proxy servers". And it can support 30w items and 500 vps(month ago it is 40w+ items and 1300+vps). I think server-proxy module is more suitable for large scalability.

                  For master-child module, We once faced a strange issue - master will refresh the child configuration. Because of the complex network environment in China, our master server cannot connect with child servers well. That makes the config information of master and child is not the same. The child will 'obey' the master - for example: a host is in one child server and not in master child, it will vanish in that child server just because the sync between master and child!

                  Comment

                  Announcement

                  Collapse
                  No announcement yet.
                  Working...
                  X