Ad Widget

Collapse

EC2 zabbix architecture

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

    EC2 zabbix architecture

    Hello,
    I work for a small company which is fully EC2 hosted.
    Currently we monitor:
    - 360 hosts
    - 52936 items
    - 11387 triggers
    - 500 vps

    For that, we use:
    - 1 zabbix-server instance (Zabbix 1.8.3, 8GB RAM)
    - 2 MySQL 5.1.41 in master/slave replication (70 GB RAM)
    - 6 zabbix-proxy instances

    From time to time, we get false positive flood alerts about zabbix-agent not answering from various hosts.

    First investigations showed that false-positive occur when too much SQL request get locked.
    The fact is that 'history' table is around 200M rows and 'history_uint' about 300M :-/

    I already changed history settings for all items (mostly was 90 days, now is 7 days).
    I also decreased housekeeper history for actions and events from 365 (default value) to 60.

    But I still have too much rows in my table (and 60 days of history).

    First question: which are the differences between item history setting and housekeeper one ? If items history is 7 days, should I decrease housekeeper's one to the same value ?

    Some ideas I have to get more performances:
    - Disable Housekeeper and use SQL partitions. With which impact on requests performances, index, ... ?
    - Use master/master replication, make zabbix-server use one SQL instance and zabbix frontend the other one. Which could be the impact ? Does zabbix support odd and even id increment ?
    - if previous improvements do not solve my problem, using distributed monitoring, replacing zabbix-proxy by other zabbix-servers

    The main bottleneck here is disk I/O performances. Because of EC2, the SQL instance got somewhat poor disk performance (compared to physical hosting). We try to compensate with memory but InnoDB MySQL seems to be saturated.

    Any ideas welcomed,
    Regards,
    JB

    #2
    Originally posted by jbfavre View Post
    First question: which are the differences between item history setting and housekeeper one ? If items history is 7 days, should I decrease housekeeper's one to the same value ?
    I suggest you read my post:
    http://zabbixzone.com/zabbix/history-and-trends/

    All values collected are stored in history tables. Zabbix Server hourly calculates a summary (min, avg, max) from item using history tables and stores on trends table.

    Originally posted by jbfavre View Post
    Some ideas I have to get more performances:
    - Disable Housekeeper and use SQL partitions. With which impact on requests performances, index, ... ?
    Great, using partitioning you'll get smaller tables and performance goes up. The problem that MySQL can't handle big tables without partitions. My history table has 500M rows, history_uint has 320M rows and I keep it only for 3 days.
    http://zabbixzone.com/zabbix/partitioning-tables/

    Originally posted by jbfavre View Post
    - Use master/master replication, make zabbix-server use one SQL instance and zabbix frontend the other one. Which could be the impact ? Does zabbix support odd and even id increment ?
    I'm sure that you will have problems with "ids" table that handles all increments.

    I suggest you use a master/slave solution and mysqlnd_ms, a PHP module, that does loadbalancing http://blog.ulf-wendel.de/?p=299, because frontend does a lot of SELECT questions.

    Originally posted by jbfavre View Post
    The main bottleneck here is disk I/O performances. Because of EC2, the SQL instance got somewhat poor disk performance (compared to physical hosting). We try to compensate with memory but InnoDB MySQL seems to be saturated.
    http://www.mysqlperformanceblog.com/...fer_pool_size/

    Did you tried Percona 5.5 instead of Community 5.1?

    Comment


      #3
      Originally posted by xsbr View Post
      I suggest you read my post:
      http://zabbixzone.com/zabbix/history-and-trends/

      All values collected are stored in history tables. Zabbix Server hourly calculates a summary (min, avg, max) from item using history tables and stores on trends table.
      I already read it
      But since I setted items history to 7 days, why do I still have 60 days history in history table ?

      And what does "actions" and "events" stand for in "Administration>General" settings tab for Housekeeper ?
      I don't think actions will act on actions table, but can't find what it's used for.



      Originally posted by xsbr View Post
      Great, using partitioning you'll get smaller tables and performance goes up. The problem that MySQL can't handle big tables without partitions. My history table has 500M rows, history_uint has 320M rows and I keep it only for 3 days.
      http://zabbixzone.com/zabbix/partitioning-tables/



      I'm sure that you will have problems with "ids" table that handles all increments.

      I suggest you use a master/slave solution and mysqlnd_ms, a PHP module, that does loadbalancing http://blog.ulf-wendel.de/?p=299, because frontend does a lot of SELECT questions.
      Great, I'll have a look on it !



      Originally posted by xsbr;85284[url
      http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/[/url]

      Did you tried Percona 5.5 instead of Community 5.1?
      Not yet, but we may have to look on it if other points do not provide us with a suitable solution. That's also a backup plan we have.

      Comment


        #4
        Besides "in-band" monitoring, you can also use a monitoring plugin specially geared towards monitoring your EC2 resources.

        Have a look at http://code.google.com/p/mikoomi

        The plugin queries the AWS EC2 API to keep track of all your servers, snapshots, images, etc.

        As a bonus, it also keeps an eye on spot prices if it matters ;-)

        Jayesh Thakrar

        Comment


          #5
          Originally posted by jthakrar View Post
          Besides "in-band" monitoring, you can also use a monitoring plugin specially geared towards monitoring your EC2 resources.

          Have a look at http://code.google.com/p/mikoomi
          Well, looks great.
          But I'll wait for better performances before enabling this plugin

          Originally posted by jthakrar View Post
          The plugin queries the AWS EC2 API to keep track of all your servers, snapshots, images, etc.

          As a bonus, it also keeps an eye on spot prices if it matters ;-)

          Jayesh Thakrar
          Had a quick look, and it's definitely something we need.
          I noticed an error in the doc by the way:
          http://code.google.com/p/mikoomi/wiki/04
          in "Setup and Configuration" section, MongoDB is mentionned. Not sure we need that to make the EC2 plugin running

          Regards,
          JB

          Comment


            #6
            Originally posted by jbfavre View Post
            I already read it
            But since I setted items history to 7 days, why do I still have 60 days history in history table ?

            And what does "actions" and "events" stand for in "Administration>General" settings tab for Housekeeper ?
            I don't think actions will act on actions table, but can't find what it's used for.
            Answer to myself:
            - "actions history" for housekeeper stands for "alerts"
            - "events history" stands for "events"

            Comment


              #7
              Hi jthakrar,

              Are you able to help me diagnose an issue that I am having with the EC2_Plugin.

              I'm able to now see spot instance prices etc in zabbix but the data for my Account e.g Number of Elastic IP's is always returning 0.

              any ideas why this may be? I have checked that I am using the correct keys :-)

              Matt

              Comment

              Working...
              X