Ad Widget

Collapse

Massive performance improvement possible for logfile triggers

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • thtux
    Junior Member
    • Apr 2011
    • 14

    #1

    Massive performance improvement possible for logfile triggers

    Update: See post #3 for complete patch!

    Hi there,

    I just found out that most of the database load on our Zabbix server is from queries like this one:
    Code:
    select value from history_log where itemid=31155 order by id desc limit 1;
    There are quite some log lines in these table, about one million lines for this specific itemid. The query needs about 15 seconds to return one value. It looks like the query is initiated by the function DBget_history_log_value(..) in zbxserver/expression.c .

    This method issues two sql queries. One to get the itemid:
    Code:
            result = DBselect("select i.itemid,i.value_type from items i,functions f"
                            " where i.itemid=f.itemid and f.functionid=" ZBX_FS_UI64,
                            functionid);
    and a second one to get the complete value (not limited to 255 characters) from the history_log table:
    Code:
                            zbx_snprintf(sql, sizeof(sql), "select %s from history_log"
                                            " where itemid=%s order by id desc",
                                            fieldname,
                                            row[0]);
    For me it looks like it's easily possible to add the timestamp to this second query as we already received it from the first query (column lastclock). When we add the timestamp there is no need for the database to sort these one million lines and it should be a very fast operation. Basically just an index lookup.

    What dou you think about this? Is there something I missed?
    Last edited by thtux; 16-05-2011, 09:00.
  • thtux
    Junior Member
    • Apr 2011
    • 14

    #2
    Proposed patch against 1.8.5:
    Code:
    --- ./libs/zbxserver/expression.c.orig  2011-05-03 08:16:46.000000000 +0200
    +++ ./libs/zbxserver/expression.c       2011-05-03 09:16:59.000000000 +0200
    @@ -958,7 +958,7 @@
            if (FAIL == trigger_get_N_functionid(trigger, N_functionid, &functionid))
                    goto fail;
    
    -       result = DBselect("select i.itemid,i.value_type from items i,functions f"
    +       result = DBselect("select i.itemid,i.value_type,lastclock from items i,functions f"
                            " where i.itemid=f.itemid and f.functionid=" ZBX_FS_UI64,
                            functionid);
    
    @@ -969,9 +969,9 @@
                    if (value_type == ITEM_VALUE_TYPE_LOG)
                    {
                            zbx_snprintf(sql, sizeof(sql), "select %s from history_log"
    -                                       " where itemid=%s order by id desc",
    +                                       " where itemid=%s and clock=%s order by id desc",
                                            fieldname,
    -                                       row[0]);
    +                                       row[0], row[2]);
    
                            h_result = DBselectN(sql, 1);

    Comment

    • thtux
      Junior Member
      • Apr 2011
      • 14

      #3
      Found some more selects like this one. It just uses the additional knowledge from the table 'items' so the database is able to use the index. Works perfectly on mysql.

      Code:
      --- src/libs/zbxserver/expression.c.orig        2011-04-15 21:17:29.000000000 +0200
      +++ src/libs/zbxserver/expression.c     2011-05-03 12:34:13.000000000 +0200
      @@ -958,7 +958,7 @@
              if (FAIL == trigger_get_N_functionid(trigger, N_functionid, &functionid))
                      goto fail;
      
      -       result = DBselect("select i.itemid,i.value_type from items i,functions f"
      +       result = DBselect("select i.itemid,i.value_type,lastclock from items i,functions f"
                              " where i.itemid=f.itemid and f.functionid=" ZBX_FS_UI64,
                              functionid);
      
      @@ -969,9 +969,9 @@
                      if (value_type == ITEM_VALUE_TYPE_LOG)
                      {
                              zbx_snprintf(sql, sizeof(sql), "select %s from history_log"
      -                                       " where itemid=%s order by id desc",
      +                                       " where itemid=%s and clock=%s order by id desc",
                                              fieldname,
      -                                       row[0]);
      +                                       row[0], row[2]);
      
                              h_result = DBselectN(sql, 1);
      
      @@ -1025,7 +1025,7 @@
                      goto fail;
      
              result = DBselect(
      -                       "select i.itemid,i.value_type,i.valuemapid,i.units,i.lastvalue"
      +                       "select i.itemid,i.value_type,i.valuemapid,i.units,i.lastvalue,i.lastclock"
                              " from items i,functions f"
                              " where i.itemid=f.itemid"
                                      " and f.functionid=" ZBX_FS_UI64,
      @@ -1040,9 +1040,9 @@
                      {
                              case ITEM_VALUE_TYPE_LOG:
                              case ITEM_VALUE_TYPE_TEXT:
      -                               zbx_snprintf(tmp, sizeof(tmp), "select value from %s where itemid=%s order by id desc",
      +                               zbx_snprintf(tmp, sizeof(tmp), "select value from %s where itemid=%s and clock=%s order by id desc",
                                                      value_type == ITEM_VALUE_TYPE_LOG ? "history_log" : "history_text",
      -                                               row[0]);
      +                                               row[0], row[5]);
      
                                      h_result = DBselectN(tmp, 1);
      
      --- src/libs/zbxserver/evalfunc.c.orig  2011-04-15 21:17:29.000000000 +0200
      +++ src/libs/zbxserver/evalfunc.c       2011-05-03 13:46:18.000000000 +0200
      @@ -1768,13 +1768,12 @@
              else
              {
                      zbx_snprintf(sql, sizeof(sql),
      -                               "select value"
      -                               " from %s"
      -                               " where itemid=" ZBX_FS_UI64
      -                               " order by %s desc",
      +                               "select h.value"
      +                               " from %s as h, items as i"
      +                               " where i.itemid=" ZBX_FS_UI64
      +                               " and h.itemid=i.itemid and h.clock=i.lastclock",
                                      get_table_by_value_type(item->value_type),
      -                               item->itemid,
      -                               get_key_by_value_type(item->value_type));
      +                               item->itemid);
                      result = DBselectN(sql, arg2);
              }
      WorksForMe(tm) - It reduced the mysqld cpu usage by about 95%!

      Here the time needed to query the old select-statement compared to the new version:
      Code:
      mysql> SELECT h.value FROM history_text as h, items as i WHERE i.itemid=33839 and h.itemid=i.itemid and h.clock=i.lastclock ORDER BY id DESC LIMIT 1;
      +---------------------------------------------------+
      | value                                             |
      +---------------------------------------------------+
      |                 ....(removed)....                 |
      +---------------------------------------------------+
      1 row in set (0.01 sec)
      
      mysql> SELECT value FROM history_text WHERE itemid =33839 ORDER BY id DESC LIMIT 1;
      +---------------------------------------------------+
      | value                                             |
      +---------------------------------------------------+
      |                 ....(removed)....                 |
      +---------------------------------------------------+
      1 row in set (0.96 sec)
      And how mysql is executing the queries:
      Code:
      mysql> explain SELECT value FROM history_text WHERE itemid =33839 ORDER BY id DESC LIMIT 1;
      +----+-------------+--------------+------+-------------------------------+----------------+---------+-------+-------+-----------------------------+
      | id | select_type | table        | type | possible_keys                 | key            | key_len | ref   | rows  | Extra                       |
      +----+-------------+--------------+------+-------------------------------+----------------+---------+-------+-------+-----------------------------+
      |  1 | SIMPLE      | history_text | ref  | history_text_2,history_text_1 | history_text_1 | 8       | const | 24552 | Using where; Using filesort |
      +----+-------------+--------------+------+-------------------------------+----------------+---------+-------+-------+-----------------------------+
      1 row in set (0.00 sec)
      
      mysql> explain SELECT h.value FROM history_text as h, items as i WHERE i.itemid=33839 and h.itemid=i.itemid and h.clock=i.lastclock ORDER BY id DESC LIMIT 1;
      +----+-------------+-------+-------+-------------------------------+----------------+---------+-------------+------+----------------+
      | id | select_type | table | type  | possible_keys                 | key            | key_len | ref         | rows | Extra          |
      +----+-------------+-------+-------+-------------------------------+----------------+---------+-------------+------+----------------+
      |  1 | SIMPLE      | i     | const | PRIMARY                       | PRIMARY        | 8       | const       |    1 | Using filesort |
      |  1 | SIMPLE      | h     | ref   | history_text_2,history_text_1 | history_text_1 | 12      | const,const |    1 | Using where    |
      +----+-------------+-------+-------+-------------------------------+----------------+---------+-------------+------+----------------+
      2 rows in set (0.00 sec)

      Comment

      • 0siris
        Member
        Zabbix Certified Specialist
        • Nov 2010
        • 76

        #4
        keep up the good work

        Comment

        • nelsonab
          Senior Member
          Zabbix Certified SpecialistZabbix Certified Professional
          • Sep 2006
          • 1233

          #5
          Have you opened up a ticket for this to track this?
          RHCE, author of zbxapi
          Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
          Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

          Comment

          • thtux
            Junior Member
            • Apr 2011
            • 14

            #6
            Thanks for your feedback.

            Yes, I opened a Bugreport:


            Looks like this patch solved all the Zabbix problems I had. I often had death poller processes and broken connections to the passive proxies after some days of usage…

            Comment

            • untergeek
              Senior Member
              Zabbix Certified Specialist
              • Jun 2009
              • 512

              #7
              Where did you find this optimization? Are there others like it?

              Comment

              • thtux
                Junior Member
                • Apr 2011
                • 14

                #8
                Originally posted by untergeek
                Where did you find this optimization? Are there others like it?
                The MySQL slow-query log showed me these queries all the time. So I started to dig into the sourcecode. I'm quite sure I eliminated all of these index issues. It's always possible to further improve performance but the database indexes are now used everywhere quite efficiently (as far as I can tell).

                The performance is now very good at my site. So there is no need for me to dive deeper into Zabbix optimization.

                Comment

                • untergeek
                  Senior Member
                  Zabbix Certified Specialist
                  • Jun 2009
                  • 512

                  #9
                  So, you patched it yourself then? And this is the only addition or edit you made to the source code? We aren't using text checks yet, but may in the future, so I plan on adding this if we do. I just was curious if you added any other optimizations.

                  Comment

                  • thtux
                    Junior Member
                    • Apr 2011
                    • 14

                    #10
                    That's all. Only two files involved, no changes to the database. There is also no need to deploy new proxies or agents, everything is in the server binary.

                    I have applied this patch two weeks ago and it still works very well. No problems so far. I think it's save to use it.

                    The attached graph shows the usage of the Zabbix processes. I applied the patch in the middle of the timespan shown.
                    Attached Files

                    Comment

                    Working...