Ad Widget

Collapse

Awfully slow performance when accessing the Web overview pages

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Doomshammer
    Junior Member
    • Dec 2007
    • 22

    #1

    Awfully slow performance when accessing the Web overview pages

    Hi all,

    I have Zabbix 1.4.3 running with 3 Webserver that have one webpage monitored per host. This is actually not a big amount of data I believe, but when I try to access the "Web" overview section, the web GUI keeps hanging.

    A look at the database shows that there are tons of queries executing which per query take between 4000ms to 10000ms. So I tried killing the database processes with this pretty slow queries and it resulted in a webpage that shows the massive amount of queries that the gui wanted to execute (s. the attached screenshot).

    Is this usual, that Zabbix does such many queries for only 3 webpage content validations configured? Any advide would be highly appreciated.


    Thanks!
    Attached Files
  • hoyt
    Member
    • Aug 2007
    • 31

    #2
    Originally posted by Doomshammer
    A look at the database shows that there are tons of queries executing which per query take between 4000ms to 10000ms. So I tried killing the database processes with this pretty slow queries and it resulted in a webpage that shows the massive amount of queries that the gui wanted to execute (s. the attached screenshot).

    Thanks!

    Assuming you are using a MySQL DB, you could try setting the MySQL "tmpdir" to a memory base filesystem. This made a pretty big difference for our setup.

    For linux you would add something like this to /etc/fstab:

    none /mytmp tmpfs size=512M,mode=1777 0 0

    And then set "tmpdir = /mytmp" in the my.cnf file. This assume you have the tmpfs filesystem type in your kernel.

    --John

    Comment

    • Doomshammer
      Junior Member
      • Dec 2007
      • 22

      #3
      Hi John,

      Originally posted by hoyt
      Assuming you are using a MySQL DB, you could try setting the MySQL "tmpdir" to a memory base filesystem. This made a pretty big difference for our setup.

      For linux you would add something like this to /etc/fstab:

      none /mytmp tmpfs size=512M,mode=1777 0 0

      And then set "tmpdir = /mytmp" in the my.cnf file. This assume you have the tmpfs filesystem type in your kernel.
      I am not using MySQL, but PostgreSQL. None the less your suggestion is just a workaround to make the database access faster. But it doesn't help to get rid of the main problem... that Zabbix is doing thousands of queries for just 3 web checks.

      Comment

      • Doomshammer
        Junior Member
        • Dec 2007
        • 22

        #4
        Ok, as I didn't get a reply here and this issue finally got even worse- I did some investigation by myself. I found that the following query was the slowest:

        Code:
        select wt.*,a.name as application,h.host,h.hostid from httptest wt  left join applications a on wt.applicationid=a.applicationid  left join hosts h on h.hostid=a.hostid where a.applicationid=170 and wt.status <> 1 order by h.host,wt.name;
        I analyzed the query and found, that it takes up to 40 seconds- and as there are a lot of these queries being executed when clicking on the "Web" section, it will run into a HTTP timeout, as it never will finish all the queries.

        A closer look at the query showed up, that an index on the applicationid in the httptest table could fasten up the query and so I created one:

        Code:
        CREATE INDEX httptestapp ON httptest (applicationid);
        This really did the job. The execution of the query only takes like 0.348 ms and the "Web" section now loads in within 1-3 seconds again. Maybe this might be helpful to other people as well.

        Comment

        • filabrazilska
          Junior Member
          • Aug 2009
          • 1

          #5
          slow Overview and Latest data pages using PGSQL and Zabbix 1.6.5

          Hi, we had the same troubles with Zabbix 1.6.5 on PGSQL db, even with a modest setup (just <10 hosts monitored) the database took some three processors for itself and Overview and Latest data pages were painfully slooow. When I created indices on hosts.status and items.status (CREATE INDEX hosts_status_index ON hosts (status); CREATE INDEX items_status_index ON items (status) the problem has been solved
          Hope this helps
          f.

          Comment

          • frankcheong
            Member
            • Oct 2009
            • 73

            #6
            I have exactly the same problem with version 1.8 running on a dedicated fast server equipped with Dual Xeon 5440 with 8GB RAM running the PAE kernel.

            When I click on the overview tab, the browser behaves like hang. I then drill into pgsql console and found the some of the SQL are taking forever to complete.

            The following list of SQL take exceptional long time to get complete, the first one need 5 min and the second one takes 10 min and the last one take around 20 min to complete.

            The query are :-

            zabbix=# select * from pg_stat_activity;
            datid | datname | procpid | usesysid | usename | current_query | waiting | query_start | backend_start | client_addr | client_port
            -------+---------+---------+----------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-------------------------------+-------------------------------+-------------+-------------
            43233 | zabbix | 5422 | 16384 | zabbix | SELECT COUNT(DISTINCT t.triggerid) as cnt FROM triggers t, functions f, items i, hosts h WHERE t.triggerid=f.triggerid AND f.itemid=i.itemid AND i.status=0 AND i.hostid=h.hostid AND h.status=0 | f | 2010-01-08 17:30:59.961939+08 | 2010-01-08 17:30:59.918316+08 | 127.0.0.1 | 53194

            43233 | zabbix | 5421 | 16384 | zabbix | SELECT g.* FROM groups g,hosts_groups hg,hosts h WHERE (g.groupid/100000000000000) in (0) AND hg.groupid=g.groupid AND h.hostid=hg.hostid AND h.status=0 AND EXISTS( SELECT t.triggerid FROM items i, functions f, triggers t WHERE i.hostid=hg.hostid AND i.status=0 AND i.itemid=f.itemid AND f.triggerid=t.triggerid AND t.status=0) | f | 2010-01-08 17:30:59.972692+08 | 2010-01-08 17:30:59.911485+08 | 127.0.0.1 | 53193

            43233 | zabbix | 5426 | 16384 | zabbix | SELECT DISTINCT g.groupid,g.name FROM groups g,hosts_groups hg,hosts h WHERE (g.groupid IN (5,6,2,1,3,4)) AND h.status=0 AND hg.groupid=g.groupid AND h.hostid=hg.hostid AND EXISTS (SELECT i.hostid FROM items i WHERE hg.hostid=i.hostid AND i.status=0) AND EXISTS( SELECT t.triggerid FROM items i, functions f, triggers t WHERE i.hostid=hg.hostid AND i.status=0 AND i.itemid=f.itemid AND f.triggerid=t.triggerid AND t.status=0) ORDER BY g.name | f | 2010-01-08 17:31:08.852987+08 | 2010-01-08 17:31:08.390396+08 | 127.0.0.1 | 53199


            I have already tuned up some kernel setting so as to allow bigger share memory as below:-
            server# cat /etc/sysctl.conf
            # Controls the maximum shared segment size, in bytes
            kernel.shmmax = 4294967295
            # Controls the maximum number of shared memory segments, in pages
            kernel.shmall = 268435456

            I have done some minor performance tuning on my pgsql as below:-
            max_connections = 50
            shared_buffers = 256MB
            temp_buffers = 16MB

            Anyone encounter similar problem and got fixes or work around? Now i have to get away from the Overview page, coz whenever I get into this page, the browser almost got hang up. Luckily the CPU also have around 70-80% idle time while the CPU load remain steadily on around 1-3 while it will bump up from 4 or sometime grows to 6 or even 7 when I am violently clicking on the Overview tab again and again before it returns anything when I almost lost my tamper...


            Anywway, I have then done some other minor performing tuning for my pgsql as below:-
            max_connections = 50
            shared_buffers = 512MB
            temp_buffers = 16MB
            work_mem = 8MB
            max_stack_depth = 4MB
            checkpoint_segments = 20
            effective_cache_size = 512MB

            And then the first two query returns almost instantly (don't know if they are being invoked by the overview tab, coz I just can't see such query pending on the queue) while the final one still takes forever (around the same time) to complete.

            This is a fresh build of version 1.8. Any additional index or ways I can speed up this query?

            Comment

            • frankcheong
              Member
              • Oct 2009
              • 73

              #7
              Refer to this post (What are Zabbix Agent Active Checks) the overview page now returns almost instantly. It seems like we hit a problem/bug with postgresql which it failed to use any index without reason. But it suddenly know which index to use without a clue.

              Comment

              Working...