Ad Widget

Collapse

upgrade from 1.8.2 ->1.8.3 history updates way too slow

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • shaunr
    Junior Member
    • Apr 2010
    • 13

    #1

    upgrade from 1.8.2 ->1.8.3 history updates way too slow

    Hi all

    I recently upgraded to 1.8.3 in order to make use of the passive proxy feature to comply with company firewall policies.


    Problem:
    The history updates are taking long to reflect on any graph. If I view the processor load graph, the graph line is out by a couple minutes as soon as zabbix is started. If I leave it to run for 24 hours and then view the graph, it is out by 4 hours!
    When I stop zabbix server, the log file shows that the history is being synced. It seems as though the history is being maintained in some buffer and not submitted to the DB server.
    I dont think the DB server itself has a performance issue as I have tuned Postgres to make us of the memory. The DB server load is around 4.

    I have to constantly provide graphs as part of a performance tuning exercise on the servers being monitored, but no data available for the last 4 hours is becoming hard to explain to business.

    My setup
    Zabbix Server : 2x Dual core Xeon, 4GB Ram
    Zabbix Proxy : 2x Dual core Xeon, 4GB Ram, still running in Active mode.
    Postgres Server 8.4: 2x Quad core with hyperthreading Xeon, 16GB Ram
    OS: All running Centos 5.5

    Zabbix setup:
    212 monitored hosts
    6349 items
    2898 Triggers
    240 Required server performance

    Server Config:
    Code:
    CacheSize=128M
    DBHost=192.168.0.2 (Connected via a crossover cable to postgres server)
    DBName=zabbix
    DBPassword=******
    DBPort=5432
    DBUser=zabbix
    HistoryCacheSize=128M
    HistoryTextCacheSize=128M
    HousekeepingFrequency=24
    ListenIP=172.19.0.196
    LogFile=/home/zabbix/zabbix_server.log
    ProxyConfigFrequency=60
    ProxyDataFrequency=5
    SourceIP=172.19.0.196
    StartDBSyncers=2
    StartPollers=10
    StartTrappers=2
    Timeout=5
    TrendCacheSize=4M
    UnreachablePeriod=60
    Proxy Config:
    Code:
    CacheSize=128M
    ConfigFrequency=60
    DataSenderFrequency=1
    DBHost=localhost
    DBName=/home/zabbix/DB/zabbix_proxy.db
    HeartbeatFrequency=60
    Hostname=rmb-ppr-palantir-ap1
    HousekeepingFrequency=12
    ListenIP=172.28.138.133
    LogFile=/tmp/zabbix_proxy.log
    ProxyLocalBuffer=0
    ProxyMode=0
    ProxyOfflineBuffer=96
    Server=palantir
    SourceIP=172.28.138.133
    Timeout=5
    Any help appreciated!

    Thanks
    Shaun
  • walterheck
    Senior Member
    • Jul 2009
    • 153

    #2
    Do you monitor the zabbix queue by any chance? Does that stay low enough?
    Free and Open Source Zabbix Templates Repository | Hosted Zabbix @ Tribily (http://tribily.com)

    Comment

    • shaunr
      Junior Member
      • Apr 2010
      • 13

      #3
      hi Waltercheck

      Code:
      Do you monitor the zabbix queue by any chance? Does that stay low enough?
      The queue is currently sitting with:

      5s - 107
      10s - 1
      30s - 2
      1min - 5
      5min - 2
      More than 10 - 625

      I adjusted the StartDBSyncers to be 32. The lag is now down to about 20 minutes. Will bump it up to its max 64 and see if that improves.

      Comment

      • shaunr
        Junior Member
        • Apr 2010
        • 13

        #4
        I shut down zabbix server which took approx 17-18 minutes to shut down.
        Here is the output of the log

        Code:
        15428:20100902:071050.921 Got signal [signal:15(SIGTERM),sender_pid:17747,sender_uid:0,reason:0]. Exiting ...
         15369:20100902:071050.922 One child process died (PID:15428,exitcode/signal:255). Exiting ...
         15369:20100902:071052.928 Syncing history data...
         15369:20100902:071104.643 Syncing history data... 0.990378%
         15369:20100902:071114.957 Syncing history data... 1.967447%
         15369:20100902:071125.316 Syncing history data... 2.944515%
         15369:20100902:071135.578 Syncing history data... 3.921584%
         15369:20100902:071145.929 Syncing history data... 4.897870%
        .
        .
        .
        .
        15369:20100902:072751.191 Syncing history data... 96.857409%
         15369:20100902:072801.014 Syncing history data... 97.839175%
         15369:20100902:072811.121 Syncing history data... 98.850692%
         15369:20100902:072821.125 Syncing history data... 99.852813%
         15369:20100902:072822.711 Syncing history data... done.
         15369:20100902:072822.711 Syncing trends data...
         15369:20100902:072823.145 Syncing trends data... done.
         15369:20100902:072823.145 Zabbix Server stopped. Zabbix 1.8.3 (revision 13928).

        Comment

        • shaunr
          Junior Member
          • Apr 2010
          • 13

          #5
          I changed StartDBSyncers=64. This has not made much difference. I ran the server for about an hour and lag still persists. Shutting down Zabbix Server took long again.

          Code:
          18042:20100902:084832.246 Got signal [signal:15(SIGTERM),sender_pid:18203,sender_uid:0,reason:0]. Exiting ...
           17954:20100902:084832.248 One child process died (PID:18042,exitcode/signal:255). Exiting ...
           17954:20100902:084834.253 Syncing history data...
           17954:20100902:084848.360 Syncing history data... 1.307671%
           17954:20100902:084858.715 Syncing history data... 2.354130%
           17954:20100902:084908.821 Syncing history data... 3.374712%
           17954:20100902:084918.915 Syncing history data... 4.395293%
           17954:20100902:084929.018 Syncing history data... 5.415875%
           17954:20100902:084939.151 Syncing history data... 6.436456%
           17954:20100902:084949.139 Syncing history data... 7.457038%
          .
          .
          .
           17954:20100902:090359.486 Syncing history data... 97.547208%
           17954:20100902:090409.267 Syncing history data... 98.552424%
           17954:20100902:090419.198 Syncing history data... 99.618293%
           17954:20100902:090422.750 Syncing history data... done.
           17954:20100902:090422.750 Syncing trends data...
           17954:20100902:090423.290 Syncing trends data... done.
           17954:20100902:090423.290 Zabbix Server stopped. Zabbix 1.8.3 (revision 13928).
          I am doing a reindex on the history table. I suspect that its an issue with slow inserts/updates on the history table.

          Comment

          • shaunr
            Junior Member
            • Apr 2010
            • 13

            #6
            Solved!

            Problem sovled. I did a reindex on items and history. It took a while and now the graph has no lag!

            The queue is down to almost nothing.

            Comment

            • rkaniyala
              Junior Member
              • Jul 2011
              • 1

              #7
              Reindex?

              Hey Shaunr,

              What did you reindex in items and history?

              I moved the zabbix database onto a different database server and changed the zabbix conf file to point to the new zabbix db server. Seems like the Queue is getting bigger for >10mins.

              Your help is appreciated

              Comment

              • shaunr
                Junior Member
                • Apr 2010
                • 13

                #8
                HI rkaniyala

                I did a reindex by using the phppgadmin tool. The commands it executed are:

                HIstory table :
                CREATE INDEX history_1 ON history USING btree (itemid, clock)

                Items Table:
                CREATE INDEX items_3 ON items USING btree (status)
                CREATE INDEX items_4 ON items USING btree (templateid)

                HTH
                Shaun

                Comment

                Working...