Ad Widget

Collapse

suckerd dying while trying to delete history

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mucknet
    Member
    • Dec 2004
    • 59

    #1

    suckerd dying while trying to delete history

    Howdy --

    I have a large zabbix DB (~50GB), whenever I try to delete an items history, I get entries similar to the following in my zabbix_suckerd.log file, then zabbix_suckerd dies:

    006020:20050104:231610 Query::insert into history (clock,itemid,value) values (1104909319,18614,72.912990)
    006020:20050104:231610 Query failed:Lock wait timeout exceeded; Try restarting transaction [1205]
    006020:20050104:231717 Query::insert into history (clock,itemid,value) values (1104909386,18614,72.802160)
    006020:20050104:231717 Query failed:Lock wait timeout exceeded; Try restarting transaction [1205]
    006020:20050104:231817 Query::insert into history (clock,itemid,value) values (1104909446,18614,72.897020)
    006020:20050104:231817 Query failed:Lock wait timeout exceeded; Try restarting transaction [1205]
    006018:20050104:231823 Query::insert into history (clock,itemid,value) values (1104909503,18588,2.665790)
    006018:20050104:231823 Query failed:Lost connection to MySQL server during query [2013]
    006018:20050104:231823 Query::select num,value_min,value_avg,value_max from trends where itemid=18588 and clock=1104908400
    006020:20050104:231823 Query::select function,parameter,itemid from functions where itemid=18386 group by 1,2,3 order by 1,2,3
    006018:20050104:231823 Query failed:Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111) [2002]
    006020:20050104:231823 Query failed:Lost connection to MySQL server during query [2013]
    006021:20050104:231823 Query::insert into history (clock,itemid,value) values (1104909503,18567,19.882370)
    006021:20050104:231823 Query failed:Lost connection to MySQL server during query [2013]
    006021:20050104:231823 Query::select num,value_min,value_avg,value_max from trends where itemid=18567 and clock=1104908400
    006021:20050104:231823 Query failed:Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111) [2002]
    006009:20050104:231823 One child process died. Exiting ...
    006017:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...
    006015:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...
    006019:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...
    006023:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...
    006022:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...
    006016:20050104:231823 Got QUIT or INT or TERM or PIPE signal. Exiting...


    This happens when I try to delete an individual item for a host, or when I try to delete a whole host.

    I store items every 10 seconds for about the last 6 months.

    any ideas or suggestions?

    Thanks!
  • Alexei
    Founder, CEO
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2004
    • 5654

    #2
    Interesting problem! I'm pretty sure you're using MySQL InnoDB database.

    006020:20050104:231610 Query::insert into history (clock,itemid,value) values (1104909319,18614,72.912990)
    006020:20050104:231610 Query failed:Lock wait timeout exceeded; Try restarting transaction [1205]

    ZABBIX suckerd process (PID=6020) tries to insert a record into history but fails because of timeout (the table is locked, obviously). ZABBIX ignores the error, I'm not sure if it is ok.

    006020:20050104:231823 Query::select function,parameter,itemid from functions where itemid=18386 group by 1,2,3 order by 1,2,3
    006020:20050104:231823 Query failed:Lost connection to MySQL server during query [2013]

    Then, we loose MySQL connection. It seems that MySQL dropped the connection. Why? I have no idea. Bug on MySQL side?

    Anyway, when deleting a host, ZABBIX v1.0 housekeeper tries to delete all data from table history at once. This is not most efficient way. I plan to improve it in 1.1, however, I still have no clear vision how to do it most efficiently.

    Several workarounds exist:

    1. Convert 50GB database to MyISAM
    2. Purge the data manually from the history
    3. Don't delete hosts. Let them be in Unreachable status
    Alexei Vladishev
    Creator of Zabbix, Product manager
    New York | Tokyo | Riga
    My Twitter

    Comment

    • festivalman
      Junior Member
      • Mar 2005
      • 10

      #3
      Hi, I'm running into the same problem and would like to go with the "Delete old records manually" route. I have a db with many hosts that I had set to keep a 2 year history of. This has grown too large and now gives me the locking timeout problem. I've set all of the history times on the hosts to 6 months to solve the problem, but I need to manually erase the histories that are older than this to fix the problem so the cleanup starts working again. Can you list here exactly what I should be deleting or what query/function would accomplish this? Any help would be appreciated. Thanks.

      Comment

      • festivalman
        Junior Member
        • Mar 2005
        • 10

        #4
        *bump* Anyone have any idea on this one? Thanks.

        Comment

        Working...