Ad Widget

Collapse

Actions Hanging on "in progress" in 1.8.8rc1

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jroberson
    Senior Member
    • May 2008
    • 124

    #1

    Actions Hanging on "in progress" in 1.8.8rc1

    I was afflicted with the 1.8.7 "crash" so I went to 1.8.8rc1 but now I'm experiencing my actions (emails & jabber IMs) hanging at "in progress". Once I restart Zabbix_server the floodgates open and all the hung notifications burst through. Anybody else seeing the same thing?

    Centos 5.6 w/ MySQL 5.1
  • richlv
    Senior Member
    Zabbix Certified Trainer
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2005
    • 3112

    #2
    did you see any errors in zabbix server log, particularly regarding database access ?

    if so, it might be https://support.zabbix.com/browse/ZBX-4141 and you should try 1.8.8rc2
    Zabbix 3.0 Network Monitoring book

    Comment

    • jroberson
      Senior Member
      • May 2008
      • 124

      #3
      I notice a
      Code:
      [Z3005] query failed: [2006] MySQL server has gone away [begin;]
      around 4am on the day that this last happened and then a
      Code:
      [Z3005] query failed: [2006] MySQL server has gone away [update alerts set retries=1,error='stream error' where alertid=712]
      last night when I had some notification delays.

      I see a batch of this:
      Code:
       11134:20110919:203433.229 Zabbix Host [cnt501ss02]: first network error, wait for 15 seconds
       11135:20110919:203437.197 [Z3005] query failed: [2006] MySQL server has gone away [begin;]
       11135:20110919:203437.217 Zabbix Host [cnt501ws02]: first network error, wait for 15 seconds
       11136:20110919:203553.966 [Z3005] query failed: [2006] MySQL server has gone away [begin;]
       11136:20110919:203553.986 Zabbix Host [cnt501ms02]: first network error, wait for 15 seconds
       11133:20110919:203601.186 [Z3005] query failed: [2006] MySQL server has gone away [begin;]
       11133:20110919:203601.232 Zabbix Host [cntarmdc01]: first network error, wait for 15 seconds
       11133:20110919:203744.030 Zabbix Host [cnt501aa01]: first network error, wait for 15 seconds
       11138:20110919:203746.602 [Z3005] query failed: [2006] MySQL server has gone away [begin;]
       11138:20110919:203746.617 Zabbix Host [cntarmfs02]: first network error, wait for 15 seconds
       11134:20110919:203755.624 Zabbix Host [cnt501eb02]: first network error, wait for 15 seconds
       11135:20110919:203805.878 Zabbix Host [cnt501vs02]: first network error, wait for 15 seconds
       11133:20110919:203815.772 Zabbix Host [cnt501ad01]: first network error, wait for 15 seconds
       11134:20110919:203817.189 Zabbix Host [cnt501ad01]: another network error, wait for 15 seconds
       11139:20110919:203841.838 [Z3005] query failed: [2006] MySQL server has gone away [begin;]
       11139:20110919:203841.856 Zabbix Host [cnt501ts01]: first network error, wait for 15 seconds
       11139:20110919:203914.601 Zabbix Host [cnt501ha01]: first network error, wait for 15 seconds
       11135:20110919:203921.264 Zabbix Host [cnt501ts02]: first network error, wait for 15 seconds
      And then MUCH more of those "first network error, wait for 15 seconds"

      Oddly enough when I received the
      Code:
       11157:20110920:033603.248 [Z3005] query failed: [2006] MySQL server has gone away [update alerts set retries=1,error='stream error' where alertid=712]
      (from above) I got a set of notifications at the same time (or a little afterwards). I'm not doing any DB maintenance at that time or anything else I can think of on the server. I also still receive data during this period as there are no gaps in my graphs.

      EDIT: I'm pulling down 1.8.8.rc2 now to see if it helps any.
      Last edited by jroberson; 20-09-2011, 16:25.

      Comment

      • jroberson
        Senior Member
        • May 2008
        • 124

        #4
        I thought it was fixed with the release of 1.8.8, but it seems to still persist. Is anybody else seeing the same issue? I check the event list and it show "in progress" but when I restart zabbix_server it then pushes all the messages that had backed up. I do restart my Jabber server at night and I get a bunch of messages about "13552:20110927:033634.615 JABBER: [[email protected]] connection failed: [111] Connection refused" but then it goes away when it comes back up. I see a bunch of these types of messages "13529:20110927:191100.996 [Z3005] query failed: [2006] MySQL server has gone away [begin;]" and I saw the below "MySQL server has gone away [update alerts set ... " which is different from the "...[begin;]" above when I started to get the "event-jam" this time. (Note: The timestamp is when I restart my Jabber server [OpenFire on Win2003 {cnt501eb02}])

        Code:
        26582:20111003:033603.362 Zabbix Host [cnt501eb02]: another network error, wait for 15 seconds
         26603:20111003:033603.478 JABBER: [[email protected]] stream error
         26603:20111003:033603.479 [Z3005] query failed: [2006] MySQL server has gone away [update alerts set retries=1,error='stream error' where alertid=1322]
         26603:20111003:033603.511 JABBER: [[email protected]] server disconnected
         26603:20111003:033603.529 JABBER: [[email protected]] received error [7]: [0] Success
         26603:20111003:033603.532 JABBER: [[email protected]] connection failed: [111] Connection refused
        I do run a backup of MySQL every 1st of the month (log files in between) BUT if you look at the date of that message, it's not on the 1st. Plus I see many of the "MySQL server has gone away [begin;]" messages throughout the day. I don't think my MySQL server is the cause for this "event-jam", but instead may just be another underlying and non related issue. The error message of "MySQL server has gone away [update alerts set retries=1,error='stream error' where alertid=1322]" happens when I restart my Jabber server and I have found another instance of it further up in the log, but it didn't cause an "event-jam" like this one. VERY STRANGE!

        So, I guess the real question is, "Is JABBER screwing up my server when it restarts?" If so, I'll get rid of that media type, or, I guess, just be sure to restart zabbix_server afterwards. It's more convenient having IM there during the day, but I needs me emails and SMS for nights & weekends! If anyone wants the logs or wants me to run some other debug stuff I will.
        Last edited by jroberson; 03-10-2011, 19:36.

        Comment

        • richlv
          Senior Member
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Oct 2005
          • 3112

          #5
          a very simple testcase would be to disable jabber for a while and see whether the problem appears or not. if it does, it's most likely something else, if it does not, we have some idea in which direction to look.
          Zabbix 3.0 Network Monitoring book

          Comment

          • jroberson
            Senior Member
            • May 2008
            • 124

            #6
            Originally posted by richlv
            a very simple testcase would be to disable jabber for a while and see whether the problem appears or not. if it does, it's most likely something else, if it does not, we have some idea in which direction to look.
            Awww ... Ok, I've deleted that media type. I'll let it sit for a week and see if it has any problems. It usually happens within a week or less, so I should know by next week. Thanks for your help.

            Comment

            • jroberson
              Senior Member
              • May 2008
              • 124

              #7
              It seems that without Jabber, Zabbix doesn't hang. I would like to have the Jabber messages but I prefer to have emails (and SMS) more! Any suggestions or you think this might qualify as a "BUG"?

              Comment

              • jroberson
                Senior Member
                • May 2008
                • 124

                #8
                Nobody else has seen this? I don't think I've got a unique setup, but then maybe most people don't use Jabber I'll go ahead and see if there is already a BUG report on this already before I put on in.

                -- https://support.zabbix.com/browse/ZBX-4350
                Last edited by jroberson; 11-11-2011, 22:04.

                Comment

                Working...