Ad Widget

Collapse

[1.3.6 r4081] Strange message on the server when an agent is restarted

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Farzad FARID
    Member
    • Apr 2007
    • 79

    #1

    [1.3.6 r4081] Strange message on the server when an agent is restarted

    Hi,

    I'm running Zabbix 1.3.6 r4081, the server is on a 64bits Linux FC5, one of the agent is on a 32bits FC5.
    After upgrading the agent, while the server was running, I got these messages in the server log:

    Code:
    21545:20070502:173215 Timeout while answering request
    21545:20070502:173215 Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]
    21545:20070502:173215 Host [srv09bc1]: first network error, wait for 15 seconds
    21577:20070502:173233 Get value from agent failed. Error: Cannot resolve []
    21577:20070502:173233 Host [^_^K<AE><AA>*]: another network error, wait for 15 seconds
    21553:20070502:173613 Get value from agent failed. Error: Cannot connect to [srv09bc1.mydomain.com:10050] [Connection refused]
    21553:20070502:173613 Host [srv09bc1]: first network error, wait for 15 seconds
    What worries me are the lines:
    • 21577:20070502:173233 Get value from agent failed. Error: Cannot resolve []
    • 21577:20070502:173233 Host [^_^K<AE><AA>*]: another network error, wait for 15 seconds



    This looks like a memory corruption or a bad pointer. As I said previously the zabbix server is running on a 64 bits Linux.

    Regards.
  • Farzad FARID
    Member
    • Apr 2007
    • 79

    #2
    Another corrupted message on linux 64 bits

    Hi,

    Last night my server running Zabbix Server 1.3.6 r4081, on a 64 bits Fedora Core 5, show again a strange log message that seems to results from corrupted memory or a stray pointer. This time it happened in a different string:

    Code:
     21545:20070502:200315 Timeout while answering request
     21545:20070502:200315 Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]
     21545:20070502:200315 Host [srv09bc1]: first network error, wait for 15 seconds
     21577:20070502:200332 Delay period format is wrong [9^Wc]
     21574:20070502:200938 Executing housekeeper
     21574:20070502:200939 Deleted 1683 records from history and trends
    Regards.

    Comment

    • Farzad FARID
      Member
      • Apr 2007
      • 79

      #3
      Hi,

      Another corrupted message, still on the server, after I restarted an agent that didn't work anymore:

      Code:
       21577:20070503:101709 Get value from agent failed. Error: ZBX_TCP_READ() failed [Interrupted system call]
       21577:20070503:101709 Host [srv05bc2] will be checked after 60 seconds
       21567:20070503:101729 Query::select i.key_,i.delay,i.lastlogsize from items i,hosts h where i.hostid=h.hostid and h.status=0 and i.status
      =0 and i.type=7 and h.host='srv05bc2' and h.hostid>=100000000000000*0 and h.hostid<=(100000000000000*0+99999999999999)
       21567:20070503:101729 Query failed:MySQL server has gone away [2006]
       21577:20070503:101809 Delay period format is wrong [^Wc]
       21577:20070503:101809 Enabling host [^Vc]
      By the way, you can notice the worrying message Query failed:MySQL server has gone away, whereas the mysql server actually works correctly. I don't know why restarted a non-functionnal agent triggered this bug.

      And the agent but is one I got with zabbix 1.3.5, and that is still not corrected in zabbix 1.3.6 r4081 : http://www.zabbix.com/forum/showthread.php?t=5829

      After this the freshly restarted agent and the server seem to work correctly again.

      Regards.

      Comment

      • Alexei
        Founder, CEO
        Zabbix Certified Trainer
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2004
        • 5654

        #4
        Thanks for reporting this. It looks like some data corruption insider ZABBIX server. To be fixed.
        Alexei Vladishev
        Creator of Zabbix, Product manager
        New York | Tokyo | Riga
        My Twitter

        Comment

        • Alexei
          Founder, CEO
          Zabbix Certified Trainer
          Zabbix Certified SpecialistZabbix Certified Professional
          • Sep 2004
          • 5654

          #5
          Indeed, these messages were related to usage of freed structure. Fixed!
          Alexei Vladishev
          Creator of Zabbix, Product manager
          New York | Tokyo | Riga
          My Twitter

          Comment

          Working...