Ad Widget

Collapse

Win32Agent crash

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • qix
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Oct 2006
    • 423

    #1

    Win32Agent crash

    Hello all,

    I'm having some trouble with the win32 agent from the 1.1.3 build.

    Today the agent was using very much cpu utilization (between 50% and 80%), it was doing this since Saturday morning.

    A look in the log file showed the following:

    Code:
    [20-Nov-2006 15:33:34] Unable to create temporary file: The process cannot access the file because it is being used
    So we tried to restart the service. It came up, but less than a minute later it crashed. We tried this a couple of times without any luck.

    When looking in the temp directory on this machine i saw that there were 62000+ zbxXXXX.tmp files in this directory. That is probably about the limit of files one can put in a directory on an NTFS filesystem.

    Removing the files deed indeed solve the crashing of the agent.
    However, the agent is generating a bizarre amount of tmp files again. I'm on 509 files at this moment already.

    To me it seems that these files are used to store values that were returned by a user parameter script.

    Does anybody have an idea why the agent doesn't remove these files automatically?

    Thanks in advance,
    With kind regards,

    Raymond
  • dantheman
    Senior Member
    • May 2006
    • 209

    #2
    well 1.1.3 is a beta.... try the 1.1.4 agent release, maybe they have it fixed in there.. might also check if that agent is updating to the zabbix server? I believe Alexei was adding in the functionality for the clients to queue up their data in the event that they were unable to contact the zabbix server..

    Comment

    • Alexei
      Founder, CEO
      Zabbix Certified Trainer
      Zabbix Certified SpecialistZabbix Certified Professional
      • Sep 2004
      • 5654

      #3
      1.1.3 is not a beta, 1.3 is!
      Alexei Vladishev
      Creator of Zabbix, Product manager
      New York | Tokyo | Riga
      My Twitter

      Comment

      • qix
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Oct 2006
        • 423

        #4
        This morning i removed about 5000 files from the temp directory again.

        I have upgraded the agent version to 1.1.4 but the problem remains.
        I have been watching the directory and pressing the refresh button frequently.

        It seems that not all the files stay in place and that some of the files do in fact get deleted. It is a mystery to me why some of the files stay in the directory and don't get deleted.

        The script which is used by the userparameter ends normally and doesn't stay in memory (i.e. it doesn't hang).

        Any ideas?
        With kind regards,

        Raymond

        Comment

        • dantheman
          Senior Member
          • May 2006
          • 209

          #5
          Originally posted by Alexei
          1.1.3 is not a beta, 1.3 is!

          Woops, my mistake.

          Comment

          • qix
            Senior Member
            Zabbix Certified SpecialistZabbix Certified Professional
            • Oct 2006
            • 423

            #6
            Today I checked this again and there were 15.000 files in the directory which I removed.

            Also something really weird is happening, I have zbxXXX.tmp files which are locked by a proces (probably the agent) with a modify date in the future


            Alexei, do you have an idea what might cause this?
            Last edited by qix; 23-11-2006, 11:56.
            With kind regards,

            Raymond

            Comment

            • netod
              Member
              • Nov 2006
              • 36

              #7
              Originally posted by qix
              Today I checked this again and there were 15.000 files in the directory which I removed.

              Also something really weird is happening, I have zbxXXX.tmp files which are locked by a proces (probably the agent) with a modify date in the future


              Alexei, do you have an idea what might cause this?
              Do you think there is maybe a stale process which is causing this? How many threads are running parented by the zabbix process? I guess the question is, do you notice the same behaviour after a reboot?

              Comment

              • qix
                Senior Member
                Zabbix Certified SpecialistZabbix Certified Professional
                • Oct 2006
                • 423

                #8
                I'm not sure what the problem is, it does seem process related though.
                I grabbed the sysinternals process monitor and saw the following:



                I see a lot of BUFFER OVERFLOWS and SHARING VIOLATIONS.

                Furthermore I checked the diffrences between a file that got deleted and a file that did not get deleted.
                The file that got deleted had the following succeded operation:



                The file that did not get deleted did not have an operation like that, not even in failed state. It did however get a lot of requests from explorer.exe and the AV scanner. I put these files in the exclusion of the AV scanner, but it doesn't seem to have much effect. If needed I can supply .PML files for a further insight (although they can be rather large).

                I have not tried a reboot of the system yet, as this is a live production environment.
                Attached Files
                With kind regards,

                Raymond

                Comment

                • qix
                  Senior Member
                  Zabbix Certified SpecialistZabbix Certified Professional
                  • Oct 2006
                  • 423

                  #9
                  It seems that setting the agent parameter MaxCollectorProcessingTime=10000 has significantly reduced the 'production' of .tmp files that are not removed.

                  There are however still files that don't get deleted, 99% of these files contain no data what-so-ever.

                  If anybody has any more ideas how to solve this, please let me know!
                  With kind regards,

                  Raymond

                  Comment

                  • qix
                    Senior Member
                    Zabbix Certified SpecialistZabbix Certified Professional
                    • Oct 2006
                    • 423

                    #10
                    The server which is running this agent is now very often unreachable (not available) according to Zabbix:

                    Got empty string from [DB000443] IP [xxx.xxx.xxx.xxxx] Parameter [xwall[InboundExchByteCount]]

                    When I run the scripts by hand, I have no problems. They return data (float and integer) normally.
                    Restarting the agent returns everything back to normal and datacollection goes ok (for a while).

                    This machine has 24 UserParameters which get polled every 5 minutes.
                    about 23 of these have a processing time of more than 3 seconds.

                    Could this be the problem?
                    With kind regards,

                    Raymond

                    Comment

                    • Alexei
                      Founder, CEO
                      Zabbix Certified Trainer
                      Zabbix Certified SpecialistZabbix Certified Professional
                      • Sep 2004
                      • 5654

                      #11
                      Thanks for reporting the issue related to temporary file. Fixed.
                      Alexei Vladishev
                      Creator of Zabbix, Product manager
                      New York | Tokyo | Riga
                      My Twitter

                      Comment

                      • qix
                        Senior Member
                        Zabbix Certified SpecialistZabbix Certified Professional
                        • Oct 2006
                        • 423

                        #12
                        Thanks Alexei!

                        In which version is this bug fixed?

                        I forgot to mention that a reboot of the server fixed the problem for a week or so. Ik rebooted it again and it has now been running stable for a week or so.
                        Last edited by qix; 04-01-2007, 11:11.
                        With kind regards,

                        Raymond

                        Comment

                        • Alexei
                          Founder, CEO
                          Zabbix Certified Trainer
                          Zabbix Certified SpecialistZabbix Certified Professional
                          • Sep 2004
                          • 5654

                          #13
                          It is fixed in 1.1.5 which is not released yet.
                          Alexei Vladishev
                          Creator of Zabbix, Product manager
                          New York | Tokyo | Riga
                          My Twitter

                          Comment

                          • qix
                            Senior Member
                            Zabbix Certified SpecialistZabbix Certified Professional
                            • Oct 2006
                            • 423

                            #14
                            Thanks again Alexei.
                            Any idea when 1.1.5 wil be released? (global estimate)
                            With kind regards,

                            Raymond

                            Comment

                            • Alexei
                              Founder, CEO
                              Zabbix Certified Trainer
                              Zabbix Certified SpecialistZabbix Certified Professional
                              • Sep 2004
                              • 5654

                              #15
                              Global estimate is January. I'd like to have it released in a 1-2 week time.
                              Alexei Vladishev
                              Creator of Zabbix, Product manager
                              New York | Tokyo | Riga
                              My Twitter

                              Comment

                              Working...