Ad Widget

Collapse

Frequent "no data on C: free disk space" trigger for select Windows hosts

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • snmpguru8
    Junior Member
    • Apr 2020
    • 19

    #1

    Frequent "no data on C: free disk space" trigger for select Windows hosts


    I have a couple Windows VMs that, every few minutes, pop a " Drive: C: - No Data from item for 20 min" alert, which last several minutes and eventually clears. We are monitoring about 150 other Windows VMs that don't appear to exhibit the behavior.

    This item is polled very 1m, and running a zabbix_get from the server (no proxy involved) returns the value almost instantly. Nothing useful in the Zabbix server logs, either.

    I've tried reinstalling, upgrading and downgrading the agents. This has occurred both when the environment was running Zabbix 4, as well as current version 5.




    Has anyone encountered this before? I have rebooted the VMs multiple times over the last year. It seems like some days are worse than others in terms of how often the trigger pops.


  • snmpguru8
    Junior Member
    • Apr 2020
    • 19

    #2
    Another thing I wanted to mention about these two particular hosts in particular is that they have a large # of items via discovery (Powershell query). The # of items for one problem server is 700, and the other is 800.

    Each server is not taxed on memory or CPU. Both have an agent setting timeout of 30s, but running zabbix_get returns the data almost immediately anyways

    Comment

    • Markku
      Senior Member
      Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
      • Sep 2018
      • 1781

      #3
      Some ideas (not in any particular order):
      • Change to active items (you talked about polling so I take it you mean passive agent items)
      • Show us your specific item and trigger configurations
      • Check the actual timestamps and data values in Latest data (maybe you get some hint why at specific times the item values were not received)
      • Disable all the other items on the servers while troubleshooting this problem
      • Check agent logs
      • In general, consider if 1 minute interval is really useful for free disk space measurement (usually like 5 or 15 minutes is enough, tune the triggers accordingly)
      • Check that your hosts are properly configured (no duplicate IPs or anything that would lead Zabbix server to query incorrect servers occasionally)
      Markku

      Comment

      • snmpguru8
        Junior Member
        • Apr 2020
        • 19

        #4
        Originally posted by Markku
        Some ideas (not in any particular order):
        • Change to active items (you talked about polling so I take it you mean passive agent items)
        • Show us your specific item and trigger configurations
        • Check the actual timestamps and data values in Latest data (maybe you get some hint why at specific times the item values were not received)
        • Disable all the other items on the servers while troubleshooting this problem
        • Check agent logs
        • In general, consider if 1 minute interval is really useful for free disk space measurement (usually like 5 or 15 minutes is enough, tune the triggers accordingly)
        • Check that your hosts are properly configured (no duplicate IPs or anything that would lead Zabbix server to query incorrect servers occasionally)
        Markku
        I have updated the free disk space measurement to every 5m.The items were previously set to active.

        I have noticed that when it is working, the timestamps reported in "Latest Data" exceed 5m. For instance:
        2021-06-29 14:55:36 46.3698
        2021-06-29 14:47:18 46.3756
        Despite being configured for 5m, it took over 8m for the data to report in.

        What are some reasons this may be happening?

        Comment

        • Markku
          Senior Member
          Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
          • Sep 2018
          • 1781

          #5
          With active checks the agent sends the values to the server, with passive checks the server polls for the values from the agent. If the values are not arriving as planned, there is some problem that prevents the data from being generated or transferred.

          Did your results change when you disabled all other items on the hosts?

          How is your Zabbix server performing according to your Zabbix metrics?

          Markku

          Comment

          • snmpguru8
            Junior Member
            • Apr 2020
            • 19

            #6
            Originally posted by Markku
            With active checks the agent sends the values to the server, with passive checks the server polls for the values from the agent. If the values are not arriving as planned, there is some problem that prevents the data from being generated or transferred.

            Did your results change when you disabled all other items on the hosts?

            How is your Zabbix server performing according to your Zabbix metrics?

            Markku
            Yes - the items are more performant when other items are disabled. Most of the items (700) are a result of a discovery template. This is generated via a Powershell script. When disabled, the items arrive in a more timely manne.r

            From what I can tell, teh Zabbix server is performing OK. It was recently built. Are there specific metrics I could check and report on?

            Thank you

            Comment

            • Markku
              Senior Member
              Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
              • Sep 2018
              • 1781

              #7
              Checking the Zabbix queues could be useful, there are also graphs in your Zabbix server template. What kind of items are those 700 discovered items, how often polled etc? What kind of limitations there are for connectivity between the server and the agent?

              Markku

              Comment


              • snmpguru8
                snmpguru8 commented
                Editing a comment
                The queues - indeed. The Zabbix queues are large and often have hundreds or thousands of items listed. The Zabbix server doesn't appear to be under water in terms of resources. What can I look at to help reduce that queue? Build a proxy to point clients to?
            • Markku
              Senior Member
              Zabbix Certified SpecialistZabbix Certified ProfessionalZabbix Certified Expert
              • Sep 2018
              • 1781

              #8
              I haven't ever had queue problems so perhaps I cannot give you very specific advice. But you already know how the massive amount of items for one agent causes in your case. You didn't yet tell what kind of items they are, and how is the connectivity.

              Markku

              Comment

              • snmpguru8
                Junior Member
                • Apr 2020
                • 19

                #9
                Originally posted by Markku
                I haven't ever had queue problems so perhaps I cannot give you very specific advice. But you already know how the massive amount of items for one agent causes in your case. You didn't yet tell what kind of items they are, and how is the connectivity.

                Markku
                Thanks, Markku. The queue is typically comprised of various items, which can be powershell related scripts, or simple perf_counters. Most all of our items are Active.

                Comment

                Working...