Ad Widget

Collapse

Many Servers are unreachable for more than 5 minutes

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • moneynut
    Member
    • Mar 2014
    • 37

    #16
    Originally posted by tchjts1
    The next thing to look at then is how your Zabbix internal processes are allocated and being used.

    Look at this post, the final paragraph and the graphs that follow. Then take a look at what your setup is doing. This will tell a lot about whether you need to tune other settings besides the Timeout= value. It is possible you simply do not have enough pollers/trappers/cache configured to handle the workload.

    https://www.zabbix.com/forum/showthread.php?t=41219
    Okay I'll show you my graphs soon.

    Btw, do we really need to open port from zabbix agent host to zabbix server on port 10051 ?
    Was just curious to know if both these rules needs to be active or not?
    Zabbix server to Agent on port 10050
    Agent to Zabbix server on port 10051

    Comment

    • steveboyson
      Senior Member
      • Jul 2013
      • 582

      #17
      Originally posted by moneynut
      Okay I'll show you my graphs soon.

      Btw, do we really need to open port from zabbix agent host to zabbix server on port 10051 ?
      Was just curious to know if both these rules needs to be active or not?
      Zabbix server to Agent on port 10050
      Agent to Zabbix server on port 10051
      Zabbix server -> Zabbix agent: for passive checks
      Zabbix agent -> Zabbix server: for active checks and trapper items

      If you have both types of checks, then you need both rules.

      Also, wenn using active proxies and child nodes, incoming connections to the Zabbix server are required.
      All that is described in the manual and the agent's|server's config file, by the way.

      Comment

      • moneynut
        Member
        • Mar 2014
        • 37

        #18
        Originally posted by steveboyson
        Zabbix server -> Zabbix agent: for passive checks
        Zabbix agent -> Zabbix server: for active checks and trapper items

        If you have both types of checks, then you need both rules.

        Also, wenn using active proxies and child nodes, incoming connections to the Zabbix server are required.
        All that is described in the manual and the agent's|server's config file, by the way.
        How do i check what kind of check I have right now? I want to know if it's passive or active.

        Comment

        • steveboyson
          Senior Member
          • Jul 2013
          • 582

          #19
          Originally posted by moneynut
          How do i check what kind of check I have right now? I want to know if it's passive or active.
          You should really spend some time on the documentation.
          See your item type in the host's or template's config. It shows "Zabbix agent" or "Zabbix agent (active)" or "SNMP v1/2/3" or "Trapper" or many other types.

          Comment

          • moneynut
            Member
            • Mar 2014
            • 37

            #20
            Originally posted by steveboyson
            You should really spend some time on the documentation.
            See your item type in the host's or template's config. It shows "Zabbix agent" or "Zabbix agent (active)" or "SNMP v1/2/3" or "Trapper" or many other types.
            I'm already into documentation. But I have a long way to go. And zabbix keeps spamming our inbox. Anyway, I've stopped the zabbix services until I fix it.

            Comment

            • steveboyson
              Senior Member
              • Jul 2013
              • 582

              #21
              Just disable your alerting rule ("configuration" - "actions"). Or remove the "media type" for specific users ("administration" - "media types").

              Then the "spamming" would suddenly stop immediately.

              Comment

              • tchjts1
                Senior Member
                • May 2008
                • 1605

                #22
                I don't know if that would work, as all those messages are already queued up to be sent.

                I've had it to where I was getting thousands of e-mails and did all kinds of similar actions to try and stop them, but to no avail.

                I think the only real way to do it is to run a few statements to the DB directly to delete all pending alerts...

                Code:
                delete from escalations;
                update alerts set status=1 where status=0;

                Comment

                • moneynut
                  Member
                  • Mar 2014
                  • 37

                  #23
                  Guys, is my issue (Unreachable for 5 minutes) somehow related to Queue?
                  I see that i have 500 in queue and top one's is delayed by 55 minutes +

                  Also on the dashboard it says the many hosts are unreachable for 5 minutes and sends me an email, But if I search the same host in Configuration->Hosts and open the host, it shows as it's being monitored and Zabbix icon is Green.

                  Also sometimes in logs it says first network error and gets restored in the next second, though dashboard does not change it's status.

                  17018:20140321:122303.688 Zabbix agent item "net.if.in[WAN Miniport (SSTP)]" on host "QA-Server-001" failed: first network error, wait for 15 seconds
                  17014:20140321:122344.038 Zabbix agent item "net.if.in[RAS Async Adapter]" on host "Prod-Server-007" failed: first network error, wait for 15 seconds
                  17022:20140321:122417.267 resuming Zabbix agent checks on host "QA-Server-001": connection restored

                  Comment

                  • moneynut
                    Member
                    • Mar 2014
                    • 37

                    #24
                    One of the host in Queue is delayed by 1 hour 5 minutes and in Dashboard it says it's unreachable for 5 minutes (Duration and age of 7 hours 54 minutes). But the same host when I look up in Configuration -> Hosts, It's status is Monitored and Zabbix icon on top is Green. But I'm still getting alerts about that.

                    This is same case with all the hosts that are sending me false alerts.

                    Comment

                    • moneynut
                      Member
                      • Mar 2014
                      • 37

                      #25
                      FYI,
                      Number of hosts (Monitored) - 450+
                      Number of items being (Monitored) - 19000+
                      Number of triggers (Enabled) - 4300+

                      Disk space, sever performance, DB performance everything is normal.

                      Comment

                      • steveboyson
                        Senior Member
                        • Jul 2013
                        • 582

                        #26
                        A filling queue is THE RESULT of unreachable/unknown items. There is no such thing as a queue. What they call "queue" is a representation of overdue items.

                        Comment

                        • aib
                          Senior Member
                          • Jan 2014
                          • 1615

                          #27
                          Also for complicated configuration which include PROXY, be prepared, that in your queue you will have some data which is waiting to send until DataSenderFrequency (from zabbix_proxy.conf) is over.
                          Sincerely yours,
                          Aleksey

                          Comment

                          • moneynut
                            Member
                            • Mar 2014
                            • 37

                            #28
                            Originally posted by steveboyson
                            A filling queue is THE RESULT of unreachable/unknown items. There is no such thing as a queue. What they call "queue" is a representation of overdue items.
                            So what's next? Been trying to fix this for last 3 days
                            Last edited by moneynut; 21-03-2014, 16:49.

                            Comment

                            • moneynut
                              Member
                              • Mar 2014
                              • 37

                              #29
                              Originally posted by aib
                              Also for complicated configuration which include PROXY, be prepared, that in your queue you will have some data which is waiting to send until DataSenderFrequency (from zabbix_proxy.conf) is over.
                              Okay. But my zabbix is not through proxy. Thanks for info though.

                              Comment

                              • tchjts1
                                Senior Member
                                • May 2008
                                • 1605

                                #30
                                Originally posted by moneynut
                                So what's next? Been trying to fix this for last 3 days
                                We have asked you for information (that you have not provided) to try and help you out. As I have mentioned in this thread, these graphs are going to tell a lot about your setup and what may need tweaking.



                                Provide a 24 hour view of Zabbix Internal process busy graph, Zabbix data gathering process busy graph and Zabbix Cache usage graph.

                                I don't mean to be curt, but you ask "what's next" when you are disregarding troubleshooting steps we have asked for way back in this thread.

                                Comment

                                Working...