Ad Widget

Collapse

Zabbix-proxy fuzzytime not giving proper response

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • A_01
    Junior Member
    • Jun 2014
    • 7

    #1

    Zabbix-proxy fuzzytime not giving proper response

    Hi,
    We have setup of proxy-agent. And we are getting agent not reachable(hasnodata expression) when ever proxy goes down.
    So I have changes the trigger logic to

    ({ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess].fuzzytime(240)} = 1
    &
    {ip-10-4-1-17.ec2.internal:agent.ping.nodata(5m)} = 1)

    It should give true only if proxy up & node down but its not working as expected I am getting notification that agent is not reachable. I am very confused ca somebody please let me know whats wrong with it or give direction so that I will be able to figure it out.

    Its very urgent Any suggestions would be very helpful!
    Thank you.
  • Pada
    Senior Member
    • Apr 2012
    • 236

    #2
    Hi,

    It seems like you've seen my post and the post that I've referenced:



    I'm not sure why your trigger isn't working correctly - but I would suggest that you rather use the trigger dependency thing than to include an additional variable into the expression.

    At this moment our Zabbix 1.8 setup with a bunch of zabbix proxies are set up as follow:
    • Template-Host-via-Proxy:
      • Macro: {$PROXY-HOSTNAME} = <Proxy hostname placeholder>
      • Item:
        • key: zabbix[proxy,{$PROXY-HOSTNAME},lastaccess]
        • type: Zabbix internal
        • units: unixtime
        • update interval: 30s
        • keep history: 1day
        • keep trends: 7days
        • description: {$PROXY-HOSTNAME}: last access time

      • Trigger:
        • name: {$PROXY-HOSTNAME} not connected for 30s
        • comments: this trigger only recovers after 10min
        • expression:
          Code:
          ({Template-Host-via-Proxy:zabbix[proxy,{$PROXY-HOSTNAME},lastaccess].fuzzytime(30)}=0)|(({TRIGGER.VALUE}=1)&(    (({Template-Host-via-Proxy:zabbix[proxy,{$PROXY-HOSTNAME},lastaccess].avg(#11)}-{Template-Host-via-Proxy:zabbix[proxy,{$PROXY-HOSTNAME},lastaccess].last(#11)}-150)>2)|(({Template-Host-via-Proxy:zabbix[proxy,{$PROXY-HOSTNAME},lastaccess].avg(#11)}-{Template-Host-via-Proxy:zabbix[proxy,{$PROXY-HOSTNAME},lastaccess].last(#11)}-150)<-2)))
        • severity (you don't want this to cause warnings for every host!): informational



    • Template-RHEL:
      • Item #1:
        • Description: tcp ping to zabbix server/proxy
        • type: Zabbix agent (active)
        • key: agent.ping
        • update interval: 30s
        • show value: service state

      • Trigger #1:
        • name: Missing Zabbix Agent data for 5min
        • expression:
          Code:
          {Template-RHEL:agent.ping.nodata(301)}=1
        • severity: warning

      • Item #2:
        • key: system.localtime
        • description: Host local time
        • type: Zabbix agent (active)
        • units: unixtime
        • update interval: 30s

      • Trigger #2:
        • name: Time out of sync
        • host's time is out of sync with more than 75s (has to be longer than proxy unavailable trigger)
        • expression:
          Code:
          {Template-RHEL:system.localtime.fuzzytime(75)}=0
        • severity: warning

      • ... and lots of other items and triggers that you want to use


    • Template-RHEL-via-<proxy's hostname here>: eg. Template-RHEL-via-proxyzonea.example.com:
      • link with template: Template-Host-via-Proxy, Template-RHEL
      • Macro: {$PROXY-HOSTNAME} = proxyzonea.example.com
      • Trigger #1 (just add the dependency):
        • name: Missing Zabbix Agent data for 5min
        • trigger depends on: proxyzonea.example.com not connected for 30s

      • Trigger #2 (just add the dependency again):
        • name: Time out of sync
        • trigger depends on: proxyzonea.example.com not connected for 30s



    So we basically have 1 template (Template-RHEL) for all our RHEL instances across the multiple zones/regions, which does not include a "Missing Zabbix agent trigger, but it does include the agent.ping item though!) but then we have multiple (1 per proxy) Template-RHEL-via-* templates.
    This crappy way of doing it results in an extra item per host that gets added to the database every 30 seconds, which is why I set the history & trend time to a week or shorter. On the proxy itself I have that history & trend values set much larger though.

    We never directly link the Template-RHEL to a host, unless the host is not monitored via a proxy, in which case we'll link the Template-RHEL-via-<its proxy's hostname> template to it.

    This way, we don't have to go and add trigger dependencies to every single host for every single zone/region at EC2. I have to admit that it is still effort to duplicate the templates for every zone that we expand to, but its not that bad.

    The agent.ping and system.localtime (and other time sensitive) triggers are the only ones that we have configured to depend on the proxy's connection status so far in our setup. Like if you're monitoring your mysql or other server's time too, then just add the proxy connection status trigger as a dependency for it too.

    Let me know if you don't understand what I've tried to explain above, then I'll try to do better

    Could you also tell us more details about your setup, such as your proxy refreshrate/update interval, zabbix versions and your update interval for the agent.ping value?
    Perhaps your trigger isn't working because the proxy connection is intermittend and not completely offline?
    Last edited by Pada; 06-06-2014, 05:36.

    Comment

    • A_01
      Junior Member
      • Jun 2014
      • 7

      #3
      Queries on the reply of Pada

      Thankyou very much for your quick response.


      Answer for your question:
      "Could you also tell us more details about your setup, such as your proxy refreshrate/update interval, zabbix versions and your update interval for the agent.ping value?"
      -sudo service zabbix-proxy -V
      -service ver. 0.91
      -Zabbix version : Zabbix 2.0.6
      -agent.ping - Interval 60s
      -proxy refreshrate/update : I am not sure how to check it.

      I have below queries can you please clarify it :
      Quesry 1 )
      Can you please clarify why to use all these templates can we keep it simple. The expression which I posted initially :
      ({ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess].fuzzytime(240)} = 1
      &
      {ip-10-4-1-17.ec2.internal:agent.ping.nodata(5m)} = 1)

      When I am running these two exressions seperately as individual trigger its working as expected but I did not understood why its not working fine when I ran it together as above. Can I debug this issue to figure out why its not behaving properly.

      Query 3) I have also tested the below scenerio
      we have a trigger "agent not reachable" on proxy node iteself. I made the agent not found trigger dependent on the proxy-agent not found. In other words suppose "devproxy-node" is proxy node and "agent-node1,2" are the subordinate nodes monitered by devproxy-node.
      devproxy-node
      trigger : agent not found
      agent-node1
      trigger : agent not found
      Depend on(devrpoxy-node-trigger agent not found)
      agent-node2
      trigger : agent not found
      Depend on(devrpoxy-node-trigger agent not found)

      So In this setup also I am getting false alarm whenver devproxy-node is going down zabbix triggering the notification for all the agent-nodes. I didnot understand whats worng with with. and How I can debug this? i mean by checking the database or something.
      Note zabbix-proxy log looks perfectly fine.

      Query 3)
      If you recommend to have all three templates than let me explain what I understood and you please correct me if my understanding is wrong
      Note : We have single proxy monitoring multiple agents setup.
      - Template-Host-via-Proxy
      devproxy-node: This trigger will only be there for proxy nodes and it will not for non proxy agents.
      - Template-RHEL :
      I am little confused about this whether I should use it or not. If I should use than I think it will be at agent level(non proxy) only.
      - Template-RHEL-via-<proxy's hostname here>
      All agents node(except proxy node) will have this trigger
      It will have two triggers with no expression but having dependency to above Template-RHEL triggers.
      Than if I will have the above setup than how it will link up with the proxy. I mean if proxy goes down than all the agent will be reported not reachable there is not link with the proxy triger.

      Comment

      • Pada
        Senior Member
        • Apr 2012
        • 236

        #4
        Hi,

        The proxy refreshrate/updaterate setting that I was referring to was the "DataSenderFrequency" setting in /etc/zabbix/zabbix_proxy.conf, which is every 1 second by default.

        I cannot find anything wrong with your expression.
        Make sure that the update frequency of the lastaccess field is either like every 30 seconds, or increase the difference between the proxy offline & agent offline checks from 1 minute, to 2 minutes.
        eg.
        Code:
        ({ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess].fuzzytime(3m)} = 1
        &
        {ip-10-4-1-17.ec2.internal:agent.ping.nodata(5m)} = 1)
        If that change doesn't fix the issue, then perhaps you should create separate triggers too to see in which order they trigger:
        eg. one for fuzzytime(3m)=0
        one for fuzzytime(3m)=1
        one for nodata(5m)=0
        and one for nodata(5m)=1

        I would also recommend that you upgrade both your Zabbix proxy and server to the latest version of 2.0, because there has been quite a few fixes relating to proxies.

        I'll try to do some more explaining in terms of the questions that you asked when I get more time the coming week.

        Comment

        • A_01
          Junior Member
          • Jun 2014
          • 7

          #5
          Originally posted by Pada
          Hi,

          The proxy refreshrate/updaterate setting that I was referring to was the "DataSenderFrequency" setting in /etc/zabbix/zabbix_proxy.conf, which is every 1 second by default.
          I cannot find anything wrong with your expression.
          Make sure that the update frequency of the lastaccess field is either like every 30 seconds, or increase the difference between the proxy offline & agent offline checks from 1 minute, to 2 minutes.
          eg.
          Code:
          ({ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess].fuzzytime(3m)} = 1
          &
          {ip-10-4-1-17.ec2.internal:agent.ping.nodata(5m)} = 1)
          I tested that by increasing the values but still no luck.
          When I am running the expression individually its working as expected. I mean suppose I have two seperate triggers at host level which is monitored by a proxy:
          #1:It gets trigger when proxy is not responding from past 3 minutes.
          Code:
          ({ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess].fuzzytime(3m)} = 1
          #2:It gets trigger when this agent is not responding from past 5 minutes.(Also it depends on proxy so if I shutdown proxy it get triggered after 5 mintes)
          Code:
          {ip-10-4-1-17.ec2.internal:agent.ping.nodata(5m)} = 1)
          It is running as expected if proxy goes down than #1 get trigger after 3 minutes and #2 getting triggered after 5 minutes(because proxy is down server consider it as its down).

          But I am expecting a below behavior to get a notification mail :
          1. When proxy is down : Send mail only for a proxy node and not for all the nodes which are monitored by proxy. (currently I have 48 nodes under only 1 proxy node, So when I shutdown the proxy I get 48 notification mails for all the nodes)
          2. When node(monitored by proxy) is down : Send mail only for node which is down and not for any other node.

          I have no clue what wrong with it. Please let me know how I can debug this.

          Comment

          • Pada
            Senior Member
            • Apr 2012
            • 236

            #6
            Originally posted by A_01
            It is running as expected if proxy goes down than #1 get trigger after 3 minutes and #2 getting triggered after 5 minutes(because proxy is down server consider it as its down).
            Are you getting those emails 2 minutes apart, with the proxy one arriving first?

            What is the "Update interval" for those 2 host level items:
            ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess
            ip-10-4-1-17.ec2.internal:agent.ping

            And did you configure "DataSenderFrequency" for the Zabbix proxy in /etc/zabbix/zabbix_proxy.conf, or did you leave it at its default value?
            Last edited by Pada; 11-06-2014, 14:24.

            Comment

            • A_01
              Junior Member
              • Jun 2014
              • 7

              #7
              Originally posted by Pada
              Are you getting those emails 2 minutes apart, with the proxy one arriving first?

              What is the "Update interval" for those 2 host level items:
              ip-10-4-1-17.ec2.internal:zabbix[proxy,zabbixproxy.dev-test.com,lastaccess
              ip-10-4-1-17.ec2.internal:agent.ping
              Yes. after 3 minutes first I am getting the mail for proxy
              and than after 5 minutes I am getting mail for the node.
              Update interval is 60 seconds for both items(lastaccess, agentping)
              Originally posted by Pada
              And did you configure "DataSenderFrequency" for the Zabbix proxy in /etc/zabbix/zabbix_proxy.conf, or did you leave it at its default value?
              Yes, I have changed it to 30 seconds.

              Also I wanted to know how I can access the data base I have the below value in zabbix_proxy.conf
              DBName=/var/lib/sqlite/zabbix.db
              DBUser=zabbix
              DBSocket=/var/lib/mysql/mysql.sock
              But I dont know how to reach the db. When I am running mysql command on the console it says command not found

              Comment

              • Pada
                Senior Member
                • Apr 2012
                • 236

                #8
                Typically with the Zabbix server & proxy (with mysql) installations, they only install the MySQL server and not the MySQL client too.

                So you'll need to run like
                Code:
                sudo yum install mysql
                to install the MySQL client.

                Seeing that your update interval is at 60s and you are getting the Proxy offline email after 3 minutes and then 2 minutes later the Host offline notification, I really can't see why combining the two does not work!

                You can always just do a workaround, by keeping the triggers separate, but then make the following changes:
                1) Change the Proxy trigger to generate an "Informational" alert (or another severity level that you're not sending notifications on) once the Proxy is unreachable for 3 minutes
                2) Change the Host offline trigger to have the Proxy trigger as a dependency

                Comment

                • A_01
                  Junior Member
                  • Jun 2014
                  • 7

                  #9
                  Originally posted by Pada
                  Typically with the Zabbix server & proxy (with mysql) installations, they only install the MySQL server and not the MySQL client too.

                  So you'll need to run like
                  Code:
                  sudo yum install mysql
                  to install the MySQL client.
                  Thanks a lot I will install it.
                  Originally posted by Pada
                  Seeing that your update interval is at 60s and you are getting the Proxy offline email after 3 minutes and then 2 minutes later the Host offline notification, I really can't see why combining the two does not work!

                  You can always just do a workaround, by keeping the triggers separate, but then make the following changes:
                  1) Change the Proxy trigger to generate an "Informational" alert (or another severity level that you're not sending notifications on) once the Proxy is unreachable for 3 minutes
                  2) Change the Host offline trigger to have the Proxy trigger as a dependency
                  Sure I will give it a try and will let you know in soon.
                  Thanks a lot again.

                  Comment

                  Working...