Ad Widget

Collapse

Suggestions for monitoring load balanced services

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mcxian
    Junior Member
    • Mar 2013
    • 10

    #1

    Suggestions for monitoring load balanced services

    Hey all,

    In our web environment we have an F5 load balancer that sends traffic to active/passive varnish servers. Behind varnish is an array of apache nodes. I would like to detect which varnish node is active, and which apache nodes varnish thinks are alive.

    I a check for our varnish host which tells me the total number of configured apache backends, the number of alive ones, and the list of hostnames for those that are alive. The total number of configured hosts, and the current active number give me something to trigger on, but I also want to have an item on the individual apache nodes telling me their status for a zabbix map.

    I was thinking I could setup a calculated item on my apache hosts with a function something like:
    dynocache2v1:varnish.backendhealth.la.str("{HOST.N AME}")

    That is the varnish server and the item item key that holds the text list of hostnames that are active. I was thinking I could search that item for the hostname of an apache node to determine if it is currently active, but this item does not seem to work.

    Additionally, as I mentioned earlier, we have two varnish servers. All traffic gets served to the first by default, but there is a second that takes over should the first fail. Thus, in my apache calculated item above, I would need to somehow save which varnish node is currently active from the F5 in a macro or something, then adjust the calculated item key to be something like:
    $ACTIVE_VARNISH_NODE:varnish.backend.la.str("{HOST .NAME}")

    Is that even possible? Does anyone have any suggestions on how to correct my function syntax or how I could do this better?

    Thanks!
  • Bernd Hohmann
    Member
    • Mar 2013
    • 46

    #2
    Originally posted by mcxian
    Is that even possible? Does anyone have any suggestions on how to correct my function syntax or how I could do this better?
    Unfortunately I never dealt with such a structure. But from my experiences so far, your problem should not be resolved in zabbix itself but from an external source like a shell script / perl application or something else which consolidates the data and reports it to Zabbix.

    Just my 5ct

    Bernd

    Comment

    • mcxian
      Junior Member
      • Mar 2013
      • 10

      #3
      So worked around the issue a bit. On the varnish servers, I have a text item that records the active apache servers. On each apache server, I have a userparameter that runs a script that does a zabbix api call for the contents of the varnish server active server item, then greps it for the hostname.

      I feel like it is a bit of a hack to need a userparameter to fetch data from zabbix just to send it back to zabbix, but it works.

      Another note, after reading the documentation more closely, it appears my orginal idea is not designed to work:
      "User macros in the formula will be expanded if used to reference a parameter or a constant. User macros will NOT be expanded if used to reference a function, host name, item key or operator."

      Sigh. Maybe the eventual LUA integration might help: https://support.zabbix.com/browse/ZBXNEXT-1443




      So I have this working for my active varnish server. Should that one ever go offline, I don't see a graceful way of failing over these checks to the second/passive varnish node. For example, maps built displaying apache nodes linked to the varnish server won't dynamically rebuild links to the passive host once it is active. Maybe I can work out some trickery with links colored the same as the background and only color them in once that varnish server goes hot.

      Comment

      Working...