Ad Widget

Collapse

Salus: Ruby DSL for writing metric collectors

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • divanikus
    Junior Member
    • Sep 2018
    • 4

    #1

    Salus: Ruby DSL for writing metric collectors

    Hi folks!

    I got tired of rewriting same collector primitives for each new software I'm willing to monitor. So I've made a tiny ruby library to solve that problem. I hope you'll like it, or at least suggest me how to improve it (or a better way of solving my problems haha).

    So, here it is: Salus - https://github.com/divanikus/salus


    Why Salus?

    Ok, if your check is just a gauge value which shouldn't be transformed any how, I guess you'll be fine with plain old bash scripts. Or even a one-liner right in the agent's conf. But what if you need transformations? What if you need multiple values, but querying for separate parameter values mean A LOT of unnecessary polling of the service? As far as I know, lots of people do some caching in their scripts, split scripts into piece for simultanious queries etc.

    So I tried to make a little framework which has enough primitives to not to bother about those things.

    Why Ruby?

    Because it's a general purpose language with enough power out of the box and even more in it's gems' repository. If you have Puppet or Chef, you must be familiar with it. I've tried to have the least possible dependencies so you shouldn't be worring about too much unneeded crap in you system. Just ruby interpreter from system repo and 2 gems (salus and thor) without any c-extensions. You might also build an omnibus package, so everything, including Ruby, would be in one directory, withouth spreading across your systems.

    How it works?

    I won't try to write introduction README once again, so better check out the github page. In short, you create metric groups, which would be the unit of work for this library and are run in parallel. You define your code inside the groups and feed metrics with raw values you got from your source. Salus takes care of doing calcs and rendering into the decided format. You may wright your own output renderer too.

    Even though Salus was made as generic framework for collectors, I'm personally use it with zabbix, so I have primitives and modes for it too.

    For example, a script which does simple MD Raid status quering

    Code:
    require "salus/zabbix"
    default ttl: 600
    
    var state_file: nil
    var zabbix_cache_file: "/run/zabbix/mdraid.cache.yml"
    let(:arrs) { Dir.glob("/sys/block/md*").map { |x| x.sub(/^\/sys\/block\//, '') } }
    
    discover "mds" do |data|
      var(:arrs, []).each do |dev|
        data << {"{#MDNAME}" => dev}
      end
    end
    
    var(:arrs, []).each do |dev|
      group "md[#{dev}]" do
        gauge "disks" do
          File.open("/sys/block/#{dev}/md/raid_disks").read.chomp.to_i
        end
        gauge "degraded" do
          File.open("/sys/block/#{dev}/md/degraded").read.chomp.to_i
        end
        gauge "sync" do
          action = File.open("/sys/block/#{dev}/md/sync_action").read.chomp
          action == "idle" ? 0 : (action == "recover" ? 2 : 1)
        end
      end
    end
    Save it as mdraid.salus and call it like this
    Code:
    # salus -f mdraid.salus
    [2018-09-13 00:51:41 +0300] md[md0].degraded - 0.00
    [2018-09-13 00:51:41 +0300] md[md0].disks - 2.00
    [2018-09-13 00:51:41 +0300] md[md0].sync - 0.00
    [2018-09-13 00:51:41 +0300] md[md1].degraded - 0.00
    [2018-09-13 00:51:41 +0300] md[md1].disks - 2.00
    [2018-09-13 00:51:41 +0300] md[md1].sync - 0.00
    Or like this
    Code:
    # salus zabbix bulk md[md0] -f mdraid.salus
    {"disks":2,"degraded":0,"sync":0}
    Or this
    Code:
    # salus -f mdraid.salus -r zabbixsender
    "md[md0]" degraded 1536789249 0
    "md[md0]" disks 1536789249 2
    "md[md0]" sync 1536789249 0
    "md[md1]" degraded 1536789249 0
    "md[md1]" disks 1536789249 2
    "md[md1]" sync 1536789249 0
    How about a little discovery?
    Code:
    # salus zabbix discover mds -f /usr/local/share/salus/mdraid.salus
    {"data":[{"{#MDNAME}":"md0"},{"{#MDNAME}":"md1"}]}
    Or how about the script, which queries dmesg for alerts etc

    Code:
    require "salus/zabbix"
    default ttl: 600
    
    var state_file: "/run/zabbix/dmesg.state.yml"
    var zabbix_cache_file: nil
    
    group "dmesg" do
      res  = {}
      levels     = %w{emerg alert crit err warn notice}
      facilities = %w{kern user mail daemon auth syslog lpr news}
      facilities.each { |f| res[f] = {}; levels.each { |x| res[f][x] = 0 } }
    
      prev = value("last")
      prev = 0 if prev.nil?
      prev = 0 if prev > MonotonicTime.get
    
      %x{sudo dmesg -x --color=never}.split(/\n/).each do |line|
        line.match(/^(\w+)\s*:(\w+)\s*:\s*\[\s*([0-9.]+)\] (.*)$/) do |m|
          next if m[3].to_f <= prev
          prev = m[3].to_f
          next unless facilities.include?(m[1])
          next unless levels.include?(m[2])
          res[m[1]][m[2]] += 1
        end
      end
    
      res.each do |facility, message|
        group facility do
          message.each { |k, v| gauge k, value: v }
        end
      end
      gauge "last", value: prev, mute: true, ttl: 86400
    end
    Code:
    # salus zabbix bulk -f dmesg.salus
    {"dmesg":{"kern":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"user":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"mail":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"daemon":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"auth":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"syslog":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"lpr":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"news":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0}}}
    So I hope you got the point.


    Known issues?

    Of course it's a kind of alpha quality software, internal primitives should be stable (and covered by tests), but CLI is uncovered and might be buggy.

    Also:
    • If you use state file, don't forget to drop it if you change your metric type. Because otherwise it would be loaded with previous type regardless of what is in your script now.
    • I highly recommend to put cache and state files into in-memory fses like tmpfs of /run. You might want to mount an additional one.
    Bug reports, pull requests, suggestions etc are highly welcome.

    I hope you'll like it, but anyways, thank you for reading this post.
  • kloczek
    Senior Member
    • Jun 2006
    • 1771

    #2
    Writing backend monitoring scripts in ruby is terrible idea.
    Why?
    Because ruby has a lot of dependencies and is very heavy. Monitoring should be as light as it is only possible. Otherwise it causes something which I call "monitoring quantum effect" when monitoring is so heavy that it affects state of the monitored object.
    Second thing is that you are trying to reinvent the wheel. I think that at the moment it should be possible to find at least one if not more MD monitoring templates on http://share.zabbix.com/
    http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
    https://kloczek.wordpress.com/
    zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
    My zabbix templates https://github.com/kloczek/zabbix-templates

    Comment

    • divanikus
      Junior Member
      • Sep 2018
      • 4

      #3
      Thanks for your reply.

      I'm not a total newbie in monitoring stuff, I saw really lots of things getting monitored through Python scripts. So why not Ruby? I'm aware of that "quantum effect", but I don't see any reason why Ruby is worse than Python. Ok, Go-lang maybe a game changer here though, but for many probes which I monitor once a minute or even once in 10 minutes - it's really not a big deal.

      As for MD, it was just an example of how to use that library.

      Comment

      • kloczek
        Senior Member
        • Jun 2006
        • 1771

        #4
        Originally posted by divanikus
        Thanks for your reply.

        I'm not a total newbie in monitoring stuff, I saw really lots of things getting monitored through Python scripts. So why not Ruby?
        Because you can organize MD monitoring without even single backend script using only build-in agent functionalities.
        KISS principle ..
        http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
        https://kloczek.wordpress.com/
        zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
        My zabbix templates https://github.com/kloczek/zabbix-templates

        Comment

        • divanikus
          Junior Member
          • Sep 2018
          • 4

          #5
          Well, yes, for MD. But if it is something more sophisticated, you'll have to sed / jq / xmlstarlet etc. Also, bulk values consumption has appeared only in 3.4, so for monitoring things like Elasticsearch you had to cache it's response somewhere, manage it's TTL etc. Or you will end up in lots of queries to ES which is even worse. Ok, for the current moment it might be not a big deal anyways, unless you need to transform it's output.

          Comment

          • kloczek
            Senior Member
            • Jun 2006
            • 1771

            #6
            Originally posted by divanikus
            Well, yes, for MD. But if it is something more sophisticated, you'll have to sed / jq / xmlstarlet etc. Also, bulk values consumption has appeared only in 3.4, so for monitoring things like Elasticsearch you had to cache it's response somewhere, manage it's TTL etc. Or you will end up in lots of queries to ES which is even worse. Ok, for the current moment it might be not a big deal anyways, unless you need to transform it's output.
            As long as it will be about parsing output over those commands at the moment zabbix agent is enough powerful that still it will be possible to handle such needs using only plain zabbix functionalities.
            Here is the main difference compare to other monitoring software
            Just try to have look on one of my templates like those for php-fpm, ngninx or apache.
            None of those templates needs any backed scripts
            Zabbix templates. Contribute to kloczek/zabbix-templates development by creating an account on GitHub.
            http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
            https://kloczek.wordpress.com/
            zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
            My zabbix templates https://github.com/kloczek/zabbix-templates

            Comment

            • divanikus
              Junior Member
              • Sep 2018
              • 4

              #7
              I got your point and I appreciate your work. But you didn't got mine.

              This library is not just a script to pull some numbers to Zabbix, but rather to write an agent to pull to anything you want. The thing does support running in a loop mode, so with systemd you can spin up your collector agent in seconds. All you need is to add appropriate renderer. Mine first script was posting data to graphite. With very little effort now it can export to Zabbix. You may ask to output to stdout, graphite format, zabbix sender format, json - add your variant - without rewriting collector's logic. I know that Ruby isn't the fastest language, but some people write collectors using python and nodejs. So why not? Maybe I'll rewrite it to something like crystal, idk.

              Comment

              • kloczek
                Senior Member
                • Jun 2006
                • 1771

                #8
                Originally posted by divanikus
                I got your point and I appreciate your work. But you didn't got mine.

                This library is not just a script to pull some numbers to Zabbix, but rather to write an agent to pull to anything you want. The thing does support running in a loop mode, so with systemd you can spin up your collector agent in seconds. All you need is to add appropriate renderer. Mine first script was posting data to graphite. With very little effort now it can export to Zabbix. You may ask to output to stdout, graphite format, zabbix sender format, json - add your variant - without rewriting collector's logic.
                Still you cal pull whatever you want using plain agent. Zabbix don't need any external collectors.
                https://en.wikipedia.org/wiki/Occam%27s_razor

                I know that Ruby isn't the fastest language, but some people write collectors using python and nodejs. So why not? Maybe I'll rewrite it to something like crystal, idk.
                It was a bit different before zabbix 3.4 and this is why still is possible to find many examples of the templates with some backend scripts. Items input filters are really now enough in "99.9% of the cases.
                http://uk.linkedin.com/pub/tomasz-k%...zko/6/940/430/
                https://kloczek.wordpress.com/
                zapish - Zabbix API SHell binding https://github.com/kloczek/zapish
                My zabbix templates https://github.com/kloczek/zabbix-templates

                Comment

                Working...