Hi folks!
I got tired of rewriting same collector primitives for each new software I'm willing to monitor. So I've made a tiny ruby library to solve that problem. I hope you'll like it, or at least suggest me how to improve it (or a better way of solving my problems haha).
So, here it is: Salus - https://github.com/divanikus/salus
Why Salus?
Ok, if your check is just a gauge value which shouldn't be transformed any how, I guess you'll be fine with plain old bash scripts. Or even a one-liner right in the agent's conf. But what if you need transformations? What if you need multiple values, but querying for separate parameter values mean A LOT of unnecessary polling of the service? As far as I know, lots of people do some caching in their scripts, split scripts into piece for simultanious queries etc.
So I tried to make a little framework which has enough primitives to not to bother about those things.
Why Ruby?
Because it's a general purpose language with enough power out of the box and even more in it's gems' repository. If you have Puppet or Chef, you must be familiar with it. I've tried to have the least possible dependencies so you shouldn't be worring about too much unneeded crap in you system. Just ruby interpreter from system repo and 2 gems (salus and thor) without any c-extensions. You might also build an omnibus package, so everything, including Ruby, would be in one directory, withouth spreading across your systems.
How it works?
I won't try to write introduction README once again, so better check out the github page. In short, you create metric groups, which would be the unit of work for this library and are run in parallel. You define your code inside the groups and feed metrics with raw values you got from your source. Salus takes care of doing calcs and rendering into the decided format. You may wright your own output renderer too.
Even though Salus was made as generic framework for collectors, I'm personally use it with zabbix, so I have primitives and modes for it too.
For example, a script which does simple MD Raid status quering
Save it as mdraid.salus and call it like this
Or like this
Or this
How about a little discovery?
Or how about the script, which queries dmesg for alerts etc
So I hope you got the point.
Known issues?
Of course it's a kind of alpha quality software, internal primitives should be stable (and covered by tests), but CLI is uncovered and might be buggy.
Also:
I hope you'll like it, but anyways, thank you for reading this post.
I got tired of rewriting same collector primitives for each new software I'm willing to monitor. So I've made a tiny ruby library to solve that problem. I hope you'll like it, or at least suggest me how to improve it (or a better way of solving my problems haha).
So, here it is: Salus - https://github.com/divanikus/salus
Why Salus?
Ok, if your check is just a gauge value which shouldn't be transformed any how, I guess you'll be fine with plain old bash scripts. Or even a one-liner right in the agent's conf. But what if you need transformations? What if you need multiple values, but querying for separate parameter values mean A LOT of unnecessary polling of the service? As far as I know, lots of people do some caching in their scripts, split scripts into piece for simultanious queries etc.
So I tried to make a little framework which has enough primitives to not to bother about those things.
Why Ruby?
Because it's a general purpose language with enough power out of the box and even more in it's gems' repository. If you have Puppet or Chef, you must be familiar with it. I've tried to have the least possible dependencies so you shouldn't be worring about too much unneeded crap in you system. Just ruby interpreter from system repo and 2 gems (salus and thor) without any c-extensions. You might also build an omnibus package, so everything, including Ruby, would be in one directory, withouth spreading across your systems.
How it works?
I won't try to write introduction README once again, so better check out the github page. In short, you create metric groups, which would be the unit of work for this library and are run in parallel. You define your code inside the groups and feed metrics with raw values you got from your source. Salus takes care of doing calcs and rendering into the decided format. You may wright your own output renderer too.
Even though Salus was made as generic framework for collectors, I'm personally use it with zabbix, so I have primitives and modes for it too.
For example, a script which does simple MD Raid status quering
Code:
require "salus/zabbix"
default ttl: 600
var state_file: nil
var zabbix_cache_file: "/run/zabbix/mdraid.cache.yml"
let(:arrs) { Dir.glob("/sys/block/md*").map { |x| x.sub(/^\/sys\/block\//, '') } }
discover "mds" do |data|
var(:arrs, []).each do |dev|
data << {"{#MDNAME}" => dev}
end
end
var(:arrs, []).each do |dev|
group "md[#{dev}]" do
gauge "disks" do
File.open("/sys/block/#{dev}/md/raid_disks").read.chomp.to_i
end
gauge "degraded" do
File.open("/sys/block/#{dev}/md/degraded").read.chomp.to_i
end
gauge "sync" do
action = File.open("/sys/block/#{dev}/md/sync_action").read.chomp
action == "idle" ? 0 : (action == "recover" ? 2 : 1)
end
end
end
Code:
# salus -f mdraid.salus [2018-09-13 00:51:41 +0300] md[md0].degraded - 0.00 [2018-09-13 00:51:41 +0300] md[md0].disks - 2.00 [2018-09-13 00:51:41 +0300] md[md0].sync - 0.00 [2018-09-13 00:51:41 +0300] md[md1].degraded - 0.00 [2018-09-13 00:51:41 +0300] md[md1].disks - 2.00 [2018-09-13 00:51:41 +0300] md[md1].sync - 0.00
Code:
# salus zabbix bulk md[md0] -f mdraid.salus
{"disks":2,"degraded":0,"sync":0}
Code:
# salus -f mdraid.salus -r zabbixsender "md[md0]" degraded 1536789249 0 "md[md0]" disks 1536789249 2 "md[md0]" sync 1536789249 0 "md[md1]" degraded 1536789249 0 "md[md1]" disks 1536789249 2 "md[md1]" sync 1536789249 0
Code:
# salus zabbix discover mds -f /usr/local/share/salus/mdraid.salus
{"data":[{"{#MDNAME}":"md0"},{"{#MDNAME}":"md1"}]}
Code:
require "salus/zabbix"
default ttl: 600
var state_file: "/run/zabbix/dmesg.state.yml"
var zabbix_cache_file: nil
group "dmesg" do
res = {}
levels = %w{emerg alert crit err warn notice}
facilities = %w{kern user mail daemon auth syslog lpr news}
facilities.each { |f| res[f] = {}; levels.each { |x| res[f][x] = 0 } }
prev = value("last")
prev = 0 if prev.nil?
prev = 0 if prev > MonotonicTime.get
%x{sudo dmesg -x --color=never}.split(/\n/).each do |line|
line.match(/^(\w+)\s*:(\w+)\s*:\s*\[\s*([0-9.]+)\] (.*)$/) do |m|
next if m[3].to_f <= prev
prev = m[3].to_f
next unless facilities.include?(m[1])
next unless levels.include?(m[2])
res[m[1]][m[2]] += 1
end
end
res.each do |facility, message|
group facility do
message.each { |k, v| gauge k, value: v }
end
end
gauge "last", value: prev, mute: true, ttl: 86400
end
Code:
# salus zabbix bulk -f dmesg.salus
{"dmesg":{"kern":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"user":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"mail":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"daemon":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"auth":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"syslog":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"lpr":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0},"news":{"emerg":0,"alert":0,"crit":0,"err":0,"warn":0,"notice":0}}}
Known issues?
Of course it's a kind of alpha quality software, internal primitives should be stable (and covered by tests), but CLI is uncovered and might be buggy.
Also:
- If you use state file, don't forget to drop it if you change your metric type. Because otherwise it would be loaded with previous type regardless of what is in your script now.
- I highly recommend to put cache and state files into in-memory fses like tmpfs of /run. You might want to mount an additional one.
I hope you'll like it, but anyways, thank you for reading this post.

Comment