Ad Widget

Collapse

Update interval for disk latency (userparameter check) on Linux hosts

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • nilldot
    Junior Member
    • Aug 2011
    • 21

    #1

    Update interval for disk latency (userparameter check) on Linux hosts

    This is clear enough when using Zabbix agent, as it use buffers to store information provided by check.

    What is not clear how to address the following for LInux hosts:
    I need to monitor disk (read/write) latency for disks. The are several tools that can help:

    1. /proc/diskstats
    2.
    3. pt-diskstats

    All of them use UserParameter. In most cases in passive mode.

    Now, what Update interval should I put here ? If I put 30 seconds , as example, it is possible that from 1-29 my storage subsystem will be experiencing latency problems, but at 30 it will be OK.

    Having in mind that in most cases latency problem could be a random issue There is possibility that I'm going to work with "moments in time" information, which, as you may imagine is not what I'm trying to monitor. This is not reliable way to get stats then.

    As an option I see using accumulative result from the bash script for 30 minutes and then fetching this back to Zabbix. But this will be avg. results probably. But I need spikes


    How to do this properly guys ?
    Thank in advance
  • jan.garaj
    Senior Member
    Zabbix Certified Specialist
    • Jan 2010
    • 506

    #2
    Note /proc/diskstats contains only counters.

    1.) game with /proc/diskstats
    Your script should to:
    - read previous state of /proc/diskstats
    - read current state of /proc/diskstats
    - save current state of /proc/diskstats
    - calculate required metrics from previous and current state of /proc/diskstats

    BTW: if you implement this concept in C, then you have systat/iostat - it's great reference code: https://github.com/sysstat/sysstat/blob/master/iostat.c

    You don't need to care about update interval in this concept. You can use, what you will need (1s,1m,1h,...) It will always calculate values from last check.
    Disadvantage: you can use only one zabbix server

    2.) game with iostat (pt_diskstats)
    You can run/parse output of iostat, for example "iostat -xd <CYCLE_TIME> 2", but <CYCLE_TIME> must to match zabbix update interval.
    Disadvantage: iostat must be running all the monitored time

    I've spent a lot of time with 1. implementation, because we can live with disadvantages of this solution (it's very customized python iostat implementation). We check data every 5 minute (=w have the same values as output from command "iostat -xd 300 2") -> output http://postimg.org/image/sjkqloj2h/ -> I can say when we have had spikes within 5 minutes resolution -> it depends on update interval (collection cycle) and what do you need.
    Devops Monitoring Expert advice: Dockerize/automate/monitor all the things.
    My DevOps stack: Docker / Kubernetes / Mesos / ECS / Terraform / Elasticsearch / Zabbix / Grafana / Puppet / Ansible / Vagrant

    Comment

    Working...