Ad Widget

Collapse

Anyone monitoring Unix "top"? (topas)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tchjts1
    Senior Member
    • May 2008
    • 1605

    #1

    Anyone monitoring Unix "top"? (topas)

    This is my second implementation of Zabbix in a large IT shop. The question that always comes from the Unix IT support folks is whether Zabbix can monitor and keep history of the top output on Unix servers. This would be a huge selling point for Zabbix.

    Has anyone successfully implemented a way to monitor top?
  • nelsonab
    Senior Member
    Zabbix Certified SpecialistZabbix Certified Professional
    • Sep 2006
    • 1233

    #2
    A few things come to mind. Though my first though is that this is an inappropriate use for Zabbix and monitoring systems in general. I understand the desire for having such data, but in my experience it most often comes from a lack of understand and or a lack of willingness to let go of what they know.

    However with that said if they *really* need something like this top can be run in a non-interactive mode where it will display it's output and quit via

    Code:
    top -b -n1
    I would then put that into active mode log type item, where the results of top are preriodically sent to Zabbix and stored in a "log" data type. Alternatively Character may be a better idea but may also have size limitations.

    Alternatively if all you are looking for is the top 5 processes which are taking up CPU and Memory (sorted in that order) grouped by name the following should work for you

    Code:
    ps -A -o comm,%cpu,%mem |sed  's/\(\w*\)\/\(\w*\)/\1/g'|awk 'NR==1 { print} NR!=1 {cpu[$1]+=$2;mem[$1]+=$3} END { for (i in cpu) {print i,cpu[i],mem[i]}}'| sort  -r +1 -2 -| head -6 | tail -5
    Last edited by nelsonab; 22-11-2012, 00:10.
    RHCE, author of zbxapi
    Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
    Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

    Comment

    • tchjts1
      Senior Member
      • May 2008
      • 1605

      #3
      Originally posted by nelsonab
      A few things come to mind. Though my first though is that this is an inappropriate use for Zabbix and monitoring systems in general. I understand the desire for having such data, but in my experience it most often comes from a lack of understand and or a lack of willingness to let go of what they know.
      I am surprised at your above statement, Mr nelsonab. Why do you feel that it is an inappropriate use of a monitoring tool to see what processes are leading the list in use of memory/CPU usage? Or, are you specifically referring to monitoring top, and feel that your second option is more appropriate?

      Inquiring minds want to know.
      BTW - Have a great Thanksgiving holiday!

      Comment

      • nelsonab
        Senior Member
        Zabbix Certified SpecialistZabbix Certified Professional
        • Sep 2006
        • 1233

        #4
        Like I said "my first thought." And yes I am more specifically referring to people wanting Zabbix to look just like top, or have data that looks like that. If it's just a matter of the top N processes, have a look at the second code snippet I put up. It will also group all processes with the same name giving a cumulative total, including the grouping of processes with a name similar to process/1 (that's what the crazy sed statement takes care of).

        I think it more comes down to the nature of how data is stored in Zabbix. Zabbix is not very good at storing this type of free form data and has limited or no meta-information capability and static naming for the data sets it retrieves, making the useful presentation of data such as this quite challenging. In addition using such data for triggers is equally challenging. Also like I said to me I wonder if the desire for this is more of a desire to continue with familiar tools vs thinking about the same problem in different ways.

        In my experience information such as this is most often only useful when a system is pegged or not responding, in which case it can be useful to figure out what's taking up the memory or cpu. One way I came up with handling this was to set up a trigger which would connect to the system and pull a current process list and store it into Zabbix as a log item. I think I may have even posted it. If I did post and and you do decide to use it you may need to take the time to mitigate any security ramifications, at the time I was more interested in solving a problem/challenge than security.
        Last edited by nelsonab; 21-11-2012, 23:54.
        RHCE, author of zbxapi
        Ansible, the missing piece (Zabconf 2017): https://www.youtube.com/watch?v=R5T9NidjjDE
        Zabbix and SNMP on Linux (Zabconf 2015): https://www.youtube.com/watch?v=98PEHpLFVHM

        Comment

        • tchjts1
          Senior Member
          • May 2008
          • 1605

          #5
          Ok, now I understand where you are coming from.

          In the instances I have encountered folks asking for this, it has always been just to seek out what is consuming the CPU/memory during performance degradation. I don't think there was ever an expectation to make it look like a top output, but simply to see what was chewing up the resources.

          I did run your second option of code and got expected output. I'll look further into using this... and possibly exploring the first option as well.

          Thanks for your input!

          Comment

          • Colttt
            Senior Member
            Zabbix Certified Specialist
            • Mar 2009
            • 878

            #6
            why top?! what kind of information do you(or Unix IT) need?!
            Debian-User

            Sorry for my bad english

            Comment

            • tchjts1
              Senior Member
              • May 2008
              • 1605

              #7
              Originally posted by Shad0w
              why top?! what kind of information do you(or Unix IT) need?!
              Simply because top reports the highest consumers of CPU and memory usage at any given time.

              As with any server that experiences performance degradation, the first thing you want to know is what processes are chewing up those resources.

              Comment

              • yippydawg
                Junior Member
                • Nov 2012
                • 1

                #8
                Baffled

                A few things come to mind. Though my first though is that this is an inappropriate use for Zabbix and monitoring systems in general. I understand the desire for having such data, but in my experience it most often comes from a lack of understand and or a lack of willingness to let go of what they know.
                Frankly I'm a bit baffled about your comment that this is inappropriate.

                It seems entirely rational that if you are collecting CPU graphs over time that if you see a spike the first thing people are going to ask is "What caused that".

                I'm also a bit baffled as to why Zabbix has no way to do that.

                I don't like the top solution much, or polling ps either. Isn't there a more elegant way to achieve this?

                Comment

                • alledm
                  Member
                  • May 2012
                  • 84

                  #9
                  I agree that Zabbix should have a way to collect top like data. Ideally it should collect the whole top view and then allow you to look only at data you care about.

                  Other solutions like GENEOS do it and it's just a pity that zabbix is not complete in that way.

                  ...and I am sorry, but repeating many times that you don't need it it's not going to change the fact that zabbix SHOULD BE able to do it natively, without ugly top or ps hacks.

                  Also, PS is not reliable because it only shows you the CPU used by that process since it was created, so it is not good to detect trends and spikes.

                  The top solution is just plain ugly, but it seems to be the only one that makes some sense.

                  Hopefully zabbix will soon gain the ability to perform such a simple check.

                  Comment

                  • alledm
                    Member
                    • May 2012
                    • 84

                    #10
                    a better approach until zabbix learns to to this itself would be to to have a small lightweight python daemon running and sending data back to zabbix.

                    Not the best solution, but using psutil and working on this implementation of top (http://code.google.com/p/psutil/sour...xamples/top.py) I think it should be pretty quick to create something that provides all the metrics you might need.

                    Comment

                    • cscott
                      Junior Member
                      • Jul 2012
                      • 16

                      #11
                      surely the way to go on this would be monitor the cpu and memory usage of the application(s) you are using on the box as well as other targeted system processes and resources. That way you can then have discrete numeric values. Grabbing the output of top will just give you a load of text and processing it wont be straight forward at all. By all means dump it to a file so you have the data, or better still dont and just enable system accounting

                      Comment

                      • tchjts1
                        Senior Member
                        • May 2008
                        • 1605

                        #12
                        That's all fine and good if you have only a handful of servers you are monitoring. But in a large installation with thousands of servers running various processes... not gonna happen.

                        Comment

                        Working...