Ad Widget

Collapse

Dell Poweredge Openmanage

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • AudiAddict
    Junior Member
    • Jun 2008
    • 29

    #1

    Dell Poweredge Openmanage

    I'm running two brand new Poweredge 2950 servers with quad core cpu's on win2k3 64bit.

    The agent on these machines work fine and connects to my unix/zabbix machine.

    Three things which I cannot get working are :

    1) Openmanage snmp info, how do I get this to communicate with my zabbix server? Where and how do I configure this? I've googled and searched this forum. All people get it to work, but nobody can tell me how.

    Openmanage works on the windows machine.
    I've set the SNMP service to allow snmp packets from the zabbix ip.
    Snmp is enabled as service

    What else do i need to configure

    2) Cpu load, I've tried all the possible cpu items/triggers, but none of them give the correct value. usaly around 0.1, 0.10. Windows shows load values of 22-40 %. There are 8 cpu's though.. would be nice if I could get a average value of all 8 cores with zabbix.

    maybe openmanage software will be able to send me the correct data?
    Perfmon in windows can show each cpu performance data per cpu, how do send this same info to zabbix?

    3) Is it possible to configure the network usage with these poweredge servers? The server has two NIC's (dual connections, so 4 nics in total).
  • swaterhouse
    Senior Member
    • Apr 2006
    • 268

    #2
    There is an open manage template that is on the wiki and also I think included in new 1.4.x installs. Link that template with you server and it should give you general status info. It was not designed to retun every value imaginable. It does include status for the different hardware (fans, disk, memory, power supply, etc.) within the server as well as temperatures and voltages.

    There are two items to monitor for cpu "utilization"
    * cpu.load which is the equivalent of the top command in *unix. In a nutshell its basically like a measurement of the cpu queue - a number less than 1*number of cpu's means instruction are acted on immediately so thats good anything over 1*the number of cpu's menase there are things sitting in the ques waiting for computation so thats is "bad".

    * cpu.util shows the eqivalent of task manager BUT you cannot get an overall utilization. What I do is monitor cpu.util idle% and switch my brain to think that high numbers are good low numbers are bad. If items could do math this could easily be solved by doing 100-idle% is total-used% but you cant do that in 1.4 nor will you be able to in 1.6.

    To monitor network info you need to use performance counters. search the forum and you will find plenty of examples. Personally I cant be bothered, I monitor every port on my swtches and get the same info but in a much easier to manage fashoin and with less overhead. This of course assumes that your switches are managed switches. If no you have to ue performace counters on your servers.

    Comment

    • AudiAddict
      Junior Member
      • Jun 2008
      • 29

      #3
      Thanks for the quick reply.

      I found the wiki templates, but did not need to install them because my zabbix server came with these templates pre-installed.

      It had poweredge and openmanage templates.

      My question was really, how do I get the zabbix server to get this info from my server? Do I not need to configure the snmp settings or community name?

      It's too bad they have no faq or howto for this. Because I cannot get it to work All current values are 3 (unknown) Ibelieve.

      I just tried the tempature template and that gives me 21c on the mainboard. So that seems to work. This is the only thing which seems to work.. all other values are 3 (unknown) fan speed, etc etc etc
      Last edited by AudiAddict; 13-06-2008, 15:14.

      Comment

      • swaterhouse
        Senior Member
        • Apr 2006
        • 268

        #4
        What model server(s) do you have and what version of OpenManage are you running?

        I have a mix of PE1850's, PE1950's (gen 1 and 3), PE2850's and PE2950's and an older 2600 all running OpenManage 5.2 or 5.4 (but started ZABBIX using OpenManage 5.0). All these run a mix of Windows 2000 and 2003, CentOS 4 and 5 and VMware ESX 3.5.

        If your getting the temp than at least some of it is working (i.e. snmp is set up etc), but it sounds like the the snmp indexes may be different on your computers for the other values.

        Comment

        • AudiAddict
          Junior Member
          • Jun 2008
          • 29

          #5
          Thanks for the reply.

          I'm running four Poweredge 2950 servers (very recent model/version)

          And last but not least poweredge 2850 and a 2650.

          Comment

          • xs-
            Senior Member
            Zabbix Certified Specialist
            • Dec 2007
            • 393

            #6
            I used the standard template from the zabbix install for monitoring the openmanage software (one of few useful templates imho). It worked out of the box.

            A few things to keep in mind tho:
            - You must have omsa installed on the box itself (win/linux)
            - You must have a snmp server daemon installed (win/linux)
            - You must have snmp query access: acls, communities, etc (win/linux)
            - You must enable snmp integration for omsa: /etc/init.d/dataeng enablesnmp (run once to permanently enable) (linux)
            - For debian derivatives (debian/ubuntu), you need to do smux stuff for snmpd (remove '-I -smux' from a line in /etc/default/snmpd )


            I attached my own dell openmanage template with a few small changes:
            Disabled triggers for (useless if you are already monitoring specific items, no need to get an alert twice):
            - Global system status
            - Chassis status
            - Chassis intrusion detection
            - eventlog status

            I added extra triggers for several items (to use the different error levels), and will do the same for all other items in the future:
            - disks
            - memory
            - fans
            - power supply



            PS
            All other items you mentioned like cpu and network utilization, can only be monitored as OS related items, they aren't hardware bound.
            Attached Files

            Comment

            • AudiAddict
              Junior Member
              • Jun 2008
              • 29

              #7
              Thanks for the info and the template!

              The dells are all running windows 2003. Right now I started testing with two poweredge 2950 servers ( our latest / recent model servers)

              Both of them give temp readings (mainboard) through the poweredge template that came with zabbix.

              All other snmp functions give a value of 3 (which is unknown?).

              If it gets the temp settings it must mean that it can acces the windows servers..

              so what am I missing? Or what do I need to configure

              Comment

              • xs-
                Senior Member
                Zabbix Certified Specialist
                • Dec 2007
                • 393

                #8
                What do you mean with 'temp reading'. You get the actual temp in C or F??? afaik you only get a temp 'status', meaning the-server-thinks-its-too-hot status y/n.

                The value you receive (3) is status OK. 2 is for unknown.
                Mayb the valuemap is borked in the current default template. Check if you have the value map (config->general->value mapping) and that its correct:
                1 ⇒ Other
                2 ⇒ Unknown
                3 ⇒ Ok
                4 ⇒ nonCritical
                5 ⇒ Critical
                6 ⇒ nonRecoverable

                Also check that in your items, you have 'Show value as' configured to use the correct value map (for all Dell OpenManage items).


                Last note.
                If you look at 'Latest data' and it shows you values at the items, everything is working correctly! The value mapping will only replace the status numbers with actual text.

                Comment

                • AudiAddict
                  Junior Member
                  • Jun 2008
                  • 29

                  #9
                  I get a temp reading of 20C.

                  The valuemaps were not listed at all for this template for some reason.

                  So I added the valuemap with the list u gave me. They are all set to 3 which now is OK.

                  To be safe I took out one of the power cables for the PSU, which gaves me 3 sms messages (disastor triggers) for OM Chassis OM Global System and OM : Power Supply.

                  Works perfectly I see!!

                  Can u tell me if I can add ur template? will it overwrite any of the data? Are there any changes except for some triggers that have been disabled?

                  Also, do I need to set these tempature thresholds or settings in the openmanage software? Or is this done by default?

                  Regarding the older server (poweredge 2650). The following readings do not work with the template :

                  Disk controller and enclosure status
                  Fan unit status
                  Mainboard temp in C
                  Proc status
                  Temp max threshold
                  Temp Failure threshold
                  Last edited by AudiAddict; 17-06-2008, 09:06.

                  Comment

                  • xs-
                    Senior Member
                    Zabbix Certified Specialist
                    • Dec 2007
                    • 393

                    #10
                    Hmm

                    I just took a look at the current dell open manage template (the one with the statuses, not the one with temp, its a separate one), seems it has been updated since the time i started using it.

                    Looking at that template:

                    1)
                    Disable the following triggers, as they are aggregated from openmanage itself (you will double triggers active)
                    - OM: Event Log Status (all)
                    - OM: Global System Status
                    - OM: Chassis Status
                    Personally i also disabled 'OM: Chassis Intrusion' as this is highly irritating when maintenance is performed and someone forgets to clear the status logs in omsa.

                    2)
                    In my current setup i have the following trigger severities per status
                    =4 Average
                    =5 High
                    >5 Disaster

                    The current setting of '>4 Disaster' is not really useful as a machine will keep working most of the time (think failed power suply, failed mem but os keeps running, failed fan, etc)
                    (we use the severity level 'Disaster' for sending an SMS to standby personnel, no need to wake them at night for a redundancy warning )

                    Just my 2 cents

                    Comment

                    • AudiAddict
                      Junior Member
                      • Jun 2008
                      • 29

                      #11
                      Thanks again for the reply.

                      Everything seems to be working fine now except for some missing snmp data from the older poweredge 2650 (see above post)

                      Now I've got to figure out how to get the correct cpu data from these machines. The four poweredge servers we are using are running vmware server and I would ver much like to see the cpu usage in a graph.

                      How have u set this up with ur poweredge servers? Is there a howto on how to do this?

                      Comment

                      • xs-
                        Senior Member
                        Zabbix Certified Specialist
                        • Dec 2007
                        • 393

                        #12
                        The missing values arent a bad thing.
                        We are using everything from 1650's to 2950-3 and up. Most servers dont support the full list of items from the omsa template, but these are just snmp misses, not a bug thingy. If you get these too much of these misses (say 25%), you could create a set of templates for omsa per poweredge series.

                        cpu utilization? its an OS related thing, not hardware.
                        I use runqueue length (unix calls this load, windows calls it queuelength). Zabbix agent can do runqueue with a function, under windows you need to do a perfmon thing for this.

                        Comment

                        • benito103e
                          Junior Member
                          • Jun 2010
                          • 24

                          #13
                          Originally posted by xs-
                          I attached my own dell openmanage template with a few small changes:
                          Disabled triggers for (useless if you are already monitoring specific items, no need to get an alert twice):
                          - Global system status
                          - Chassis status
                          - Chassis intrusion detection
                          - eventlog status

                          I added extra triggers for several items (to use the different error levels), and will do the same for all other items in the future:
                          - disks
                          - memory
                          - fans
                          - power supply
                          Hi !
                          Could i know how did you get informations about the MIB ?
                          I don't find the manual, or things like that, where the OID are described

                          Comment

                          • Speedfight
                            Member
                            • May 2007
                            • 67

                            #14
                            Originally posted by benito103e
                            Hi !
                            Could i know how did you get informations about the MIB ?
                            I don't find the manual, or things like that, where the OID are described


                            i found this old topic.
                            maybe it can help somebody: OM mibs reference guide:

                            http://support.dell.com/support/edoc...P/PDF/SNMP.pdf

                            Comment

                            Working...