ZABBIX Forums  

Go Back   ZABBIX Forums > Zabbix Discussions and Feedback > Zabbix Cookbook

Reply
 
Thread Tools Display Modes
  #11  
Old 13-07-2013, 00:12
linuxsquad linuxsquad is offline
Junior Member
 
Join Date: Jul 2013
Location: chicago il
Posts: 12
Send a message via Skype™ to linuxsquad
Default

One of the Zabbix virtues as many other IT tools, is centralized management and maintenance.

First, Zabbix notification does allow escalation and other actions taken for a specific trigger event. You can easily change who is alerted and when

Second, relying on email servers for communication is not always prudent. While zabbix dashboard shows all ongoing issues, you can have other notification mechanisms: SMS, sound alarm, flashing lights in the building, etc

Third, I don't think managing LSI RAID alerts locally on each server scales well...

OB
Reply With Quote
  #12  
Old 13-07-2013, 00:45
vic vic is offline
Member
 
Join Date: Jul 2013
Posts: 48
Default

Quote:
Originally Posted by linuxsquad View Post
One of the Zabbix virtues as many other IT tools, is centralized management and maintenance.

First, Zabbix notification does allow escalation and other actions taken for a specific trigger event. You can easily change who is alerted and when

Second, relying on email servers for communication is not always prudent. While zabbix dashboard shows all ongoing issues, you can have other notification mechanisms: SMS, sound alarm, flashing lights in the building, etc

Third, I don't think managing LSI RAID alerts locally on each server scales well...

OB
Depends on your situation. For me a RAID failure requires no escalation. It goes straight to the top highest priority. It's not like it's a common occurrence. If it is you have bigger problems.

I will probably try adapt my email method to zabbix.

Last edited by vic; 13-07-2013 at 01:49.
Reply With Quote
  #13  
Old 13-07-2013, 01:37
linuxsquad linuxsquad is offline
Junior Member
 
Join Date: Jul 2013
Location: chicago il
Posts: 12
Send a message via Skype™ to linuxsquad
Default

1) Zabbix really shines when IT dept comprises of more than couple hands and/or there are other entities with vested interest in IT resources. They might want to see historical data to address bottlenecks.

2) Zabbix lets you to prioritize and classify events. So from this perspective, I like to have ability to select what event trigger what action:

- yellow alert on Z dashboard
- email to a single IT person
- blasting email to all IT dept
- SMS to IT dept head and whoever on call

3) For instance, hysteresis. For LSI RAID it is not an issue. However, if storage capacity or network bandwidth values fluctuate around trigger line (for instance, 80%) alerts will spam your inbox 'till someone screams "... get the f### Z### off my ermail !!!". A single change in a trigger configuration will allow to avoid such annoyance.

So there are plenty of reasons to spend time and bring Z up on your network, even local alerts can do all you need ... for now.

OB
Reply With Quote
  #14  
Old 11-08-2013, 15:24
Jason Jason is offline
Senior Member
 
Join Date: Nov 2007
Posts: 344
Default

Where possible we like to use snmp monitoring... It's not complicated to use snmp to discover all drives both physical and logical in a server and then report on it. Also this gets round the problem of someone adding a drive and then forgetting to update the template.

This wont work for ESXi systems, but for those can just use python wbem to read the health status (assuming have loaded the lsi providers) and then parse that straight in to zabbix.

For remote sites where there isn't direct SNMP access then we just shove in a proxy either as a small virtual or on a rasp pi
Reply With Quote
  #15  
Old 20-09-2013, 02:56
kevind kevind is offline
Member
 
Join Date: Sep 2011
Posts: 38
Default

We use LSI Logic MegaRAID controllers in AberNAS units. I made a low-level discovery template which automatically monitors what's there. You should be able to use it with Dell also. This template uses low-level discovery to detect and monitor virtual devices (volumes), physical devices (drives), adapters, enclosures and batteries.

To monitor MegaRAID via SNMP requires the sas_snmp rpm, available for your controller from the LSI logic website (buried in "megaRAID_SNMP_Installers"). I also suggest updating net-snmp to version 5.5-44 or higher, so you can get the correct size for large volumes.

Substitute your own community string for {{ your_snmp_community }}, and your Zabbix Server IP address for {{ zabbix_server_ip_address }} in the example below. Install/update net-snmp before installing sas_snmp.

Code:
# copy the following files:
#     net-snmp-5.5-44.el6.x86_64.rpm
#     net-snmp-libs-5.5-44.el6.x86_64.rpm
#     net-snmp-utils-5.5-44.el6.x86_64.rpm
#     sas_snmp-13.04-0301.x86_64.rpm
#
# as root:

rpm -Uvh net-snmp-5.5-44.el6.x86_64.rpm net-snmp-libs-5.5-44.el6.x86_64.rpm net-snmp-utils-5.5-44.el6.x86_64.rpm
rpm -ivh sas_snmp-13.04-0301.x86_64.rpm
vi /etc/snmp/snmpd.conf
# at the top of /etc/snmp/snmpd.conf, add the following lines.  Use the IP address of the zabbix server in place of "{{ zabbix_server_ip_address }}"

rocommunity public 127.0.0.1
rocommunity {{ your_snmp_community }} 0.0.0.0
trapcommunity public
trap2sink {{ zabbix_server_ip_address }}
# report fake allocation unit size for large volumes so size calc is right
realStorageUnits 0

# save the edited /etc/snmp/snmpd.conf, then restart snmpd:
/sbin/service snmpd restart
Attached Files
File Type: xml Template_LSI_MegaRAID.xml (69.6 KB, 884 views)

Last edited by kevind; 20-09-2013 at 10:39.
Reply With Quote
  #16  
Old 02-01-2014, 03:10
UWH-David UWH-David is offline
Junior Member
 
Join Date: Jan 2014
Posts: 4
Default

Unfortunately I am not able to import this in version 2.2.1. I receive the following:

ERROR: Import failed

Created: Application "HW_CPU" on "Template SNMP_Dell".
Created: Application "HW_Disk" on "Template SNMP_Dell".
Created: Application "HW_Disk Controller" on "Template SNMP_Dell".
Created: Application "HW_Fan" on "Template SNMP_Dell".
Created: Application "HW_Memory" on "Template SNMP_Dell".
Created: Application "HW_Power Supply" on "Template SNMP_Dell".
Created: Application "HW_System" on "Template SNMP_Dell".
Created: Application "HW_Temperature" on "Template SNMP_Dell".
Cannot find value map "DellStatus" used for item "ChassisStatus" on "Template SNMP_Dell".

Any help would be appreciated!



Quote:
Originally Posted by geek74 View Post
Hi,

So When you have OMSA populating snmp you can use the attached template.
To make it work under ubuntu 12.04LTS install OMSA from dell repository and do the following fix http://administratosphere.wordpress....-ubuntu-a-fix/

It needs a lot of value mapping to be human readable.

DellArrayDiskState
1 ⇒ ready
2 ⇒ failed
3 ⇒ online
4 ⇒ offline
6 ⇒ degraded
7 ⇒ recovering
11 ⇒ removed
13 ⇒ non-raid
15 ⇒ resynching
24 ⇒ rebuild
25 ⇒ noMedia
26 ⇒ formatting
28 ⇒ diagnostics
34 ⇒ predictiveFailure
35 ⇒ initializing
39 ⇒ foreign
40 ⇒ clear
41 ⇒ unsupported
53 ⇒ incompatible


DellBatteryState
1 ⇒ ready
2 ⇒ failed
6 ⇒ degraded
7 ⇒ reconditioning
9 ⇒ high
10 ⇒ low
12 ⇒ charging
21 ⇒ missing
36 ⇒ learning

DellLogDriveState
1 ⇒ ready
2 ⇒ failed
3 ⇒ online
4 ⇒ offline
6 ⇒ degraded
7 ⇒ verifying
15 ⇒ resynching
16 ⇒ regenerating
18 ⇒ failedRedundancy
24 ⇒ rebuilding
26 ⇒ formatting
32 ⇒ reconstructing
35 ⇒ initializing
36 ⇒ backgroundInit
52 ⇒ permanentlyDegraded

DellLogDriveType
1 ⇒ concatenated
2 ⇒ raid-0
3 ⇒ raid-1
4 ⇒ raid-2
5 ⇒ raid-3
6 ⇒ raid-4
7 ⇒ raid-5
8 ⇒ raid-6
9 ⇒ raid-7
10 ⇒ raid-10
11 ⇒ raid-30
12 ⇒ raid-50
13 ⇒ addSpares
14 ⇒ deleteLogical
15 ⇒ transformLogical
18 ⇒ raid-0-plus-1
19 ⇒ concatRaid-1
20 ⇒ concatRaid-5
21 ⇒ noRaid
22 ⇒ volume
23 ⇒ raidMorph
24 ⇒ raid-60
25 ⇒ cacheCade

Dell Open Manage System Status
1 ⇒ Other
2 ⇒ Unknown
3 ⇒ OK
4 ⇒ NonCritical
5 ⇒ Critical
6 ⇒ NonRecoverable

DellsDiskControllerState
1 ⇒ ready
2 ⇒ failed
3 ⇒ online
4 ⇒ offline
6 ⇒ degraded

DellStatus
1 ⇒ other
2 ⇒ unknown
3 ⇒ ok
4 ⇒ nonCritical
5 ⇒ critical
6 ⇒ nonRecoverable

DellStatusProbe
1 ⇒ other
2 ⇒ unknown
3 ⇒ ok
4 ⇒ nonCriticalUpper
5 ⇒ criticalUpper
6 ⇒ nonRecoverableUpper
7 ⇒ nonCriticalLower
8 ⇒ criticalLower
9 ⇒ nonRecoverableLower
10 ⇒ failed

DellStatusRedundancy
1 ⇒ other
2 ⇒ unknown
3 ⇒ full
4 ⇒ degraded
5 ⇒ lost
6 ⇒ notRedundant
7 ⇒ redundnacyOffline

DellStorageGlobalStatus
1 ⇒ critical
2 ⇒ warning
3 ⇒ normal
4 ⇒ unknown


Please comment and update if you found wrong stuff.

Cheers
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 17:38.