Hardware Raid Setup using MegaCli
Contents[hide] |
MegaCli
MegaCli introduced by LSI as a command line administration of LSI MegaRaid controllers .
With megacli we can create physical raids, gather info about raids and monitor raids.
Download and Install MegaCli and other supportive tools
Centos
yum install MegaCli
Command location: /opt/MegaRAID/MegaCli/MegaCli64
make an alias for easier use:
alias megacli='/opt/MegaRAID/MegaCli/MegaCli64'
Ubuntu
apt-get install megacli
Command Location: /usr/sbin/megacli, no need alias.
Extra tools
MegaCli not providing all the information we need like mapping to linux devices and raid level (readable), so we are going to use some extra tools.
Centos
yum install sg3_utils
Ubuntu
apt-get install sg3-utils megactl
Megacli Concepts
Adapters, Physical Drives and Virtual Drives
Before we go through megacli commands we need to follow megacli concepts.
Adapter – The physical controller which we are going to use, represented by id (usually 0).
Enclosure – The physical chassis the physical drives attached to, represented by id, such as 254,252 etc.
Physical Drives – Physical Hard Disks attached to controller, represented by id, 0,1,2,3 etc.
Virtual Drives – Those drives contains Physical Drives and equal to Raid Devices, represented by id, 0,1,2,3 etc.
For example if we have RAID 0 over 3 Physical Drives, we get:
Physical Drives ids: 0,1,2
Virtual Drive id: 0
Virtual Drive contains physical drives 0,1 and 2 and includes the settings of the raid device such as raid level, strip size etc.
We can see configurations in megacli like Virtual Drive with RAID 0 over one physical drive, we have this settings because physical drives attached to raid controller, and for representing the device to the system we must set it in megacli.
As a default we will see all the physical devices part of virtual drive in raid 0.
Gather the information we need
Gather info about raid controller
Make sure you have raid controller attached to your server, using lspci.
lspci | grep -i raid
Output example:
81:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 05)
Now that we know we have MegaRaid controller we can go on and work with megacli CLI.
Gather info about raid adapters:
megacli -AdpGetPciInfo -aAll
Output example:
PCI information for Controller 0 -------------------------------- Bus Number : 5 Device Number : 0 Function Number : 0 Exit Code: 0x00
We can see here the adapters information, we will use the adapter id later on, adapter id is 0.
Gather info about the enclosure
megacli -EncInfo -a0
Output example:
Number of enclosures on adapter 0 -- 1 Enclosure 0: Device ID : 245 Number of Slots : 24 Number of Power Supplies : 0 Number of Fans : 1 Number of Temperature Sensors : 3 Number of Alarms : 0 Number of SIM Modules : 0 Number of Physical Drives : 10 Status : Normal Position : 1 Connector Name : Unavailable Enclosure type : SGPIO VendorId is LSI CORP and Product Id is Bobcat VendorID and Product ID didnt match FRU Part Number : N/A Enclosure Serial Number : N/A ESM Serial Number : N/A Enclosure Zoning Mode : N/A Partner Device Id : 65535 Inquiry data : Vendor Identification : LSI CORP Product Identification : Bobcat Product Revision Level : 0504 Vendor Specific : x36-25.5.4.0 Exit Code: 0x00
Lets look over some of the values:
Device ID : 245
This id represent the enclosure and will be used in other commands.
Number of Slots : 24
The maximum physical drives we can connect to this enclosure.
Gather info about physical drives
Adapter id is 0.
megacli -LdPdInfo -a0
Output example:
PD: 0 Information Enclosure Device ID: 245 Slot Number: 0 Drive's postion: DiskGroup: 0, Span: 0, Arm: 0 Enclosure position: 0 Device Id: 20 WWN: 5000C5006B1ECCA8 Sequence Number: 2 Media Error Count: 0 Other Error Count: 0 Predictive Failure Count: 0 Last Predictive Failure Event Seq Number: 0 PD Type: SAS Raw Size: 1.090 TB [0x8bba0cb0 Sectors] Non Coerced Size: 1.090 TB [0x8baa0cb0 Sectors] Coerced Size: 1.090 TB [0x8baa0000 Sectors] Firmware state: Online, Spun Up Is Commissioned Spare : NO Device Firmware Level: 0002 Shield Counter: 0 Successful diagnostics completion on : N/A SAS Address(0): 0x5000c5006b1ecca9 SAS Address(1): 0x0 Connected Port Number: 0(path0) Inquiry Data: SEAGATE ST1200MM0017 0002S3L02PGK FDE Enable: Disable Secured: Unsecured Locked: Unlocked Needs EKM Attention: No Foreign State: None Device Speed: 6.0Gb/s Link Speed: 6.0Gb/s Media Type: Hard Disk Device Drive Temperature :34C (93.20 F) PI Eligibility: No Drive is formatted for PI information: No PI: No PI Drive's write cache : Disabled Port-0 : Port status: Active Port's Linkspeed: 6.0Gb/s Port-1 : Port status: Active Port's Linkspeed: Unknown Drive has flagged a S.M.A.R.T alert : No Lets look over some of the properties: Enclosure Device ID: 245 The enclosure id that the physical drive connected to.
Slot Number: 0
This is the slot number the physical drive is connected to.
Actually this slot number will be used to represent this physical ID drive in other commands.
Gather info about Virtual drives
megacli -LDInfo -Lall -a0
Output example:
Virtual Drive: 0 (Target Id: 0) Name : RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 1.090 TB Parity Size : 0 State : Optimal Strip Size : 64 KB Number Of Drives : 1 Span Depth : 1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Is VD Cached: No
Target Id: 0
This id represent the virtual drive its created within the raid creation.
Size : 1.090 TB
Size of the raid device, this size is the actual allocatable size.
For two devices that each size is 1TB at RAID 1, the virtual drive size will be 1TB, because the drives in mirror raid.
About physical drives inside virtual drives
As we look for simple output that shows the virtual drives structure, its hard to get that with megacli.
One way is to manipulate the output of the above command:
megacli -LdPdInfo -a0 | grep -E "Virtual Drive:|Slot Number:" | xargs | sed -r 's/(Slot Number:)(\s[0-9]+)/\2,/g' | sed 's/(Target Id: .)/Physical Drives ids:/g' | sed 's/Virtual Drive:/\nVirtual Drive:/g'
Output example:
Virtual Drive: 0 Physical Drives ids: 0, Virtual Drive: 1 Physical Drives ids: 1, Virtual Drive: 2 Physical Drives ids: 2, 3, 4, 5, 6, 7, 8, 9,
This manipulation is ugly and unnecessary since we can simply use megasasctl.
megasasctl
Output example:
a0 LSI MegaRAID SAS 9260-16i encl:1 ldrv:3 batt:FAULT, unknown charge state a0d0 1TiB RAID 0 1x1 optimal a0d1 1TiB RAID 0 1x1 optimal a0d2 4TiB RAID 10 4x2 optimal a0e245s0 1TiB a0d0 online a0e245s1 1TiB a0d1 online a0e245s2 1TiB a0d2 online a0e245s3 1TiB a0d2 online a0e245s4 1TiB a0d2 online a0e245s5 1TiB a0d2 online a0e245s6 1TiB a0d2 online a0e245s7 1TiB a0d2 online a0e245s8 1TiB a0d2 online a0e245s9 1TiB a0d2 online
Lets look over one of the lines representing virtual drive:
a0d0 1TiB RAID 0 1x1 optimal
Explanation:
Adapter: 0
Virtual Drive: 0
Raid Size: 1TB
Raid level: 0
Line that represent physical drive:
a0e245s0 1TiB a0d0 online
Explanation:
Adapter: 0
Enclosure id: 245
Slot: 0
Size: 1TB
And this physical drive is part of Adapter 0 Virtual Drive 0 (a0d0).
Gather info about raid level
As we familiar with megasasctl command we can get information about raid level easily.
MegaCli providing information about the raid level also:
megacli -LDInfo -L2 -a0 | grep -i raid
Output example:
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
We can see here Primary raid level 1 and secondary 0. It means actually raid 0 over raid 1 that is RAID 10.
We find this output hard to understand this is why prefer using megasasctl. As explained above:
a0d2 4TiB RAID 10 4x2 optimal
Mapping virtual drives to linux devices
Eventually we want to get the information about the mapping between linux devices and virtual drives (raid devices).
We already know that virtual drives represented by ids.
We will use sg_map command:
sg_map -x
Output example:
/dev/sg0 0 1 117 0 13 /dev/sg1 0 2 0 0 0 /dev/sda /dev/sg2 0 2 1 0 0 /dev/sdb /dev/sg3 0 2 2 0 0 /dev/sdc
Explanation:
sg_device_name host_number bus scsi_id lun scsi_type linux_device_name /dev/sg1 0 2 0 0 0 /dev/sda
scsi_id – Equals to virtual drive id, this is all we need to know the mapping between virtual drives to linux devices as shown above.
Create RAID Device (Virtual Drive)
Before we create raid device we need to gather some information as I explained above.
Lets use this information for the example:
Adpater id: 0 Enclosure id: 245 Physical Drive ids: 3,4 Raid Level: 0
The command syntax is:
megacli -CfgLdAdd -rX[enclosure_id:physical_id,enclosure_id:physical_id] -aN
X= Raid level
N= adapter id
Example:
megacli -CfgLdAdd -r0[245:3,245:4] -a0
Output example:
Adapter 0: Created VD 2 Adapter 0: Configured the Adapter!! Exit Code: 0x00
We can see that virtual drive 2 created.
Create RAID 10 Device (Virtual Drive)
Creating raid 10 device is different because we have to write the exact pairs for RAID 1 and over them there is going to be RAID 0.
We also using other flag of megacli command.
Example:
megacli -CfgSpanAdd -r10 -Array0[245:2,245:3] -Array1[245:4,245:5] -a0
Delete Raid Device (Virtual Drive)
In case we want to delete raid device we need to determine the Virtual Drive id and adapter id.
Make sure there is no data on the disk before deletion, also device needs to be unmounted and out of /etc/fstab .
Lets delete Virtual Drive 2.
Example:
megacli -CfgLdDel -L2 -a0
Output example:
Adapter 0: Deleted Virtual Drive-2(target id-2) Exit Code: 0x00
Cache Policy
Cache Policy's are how the raid card uses on board RAM to collect data before writing out to disk or to read data before the system asks for it.
Write cache is used when we have a lot of data to write and it is faster to write data sequentially to disk instead of writing small chunks.
Read cache is used when the system has asked for some data and the raid card keeps the data in cache in case the system asks for the same data again.
It is always faster to read and write to cache then to access spinning disks. Understand that you should only use caching if you have good UPS power to the system.
If the system looses power and does not flush the cache it is possible to loose data. No one wants that. Lets look at each cache policy LSI raid card use.
- WriteBack uses the card's cache to collect enough data to make a series of long sequential writes out to disk. This is the fastest write method.
- WriteThrough tells the card to write all data directly to disk without cache. This method is quite slow by about 1/10 the speed of WriteBack, but is safer as no data can be lost that was in cache when the machine's power fails.
- ReadAdaptive uses an algorithm to see if when the OS asks for a bunch of data blocks sequentially, if we should read a few more sequential blocks because the OS _might_ ask for those too. This method can lead to good speed increases.
- ReadAheadNone tells the raid card to only read the data off the raid disk if it was actually asked for. No more, no less.
- Cached allows the general use of the cards cache for any data which is read or written. Very efficient if the same data is accessed over and over again.
- Direct is straight access to the disk without ever storing data in the cache. This can be slow as any I/O has to touch the disk platters.
- Write Cache OK if Bad BBU tells the card to use write caching even if the Battery Backup Unit (BBU) is bad, disabled or missing. This is a good setting if your raid card's BBU charger is bad, if you do not want or can't to replace the BBU or if you do not want WriteThrough enabled during a BBU
relearn test.
- No Write Cache if Bad BBU if the BBU is not available for any reason then disable WriteBack and turn on WriteThrough. This option is safer for your data, but the raid card will switch to WriteThrough during a battery relearn cycle.
- Disk Cache Policy: Enabled Use the hard drive's own cache. For example if data is written out the drives this option lets the drives themselves cache data internally before writing data to its platters.
- Disk Cache Policy: Disabled does not allow the drive to use any of its own internal cache.