Home » Dell Manuals » Servers » Dell PowerEdge T140 » Manual Viewer

Dell PowerEdge T140 EMC PowerEdge Servers Troubleshooting Guide - Page 99

How to, a RAID puncture, ﬁx

View all Dell PowerEdge T140 manuals

Add to My Manuals
Save this manual to your list of manuals

Page 99 highlights

A Check Consistency performed after a RAID puncture is induced will not resolve the issue. This is why it is very important to perform a Check Consistency on a regular basis. It becomes especially important prior to replacing drives, when possible. The array must be in an optimal state to perform the Check Consistency. A RAID array that contains a single data error in conjunction with an additional error event such as a hard drive failure causes a RAID puncture when the failed or replacement drive is rebuilt into the array. As an example, an optimal RAID 5 array includes three members: drive 0, drive 1 and drive 2. If drive 0 fails and is replaced, the data and parity remaining on drives 1 and 2 are used to rebuild the missing information on to the replacement drive 0. However, if a data error exists on drive 1 when the rebuild operation reaches that error, there is insufficient information within the stripe to rebuild the missing data in that stripe. Drive 0 has no data, drive 1 has bad data and drive 2 has good data as it is being rebuilt. There are multiple errors within that stripe. Drive 0 and drive 1 do not contain valid data, so any data in that stripe cannot be recovered and is therefore lost. The result as shown in Figure 3 is that RAID punctures (in stripes 1 and 2) are created during the rebuild. The errors are propagated to drive 0. Figure 24. RAID punctures Puncturing the array restores the redundancy and returns the array to an optimal state. This provides for the array to be protected from additional data loss in the event of additional errors or drive failures. How to fix a RAID puncture Issue: Solution: How to fix RAID arrays that have been subjected to a puncture? Complete the following steps to resolve the issue: WARNING: Following these steps will result in the loss of all data on the array. Ensure that you are prepared to restore from backup or other means prior to following these steps. Use caution so that following these steps does not impact any other arrays. 1 Discard Preserved Cache, if it exists. 2 Clear foreign configurations, if any. 3 Delete the array. 4 Shift the position of the drives by one. Move Disk 0 to slot 1, Disk 1 to slot 2, and Disk 2 to slot 0. 5 Recreate the array as desired. 6 Perform a Full Initialization of the array (not a Fast Initialization). 7 Perform a Check Consistency on the array. If the Check Consistency completes without errors, you can safely assume that the array is now healthy and the puncture is removed. Data can now be restored to the healthy array. Troubleshooting hardware issues 99

Section	Page
Dell EMC PowerEdge Servers Troubleshooting Guide	3
Introduction	8
Audience	8
Recommended tools	8
Documentation resources	9
Safety instructions	10
Diagnostic indicators	11
Status LED indicators	11
System health and system ID indicator codes	12
iDRAC Quick Sync 2 indicator codes	12
iDRAC Direct LED indicator codes	13
NIC indicator codes	13
Power supply unit indicator codes	14
Non-redundant power supply unit indicator codes	16
Hard drive indicator codes	17
uSATA SSD indicator codes	18
Internal dual SD module indicator codes	18
Running diagnostics	20
Receiving automated support with SupportAssist	20
PSA/ePSA Diagnostics	20
Running the PSA Diagnostics	20
PSA and ePSA Diagnostics error codes	20
Debugging mini crash dump files using by WinDbg in Windows operating system	37
Troubleshooting hardware issues	42
Troubleshooting system startup failure	42
No bootable device found	42
Troubleshooting external connections	43
Troubleshooting the video subsystem	43
Troubleshooting a USB device	43
Troubleshooting iDRAC Direct - USB XML configuration	44
Troubleshooting iDRAC Direct - Laptop connection	44
Troubleshooting a serial Input Output device	45
Troubleshooting a NIC	45
NIC teaming on a PowerEdge Server	45
Troubleshooting a wet system	46
Troubleshooting a damaged system	46
Troubleshooting the system battery	47
Troubleshooting cooling problems	47
Troubleshooting cooling fans	48
Troubleshooting an internal USB key	48
Troubleshooting a micro SD card	49
Troubleshooting expansion cards	49
Troubleshooting processors	50
Troubleshooting a CPU Machine Check error	50
Troubleshooting a storage controller	51
OMSA flagging PERC driver	51
Importing or clearing foreign configurations using the foreign configuration view screen	51
Importing or clearing foreign configurations using the VD mgmt menu	53
RAID controller L1, L2 and L3 cache error	53
PERC controllers do not support NVME PCIe drives	53
12 Gbps hard drive does not support in SAS 6ir RAID controllers	54
Hard drives cannot be added to the existing RAID 10 Array	54
PERC battery discharging	54
PERC battery failure message is displayed in ESM log	56
Creating non-raid disks for storage purpose	56
Firmware or Physical disks out-of-date	57
Cannot boot to Windows due to foreign configuration	57
Offline or missing virtual drives with preserved cache error message	57
Expanding RAID array	58
LTO-4 Tape drives are not supported on PERC	58
Limitations of HDD size on H310	58
System logs show failure entry for a storage controller even though it is working correctly	58
Troubleshooting hard drives	59
Troubleshooting multiple Drive failure	59
Checking hard drive status in the PERC BIOS	60
FAQs	61
Symptoms	62
Drive timeout error	62
Drives not accessible	63
Troubleshooting an optical drive	63
Troubleshooting a tape backup unit	64
Troubleshooting system memory	64
Correctable memory errors in the system logs	65
Memory errors after system reboots	65
Memory errors after upgrading memory modules	65
Troubleshooting memory module issues	66
Troubleshooting no power issues	69
Troubleshooting power supply units	70
Troubleshooting power source problems	70
Troubleshooting power supply unit problems	70
Troubleshooting RAID	71
RAID configuration using PERC	71
RAID configuration using OpenManage Server Administrator	74
RAID configuration by using Unified Server Configurator	77
Downloading and installing the RAID controller log export by using PERCCLI tool on ESXi hosts on Dell’s 13th generation of PowerEdge servers	80
Configuring RAID by using Lifecycle Controller	84
Starting and target RAID levels for virtual disk reconfiguration and capacity expansion	85
Replacing physical disks in RAID1 configuration	86
Thumb rules for RAID configuration	87
Reconfiguring or migrating virtual disks	87
Foreign Configuration Operations	88
Viewing Patrol Read report	90
Check Consistency report	91
Virtual disk troubleshooting	92
Troubleshooting memory or battery errors on the PERC controller on Dell PowerEdge servers	95
Slicing	98
RAID puncture	98
Troubleshooting thermal issue	100
Server management software issues	101
What are the different types of iDRAC licenses	101
How to activate license on iDRAC	102
Can I upgrade the iDRAC license from express to enterprise and BMC to express	102
How to find out missing licenses	103
How to export license using iDRAC web interface	103
How to set up e-mail alerts	103
System time zone is not synchronized	104
How to set up Auto Dedicated NIC feature	104
How to configure network settings using Lifecycle Controller	104
Assigning hot spare with OMSA	105
Assigning And Unassigning Global Hot Spare	105
Storage Health	106
How do I configure RAID using operating system deployment wizard	106
Foreign drivers on physical disk	107
Importing Foreign Configurations	107
Physical disk reported as Foreign	107
Clearing the foreign configuration	108
Resetting storage-controller configuration	108
How to update BIOS on 13th generation PowerEdge servers	108
Why am I unable to update firmware	108
Which are the operating systems supported on Dell EMC PowerEdge servers	109
Unable to create a partition or locate the partition and unable to install Microsoft Windows Server 2012	109
JAVA support in iDRAC	109
How to specify language and keyboard type	110
Message Event ID - 2405	110
Description	110
Installing Managed System Software On Microsoft Windows Operating Systems	110
Installing Managed System Software On Microsoft Windows Server and Microsoft Hyper-V Server	110
Installing Systems Management Software On VMware ESXi	111
Processor TEMP error	111
PowerEdge T130, R230, R330, and T330 servers may report a critical error during scheduled warm reboots	111
SSD is not detected	111
TRIM/UNMAP and Dell Enterprise SSD Drives Support	111
OpenManage Essentials does not recognize the server	112
Unable to connect to iDRAC port through a switch	112
Lifecycle Controller is not recognizing USB in UEFI mode	112
Guidance on remote desktop services	112
Troubleshooting operating system issues	114
How to install the operating system on a Dell PowerEdge Server	114
Locating the VMware and Windows licensing	114
Troubleshooting blue screen errors or BSODs	114
Troubleshooting a Purple Screen of Death or PSOD	115
Troubleshooting no boot issues for Windows operating systems	115
No boot device found error message is displayed	116
No POST issues in iDRAC	117
“First Boot Device cannot be set” error message is displayed when configuring a boot device during POST.	117
“Alert! iDRAC6 not responding.. Power required may exceed PSU wattage...” error message is displayed at POST during a reboot.	117
Troubleshooting a No POST situation	117
Migrating to OneDrive for Business using Dell Migration Suite for SharePoint	118
Windows	119
Installing and reinstalling Microsoft Windows Server 2016	119
FAQs	121
Symptoms	123
Troubleshooting system crash at cng.sys with watchdog Error violation	123
Host bus adapter mini is missing physical disks and backplane in Windows	124
Converting evaluation OS version to retail OS version	124
Partitions on disk selected for installation of Hyper-V server 2012	124
Install Microsoft Hyper-V Server 2012 R2 with the Internal Dual SD module	125
VMware	125
FAQs	125
Rebooting an ESXi host	126
Unable to allocate storage space to a VM	126
Configuration backup and restore procedures	126
Can we back up 2012 r2 as a VM	127
Install, update and manage Fusion-IO drives in Windows OS	127
Symptoms	128
Linux	128
FAQs	128
Symptoms	128
Installing operating system through various methods	128
Getting help	131
Contacting Dell EMC	131
Downloading the drivers and firmware	131

Match case Limit results 1 per page

A Check Consistency performed after a RAID puncture is induced will not resolve the issue. This is why it is very important to perform a

Check Consistency on a regular basis. It becomes especially important prior to replacing drives, when possible. The array must be in an

optimal state to perform the Check Consistency.

A RAID array that contains a single data error in conjunction with an additional error event such as a hard drive failure causes a RAID

puncture when the failed or replacement drive is rebuilt into the array. As an example, an optimal RAID 5 array includes three members:

drive 0, drive 1 and drive 2. If drive 0 fails and is replaced, the data and parity remaining on drives 1 and 2 are used to rebuild the missing

information on to the replacement drive 0. However, if a data error exists on drive 1 when the rebuild operation reaches that error, there is

insuﬃcient

information within the stripe to rebuild the missing data in that stripe. Drive 0 has no data, drive 1 has bad data and drive 2 has

good data as it is being rebuilt. There are multiple errors within that stripe. Drive 0 and drive 1 do not contain valid data, so any data in that

stripe cannot be recovered and is therefore lost. The result as shown in Figure 3 is that RAID punctures (in stripes 1 and 2) are created

during the rebuild. The errors are propagated to drive 0.

Figure 24. RAID punctures

Puncturing the array restores the redundancy and returns the array to an optimal state. This provides for the array to be protected from

additional data loss in the event of additional errors or drive failures.

How to

ﬁx

a RAID puncture

Issue:

How to

ﬁx

RAID arrays that have been subjected to a puncture?

Solution:

Complete the following steps to resolve the issue:

WARNING:

Following these steps will result in the loss of all data on the array. Ensure that you are

prepared to restore from backup or other means prior to following these steps. Use caution so that

following these steps does not impact any other arrays.

Discard Preserved Cache, if it exists.

Clear foreign

conﬁgurations,

if any.

Delete the array.

Shift the position of the drives by one.

Move Disk 0 to slot 1, Disk 1 to slot 2, and Disk 2 to slot 0.

Recreate the array as desired.

Perform a Full Initialization of the array (not a Fast Initialization).

Perform a Check Consistency on the array.

If the Check Consistency completes without errors, you can safely assume that the array is now healthy and the

puncture is removed. Data can now be restored to the healthy array.

Troubleshooting hardware issues