HP StorageWorks MSA1510i HP StorageWorks 1510i Modular Smart Array installatio - Page 93

Recognizing and recovering from hard drive failures and faulted LUNs, Recognizing hard drive failure

Page 93 highlights

Recognizing and recovering from hard drive failures and faulted LUNs The purpose of fault-tolerant array configurations is to protect against data loss due to hard drive failure. Each RAID configuration has inherent limitations on the number of hard drive failures that it can tolerate. If the fault-tolerance level of a particular LUN or array configuration is exceeded, the array will be locked from any further I/O. This protection is designed to preserve the integrity of the local drive, but does require manual intervention to recover or re-enable the LUN. Although controller firmware is designed to protect against normal hard drive failure, it is imperative that you perform the correct actions to recover from a hard drive failure without inadvertently introducing any additional hard drive failures. Included sections: • Recognizing hard drive failure • Compromised fault tolerance • Recovering from compromised fault tolerance (enabling failed LUNs) • Automatic data recovery (rebuild) • Replacing a hard drive Recognizing hard drive failure LEDs on the front of each hard drive are visible from the front of the external storage unit. When a hard drive is configured as a part of an array and attached to a powered-on controller, the status of the hard drive can be determined from the illumination pattern of these LEDs. For detailed descriptions of the various LED combinations, see Hard drive LEDs. Other ways to determine that a hard drive has failed include the following: • LEDs on the storage system chassis illuminate amber if failed hard drives are inside. (However, this LED also illuminates when other problems occur, such as when a fan or a redundant power supply fails, or when the system overheats.) • LEDs on the hard drives illuminate amber if a hard drive has failed or is a member of a faulted LUN. • Front-panel LCD display messages list faulted LUNs and failed hard drives whenever the system is restarted, as long as the controller detects one or more good hard drives. • The ACU and SMU represent faulted LUNs and failed drives with distinctive icons. • HP-SIM can detect failed hard drives. • ADU lists all failed hard drives. For more information on troubleshooting hard drive problems, see the HP ProLiant servers troubleshooting guide. Effects of hard drive failure When a hard drive fails, all logical drives that are in the same array are affected. Each logical drive in an array may be using a different fault-tolerance method, so each logical drive can be affected differently. • RAID 0 configurations cannot tolerate hard drive failure. If any physical hard drive in the array fails, all non-fault-tolerant (RAID 0) LUNs in the same array also are failed. • RAID 1 and RAID 1+0 configurations can tolerate multiple hard drive failures, as long as none of the failed hard drives are mirrored to one another. • RAID 5 configurations can tolerate one hard drive failure. • RAID 6 configurations can tolerate simultaneous failure of two hard drives in the array. 1510i Modular Smart Array installation and user guide 93

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160

Recognizing and recovering from hard drive failures and
faulted LUNs
The purpose of fault-tolerant array con
gurations is to protect against data loss due to hard drive failure.
Each RAID con
guration has inherent limitations on the number of hard drive failures that it can tolerate.
If the fault-tolerance level of a particular LUN or array con
guration is exceeded, the array will be locked
from any further I/O. This protection is designed to preserve the integrity of the local drive, but does
require manual intervention to recover or re-enable the LUN.
Although controller
rmware is designed to protect against normal hard drive failure, it is imperative that
you perform the correct actions to recover from a hard drive failure without inadvertently introducing any
additional hard drive failures.
Included sections:
Recognizing hard drive failure
Compromised fault tolerance
Recovering from compromised fault tolerance (enabling failed LUNs)
Automatic data recovery (rebuild)
Replacing a hard drive
Recognizing hard drive failure
LEDs on the front of each hard drive are visible from the front of the external storage unit. When a hard
drive is con
gured as a part of an array and attached to a powered-on controller, the status of the hard
drive can be determined from the illumination pattern of these LEDs.
For detailed descriptions of the various LED combinations, see
Hard drive LEDs
.
Other ways to determine that a hard drive has failed include the following:
LEDs on the storage system chassis illuminate amber if failed hard drives are inside. (However, this
LED also illuminates when other problems occur, such as when a fan or a redundant power supply
fails, or when the system overheats.)
LEDs on the hard drives illuminate amber if a hard drive has failed or is a member of a faulted LUN.
Front-panel LCD display messages list faulted LUNs and failed hard drives whenever the system is
restarted, as long as the controller detects one or more good hard drives.
The ACU and SMU represent faulted LUNs and failed drives with distinctive icons.
HP-SIM can detect failed hard drives.
ADU lists all failed hard drives.
For more information on troubleshooting hard drive problems, see the
HP ProLiant servers troubleshooting
guide
.
Effects of hard drive failure
When a hard drive fails, all logical drives that are in the same array are affected. Each logical drive in an
array may be using a different fault-tolerance method, so each logical drive can be affected differently.
RAID 0 con
gurations cannot tolerate hard drive failure. If any physical hard drive in the array
fails, all non-fault-tolerant (RAID 0) LUNs in the same array also are failed.
RAID 1 and RAID 1+0 con
gurations can tolerate multiple hard drive failures, as long as none of
the failed hard drives are mirrored to one another.
RAID 5 con
gurations can tolerate one hard drive failure.
RAID 6 con
gurations can tolerate simultaneous failure of two hard drives in the array.
1510i Modular Smart Array installation and user guide
93