HP AD510A HP StorageWorks 1500 Modular Smart Array maintenance and service gui - Page 97

Hard drive failures and faulted LUNs, Recognizing hard drive failure, Effects of hard drive

Page 97 highlights

9 Hard drive failures and faulted LUNs The purpose of fault-tolerant array configurations is to protect against data loss due to hard drive failure. Each RAID configuration has inherent limitations on the number of hard drive failures that it can tolerate. If the fault-tolerance level of a particular LUN or array configuration is exceeded, the array will be locked from any further I/O. This protection is designed to preserve the integrity of the local drive, but does require manual intervention to recover or re-enable the LUN. Although controller firmware is designed to protect against normal hard drive failure, it is imperative that you perform the correct actions to recover from a hard drive failure without inadvertently introducing any additional hard drive failures. Included sections: • Recognizing hard drive failure • Compromised fault tolerance • Enabling failed LUNs • Best practices when replacing hard drives • Automatic data recovery Recognizing hard drive failure LEDs on the front of each hard drive are visible from the front of the external storage unit. When a hard drive is configured as a part of an array and attached to a powered-on controller, the status of the hard drive can be determined from the illumination pattern of these LEDs. For detailed descriptions of the various LED combinations, see Hard drive LEDs. Other ways to determine that a hard drive has failed include the following: • LEDs on the storage system chassis illuminate amber if failed hard drives are inside. (However, this LED also illuminates when other problems occur, such as when a fan fails, a redundant power supply fails, or the system overheats.) • LEDs on the hard drives illuminate amber if a hard drive has failed or is a member of a faulted LUN. • Front-panel LCD display messages list faulted LUNs and failed hard drives whenever the system is restarted, as long as the controller detects one or more good hard drives. • ACU represents faulted LUNs and failed drives with distinctive icons. • HP-SIM can detect failed hard drives. • ADU lists all failed hard drives. For more information on troubleshooting hard drive problems, see the HP ProLiant Servers Troubleshooting Guide. Effects of hard drive failure When a hard drive fails, all logical drives that are in the same array are affected. Each logical drive in an array may be using a different fault-tolerance method, so each logical drive can be affected differently. • RAID 0 configurations cannot tolerate hard drive failure. If any physical hard drive in the array fails, all non-fault-tolerant (RAID 0) LUNs in the same array also are failed. • RAID 1 and RAID 1+0 configurations can tolerate multiple hard drive failures, as long as none of the failed hard drives are mirrored to one another. • RAID 5 configurations can tolerate one hard drive failure. maintenance and service guide 97

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120

9 Hard drive failures and faulted
LUNs
The purpose of fault-tolerant array con
gurations is to protect against data loss due to hard drive failure.
Each RAID con
guration has inherent limitations on the number of hard drive failures that it can tolerate.
If the fault-tolerance level of a particular LUN or array con
guration is exceeded, the array will be locked
from any further I/O. This protection is designed to preserve the integrity of the local drive, but does
require manual intervention to recover or re-enable the LUN.
Although controller
rmware is designed to protect against normal hard drive failure, it is imperative that
you perform the correct actions to recover from a hard drive failure without inadvertently introducing any
additional hard drive failures.
Included sections:
Recognizing hard drive failure
Compromised fault tolerance
Enabling failed LUNs
Best practices when replacing hard drives
Automatic data recovery
Recognizing hard drive failure
LEDs on the front of each hard drive are visible from the front of the external storage unit. When a hard
drive is con
gured as a part of an array and attached to a powered-on controller, the status of the hard
drive can be determined from the illumination pattern of these LEDs.
For detailed descriptions of the various LED combinations, see
Hard drive LEDs
.
Other ways to determine that a hard drive has failed include the following:
LEDs on the storage system chassis illuminate amber if failed hard drives are inside. (However,
this LED also illuminates when other problems occur, such as when a fan fails, a redundant power
supply fails, or the system overheats.)
LEDs on the hard drives illuminate amber if a hard drive has failed or is a member of a faulted LUN.
Front-panel LCD display messages list faulted LUNs and failed hard drives whenever the system is
restarted, as long as the controller detects one or more good hard drives.
ACU represents faulted LUNs and failed drives with distinctive icons.
HP-SIM can detect failed hard drives.
ADU lists all failed hard drives.
For more information on troubleshooting hard drive problems, see the HP ProLiant Servers Troubleshooting
Guide.
Effects of hard drive failure
When a hard drive fails, all logical drives that are in the same array are affected. Each logical drive in
an array may be using a different fault-tolerance method, so each logical drive can be affected differently.
RAID 0 con
gurations cannot tolerate hard drive failure. If any physical hard drive in the array
fails, all non-fault-tolerant (RAID 0) LUNs in the same array also are failed.
RAID 1 and RAID 1+0 con
gurations can tolerate multiple hard drive failures, as long as none of
the failed hard drives are mirrored to one another.
RAID 5 con
gurations can tolerate one hard drive failure.
maintenance and service guide
97