HP AE326A HP StorageWorks 1500 Modular Smart Array maintenance and service gui - Page 100

Automatic data recovery (rebuild), Time required for a rebuild

Page 100 highlights

• When RAID 6 (ADG) is used, two drives can fail simultaneously (and be replaced simultaneously) without data loss. • If the offline drive is a spare, the degraded drive can be replaced. • Do not remove a failed second hard drive from an array until the first failed or missing hard drive has been replaced and the rebuild process is complete. (When the rebuild is complete, the online LED on the front of the hard drive stops blinking.) Exceptions: • In RAID 6 (ADG) configurations, any two drives in the array can be replaced simultaneously. • In RAID1+0 configurations, any drives that are not mirrored to other removed or failed drives can be simultaneously replaced offline without data loss. Automatic data recovery (rebuild) When you replace a hard drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost. If another hard drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system error may occur, and all data on the array is then lost. In exceptional cases, however, failure of another drive need not lead to a fatal system error. These exceptions include: • Failure after activation of a spare drive. • Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 configuration). • Failure of a second drive in a RAID ADG configuration. Time required for a rebuild The time required for a rebuild varies considerably, depending on several factors: • The priority that the rebuild is given over normal I/O operations (you can change the priority setting through the Array Configuration Utility (ACU) or MSA Command Line Interface (MSA-CLI). • The amount of I/O activity during the rebuild operation • The rotational speed of the hard drives • The availability of drive cache • The brand, model, and age of the drives • The amount of unused capacity on the drives • The number of drives in the array (for RAID 5 and RAID ADG) Allow approximately 15 minutes per gigabyte for the rebuild process to be completed. This figure is conservative, and newer drive models usually require less time to rebuild. System performance is affected during the rebuild, and the system is unprotected against further drive failure until the rebuild has finished. Therefore, replace drives during periods of low activity when possible. CAUTION: If the Online LED of the replacement drive stops blinking and the amber Fault LED glows, or if other drive LEDs in the array go out, the replacement drive has failed and is producing unrecoverable disk errors. Remove and replace the failed replacement drive. When automatic data recovery has finished, the Online LED of the replacement drive stops blinking and begins to glow steadily. If ADR process aborts, restart the storage system and allow ADR to begin again. If ADR fails again, back up all data on the system, do a surface analysis (using your diagnostics utility), and restore the data from backup. 100 Hard drive failures and faulted LUNs

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120

When RAID 6 (ADG) is used, two drives can fail simultaneously (and be replaced simultaneously)
without data loss.
If the of
ine drive is a spare, the degraded drive can be replaced.
Do not remove a failed second hard drive from an array until the
rst failed or missing hard drive has
been replaced and the rebuild process is complete. (When the rebuild is complete, the online LED
on the front of the hard drive stops blinking.)
Exceptions:
In RAID 6 (ADG) con
gurations, any two drives in the array can be replaced simultaneously.
In RAID1+0 con
gurations, any drives that are not mirrored to other removed or failed drives can be
simultaneously replaced of
ine without data loss.
Automatic data recovery (rebuild)
When you replace a hard drive in an array, the controller uses the fault-tolerance information on the
remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced
drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If
fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost.
If another hard drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system
error may occur, and all data on the array is then lost. In exceptional cases, however, failure of another
drive need not lead to a fatal system error. These exceptions include:
Failure after activation of a spare drive.
Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 con
guration).
Failure of a second drive in a RAID ADG con
guration.
Time required for a rebuild
The time required for a rebuild varies considerably, depending on several factors:
The priority that the rebuild is given over normal I/O operations (you can change the priority setting
through the Array Con
guration Utility (ACU) or MSA Command Line Interface (MSA-CLI).
The amount of I/O activity during the rebuild operation
The rotational speed of the hard drives
The availability of drive cache
The brand, model, and age of the drives
The amount of unused capacity on the drives
The number of drives in the array (for RAID 5 and RAID ADG)
Allow approximately 15 minutes per gigabyte for the rebuild process to be completed. This
gure is
conservative, and newer drive models usually require less time to rebuild.
System performance is affected during the rebuild, and the system is unprotected against further drive
failure until the rebuild has
nished. Therefore, replace drives during periods of low activity when possible.
CAUTION:
If the Online LED of the replacement drive stops blinking and the amber Fault LED glows, or if other drive
LEDs in the array go out, the replacement drive has failed and is producing unrecoverable disk errors.
Remove and replace the failed replacement drive.
When automatic data recovery has
nished, the Online LED of the replacement drive stops blinking
and begins to glow steadily.
If ADR process aborts, restart the storage system and allow ADR to begin again. If ADR fails again,
back up all data on the system, do a surface analysis (using your diagnostics utility), and restore the
data from backup.
100
Hard drive failures and faulted LUNs