HP 381513-B21 HP Smart Array P800 Controller for ProLiant Servers User Guide - Page 29

Automatic data recovery (rebuild), Time required for a rebuild, Abnormal termination of a rebuild

Page 29 highlights

Automatic data recovery (rebuild) When you replace a hard drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost. If another drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system error can occur, and all data on the array is then lost. In exceptional cases, however, failure of another drive need not lead to a fatal system error. These exceptions include: • Failure after activation of a spare drive • Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 configuration) • Failure of a second drive in a RAID 6 (ADG) configuration Time required for a rebuild The time required for a rebuild varies considerably, depending on several factors: • The priority that the rebuild is given over normal I/O operations (you can change the priority setting by using ACU) • The amount of I/O activity during the rebuild operation • The rotational speed of the hard drives • The availability of drive cache • The brand, model, and age of the drives • The amount of unused capacity on the drives • For RAID 5 and RAID 6 (ADG), the number of drives in the array Allow approximately 15 minutes per gigabyte for the rebuild process to be completed. This figure is conservative; the actual time required is usually less than this. System performance is affected during the rebuild, and the system is unprotected against further drive failure until the rebuild has finished. Therefore, replace drives during periods of low activity when possible. When automatic data recovery has finished, the Online/Activity LED of the replacement drive stops blinking steadily at 1 Hz and begins to either glow steadily (if the drive is inactive) or flash irregularly (if the drive is active). CAUTION: If the Online/Activity LED on the replacement drive does not light up while the corresponding LEDs on other drives in the array are active, the rebuild process has abnormally terminated. The amber Fault LED of one or more drives might also be illuminated. Refer to "Abnormal termination of a rebuild (on page 29)" to determine what action you must take. Abnormal termination of a rebuild If the Online/Activity LED on the replacement drive permanently ceases to be illuminated even while other drives in the array are active, the rebuild process has abnormally terminated. The following table indicates the three possible causes of abnormal termination of a rebuild. Replacing, moving, or adding hard drives 29

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43

Replacing, moving, or adding hard drives
29
Automatic data recovery (rebuild)
When you replace a hard drive in an array, the controller uses the fault-tolerance information on the
remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced
drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If
fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost.
If another drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system error
can occur, and all data on the array is then lost. In exceptional cases, however, failure of another drive
need not lead to a fatal system error. These exceptions include:
Failure after activation of a spare drive
Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 configuration)
Failure of a second drive in a RAID 6 (ADG) configuration
Time required for a rebuild
The time required for a rebuild varies considerably, depending on several factors:
The priority that the rebuild is given over normal I/O operations (you can change the priority setting
by using ACU)
The amount of I/O activity during the rebuild operation
The rotational speed of the hard drives
The availability of drive cache
The brand, model, and age of the drives
The amount of unused capacity on the drives
For RAID 5 and RAID 6 (ADG), the number of drives in the array
Allow approximately 15 minutes per gigabyte for the rebuild process to be completed. This figure is
conservative; the actual time required is usually less than this.
System performance is affected during the rebuild, and the system is unprotected against further drive
failure until the rebuild has finished. Therefore, replace drives during periods of low activity when
possible.
When automatic data recovery has finished, the Online/Activity LED of the replacement drive stops
blinking steadily at 1 Hz and begins to either glow steadily (if the drive is inactive) or flash irregularly (if
the drive is active).
CAUTION:
If the Online/Activity LED on the replacement drive does not light up while the
corresponding LEDs on other drives in the array are active, the rebuild process has abnormally
terminated. The amber Fault LED of one or more drives might also be illuminated. Refer to
"Abnormal termination of a rebuild (on page
29
)" to determine what action you must take.
Abnormal termination of a rebuild
If the Online/Activity LED on the replacement drive permanently ceases to be illuminated even while other
drives in the array are active, the rebuild process has abnormally terminated. The following table
indicates the three possible causes of abnormal termination of a rebuild.