HP 273914-B21 Smart Array 6400 Series Controllers for Integrity Servers User G - Page 25

Factors to consider before replacing hard drives, Automatic data recovery (rebuild)

Page 25 highlights

automatically (as indicated by the blinking Online LED on the replacement drive) if the array is in a faulttolerant configuration. If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST message is displayed when the system is next powered up. This message prompts you to press the F1 key to start automatic data recovery. If you do not enable automatic data recovery, the logical volume remains in a ready-to-recover condition and the same POST message is displayed whenever the system is restarted. Factors to consider before replacing hard drives • In systems that use external data storage, be sure that the server is the first unit to be powered down and the last to be powered back up. Taking this precaution ensures that the system does not erroneously mark the drives as failed when the server is powered up. • If you set the SCSI ID jumpers manually: • Check the ID value of the removed drive to be sure that it corresponds to the ID of the drive marked as failed. • Set the same ID value on the replacement drive to prevent SCSI ID conflicts. Before replacing a degraded drive: • Open Systems Insight Manager and inspect the Error Counter window for each physical drive in the same array to confirm that no other drives have any errors. (For details, refer to the Systems Insight Manager documentation on the Management CD.) • Be sure that the array has a current, valid backup. • Use replacement drives that have a capacity at least as great as that of the smallest drive in the array. The controller immediately fails drives that have insufficient capacity. To minimize the likelihood of fatal system errors, take these precautions when removing failed drives: • Do not remove a degraded drive if any other drive in the array is offline (the Online LED is off). In this situation, no other drive in the array can be removed without data loss. Exceptions: • When RAID 1+0 is used, drives are mirrored in pairs. Several drives can be in a failed condition simultaneously (and they can all be replaced simultaneously) without data loss, as long as no two failed drives belong to the same mirrored pair. • When RAID ADG is used, two drives can fail simultaneously (and be replaced simultaneously) without data loss. • If the offline drive is a spare, the degraded drive can be replaced. • Do not remove a second drive from an array until the first failed or missing drive has been replaced and the rebuild process is complete. (The rebuild is complete when the Online LED on the front of the drive stops blinking.) These cases are the exceptions: • In RAID ADG configurations, any two drives in the array can be replaced simultaneously. • In RAID 1+0 configurations, any drives that are not mirrored to other removed or failed drives can be simultaneously replaced offline without data loss. Automatic data recovery (rebuild) When you replace a hard drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost. Replacing, moving, or adding hard drives 25

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42

Replacing, moving, or adding hard drives
25
automatically (as indicated by the blinking Online LED on the replacement drive) if the array is in a fault-
tolerant configuration.
If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST
message is displayed when the system is next powered up. This message prompts you to press the
F1
key
to start automatic data recovery. If you do not enable automatic data recovery, the logical volume
remains in a ready-to-recover condition and the same POST message is displayed whenever the system is
restarted.
Factors to consider before replacing hard drives
In systems that use external data storage, be sure that the server is the first unit to be powered down
and the last to be powered back up. Taking this precaution ensures that the system does not
erroneously mark the drives as failed when the server is powered up.
If you set the SCSI ID jumpers manually:
Check the ID value of the removed drive to be sure that it corresponds to the ID of the drive
marked as failed.
Set the same ID value on the replacement drive to prevent SCSI ID conflicts.
Before replacing a degraded drive:
Open Systems Insight Manager and inspect the Error Counter window for each physical drive in the
same array to confirm that no other drives have any errors. (For details, refer to the Systems Insight
Manager documentation on the Management CD.)
Be sure that the array has a current, valid backup.
Use replacement drives that have a capacity at least as great as that of the smallest drive in the
array. The controller immediately fails drives that have insufficient capacity.
To minimize the likelihood of fatal system errors, take these precautions when removing failed drives:
Do not remove a degraded drive if any other drive in the array is offline (the Online LED is off). In
this situation, no other drive in the array can be removed without data loss.
Exceptions:
When RAID 1+0 is used, drives are mirrored in pairs. Several drives can be in a failed
condition simultaneously (and they can all be replaced simultaneously) without data loss, as
long as no two failed drives belong to the same mirrored pair.
When RAID ADG is used, two drives can fail simultaneously (and be replaced simultaneously)
without data loss.
If the offline drive is a spare, the degraded drive can be replaced.
Do not remove a second drive from an array until the first failed or missing drive has been replaced
and
the rebuild process is complete. (The rebuild is complete when the Online LED on the front of
the drive stops blinking.)
These cases are the exceptions:
In RAID ADG configurations, any two drives in the array can be replaced simultaneously.
In RAID 1+0 configurations, any drives that are not mirrored to other removed or failed drives
can be simultaneously replaced offline without data loss.
Automatic data recovery (rebuild)
When you replace a hard drive in an array, the controller uses the fault-tolerance information on the
remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced
drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If
fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost.