HP P4000 HP Smart Array SAS controllers for Integrity servers support guide - Page 113

Factors to consider before replacing physical disks, Automatic data recovery (rebuild), CAUTION

Page 113 highlights

Factors to consider before replacing physical disks Before replacing a degraded disk: • Confirm that the array has a current, valid backup. • Confirm that the replacement disk is of the same type (SAS or SATA) as the degraded disk. • Use replacement disks that have a capacity at least as great as that of the smallest disk in the array. The controller immediately fails disks that have insufficient capacity. CAUTION: A disk that was previously failed by the controller can seem to be operational after the system is power cycled, or (for a hot-pluggable disk) if a disk is removed and reinserted. However, continued use of the disk can result in data loss. Replace the disk as soon as possible. IMPORTANT: In systems that use external data storage, be sure that the server is the first unit to be powered off and the last to be powered on. Taking this precaution ensures that the system does not erroneously mark the drives as failed when the server is powered on. To minimize the likelihood of fatal system errors, take these precautions when removing failed disks: • Do not remove a degraded disk if another disk in the array is offline (the Online/Activity LED is off). In this situation, no other disk in the array can be removed without data loss. The following cases are exceptions: • When RAID 1+0 is used, disks are mirrored in pairs. Several disks can be in a failed condition simultaneously (and they can all be replaced simultaneously) without data loss, as long as no two failed disks belong to the same mirrored pair. • When RAID 6 (ADG) is used, two disks can fail simultaneously (and be replaced simultaneously) without data loss. • If the offline disk is a spare, the degraded disk can be replaced. • Do not remove a second disk from an array until the first failed or missing disk is replaced and the rebuild process is complete. (The rebuild is complete when the Online/Activity LED on the front of the drive stops flashing.) The following cases are exceptions: • In RAID 1+0 configurations, any disks that are not mirrored to other removed or failed disks can be simultaneously replaced offline without data loss. • In RAID 50 configurations, disks are arranged in parity groups. You can replace several disks simultaneously, if the disks belong to different parity groups. Do not replace more than one disk at a time from the same parity group. • In RAID 6 (ADG) configurations, any two disks in the array can be replaced simultaneously. • In RAID 60 configurations, disks are arranged in parity groups. You can replace several disks simultaneously, if no more than two of the disks being replaced belong to the same parity group. Do not replace more than two disks at a time from the same parity group. • Replacement disks must have a capacity no less than that of the smallest disk in the array. Disks with insufficient capacity are failed immediately by the controller, before data recovery begins. Automatic data recovery (rebuild) When a physical disk is replaced, the controller gathers fault tolerance data from the remaining disks in the array. This data is then used to rebuild the missing data from the failed disk onto the replacement disk. The rebuild operation takes several hours, even if the system is not busy while the rebuild is in progress. System performance and fault tolerance are affected until the rebuild finishes. Therefore, replace disks during low activity periods when possible. In addition, be sure that all logical drives on the same array as the disk being replaced have a current, valid backup. Automatic data recovery (rebuild) 113

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142

Factors to consider before replacing physical disks
Before replacing a degraded disk:
Confirm that the array has a current, valid backup.
Confirm that the replacement disk is of the same type (SAS or SATA) as the degraded disk.
Use replacement disks that have a capacity at least as great as that of the smallest disk in
the array. The controller immediately fails disks that have insufficient capacity.
CAUTION:
A disk that was previously failed by the controller can seem to be operational after
the system is power cycled, or (for a hot-pluggable disk) if a disk is removed and reinserted.
However, continued use of the disk can result in data loss. Replace the disk as soon as possible.
IMPORTANT:
In systems that use external data storage, be sure that the server is the first unit
to be powered off and the last to be powered on. Taking this precaution ensures that the system
does not erroneously mark the drives as failed when the server is powered on.
To minimize the likelihood of fatal system errors, take these precautions when removing failed
disks:
Do not remove a degraded disk if another disk in the array is offline (the Online/Activity
LED is off). In this situation, no other disk in the array can be removed without data loss.
The following cases are exceptions:
When RAID 1+0 is used, disks are mirrored in pairs. Several disks can be in a failed
condition simultaneously (and they can all be replaced simultaneously) without data
loss, as long as no two failed disks belong to the same mirrored pair.
When RAID 6 (ADG) is used, two disks can fail simultaneously (and be replaced
simultaneously) without data loss.
If the offline disk is a spare, the degraded disk can be replaced.
Do not remove a second disk from an array until the first failed or missing disk is replaced
and the rebuild process is complete. (The rebuild is complete when the Online/Activity LED
on the front of the drive stops flashing.) The following cases are exceptions:
In RAID 1+0 configurations, any disks that are not mirrored to other removed or failed
disks can be simultaneously replaced offline without data loss.
In RAID 50 configurations, disks are arranged in parity groups. You can replace several
disks simultaneously, if the disks belong to different parity groups. Do not replace more
than one disk at a time from the same parity group.
In RAID 6 (ADG) configurations, any two disks in the array can be replaced
simultaneously.
In RAID 60 configurations, disks are arranged in parity groups. You can replace several
disks simultaneously, if no more than two of the disks being replaced belong to the
same parity group. Do not replace more than two disks at a time from the same parity
group.
Replacement disks must have a capacity no less than that of the smallest disk in the array.
Disks with insufficient capacity are failed immediately by the controller, before data recovery
begins.
Automatic data recovery (rebuild)
When a physical disk is replaced, the controller gathers fault tolerance data from the remaining
disks in the array. This data is then used to rebuild the missing data from the failed disk onto
the replacement disk.
The rebuild operation takes several hours, even if the system is not busy while the rebuild is in
progress. System performance and fault tolerance are affected until the rebuild finishes. Therefore,
replace disks during low activity periods when possible. In addition, be sure that all logical
drives on the same array as the disk being replaced have a current, valid backup.
Automatic data recovery (rebuild)
113