HP 226593-B21 Smart Array 5i Plus Controller and Battery Backed Write Cache En - Page 97

Compromised Fault Tolerance, Procedure to Attempt Recovery

Page 97 highlights

Hard Drive Installation and Replacement Compromised Fault Tolerance Compromised fault tolerance commonly occurs when more physical drives have failed than the fault-tolerance method can endure. In this case, the logical volume is failed and unrecoverable disk error messages are returned to the host. Data loss is likely to occur. An example of this situation is where one drive on an array fails while another drive in the same array is still being rebuilt. If the array has no online spare, any logical drives on the array that are configured with RAID 5 fault tolerance will fail. Compromised fault tolerance may also be caused by non-drive problems, such as temporary power loss to a storage system or a faulty cable. In such cases, the physical drives do not need to be replaced. However, data may still have been lost, especially if the system was busy at the time that the problem occurred. Procedure to Attempt Recovery When fault tolerance has been compromised, inserting replacement drives does not improve the condition of the logical volume. Instead, if your screen displays unrecoverable disk error messages, try the following procedure to recover data. 1. Turn the entire system off and then back on. In some cases, a marginal drive will work again for long enough to allow you to make copies of important files. 2. If a 1779 POST message is displayed, press the F2 key to re-enable the logical volumes. Remember that data loss has probably occurred and any data on the logical volume is suspect. 3. Make copies of important data, if possible. 4. Replace any failed drives. 5. After the failed drives have been replaced, the fault tolerance may again be compromised. If so, cycle the power again. If the 1779 POST message is displayed, press the F2 key to re-enable the logical drives, recreate your partitions, and restore all data from backup. To minimize the risk of data loss due to compromised fault tolerance, make frequent backups of all logical volumes. Compaq Smart Array 5i Plus Controller and Battery Backed Write Cache User Guide D-5

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141

Hard Drive Installation and Replacement
Compaq Smart Array 5i Plus Controller and Battery Backed Write Cache User Guide
D-5
Compromised Fault Tolerance
Compromised fault tolerance commonly occurs when more physical drives have
failed than the fault-tolerance method can endure. In this case, the logical volume is
failed and unrecoverable disk error messages are returned to the host. Data loss is
likely to occur.
An example of this situation is where one drive on an array fails while another drive
in the same array is still being rebuilt. If the array has no online spare, any logical
drives on the array that are configured with RAID 5 fault tolerance will fail.
Compromised fault tolerance may also be caused by non-drive problems, such as
temporary power loss to a storage system or a faulty cable. In such cases, the physical
drives do not need to be replaced. However, data may still have been lost, especially
if the system was busy at the time that the problem occurred.
Procedure to Attempt Recovery
When fault tolerance has been compromised, inserting replacement drives does not
improve the condition of the logical volume. Instead, if your screen displays
unrecoverable disk error messages, try the following procedure to recover data.
1.
Turn the entire system off and then back on. In some cases, a marginal drive will
work again for long enough to allow you to make copies of important files.
2.
If a 1779 POST message is displayed, press the
F2
key to re-enable the logical
volumes. Remember that data loss has probably occurred and any data on the
logical volume is suspect.
3.
Make copies of important data, if possible.
4.
Replace any failed drives.
5.
After the failed drives have been replaced, the fault tolerance may again be
compromised. If so, cycle the power again. If the 1779 POST message is
displayed, press the
F2
key to re-enable the logical drives, recreate your
partitions, and restore all data from backup.
To minimize the risk of data loss due to compromised
fault tolerance, make frequent
backups of all logical volumes.