IBM 86655RY Hardware Maintenance Manual - Page 194

Recovery procedures for defunct (DDD) drives, Using and understanding the ServeRAID Monitor Log

Page 194 highlights

If ISPR code is EF10 after disconnecting cables, follow the steps below until the error is eliminated: a. Identify which channel is causing the error by reconnecting cables one at a time and rebooting until the error returns. b. Check termination of identified channel in step a.. Note: Refer to the HMM (Hardware Maintenance Manual) specific to the system comprising the SCSI channel for termination details. c. Disconnect one drive at a time attached to channel identified in step a. and reboot each time to determine which drive is causing the problem. d. Replace SCSI Cable attached to channel identified in step a.. e. Replace Backplane attached to channel identified in step a.. 2. If original ISPR code is still present after disconnecting all SCSI cables and rebooting, perform the following actions until the error is no longer present: • Reseat the controller • Replace the controller Recovery procedures for defunct (DDD) drives This section includes information on the following: • "Drive replacement (rebuilding a defunct drive)" • "Software and physical replacement" on page 187 • "Using and understanding the ServeRAID Monitor Log" on page 188 • "Recovery from ServeRAID controller failure" on page 189 • "Recovery procedures" on page 189 Note: The following information applies only to drives that are part of the same array. Drive replacement (rebuilding a defunct drive) A hard disk drive goes defunct when there is a loss of communication between the controller and the hard disk drive. This can be caused by any of the following: • An improperly connected cable, hard disk drive, or controller • A loss of power to a drive • A defective cable, backplane, hard disk drive or controller • A defective drive In each case, the communication problem needs to be resolved, and then a Rebuild operation is required to reconstruct the data for the device in its disk array. The ServeRAID controllers can reconstruct redundant arrays, but they cannot reconstruct data stored in non-redundant arrays. See "Reference information" on page 192 for more information. To prevent data-integrity problems, the ServeRAID controllers set the non-redundant logical drives to Blocked during a Rebuild operation. After the Rebuild operation completes, you can unblock the non-redundant logical drives and access them once again. Remember, however, that the logical drive might contain damaged data. 184 Hardware Maintenance Manual: Netfinity 7600 - Type 8665 Models 1RY, 2RY

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294

184
Hardware Maintenance Manual: Netfinity 7600
Type 8665 Models 1RY, 2RY
If ISPR code is
EF10
after disconnecting cables, follow the steps below until the
error is eliminated:
a.
Identify which channel is causing the error by reconnecting cables one at a
time and rebooting until the error returns.
b.
Check termination of identified channel in step
a..
Note:
Refer to the HMM (Hardware Maintenance Manual) specific to the
system comprising the SCSI channel for termination details.
c.
Disconnect one drive at a time attached to channel identified in step a. and
reboot each time to determine which drive is causing the problem.
d.
Replace SCSI Cable attached to channel identified in step a..
e.
Replace Backplane attached to channel identified in step a..
2.
If original ISPR code is still present after disconnecting all SCSI cables and
rebooting, perform the following actions until the error is no longer present:
Reseat the controller
Replace the controller
Recovery procedures for defunct (DDD) drives
This section includes information on the following:
Drive replacement (rebuilding a defunct drive)
Software and physical replacement
on page 187
Using and understanding the ServeRAID Monitor Log
on page 188
Recovery from ServeRAID controller failure
on page 189
Recovery procedures
on page 189
Note:
The following information applies only to drives that are part of the same
array.
Drive replacement (rebuilding a defunct drive)
A hard disk drive goes defunct when there is a loss of communication between the
controller and the hard disk drive. This can be caused by any of the following:
An improperly connected cable, hard disk drive, or controller
A loss of power to a drive
A defective cable, backplane, hard disk drive or controller
A defective drive
In each case, the communication problem needs to be resolved, and then a Rebuild
operation is required to reconstruct the data for the device in its disk array. The
ServeRAID controllers can reconstruct redundant arrays, but they cannot reconstruct
data stored in non-redundant arrays. See
Reference information
on page 192 for
more information.
To prevent data-integrity problems, the ServeRAID controllers set the non-redundant
logical drives to Blocked during a Rebuild operation. After the Rebuild operation
completes, you can unblock the non-redundant logical drives and access them once
again. Remember, however, that the logical drive might contain damaged data.