IBM 86655RY Hardware Maintenance Manual - Page 216

Device Event Table, Using the IPSSEND program, Backplanes, Hot-Swap Drive Trays

Page 216 highlights

• Analyzes data from periodic internal measurements • Recommends replacement when specific thresholds are exceeded The data from periodic internal measurements is collected when data sectors are accessed. Data scrubbing performs the following operations: • Forces all data sectors to be read • Provides more data to improve the accuracy of PFA The thresholds have been determined by examining the history logs of drives that have failed in actual customer operation. When PFA detects a threshold exceeded failure, the system administrator can be notified through Netfinity Director. The design goal of PFA is to provide a minimum of 24 hours warning before a drive experiences "catastrophic" failure. 2. A cable breaking, a component burning out, a solder connection failing, are all examples of "on/off" unpredictable catastrophic failures. As assembly and component processes have improved, these types of defects have been reduced but not eliminated. PFA cannot always provide warning for on/off unpredictable failures. Device Event Table: This table contains counters indicating the number of times unexpected events were reported through the storage subsystem. These events may be caused by several sources, including: ServeRAID controller, Cables (external and internal), Connectors, Hot-Swap Backplane(s), Hot-Swap Drive Trays, Target Devices (Disk Drives, CD-ROMs, etc.), and SCSI Terminators. The Device Event Table can be displayed using the IPSSEND program or the . Using the IPSSEND program Note: In the following command, replace with the ServeRAID controller number. At a command prompt, type the following: ipssend getevent device Frequently asked questions regarding the Device Event Table: In the Device Event Table, what are hard events?: The hard event count entry in the device event table is a count of events detected by the SCSI I/O processor since the Device Event Table was last cleared. These events are usually not caused by the target device. The controller processor can detect many types of events. Usually these events are related to SCSI cabling, back planes or internal problems in the ServeRAID controller. Hard events are usually not related to the hard drives or other SCSI devices that are on the bus. How should hard events be handled?: If you find a hard event entered into the Event log, first check to see if there is a discernible pattern to the events in the device error table. For example a large number of events on a particular drive or channel may indicate a problem with the cabling or back plane for that particular drive, channel, etc. Always check for cables being properly seated, bent pins, pushed pins, damaged cables and proper termination. Before replacing the ServeRAID controller, replace the SCSI cables followed by the back plane. If you have exhausted all other possibilities, then replace the ServeRAID controller. Remember that the ServeRAID card is the least likely item in the subsystem to cause hard events and the most expensive to replace. In the Device Event Table what is the meaning of soft events?: The soft event entry in the device error table is a count of the SCSI check conditions (other than unit attention) 206 Hardware Maintenance Manual: Netfinity 7600 - Type 8665 Models 1RY, 2RY

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294

206
Hardware Maintenance Manual: Netfinity 7600
Type 8665 Models 1RY, 2RY
Analyzes data from periodic internal measurements
Recommends replacement when specific thresholds are exceeded
The data from periodic internal measurements is collected when data sectors are
accessed.
Data scrubbing performs the following operations:
Forces all data sectors to be read
Provides more data to improve the accuracy of PFA
The thresholds have been determined by examining the history logs of drives
that have failed in actual customer operation. When PFA detects a threshold
exceeded failure, the system administrator can be notified through Netfinity
Director. The design goal of PFA is to provide a minimum of 24 hours warning
before a drive experiences
catastrophic
failure.
2.
A cable breaking, a component burning out, a solder connection failing, are all
examples of
on/off
unpredictable catastrophic failures
. As assembly and
component processes have improved, these types of defects have been reduced
but not eliminated. PFA cannot always provide warning for on/off unpredictable
failures.
Device Event Table:
This table contains counters indicating the number of times
unexpected events were reported through the storage subsystem. These events may
be caused by several sources, including:
ServeRAID controller, Cables (external and internal), Connectors, Hot-Swap
Backplane(s), Hot-Swap Drive Trays, Target Devices (Disk Drives, CD-ROMs, etc.),
and SCSI Terminators.
The Device Event Table can be displayed using the IPSSEND program or the .
Using the IPSSEND program
Note:
In the following command, replace
<controller>
with the ServeRAID
controller number.
At a command prompt, type the following:
ipssend getevent <controller> device
Frequently asked questions regarding the Device Event Table:
In the
Device Event Table, what are hard events?:
The hard event count entry in the device
event table is a count of events detected by the SCSI I/O processor since the Device
Event Table was last cleared. These events are usually not caused by the target device.
The controller processor can detect many types of events. Usually these events are
related to SCSI cabling, back planes or internal problems in the ServeRAID controller.
Hard events are usually not related to the hard drives or other SCSI devices that are
on the bus.
How should hard events be handled?:
If you find a hard event entered into the Event log,
first check to see if there is a discernible pattern to the events in the device error table.
For example a large number of events on a particular drive or channel may indicate a
problem with the cabling or back plane for that particular drive, channel, etc. Always
check for cables being properly seated, bent pins, pushed pins, damaged cables and
proper termination. Before replacing the ServeRAID controller, replace the SCSI
cables followed by the back plane. If you have exhausted all other possibilities, then
replace the ServeRAID controller. Remember that the ServeRAID card is the least
likely item in the subsystem to cause hard events and the most expensive to replace.
In the Device Event Table what is the meaning of soft events?:
The soft event entry in the
device error table is a count of the SCSI check conditions (other than unit attention)