Dell EqualLogic PS6210XS EqualLogic Group Manager Administrator s Guide PS Ser - Page 333

Common Hardware Issues, Monitor PS Series group

Page 333 highlights

• Are users getting the response time they expect? If not, identify which area might be causing the problem: - Operating system problem, as it interacts with storage - Network problem - Application being run or accessed - Storage environment • Use the 80/20 rule. By focusing on 20 percent of the most likely causes of a performance issue, you will solve 80 percent of the problems. • Keep the host perspective in mind when managing the arrays. In particular: - Make sure time is synchronized on all monitoring areas (host, array). - Never assume the cause of a problem. Other issues might be causing skewed data. General best practices for solving performance problems include: • Fix hardware problems immediately, even if the array that is down is the redundant array. Replace failed disks or failed control modules. • Always validate the data you are analyzing to ensure it is accurate. For example, you might need to inspect the physical array to see if it is powered on or check if your switches are connected properly to handle MPIO. • Consider the size of your installation when analyzing your storage data. For example, an enterprise-level installation with a large volume of users and data will experience a greater impact from a small degradation of IOPS than a small company with few users. • When setting up email notification, determine what types of information are most useful. Streamline the information you receive as much as possible. If email notifications are set too broadly, an actual problem might be obscured by too much information. • When using SAN Headquarters to monitor your groups, remember that the historical data collected degrades over time. Use more current data for your analysis. • Look at event and audit logs for other issues. Common Hardware Issues Identifying hardware performance issues can eliminate additional effort elsewhere. Hardware failures can also be a source of performance problems. In addition, the combination of hardware and firmware can affect performance, as can various disk types with different performance characteristics. The basic steps for solving any IT problem also apply to the SAN. Table 62. Hardware Issues Affecting SAN Performance lists some common problems that you should watch for and correct immediately. Table 62. Hardware Issues Affecting SAN Performance Damaged Hardware Typical Symptom Detected By Server NIC Malformed packets Monitor errors at switch Possible Corrective Actions Update NIC drivers Replace NIC Bad cable Wrong class of cables Visible damage Malformed packets Visual inspection Monitor errors at switch Replace cable Defective switch Spontaneous restarts Random lockup Monitor switch with appropriate network Update switch firmware Replace switch Defective array hardware Alerts Monitor PS Series group Monitor SAN Headquarters Contact Dell customer support to replace malfunctioning component About Monitoring 333

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355

Are users getting the response time they expect? If not, identify which area might be causing the problem:
Operating system problem, as it interacts with storage
Network problem
Application being run or accessed
Storage environment
Use the 80/20 rule. By focusing on 20 percent of the most likely causes of a performance issue, you will solve 80 percent of the
problems.
Keep the host perspective in mind when managing the arrays. In particular:
Make sure time is synchronized on all monitoring areas (host, array).
Never assume the cause of a problem. Other issues might be causing skewed data.
General best practices for solving performance problems include:
Fix hardware problems immediately, even if the array that is down is the redundant array. Replace failed disks or failed control
modules.
Always validate the data you are analyzing to ensure it is accurate. For example, you might need to inspect the physical array to
see if it is powered on or check if your switches are connected properly to handle MPIO.
Consider the size of your installation when analyzing your storage data. For example, an enterprise-level installation with a large
volume of users and data will experience a greater impact from a small degradation of IOPS than a small company with few
users.
When setting up email
notification,
determine what types of information are most useful. Streamline the information you receive
as much as possible. If email
notifications
are set too broadly, an actual problem might be obscured by too much information.
When using SAN Headquarters to monitor your groups, remember that the historical data collected degrades over time. Use
more current data for your analysis.
Look at event and audit logs for other issues.
Common Hardware Issues
Identifying hardware performance issues can eliminate additional
effort
elsewhere. Hardware failures can also be a source of
performance problems. In addition, the combination of hardware and
firmware
can
affect
performance, as can various disk types
with
different
performance characteristics.
The basic steps for solving any IT problem also apply to the SAN.
Table 62. Hardware Issues
Affecting
SAN Performance
lists some
common problems that you should watch for and correct immediately.
Table 62. Hardware Issues
Affecting
SAN Performance
Damaged
Hardware
Typical Symptom
Detected By
Possible Corrective Actions
Server NIC
Malformed
packets
Monitor errors at switch
Update NIC drivers
Replace NIC
Bad cable
Wrong class of
cables
Visible damage
Malformed
packets
Visual inspection
Monitor errors at switch
Replace cable
Defective switch
Spontaneous
restarts
Random lockup
Monitor switch with appropriate network
Update switch
firmware
Replace switch
Defective array
hardware
Alerts
Monitor PS Series group
Monitor SAN Headquarters
Contact Dell customer support to replace
malfunctioning component
About Monitoring
333