Dell EqualLogic PS6210XS EqualLogic Group Manager Administrator s Guide PS Ser - Page 334

About Analyzing SAN, Average I/O Latency

Page 334 highlights

Damaged Hardware Typical Symptom Detected By Set up email alerts on group and SAN Headquarters Possible Corrective Actions As best practice, use the SAN Headquarters GUI to help identify hardware-related issues. SAN Headquarters easily tracks the array model, service tag, and serial number, plus RAID status and policy, and firmware version. In particular, SAN Headquarters provides information about: • Hardware alerts The SAN Headquarters Alerts panel shows hardware problems that might affect performance, such as a failed disk or a network connection that is not Gigabit Ethernet. • Network retransmissions A sustained high TCP retransmit rate (greater than 1 percent) might indicate a network hardware failure, insufficient server resources, or insufficient network bandwidth. • RAID status A degraded, reconstructing, or verifying RAID set might adversely affect performance. In some cases, performance might return to normal when an operation completes. • Low pool capacity Make sure free space in each pool does not fall below the following level (whichever is smaller): - 5 percent of pool capacity - 100GB times the number of pool members Otherwise, load-balancing, member-removal, and replication operations do not perform optimally. Low free space also negatively affects the performance of thin-provisioned volumes. About Analyzing SAN If you are sure that no hardware problems exist, it is best practice to use SAN Headquarters to review performance statistics to identify other potential problems. These statistics provide a good indication of overall group performance and might help you identify areas where performance can be optimized. The following statistics provide common indicators of performance problems: I/O latency, I/O load, IOPS, I/O size, network load, network rate, and queue depth. Average I/O Latency One of the leading indicators of a healthy SAN is latency. Latency is the time from the receipt of the I/O request to the time that the I/O is returned to the server. Latency must be considered along with the average I/O size, because large I/O operations take longer to process than small I/O operations. The following guidelines apply to I/O operations with an average size of 16KB or less: • Less than 20 ms - In general, average latencies of less than 20 ms are acceptable. • 20 ms to 50 ms - Sustained average latencies between 20 ms and 50 ms should be monitored closely. You might want to reduce the workload or add additional storage resources to handle the load. • 51 ms to 80 ms - Sustained average latencies between 51 ms and 80 ms should be monitored closely. Applications might experience problems and noticeable delays. You might want to reduce the workload or add additional storage resources to handle the load. • Greater than 80 ms - An average latency of more than 80 ms indicates a problem, especially if this value is sustained over time. Most enterprise applications will experience problems if latencies exceed 100 ms. You should reduce the workload or add additional storage resources to handle the load. If the average I/O operation size is greater than 16KB, these latency guidelines might not apply. If latency statistics indicate a performance problem, examine the total IOPS in the pools. The storage array configuration (disk drives and RAID level) determines 334 About Monitoring

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355

Damaged
Hardware
Typical Symptom
Detected By
Possible Corrective Actions
Set up email alerts on group and
SAN Headquarters
As best practice, use the SAN Headquarters GUI to help identify hardware-related issues. SAN Headquarters easily tracks the array
model, service tag, and serial number, plus RAID status and policy, and
firmware
version. In particular, SAN Headquarters provides
information about:
Hardware alerts
The SAN Headquarters Alerts panel shows hardware problems that might
affect
performance, such as a failed disk or a network
connection that is not Gigabit Ethernet.
Network retransmissions
A sustained high TCP retransmit rate (greater than 1 percent) might indicate a network hardware failure,
insufficient
server
resources, or
insufficient
network bandwidth.
RAID status
A degraded, reconstructing, or verifying RAID set might adversely
affect
performance. In some cases, performance might return
to normal when an operation completes.
Low pool capacity
Make sure free space in each pool does not fall below the following level (whichever is smaller):
5 percent of pool capacity
100GB times the number of pool members
Otherwise, load-balancing, member-removal, and replication operations do not perform optimally. Low free space also negatively
affects
the performance of thin-provisioned volumes.
About Analyzing SAN
If you are sure that no hardware problems exist, it is best practice to use SAN Headquarters to review performance statistics to
identify other potential problems. These statistics provide a good indication of overall group performance and might help you identify
areas where performance can be optimized.
The following statistics provide common indicators of performance problems: I/O latency, I/O load, IOPS, I/O size, network load,
network rate, and queue depth.
Average I/O Latency
One of the leading indicators of a healthy SAN is latency. Latency is the time from the receipt of the I/O request to the time that the
I/O is returned to the server.
Latency must be considered along with the average I/O size, because large I/O operations take longer to process than small I/O
operations.
The following guidelines apply to I/O operations with an average size of 16KB or less:
Less than 20 ms — In general, average latencies of less than 20 ms are acceptable.
20 ms to 50 ms — Sustained average latencies between 20 ms and 50 ms should be monitored closely. You might want to
reduce the workload or add additional storage resources to handle the load.
51 ms to 80 ms — Sustained average latencies between 51 ms and 80 ms should be monitored closely. Applications might
experience problems and noticeable delays. You might want to reduce the workload or add additional storage resources to handle
the load.
Greater than 80 ms — An average latency of more than 80 ms indicates a problem, especially if this value is sustained over time.
Most enterprise applications will experience problems if latencies exceed 100 ms. You should reduce the workload or add
additional storage resources to handle the load.
If the average I/O operation size is greater than 16KB, these latency guidelines might not apply. If latency statistics indicate a
performance problem, examine the total IOPS in the pools. The storage array
configuration
(disk drives and RAID level) determines
334
About Monitoring