IBM 86655RY Hardware Maintenance Manual - Page 214

Miscellaneous programs, Using ServeRAID Controllers to avoid data loss, Drive failures, How They Occur

Page 214 highlights

Miscellaneous programs: The IPSSEND and IPSMON programs are advanced command-line programs that can be used to manage the ServeRAID controllers. You can use the IPSSEND program to view the configuration of a ServeRAID controller, rebuild a defunct drive, and perform other functions. You can use the ISPMON program to monitor a ServeRAID controller for defunct drives, predictive failure analysis (PFA) warnings, rebuild operators synchronizations, and logical drive migration. See the README files for installation instructions. Using ServeRAID Controllers to avoid data loss: RAID-5 and RAID-1 technology provides the ability to continue operation after the failure of a hard drive and the ability to rebuild the lost data onto a replacement drive. In conjunction with the bad sector remapping capabilities of the hard drives, RAID-5 and RAID-1 can also help recreate data lost due to sector media corruption. Defective sectors on hard drives are not uncommon. Data scrubbing helps you detect and correct these errors before they become a problem. If the ServeRAID Array is not properly set up and/or maintained, a significant risk of data loss grows with the passage of time. This manual examines how to avoid data loss wherever possible. Drive failures: Three types of drive failures can typically occur in a RAID-5 or RAID1 subsystem that may endanger the protection of stored data: • "Catastrophic drive failures" • "Grown sector media errors" • "Combination failures" on page 205 Catastrophic drive failures: How They Occur Catastrophic drive failures occur when all data on a drive, including the ECC data written on the drive to protect information, is completely inaccessible due to mechanical or electrical problems. Grown sector media errors: How They Occur Grown sector media errors occur due to the following: • Latent imperfections on the disk • Media damage due to mishandling of the disk • Harsh environments The drive itself can often repair these errors by recalculating lost data from Error Correction Code (ECC) information stored within each data sector on the drive. The drive then remaps this damaged sector to an unused area of the drive to prevent data loss. Note: Sector media errors, which affect only a small area of the surface of the drive, may not be detected in seldom used files or in non-data areas of the disk. These errors are only identified and corrected if a read or write request is made to data stored within that location. Data scrubbing forces all sectors in the logical drive to be accessed so that sector media errors are detected by the drive. Once detected, the drive's error recovery procedures are launched to repair these errors by recalculating the lost data from the ECC information described above. If the ECC information is not sufficient to recalculate the lost data, the information may still be recovered if the drive is part of a RAID-5 or RAID-1 array. RAID-5 and RAID-1 arrays can provide their own redundant information (similar to the ECC data written on the drive itself), which is stored on other drives in the array. The ServeRAID controller can recalculate the lost data and remap the bad sector. Note: 204 Hardware Maintenance Manual: Netfinity 7600 - Type 8665 Models 1RY, 2RY

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294

204
Hardware Maintenance Manual: Netfinity 7600
Type 8665 Models 1RY, 2RY
Miscellaneous programs:
The IPSSEND and IPSMON programs are advanced
command-line programs that can be used to manage the ServeRAID controllers. You
can use the IPSSEND program to view the configuration of a ServeRAID controller,
rebuild a defunct drive, and perform other functions. You can use the ISPMON
program to monitor a ServeRAID controller for defunct drives, predictive failure
analysis (PFA) warnings, rebuild operators synchronizations, and logical drive
migration. See the README files for installation instructions.
Using ServeRAID Controllers to avoid data loss:
RAID-5 and RAID-1 technology
provides the ability to continue operation after the failure of a hard drive and the
ability to rebuild the lost data onto a replacement drive. In conjunction with the bad
sector remapping capabilities of the hard drives, RAID-5 and RAID-1 can also help
recreate data lost due to sector media corruption.
Defective sectors on hard drives are not uncommon. Data scrubbing helps you detect
and correct these errors before they become a problem. If the ServeRAID Array is not
properly set up and/or maintained, a significant risk of data loss grows with the
passage of time. This manual examines how to avoid data loss wherever possible.
Drive failures:
Three types of drive failures can typically occur in a RAID-5 or RAID-
1 subsystem that may endanger the protection of stored data:
Catastrophic drive failures
Grown sector media errors
Combination failures
on page 205
Catastrophic drive failures:
How They Occur
Catastrophic drive failures occur when all data on a drive, including the ECC data
written on the drive to protect information, is completely inaccessible due to
mechanical or electrical problems.
Grown sector media errors:
How They Occur
Grown sector media errors occur due to the following:
Latent imperfections on the disk
Media damage due to mishandling of the disk
Harsh environments
The drive itself can often repair these errors by recalculating lost data from Error
Correction Code (ECC) information stored within each data sector on the drive. The
drive then remaps this damaged sector to an unused area of the drive to prevent data
loss.
Note:
Sector media errors, which affect only a small area of the surface of the drive,
may not be detected in seldom used files or in non-data areas of the disk.
These errors are only identified and corrected if a read or write request is made
to data stored within that location.
Data scrubbing forces all sectors in the logical drive to be accessed so that sector
media errors are detected by the drive. Once detected, the drive's error recovery
procedures are launched to repair these errors by recalculating the lost data from the
ECC information described above. If the ECC information is not sufficient to
recalculate the lost data, the information may still be recovered if the drive is part of a
RAID-5 or RAID-1 array. RAID-5 and RAID-1 arrays can provide their own
redundant information (similar to the ECC data written on the drive itself), which is
stored on other drives in the array. The ServeRAID controller can recalculate the lost
data and remap the bad sector.
Note: