IBM 86655RY Hardware Maintenance Manual - Page 214

204

Hardware Maintenance Manual: Netfinity 7600

–

Type 8665 Models 1RY, 2RY

Miscellaneous programs:

The IPSSEND and IPSMON programs are advanced

command-line programs that can be used to manage the ServeRAID controllers. You

can use the IPSSEND program to view the configuration of a ServeRAID controller,

rebuild a defunct drive, and perform other functions. You can use the ISPMON

program to monitor a ServeRAID controller for defunct drives, predictive failure

analysis (PFA) warnings, rebuild operators synchronizations, and logical drive

migration. See the README files for installation instructions.

Using ServeRAID Controllers to avoid data loss:

RAID-5 and RAID-1 technology

provides the ability to continue operation after the failure of a hard drive and the

ability to rebuild the lost data onto a replacement drive. In conjunction with the bad

sector remapping capabilities of the hard drives, RAID-5 and RAID-1 can also help

recreate data lost due to sector media corruption.

Defective sectors on hard drives are not uncommon. Data scrubbing helps you detect

and correct these errors before they become a problem. If the ServeRAID Array is not

properly set up and/or maintained, a significant risk of data loss grows with the

passage of time. This manual examines how to avoid data loss wherever possible.

Drive failures:

Three types of drive failures can typically occur in a RAID-5 or RAID-

1 subsystem that may endanger the protection of stored data:

•

“

Catastrophic drive failures

”

•

“

Grown sector media errors

”

•

“

Combination failures

”

on page 205

Catastrophic drive failures:

How They Occur

Catastrophic drive failures occur when all data on a drive, including the ECC data

written on the drive to protect information, is completely inaccessible due to

mechanical or electrical problems.

Grown sector media errors:

How They Occur

Grown sector media errors occur due to the following:

•

Latent imperfections on the disk

•

Media damage due to mishandling of the disk

•

Harsh environments

The drive itself can often repair these errors by recalculating lost data from Error

Correction Code (ECC) information stored within each data sector on the drive. The

drive then remaps this damaged sector to an unused area of the drive to prevent data

loss.

Note:

Sector media errors, which affect only a small area of the surface of the drive,

may not be detected in seldom used files or in non-data areas of the disk.

These errors are only identified and corrected if a read or write request is made

to data stored within that location.

Data scrubbing forces all sectors in the logical drive to be accessed so that sector

media errors are detected by the drive. Once detected, the drive's error recovery

procedures are launched to repair these errors by recalculating the lost data from the

ECC information described above. If the ECC information is not sufficient to

recalculate the lost data, the information may still be recovered if the drive is part of a

RAID-5 or RAID-1 array. RAID-5 and RAID-1 arrays can provide their own

redundant information (similar to the ECC data written on the drive itself), which is

stored on other drives in the array. The ServeRAID controller can recalculate the lost

data and remap the bad sector.

Note:

Section	Page
About this manual	5
Important safety information	5
Online support	6
General checkout	11
General information	13
Features and specifications	13
Server features	15
Reliability, availability, and serviceability	16
Controls and indicators	17
Information LED panel	19
Diagnostics	21
Diagnostic tools overview	21
POST	22
Small computer system interface messages	22
Solving ServeRAID problems	23
Diagnostic programs and error messages	34
Light path diagnostics	37
Power checkout	41
Temperature checkout	41
Recovering BIOS	42
Replacing the battery	42
Diagnosing errors	44
Configuring the server	53
Using the Configuration/Setup Utility program	53
Using the SCSISelect utility program	59
Installing options	63
Major components of the Netfinity 7600	63
Component locations	64
Before you begin	70
Removing the server top cover and bezel	71
Working with adapters	73
Installing internal drives	76
Installing memory-module kits	81
Installing a microprocessor kit	83
Installing a hot-swap power supply	86
Replacing a hot-swap fan	88
Completing the installation	89
Connecting external options	91
Input/output ports	91
Cabling the server	103
Installing the server in a rack	103
Netfinity Manager	105
Managing your IBM Netfinity server with Netfinity Manager	106
Netfinity Manager documentation	106
Netfinity Manager system requirements	106
Starting the Netfinity Manager installation program	108
Netfinity Manager database support	115
Starting Netfinity Manager	125
Getting more information about Netfinity Manager	132
Installation options	133
FRU information (service only)	137
Diagnostic switch card	137
Disconnecting the shuttle	138
Front LED card assembly	138
I/O Legacy board	139
Memory card removal	140
PCI switch card	142
Power backplane assembly	142
Processor/PCI backplane	143
Removing the shuttle	145
SCSI backplane assembly	145
SCSI daughter card	146
Installing and configuring ServeRAID controllers	149
Features and connector locations of ServeRAID-4H controller	149
Features and connector locations of ServeRAID-4L controller	151
Features and connector locations of ServeRAID-4M controller	153
Using a ServeRAID-4x controller in a server with Hot-plug PCI features	155
Step 1: Installing and cabling a ServeRAID controller	156
Step 2: Updating BIOS and firmware code	161
Step 3: Configuring ServeRAID controllers	161
Obtaining ServeRAID updates	172
ServeRAID device driver order on Windows 2000 and Windows NT 4.0	174
Using utility programs	175
Introduction to IBM ServeRAID cluster solution	185
Monitoring and updating an IBM ServeRAID cluster solution	186
POST (ISPR) error codes and procedures	190
Recovery procedures for defunct (DDD) drives	194
Channel record table	201
Reference information	202
Symptom-to-FRU index	225
Beep symptoms	225
No beep symptoms	228
Diagnostic panel LEDs	228
Diagnostic error codes	230
Error symptoms	235
Power supply LED errors	235
POST error codes	236
ServeRAID POST (ISPR) error codes	242
ServeRAID	244
SCSI error codes	246
Temperature error messages	246
Fan error messages	247
Power error messages	247
System shutdown	248
DASD checkout	249
Host Built-In Self Test (BIST) checkout	249
I2C bus fault messages	249
Undetermined problems	251
Parts listing (Type 8665)	253
Part A	253
Part B	254
System	255
Keyboards	256
Power cords	257
Related service information	259
Safety information	259
Send us your comments!	289
Problem determination tips	290
Notices	290

IBM 86655RY Hardware Maintenance Manual - Page 214

Miscellaneous programs, Using ServeRAID Controllers to avoid data loss, Drive failures, How They Occur

Page 214 highlights