Home » HP Manuals » Servers » HP Integrity rx2800 » Manual Viewer

HP Integrity rx2800 rx2800 i2 User Service Guide - Page 69

Troubleshooting, Methodology, General troubleshooting methodology

View all HP Integrity rx2800 manuals

Add to My Manuals
Save this manual to your list of manuals

Page 69 highlights

5 Troubleshooting The purpose of this chapter is to provide a preferred methodology (strategies and procedures) and tools for troubleshooting the server error and fault conditions. Methodology General troubleshooting methodology There are multiple entry points to the troubleshooting process, dependent upon your level of troubleshooting expertise, the tools/processes/procedures which you have at your disposal, and the nature of the system fault or failure. Typically, you select from a set of symptoms, ranging from very simple (system LED is blinking) to the most difficult (Machine Check Abort (MCA)) has occurred. The following is a list of symptom examples: NOTE: Your output might differ from the output in the examples in this book depending on your server and its configuration. • Front panel LED blinking • System alert present on console • System won't power-up • System won't boot • Error/Event Message received • Machine Check Abort (MCA) occurred Narrow down the observed problem to the specific troubleshooting procedure required. Isolate the failure to a specific part of the server, so you can perform more detailed troubleshooting. For example: • Problem- Front panel LED blinking NOTE: The front panel health LEDs flash amber with a warning indication, or flash red with a fault indication. ◦ System Alert on console? Analyze the alert by using the system event log (SEL), to identify the last error logged by the server. Use the iLO 3 MP commands to view the SEL, either through the iLO 3 MP serial text interface, or through telnet, Secure Shell, or through the web GUI on the iLO 3 MP LAN. You should now have a good idea about which area of the system requires further analysis. For example, if the symptom was "system won't power-up", the initial troubleshooting procedure may indicate a problem with the dc power rail not coming up after the power switch was turned on. You have now reached the point where the failed CRU has been identified and needs to be replaced. Perform the specific removal and replacement procedure, and verification steps. NOTE: If multiple CRUs are identified as part of the solution, a fix cannot be guaranteed unless all identified failed CRUs are replaced. There may be specific recovery procedures you need to perform to finish the repair. For example, if the system board is replaced, you need to restore customer specific information. Methodology 69

Section	Page
HP Integrity rx2800 i2 Server User Service Guide	1
Contents	3
Abstract	9
1 Overview	10
Server subsystems	10
Internal components	10
I/O subsystem	12
RAID support	13
CPU subsystem	13
Memory subsystem	13
Cooling subsystem	14
Power subsystem	14
Hard drive subsystem	15
Firmware	15
Event IDs for errors and events	15
Controls, ports, and LEDs	16
Front panel controls, ports, and LEDs	16
SID	17
Storage and media devices	18
Hard drive LEDs	18
Optical drive	19
Rear panel controls, ports, and LEDs	19
Power supply	21
PCIe card slots	21
2 Server Specifications	22
System configuration	22
Dimensions and weight	22
Grounding	23
Electrical specifications	23
System power specifications	23
Power consumption and cooling	24
Physical and environmental specifications	24
3 Installing the server	26
Safety information	26
Installation sequence and checklist	26
Unpacking and inspecting the server	27
Verifying site preparation	27
Inspecting the shipping containers for damage	27
Unpacking the server	27
Checking the inventory	27
Returning damaged equipment	27
Unloading the server with a lifter	28
Installing additional components	28
Installing a hot-pluggable SAS hard drive	28
Installing a hot-swappable power supply	29
Removing the access panel	30
Removing the PCI riser cage	31
Removing expansion slot covers	32
Installing expansion boards	33
Installing a half-length expansion board	33
Installing a full-length expansion board	33
DIMMs	34
Memory configurations	34
Memory riser board locations and slot IDs	34
Supported DIMM sizes	35
Memory loading rules and guidelines	35
Installing DIMMs	36
Installing a CPU	37
CPU load order	38
Installing a CPU and heat sink module	38
Completing installation	44
Installing the server into a rack or pedestal	44
Rack installation	44
HP rack	44
Non-HP rack	44
Pedestal kit installation	45
Connecting server cables	45
AC input power	45
Power states	45
Applying standby power to the server	46
Connecting to the LAN	46
Connecting and setting up the console	46
Setting up the console	46
Connecting to a host console	46
Physical access	46
iLO 3 MP LAN	47
HP-UX	47
Setup checklist	47
Preparation	48
Determining the physical iLO 3 MP access method	48
Determining the iLO 3 MP LAN configuration method	48
Configuring the iLO 3 MP LAN using DHCP and DNS	49
Configuring the iLO 3 MP LAN using the RS-232 serial port	49
Logging in to the iLO 3 MP	50
Additional setup	51
Modifying user accounts and default password	51
Setting up security	51
Security access settings	52
Accessing the host console	52
Accessing the host console with the TUI - CO command	52
Interacting with the iLO 3 MP using the web GUI	52
Accessing the graphic console using VGA	53
Powering on and powering off the server	53
Power states	53
Powering on the server	53
Powering on the server using the iLO 3 MP	54
Powering on the server manually	54
Powering off the server	54
Powering off the server using the iLO 3 MP	54
Powering off the server manually	54
Verifying installed components in the server	55
Installation troubleshooting	57
Troubleshooting methodology	57
Troubleshooting using the server power button	57
Server does not power on	58
UEFI menu is not available	58
Operating system does not boot	59
Operating system boots with problems	59
Intermittent server problems	59
SATA DVD+RW drive problems	59
SAS disk drive problems	59
Console problems	59
Downloading and installing the latest version of the firmware	60
Downloading the latest version of the firmware	60
Installing the latest version of the firmware on the server	60
4 Installing, booting and shutting down the operating system	61
Operating systems supported on the server	61
Installing the operating system onto the server	61
Installing the OS from the DVD drive	61
Installing the OS using HP Ignite–UX	61
Installing the OS using vMedia	62
Configuring system boot options	62
Booting and shutting down HP-UX	63
Adding HP-UX to the boot options list	63
HP-UX standard boot	64
Booting HP-UX from the UEFI Boot Manager	64
Booting HP-UX from the UEFI Shell	64
Booting HP-UX in single-user mode	65
Booting HP-UX in LVM-maintenance mode	65
Shutting down HP-UX	65
Booting and shutting down Microsoft Windows	65
Adding Microsoft Windows to the boot options list	65
Booting the Microsoft Windows operating system	66
Shutting down Microsoft Windows	67
Shutting down Windows from the command line	68
5 Troubleshooting	69
Methodology	69
General troubleshooting methodology	69
Recommended troubleshooting methodology	70
Basic and advanced troubleshooting tables	71
Troubleshooting tools	75
LEDs	75
Front panel	75
Health LED	75
System Event Log LED	76
Locator Switch/LED (UID)	77
SID LEDs	77
FRU and CRU health LEDs	77
Diagnostics	77
Online diagnostics and exercisers	77
Online support tool availability	78
Online support tools list	78
Offline support tools list	79
General diagnostic tools	79
Fault management overview	80
HP-UX fault management	80
WBEM indication providers	80
Errors and reading error logs	80
Event log definitions	80
Using event logs	81
iLO 3 MP event logs	81
System event log review	82
Supported configurations	82
Server block diagram	82
System build-Up troubleshooting procedure	83
Troubleshooting the CPU and Memory	84
Troubleshooting the server CPU	85
CPU load order	85
CPU module behaviors	85
Customer messaging policy	85
Troubleshooting the server memory	87
Memory DIMM load order	87
Memory subsystem behaviors	87
Customer messaging policy	87
Troubleshooting the power subsystem	88
Power subsystem behavior	88
Power LED button	89
Troubleshooting the cooling subsystem	89
Cooling subsystem behavior	89
Troubleshooting the I/O	90
I/O subsystem behaviors	90
Customer messaging policy	90
Troubleshooting the iLO 3 MP subsystem	92
iLO 3 MP LAN LED on the rear panel	92
Troubleshooting the I/O subsystem	92
Verifying SAS hard drive operation	92
System LAN LEDs	93
Troubleshooting the boot process	93
Troubleshooting the firmware	94
Identifying and troubleshooting firmware problems	94
Updates	94
Troubleshooting the system console	95
Troubleshooting tips	95
Troubleshooting the server environment	95
Reporting your problems to HP	95
Online support	96
Phone support	96
Information to collect before you contact support	96
6 Removal and replacement procedures	97
Required tools	97
Safety considerations	97
Preventing electrostatic discharge	97
Server warnings and cautions	98
Preparation procedures	98
Extend the server from the rack	99
Accessing internal components for a pedestal–mounted server	99
power off the server	102
Remove the server from the rack	102
Access the product rear panel	103
Cable management arm with left-hand swing	103
Cable management arm with right-hand swing	103
Server component classification	104
Hot-swappable components	104
Hot-pluggable components	104
Cold-swappable components	104
SAS hard drive blank	105
Hot-plug SAS hard drive	105
Power supply blank	106
Hot-swap power supply	106
Access panel	107
Optical drive filler	107
Optical drive	108
Hot-swap fan	109
Power supply backplane	110
Hard drive backplane	111
PCI riser cage	112
Expansion slot covers	112
Expansion boards	112
Half-length expansion board	112
Full-length expansion board	113
Battery-backed write cache procedures	114
Removing the cache module	114
Removing the super capacitor pack	114
Recovering data from the battery-backed write cache	116
Removing and replacing the CPU baffle	117
Removing the CPU baffle	117
Replacing the CPU baffle	117
Removing and replacing a CPU and heat sink module	118
Removing a CPU and heat sink module	118
Replacing a CPU	119
DIMMs	120
PDH battery (system battery)	121
SID	121
Intrusion switch cable	122
System board	122
HP Trusted Platform Module (TPM)	125
7 Support and other resources	126
Contacting HP	126
Before you contact HP	126
HP contact information	126
Subscription service	126
HP Insight Remote Support Software	126
Related information	127
About this document	127
Typographic Conventions	127
HP-UX release name and release identifier	128
Related documents	128
A Customer replaceable units information	129
Parts only warranty service	129
Customer self repair	129
Customer replaceable units list	130
B Utilities	132
SAS disk setup	132
Using the saupdate command	132
Get mode	132
Set mode	133
Updating the firmware using saupdate	133
Determining the Driver ID and CTRL ID	134
Using the ORCA menu-driven interface	134
Creating a logical drive	134
Deleting a logical drive	134
UEFI	135
UEFI shell and HP POSSE commands	135
Drive paths in UEFI	138
Using the boot maintenance manager	138
Boot options	139
Add boot option	139
Delete boot option	140
Change boot order	141
Driver options	141
Add driver option	142
Delete driver option	143
Change driver order	143
Console options	143
Boot from file	143
Set boot next value	144
Set time out value	144
Reset system	145
iLO MP	145
Glossary	146

Match case Limit results 1 per page

5 Troubleshooting

The purpose of this chapter is to provide a preferred methodology (strategies and procedures) and

tools for troubleshooting the server error and fault conditions.

Methodology

General troubleshooting methodology

There are multiple entry points to the troubleshooting process, dependent upon your level of

troubleshooting expertise, the tools/processes/procedures which you have at your disposal, and

the nature of the system fault or failure.

Typically, you select from a set of symptoms, ranging from very simple (system LED is blinking) to

the most difficult (Machine Check Abort (MCA)) has occurred. The following is a list of symptom

examples:

NOTE:

Your output might differ from the output in the examples in this book depending on your

server and its configuration.

•

Front panel LED blinking

•

System alert present on console

•

System won’t power-up

•

System won’t boot

•

Error/Event Message received

•

Machine Check Abort (MCA) occurred

Narrow down the observed problem to the specific troubleshooting procedure required. Isolate

the failure to a specific part of the server, so you can perform more detailed troubleshooting. For

example:

•

Problem- Front panel LED blinking

NOTE:

The front panel health LEDs flash amber with a warning indication, or flash red with

a fault indication.

◦

System Alert on console?

Analyze the alert by using the system event log (SEL), to identify the last error logged by

the server. Use the iLO 3 MP commands to view the SEL, either through the iLO 3 MP

serial text interface, or through telnet, Secure Shell, or through the web GUI on the iLO

3 MP LAN.

You should now have a good idea about which area of the system requires further analysis. For

example, if the symptom was “system won’t power-up”, the initial troubleshooting procedure may

indicate a problem with the dc power rail not coming up after the power switch was turned on.

You have now reached the point where the failed CRU has been identified and needs to be

replaced. Perform the specific removal and replacement procedure, and verification steps.

NOTE:

If multiple CRUs are identified as part of the solution, a fix cannot be guaranteed unless

all identified failed CRUs are replaced.

There may be specific recovery procedures you need to perform to finish the repair. For example,

if the system board is replaced, you need to restore customer specific information.

Methodology