IBM 436854u Service Guide - Page 155

Checkout procedure, About the checkout procedure, Exception, Important

Page 155 highlights

Checkout procedure The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the server. About the checkout procedure Before you perform the checkout procedure for diagnosing hardware problems, review the following information: v Read the safety information that begins on page vii. v The DSA Preboot diagnostic programs provide the primary methods of testing the major components of the server, such as the system board, Ethernet controller, serial ports, and hard disk drives. You can also use them to test some external devices. If you are not sure whether a problem is caused by the hardware or by the software, you can use the diagnostic programs to confirm that the hardware is working correctly. v When you run the diagnostic programs, a single problem might cause more than one error message. When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run the diagnostic programs. Exception: If multiple error codes or LEDs indicate a microprocessor error, the error might be in a microprocessor or in a microprocessor socket. See "Microprocessor problems" on page 145 for information about diagnosing microprocessor problems. v Before you run the DSA diagnostic programs, you must determine whether the failing server is part of a shared hard disk drive cluster (two or more servers sharing external storage devices). If it is part of a cluster, you can run all diagnostic programs except the ones that test the storage unit (that is, a hard disk drive in the storage unit) or the storage adapter that is attached to the storage unit. The failing server might be part of a cluster if any of the following conditions is true: - You have identified the failing server as part of a cluster (two or more servers sharing external storage devices). - One or more external storage units are attached to the failing server and at least one of the attached storage units is also attached to another server or unidentifiable device. - One or more servers are located near the failing server. Important: If the server is part of a shared hard disk drive cluster, run one test at a time. v If the server is halted and a POST error code is displayed, see "Error logs" on page 125. If the server is halted and no error message is displayed, see "Troubleshooting tables" on page 139 and "Solving undetermined problems" on page 229. v For information about power-supply problems, see "Solving power problems" on page 227. v For intermittent problems, check the error log; see "Error logs" on page 125 and "Diagnostic programs and messages" on page 155. Chapter 5. Diagnostics 137

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266

Checkout procedure
The checkout procedure is the sequence of tasks that you should follow to
diagnose a problem in the server.
About the checkout procedure
Before you perform the checkout procedure for diagnosing hardware problems,
review the following information:
v
Read the safety information that begins on page vii.
v
The DSA Preboot diagnostic programs provide the primary methods of testing the
major components of the server, such as the system board, Ethernet controller,
serial ports, and hard disk drives. You can also use them to test some external
devices. If you are not sure whether a problem is caused by the hardware or by
the software, you can use the diagnostic programs to confirm that the hardware
is working correctly.
v
When you run the diagnostic programs, a single problem might cause more than
one error message. When this happens, correct the cause of the first error
message. The other error messages usually will not occur the next time you run
the diagnostic programs.
Exception:
If multiple error codes or LEDs indicate a microprocessor error, the
error might be in a microprocessor or in a microprocessor socket. See
“Microprocessor problems” on page 145 for information about diagnosing
microprocessor problems.
v
Before you run the DSA diagnostic programs, you must determine whether the
failing server is part of a shared hard disk drive cluster (two or more servers
sharing external storage devices). If it is part of a cluster, you can run all
diagnostic programs except the ones that test the storage unit (that is, a hard
disk drive in the storage unit) or the storage adapter that is attached to the
storage unit. The failing server might be part of a cluster if any of the following
conditions is true:
You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or
unidentifiable device.
One or more servers are located near the failing server.
Important:
If the server is part of a shared hard disk drive cluster, run one test
at a time.
v
If the server is halted and a POST error code is displayed, see “Error logs” on
page 125. If the server is halted and no error message is displayed, see
“Troubleshooting tables” on page 139 and “Solving undetermined problems” on
page 229.
v
For information about power-supply problems, see “Solving power problems” on
page 227.
v
For intermittent problems, check the error log; see “Error logs” on page 125 and
“Diagnostic programs and messages” on page 155.
Chapter 5. Diagnostics
137