IBM 8478 Hardware Maintenance Manual - Page 16

Reliability, availability, serviceability

Page 16 highlights

Reliability, availability, and serviceability Three of the most important considerations in server design are reliability, availability, and serviceability (RAS). The RAS features help to ensure the integrity of the data that is stored on the server, the availability of the server when needed, and the ease with which you can diagnose and repair problems. The following is an abbreviated list of the RAS features that the server supports. Many of these features are explained in the following chapters of this book. v Reliability features - Boot block recovery - Cooling fans with speed-sensing capability - Customer-upgradable basic input/output system (BIOS) - ECC front-side buses (FSBs) - ECC L2 cache - ECC memory - Parity checking on the small computer system interface (SCSI) and peripheral component interconnect (PCI) buses - Advanced configuration and power interface (ACPI) - Power-on self-test (POST) - Synchronous dynamic random access memory (SDRAM) with serial presence detect (SPD) v Availability features - Advanced desktop management interface (DMI) features - Alarm on LAN™ capability - Chassis intrusion - Operating system (OS) hangs - Auto-restart initial program load (IPL) power supply - Automatic error retry or recovery - Automatic server restart - Automatic restart after power failure - Built-in, menu-driven configuration programs - Built-in, menu-driven SCSI configuration programs (some models) - Built-in, menu-driven setup programs - Failover Ethernet support - Menu-driven diagnostic programs on CD-ROM - Monitoring support for temperature, voltage, and fan speed - Server management - ServeRAID™ adapter support - Standard advanced system management (ASM) PCI adapter provides control for remote system management - Upgradable BIOS, diagnostics, ASM PCI adapter microcode, and POST 6 Hardware Maintenance Manual: xSeries 200

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180

Reliability,
availability,
and
serviceability
Three
of
the
most
important
considerations
in
server
design
are
reliability,
availability,
and
serviceability
(RAS).
The
RAS
features
help
to
ensure
the
integrity
of
the
data
that
is
stored
on
the
server,
the
availability
of
the
server
when
needed,
and
the
ease
with
which
you
can
diagnose
and
repair
problems.
The
following
is
an
abbreviated
list
of
the
RAS
features
that
the
server
supports.
Many
of
these
features
are
explained
in
the
following
chapters
of
this
book.
v
Reliability
features
Boot
block
recovery
Cooling
fans
with
speed-sensing
capability
Customer-upgradable
basic
input/output
system
(BIOS)
ECC
front-side
buses
(FSBs)
ECC
L2
cache
ECC
memory
Parity
checking
on
the
small
computer
system
interface
(SCSI)
and
peripheral
component
interconnect
(PCI)
buses
Advanced
configuration
and
power
interface
(ACPI)
Power-on
self-test
(POST)
Synchronous
dynamic
random
access
memory
(SDRAM)
with
serial
presence
detect
(SPD)
v
Availability
features
Advanced
desktop
management
interface
(DMI)
features
Alarm
on
LAN
capability
-
Chassis
intrusion
-
Operating
system
(OS)
hangs
Auto-restart
initial
program
load
(IPL)
power
supply
Automatic
error
retry
or
recovery
Automatic
server
restart
Automatic
restart
after
power
failure
Built-in,
menu-driven
configuration
programs
Built-in,
menu-driven
SCSI
configuration
programs
(some
models)
Built-in,
menu-driven
setup
programs
Failover
Ethernet
support
Menu-driven
diagnostic
programs
on
CD-ROM
Monitoring
support
for
temperature,
voltage,
and
fan
speed
Server
management
ServeRAID
adapter
support
Standard
advanced
system
management
(ASM)
PCI
adapter
provides
control
for
remote
system
management
Upgradable
BIOS,
diagnostics,
ASM
PCI
adapter
microcode,
and
POST
6
Hardware
Maintenance
Manual:
xSeries
200