HP NetServer AA 4000 HP AA HP Netserver 4000 Reference Guide - Page 118

Overview of Troubleshooting in a HP AA Environment, Diagnosing Faults

Page 118 highlights

HP NetServer AA Overview of Troubleshooting in a HP AA Environment The HP AA system is a fault tolerant system. When faults occur (for example, a failed network adapter) the system continues to operate. While the system is operational, any additional failures to the faulted components redundant counterpart can affect the availability of the system. Returning the system to a state of fully fault tolerant consists of a series of actions. These actions fall under the following basic categories: • Diagnosing the Fault • Isolating the Fault • Correcting the Fault The overall approach for this system is to focus primarily on information gathering and only when a sufficient amount of data is collected should an action take place. It is critical to understand that the more analysis done up front, the less likely server availability will have to be sacrificed. Diagnosing Faults In the HPAA environment there are four basic methods used to diagnose the source of a fault. These methods are: Marathon Manager: This tool can be used to quickly examine the status of a component. The color coding used in the Administration Window and Device Status window quickly identify components that are in a degraded state. SSDL status lights: The front of the SSDL displays power and connection status for the HPAA components. Because it is clearly visible when approaching the array, this should be one of the first components examined when performing fault diagnosis. Windows NT Event View: The event view accumulates all events associated with the Windows NT operating system and the HP AA components. This is the primary tool used for detailed fault diagnosis. Marathon Event Log: The events displayed in this log are the same as the NT Event viewer. There are two differences, the Marathon Event Log is DOS based, and second it displays only Marathon events as they occur. 7-2 Hewlett-Packard Company

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142

HP NetServer AA
Hewlett-Packard Company
7-2
Overview of Troubleshooting in a HP AA Environment
The HP AA system is a fault tolerant system.
When faults occur (for
example, a failed network adapter) the system continues to operate.
While the system is operational, any additional failures to the faulted
components redundant counterpart can affect the availability of the
system.
Returning the system to a state of fully fault tolerant consists of a
series of actions.
These actions fall under the following basic
categories:
Diagnosing the Fault
Isolating the Fault
Correcting the Fault
The overall approach for this system is to focus primarily on
information gathering and only when a sufficient amount of data is
collected should an action take place.
It is critical to understand that
the more analysis done up front, the less likely server availability
will have to be sacrificed.
Diagnosing Faults
In the HPAA environment there are four basic methods used to
diagnose the source of a fault.
These methods are:
Marathon Manager:
This tool can be used to quickly examine the
status of a component.
The color coding used in the Administration
Window and Device Status window quickly identify components
that are in a degraded state.
SSDL status lights:
The front of the SSDL displays power and
connection status for the HPAA components.
Because it is clearly
visible when approaching the array, this should be one of the first
components examined when performing fault diagnosis.
Windows NT Event View
:
The event view accumulates all events
associated with the Windows NT operating system and the HP AA
components.
This is the primary tool used for detailed fault
diagnosis.
Marathon Event Log:
The events displayed in this log are the same
as the NT Event viewer.
There are two differences, the Marathon
Event Log is DOS based, and second it displays only Marathon
events as they occur.