Intel X38ML Product Specification - Page 102

Error Reporting and Handling

Page 102 highlights

Error Reporting and Handling Intel® Server Board X38ML 6. Error Reporting and Handling This chapter defines the following error handling features: ƒ Fault Resilient Booting ƒ Error Handling and Logging ƒ Error Messages and Error Codes 6.1 Fault Resilient Booting Fault Resilient Booting (FRB) is an Intel-specific feature that detects and handles errors during the system boot process. The FRB feature guarantees the system boots, even if one or more processors fail during POST. There are several failures that can occur during the boot process that can be detected and handled by the BIOS and BMC: ƒ BSP POST Failures (FRB-2) ƒ Operating system load failures 6.1.1 BSP POST Failures (FRB-2) FRB-2 is a process that uses a watchdog timer that can be configured to reset the system when POST hangs. The BIOS sets the FRB-2 timer to 6 minutes. The BIOS disables the watchdog timer before prompting the user for a boot password (user password), while scanning for option ROM, and when the user enters BIOS Setup. If the system hangs during POST before the BIOS disables the FRB-2 timer, the BMC generates an asynchronous system reset (ASR). The BMC retains status bits that the BIOS can read later in POST so the appropriate event can be entered in the system event log and the appropriate error message is displayed. When an FRB-2 timeout occurs, the BIOS will not send a "Set Fault Indication". In the case of an FRB-2 failure, the system will log a POST error into the SEL and the error manager. 6.1.2 Operating System Load Failures (OS Boot Timer) The BIOS has an additional watchdog timer to provide fault resilient booting to the operating system. This timer option is disabled by default. The timeout value and the option to enable the timer are configured in BIOS Setup. When enabled, the BIOS enables the OS Boot Timer in the BMC. It is the responsibility of the operating system or an application to disable this timer once the operating system has successfully loaded. Warning: If this option is enabled and there is no operating system or server management application installed that supports it, this feature causes the system to reboot when the timer expires. See the application or operating system documentation to make sure this feature is supported for your operating system environment. 90 Revision 1.3 Intel order number E15331-006

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132

Error Reporting and Handling
Intel® Server Board X38ML
Revision 1.3
Intel order number E15331-006
90
6.
Error Reporting and Handling
This chapter defines the following error handling features:
±
Fault Resilient Booting
±
Error Handling and Logging
±
Error Messages and Error Codes
6.1
Fault Resilient Booting
Fault Resilient Booting (FRB) is an Intel-specific feature that detects and handles errors during
the system boot process. The FRB feature guarantees the system boots, even if one or more
processors fail during POST. There are several failures that can occur during the boot process
that can be detected and handled by the BIOS and BMC:
±
BSP POST Failures (FRB-2)
±
Operating system load failures
6.1.1
BSP POST Failures (FRB-2)
FRB-2 is a process that uses a watchdog timer that can be configured to reset the system when
POST hangs. The BIOS sets the FRB-2 timer to 6 minutes.
The BIOS disables the watchdog timer before prompting the user for a boot password (user
password), while scanning for option ROM, and when the user enters BIOS Setup. If the system
hangs during POST before the BIOS disables the FRB-2 timer, the BMC generates an
asynchronous system reset (ASR).
The BMC retains status bits that the BIOS can read later in POST so the appropriate event can
be entered in the system event log and the appropriate error message is displayed.
When an FRB-2 timeout occurs, the BIOS will not send a “Set Fault Indication”. In the case of
an FRB-2 failure, the system will log a POST error into the SEL and the error manager.
6.1.2
Operating System Load Failures (OS Boot Timer)
The BIOS has an additional watchdog timer to provide fault resilient booting to the operating
system. This timer option is disabled by default. The timeout value and the option to enable the
timer are configured in BIOS Setup. When enabled, the BIOS enables the OS Boot Timer in the
BMC. It is the responsibility of the operating system or an application to disable this timer once
the operating system has successfully loaded.
Warning:
If this option is enabled and there is no operating system or server management
application installed that supports it, this feature causes the system to reboot when the timer
expires. See the application or operating system documentation to make sure this feature is
supported for your operating system environment.