Intel X38ML Product Specification - Page 103
Error Handling and Logging
UPC - 735858197397
View all Intel X38ML manuals
Add to My Manuals
Save this manual to your list of manuals |
Page 103 highlights
Intel® Server Board X38ML Error Reporting and Handling 6.2 Error Handling and Logging This section defines how errors are handled by the system BIOS, including a discussion of the role of the BIOS in error handling and the interaction between the BIOS, platform hardware, and server management firmware with regard to error handling. In addition, error-logging techniques are described and error codes for errors are defined. 6.2.1 Error Sources and Types One of the major requirements of server management is to correctly and consistently handle system errors. System errors that can be enabled and disabled individually or as a group can be categorized as follows: PCI bus Memory single- and multi-bit errors Sensors Errors detected during POST, logged as POST errors Sensors are managed by the BMC. The BMC is capable of receiving event messages from individual sensors and logging system events. For more information on BMC logged errors, see the BMC EPS. 6.2.2 Error Logging via SMI Handler The SMI handler is used to handle and log system level events not visible to the server management firmware. The SMI handler pre-processes all system errors, including errors that can generate an NMI. The SMI handler sends a command to the BMC to log the event and provides the data to be logged. For example, the BIOS programs the hardware to generate an SMI on a single-bit memory error and logs the location of the failed DIMM in the system event log. System events handled by the BIOS generate an SMI. After the BIOS finishes logging the error, it asserts the NMI if needed. 6.2.2.1 PCI Bus Error The PCI bus defines two error pins, PERR# and SERR#. These are used for reporting PCI parity errors and system errors, respectively. The BIOS can be instructed to enable or disable reporting PERR# and SERR# through the NMI. Disabling NMI for PERR# and/or SERR# also disables logging of the corresponding event. In the case of PERR#, the PCI bus master has the option to retry the offending transaction, or to report it using SERR#. All other PCI-related errors are reported by SERR#. All PCI-to-PCI bridges are configured so that they generate an SERR# on the primary interface whenever there is an SERR# on the secondary side, as long as SERR# is enabled in BIOS Setup. The same is true for PERR#. The format of the data bytes is described in Section 6.2.3.3. 6.2.2.2 PCI Express* Errors The hardware is programmed to generate an SMI on PCI Express* correctable, uncorrectable non-fatal, and uncorrectable fatal errors. The correctable PCI Express* errors are reported to Revision 1.3 91 Intel order number E15331-006