HP DL740 HP F8 Architecture Technology Brief - Page 12

Xeon MP Processor Subsystem, Hyper-Threading Technology, Hyper-Threading, Technology

Page 12 highlights

HP F8 Architecture PCI-X improves performance over conventional PCI as a result of two primary differences: higher clock frequencies made possible by a register-to-register protocol and new protocol enhancements, such as split transactions, to make the bus more efficient. The register-toregister protocol eases the timing constraints by allowing an entire clock cycle for the decode logic to occur. With the timing constraints reduced, it is much easier to design adapters and systems to operate at frequencies greater than 66 MHz. In PCI-X mode, read operations to main memory are completed as split transactions rather than as delayed transactions. A split transaction enables more efficient use of the bus because it eliminates polling. With a delayed transaction in conventional PCI protocol, the device requesting data must poll the target to determine when the request has been completed and the data is available. With a split transaction as supported in PCI-X, the device requesting the data sends a signal to the target. The target device informs the requester that it has accepted the request. The requester is free to process other information until the target device sends the data to the requester. The F8 architecture includes two optional features from the PCI-X specification to enhance performance even more: the "don't-snoop" bit and relaxed ordering. When the "don't snoop" bit is set during a PCI-X transaction, an I/O request will not snoop the L2 caches on the processor bus. Thus, an I/O request will go directly to main memory, eliminating a snoop cycle on the processor bus. With conventional PCI bridge designs, the bridge handles requests from multiple PCI devices in the order in which they are received. The PCI-X protocol includes an optional relaxed ordering bit. If the device driver or controlling software sets this bit, the PCI-X bridge permits a transaction to pass previously posted transactions from other devices. The bridge can rearrange the transactions in the most efficient manner, depending on which PCI device or system memory port is available. For more information on PCI-X technology, refer to the technology brief titled PCI-X: An Evolution of the PCI Bus, document number TC990903TB. Xeon MP Processor Subsystem Xeon MP is the multiprocessing version of the Pentium 4 family of seventh-generation IA-32 processors.6The Xeon MP processor is designed for performance in high-end X86 workstations and servers. The Pentium 4 family has a significantly different architecture than the Intel P6 family, which began with the Pentium Pro and extends through the Pentium III Xeon processors. The following information is based on publicly available information on the Intel website.7 Hyper-Threading Technology Implemented in the Xeon MP processor is Intel's new Hyper-Threading technology that improves processor utilization to meet the needs of large, memory-intensive server applications. Hyper-Threading Technology enables one physical processor to execute two separate threads at the same time. To achieve this, Intel designed the Xeon processor with the usual processor core, but with two Architectural State devices (logical processors). Each Architectural State tracks the flow of a thread being executed by core resources. Both logical processors inside the physical processor share all the internal caches and other physical execution resources. An application or operating system can submit threads to two different 6 More detailed information on the Xeon MP processor is available in the Technology Brief entitled Intel Processor Roadmap, document number TC000808TB. 7 IA-32 Intel Architecture Software Developer's Manual with Preliminary Willamette Architecture Information Volume 1: Basic Architecture, http://developer.intel.com/design/pentium4/manuals/245470.htm 12

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

HP F8 Architecture
12
PCI-X improves performance over conventional PCI as a result of two primary differences:
higher clock frequencies made possible by a register-to-register protocol and new protocol
enhancements, such as split transactions, to make the bus more efficient. The register-to-
register protocol eases the timing constraints by allowing an entire clock cycle for the
decode logic to occur. With the timing constraints reduced, it is much easier to design
adapters and systems to operate at frequencies greater than 66 MHz.
In PCI-X mode, read operations to main memory are completed as split transactions rather
than as delayed transactions. A split transaction enables more efficient use of the bus
because it eliminates polling. With a delayed transaction in conventional PCI protocol, the
device requesting data must poll the target to determine when the request has been
completed and the data is available. With a split transaction as supported in PCI-X, the
device requesting the data sends a signal to the target. The target device informs the
requester that it has accepted the request. The requester is free to process other information
until the target device sends the data to the requester.
The F8 architecture includes two optional features from the PCI-X specification to enhance
performance even more: the “don’t-snoop” bit and relaxed ordering. When the “don’t
snoop” bit is set during a PCI-X transaction, an I/O request will not snoop the L2 caches on
the processor bus. Thus, an I/O request will go directly to main memory, eliminating a
snoop cycle on the processor bus.
With conventional PCI bridge designs, the bridge handles requests from multiple PCI devices
in the order in which they are received. The PCI-X protocol includes an optional relaxed
ordering bit. If the device driver or controlling software sets this bit, the PCI-X bridge permits
a transaction to pass previously posted transactions from other devices. The bridge can
rearrange the transactions in the most efficient manner, depending on which PCI device or
system memory port is available.
For more information on PCI-X technology, refer to the technology brief titled
PCI-X: An
Evolution of the PCI Bus
, document number TC990903TB.
Xeon MP
Processor
Subsystem
Xeon MP is the multiprocessing version of the Pentium 4 family of seventh-generation IA-32
processors.
6
The Xeon MP processor is designed for performance in high-end X86
workstations and servers. The Pentium 4 family has a significantly different architecture than
the Intel P6 family, which began with the Pentium Pro and extends through the Pentium III
Xeon processors. The following information is based on publicly available information on the
Intel website.
7
Hyper-Threading
Technology
Implemented in the Xeon MP processor is Intel’s new Hyper-Threading technology that
improves processor utilization to meet the needs of large, memory-intensive server
applications. Hyper-Threading Technology enables one physical processor to execute two
separate threads at the same time. To achieve this, Intel designed the Xeon processor with
the usual processor core, but with two Architectural State devices (logical processors).
Each
Architectural State tracks the flow of a thread being executed by core resources. Both logical
processors inside the physical processor share all the internal caches and other physical
execution resources. An application or operating system can submit threads to two different
6
More detailed information on the Xeon MP processor is available in the Technology Brief entitled
Intel Processor Roadmap,
document number TC000808TB
.
7
IA-32 Intel Architecture Software Developer’s Manual with Preliminary Willamette Architecture Information Volume 1: Basic
Architecture,