HP DL360 The Intel processor roadmap for industry-standard servers technology - Page 7

Keeping the pipeline busy requires that the processor begin executing a second instruction before the

first has traveled completely through the pipeline. However, suppose a program has an instruction

that requires summing three numbers:

X = A + B + C

If the processor already has A and B stored in registers but needs to get C from memory, this causes a

“bubble,” or stall, in the pipeline in which the processor cannot execute the instruction until it obtains

the value for C from memory. This bubble must move all the way through the pipeline, forcing each

stage that contains the bubble to sit idle, wasting execution resources during that clock cycle. Clearly,

the longer the pipeline, the more significant this problem becomes.

Processor stalls often occur as a result of one instruction being dependent on another. If the program

has a branch, such as an IF–THEN loop, the processor has two options. The processor either waits for

the critical instruction to finish (stalling the pipeline) before deciding which program branch to take, or

it predicts which branch the program will follow.

If the processor predicts the wrong code branch, it must flush the pipeline and start over again with

the IF–THEN statement using the correct branch. The longer the pipeline, the higher the performance

cost for branch mispredicts. For example, the longer the pipeline, the more the processor must execute

speculative instructions that must be discarded when a mispredict occurs. Specific to the NetBurst

design was an improved branch-prediction algorithm aided by a large branch target array that stored

branch predictions.

Hyper-Threading Technology

Intel Hyper-Threading (HT) Technology is a design enhancement for server environments. It takes

advantage of the fact that, according to Intel estimates, the utilization rate for the execution units in a

NetBurst processor is typically only about 35 percent. To improve the utilization rate, HT Technology

adds Multi-Thread-Level Parallelism (MTLP) to the design. In essence, MTLP means that the core

receives two instruction streams from the operating system (OS) to take advantage of idle cycles on

the execution units of the processor. For one physical processor to appear as two distinct processors

to the OS, the design replicates the pieces of the processor with which the OS interacts to create two

logical processors in one package. These replicated components include the instruction pointer, the

interrupt controller, and other general-purpose registers

―

all of which are collectively referred to as

the Architectural State, or AS (see Figure 5).

Figure 5.

Hyper-Threading Technology

AS

AS

System

Bus

IA-32 Processor with

Hyper-thread Technology

Traditional Dual-processor

(D) System

Processor

Core

Processor

Core

System

Bus

AS1

AS2

Processor

Core

Logical

processor

Logical

processor

7

Section	Page
Abstract	2
Introduction	2
Intel processor architecture and microarchitectures	2
NetBurst® microarchitecture	5
Hyper-pipeline and clock frequency	5
Hyper-Threading Technology	7
NetBurst microarchitecture on 90nm silicon process technology	9
Extended hyper-pipeline	10
SSE3 instructions	10
64-bit extensions —Intel 64	10
Two-core technology	11
Intel Core™ microarchitecture	12
Processors	12
Xeon two-core processors	12
Xeon four-core processors	13
Enhanced SpeedStep® Technology	14
Intel Virtualization® Technology	15
Intel® Microarchitecture Nehalem	15
Integrated memory controller	15
Intel® QuickPath Technology	16
Three-level cache hierarchy	17
Intel® Hyper-Threading Technology	18
Intel® Turbo Boost Technology	18
Dynamic Power Management	19
Performance comparisons	20
TPC-C performance	20
SPEC performance	20
Conclusion	21
For more information	22

HP DL360 The Intel processor roadmap for industry-standard servers technology - Page 7

Hyper-Threading Technology, the Architectural State, or AS see

Page 7 highlights