HP DL360 The Intel processor roadmap for industry-standard servers technology - Page 7

Hyper-Threading Technology, the Architectural State, or AS see

Page 7 highlights

Keeping the pipeline busy requires that the processor begin executing a second instruction before the first has traveled completely through the pipeline. However, suppose a program has an instruction that requires summing three numbers: X = A + B + C If the processor already has A and B stored in registers but needs to get C from memory, this causes a "bubble," or stall, in the pipeline in which the processor cannot execute the instruction until it obtains the value for C from memory. This bubble must move all the way through the pipeline, forcing each stage that contains the bubble to sit idle, wasting execution resources during that clock cycle. Clearly, the longer the pipeline, the more significant this problem becomes. Processor stalls often occur as a result of one instruction being dependent on another. If the program has a branch, such as an IF-THEN loop, the processor has two options. The processor either waits for the critical instruction to finish (stalling the pipeline) before deciding which program branch to take, or it predicts which branch the program will follow. If the processor predicts the wrong code branch, it must flush the pipeline and start over again with the IF-THEN statement using the correct branch. The longer the pipeline, the higher the performance cost for branch mispredicts. For example, the longer the pipeline, the more the processor must execute speculative instructions that must be discarded when a mispredict occurs. Specific to the NetBurst design was an improved branch-prediction algorithm aided by a large branch target array that stored branch predictions. Hyper-Threading Technology Intel Hyper-Threading (HT) Technology is a design enhancement for server environments. It takes advantage of the fact that, according to Intel estimates, the utilization rate for the execution units in a NetBurst processor is typically only about 35 percent. To improve the utilization rate, HT Technology adds Multi-Thread-Level Parallelism (MTLP) to the design. In essence, MTLP means that the core receives two instruction streams from the operating system (OS) to take advantage of idle cycles on the execution units of the processor. For one physical processor to appear as two distinct processors to the OS, the design replicates the pieces of the processor with which the OS interacts to create two logical processors in one package. These replicated components include the instruction pointer, the interrupt controller, and other general-purpose registers―all of which are collectively referred to as the Architectural State, or AS (see Figure 5). Figure 5. Hyper-Threading Technology IA-32 Processor with Hyper-thread Technology AS1 AS2 Traditional Dual-processor (D) System AS AS Processor Core Logical processor Logical processor System Bus Processor Core Processor Core System Bus 7

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

Keeping the pipeline busy requires that the processor begin executing a second instruction before the
first has traveled completely through the pipeline. However, suppose a program has an instruction
that requires summing three numbers:
X = A + B + C
If the processor already has A and B stored in registers but needs to get C from memory, this causes a
“bubble,” or stall, in the pipeline in which the processor cannot execute the instruction until it obtains
the value for C from memory. This bubble must move all the way through the pipeline, forcing each
stage that contains the bubble to sit idle, wasting execution resources during that clock cycle. Clearly,
the longer the pipeline, the more significant this problem becomes.
Processor stalls often occur as a result of one instruction being dependent on another. If the program
has a branch, such as an IF–THEN loop, the processor has two options. The processor either waits for
the critical instruction to finish (stalling the pipeline) before deciding which program branch to take, or
it predicts which branch the program will follow.
If the processor predicts the wrong code branch, it must flush the pipeline and start over again with
the IF–THEN statement using the correct branch. The longer the pipeline, the higher the performance
cost for branch mispredicts. For example, the longer the pipeline, the more the processor must execute
speculative instructions that must be discarded when a mispredict occurs. Specific to the NetBurst
design was an improved branch-prediction algorithm aided by a large branch target array that stored
branch predictions.
Hyper-Threading Technology
Intel Hyper-Threading (HT) Technology is a design enhancement for server environments. It takes
advantage of the fact that, according to Intel estimates, the utilization rate for the execution units in a
NetBurst processor is typically only about 35 percent. To improve the utilization rate, HT Technology
adds Multi-Thread-Level Parallelism (MTLP) to the design. In essence, MTLP means that the core
receives two instruction streams from the operating system (OS) to take advantage of idle cycles on
the execution units of the processor. For one physical processor to appear as two distinct processors
to the OS, the design replicates the pieces of the processor with which the OS interacts to create two
logical processors in one package. These replicated components include the instruction pointer, the
interrupt controller, and other general-purpose registers
all of which are collectively referred to as
the Architectural State, or AS (see Figure 5).
Figure 5.
Hyper-Threading Technology
AS
AS
System
Bus
IA-32 Processor with
Hyper-thread Technology
Traditional Dual-processor
(D) System
Processor
Core
Processor
Core
System
Bus
AS1
AS2
Processor
Core
Logical
processor
Logical
processor
7