Compaq W8000 Hyper-Threading Technology, New Feature of Intel Xeon Processor - Page 4

Hyper-Threading Technology, New Feature of Intel Xeon Processor White Paper

4

167T-0202A-WWEN

For workstation applications that use a lot of memory intensive, multi-tasked or resource bound

tasks, there seems to be no apparent benefit to using Hyper-Threading technology. This will be

evident from the results of some benchmarks in this paper. There seems to be more benefit

running two physical processors as opposed to running a single processor with Hyper-Threading

enabled.

Overview

Hyper-Threading technology enables a single physical processor to appear as two independent

Logical Processors to the OS. This enables the OS to execute two separate code streams (called

threads) concurrently, either from two different applications or from the same application. After

power up and initialization, each logical processor can be individually halted, interrupted or

directed to execute a specified thread, independently from the other logical processor on the chip.

Unlike a traditional dual processor (DP) configuration (see Figure 1) that uses two separate

physical IA-32 processors (such as two Intel Xenon processors), the

logical processors

(see

Figure 2) in a processor with Hyper-Threading technology share the execution resources of the

processor core, which include the rapid execution engine, the caches, the system bus interface,

and the firmware.

Each

logical processor

has its own set of general purpose registers (including a

separate Program Counter and local Advanced Programmable Interrupt Controller [APIC]) but, in

order to minimize the complexity of the technology, the Intel Hyper-Threading technology does

not attempt to simultaneously fetch/decode instructions corresponding to two threads. Instead, the

Central Processing Unit (CPU) will alternate the fetch/decode stages between the two logical

CPUs and only attempt to execute operations from two threads simultaneously, thus addressing

the problem of poor execution unit utilization.

Hyper-Threading is available in a Simultaneous Multi-Threaded (SMT) class processor, which

has

dual Architectural State

1

. Simply stated, there are two logical processors on one die.

Therefore, two threads can be launched simultaneously on the same processor, which reduces

overhead on the thread-switches. The

Architectural State

,

which includes the associated register

set for the second logical processor

,

is only about 5% of the total die area.

Figure 1

Figure 2

1

Architectural State represents the current thread context that consists of the IA-32 registers that are visible to the programmer such as

data registers, segment registers, control registers, debug registers, and most of the MSRs as well as its own APIC. The conventional

microprocessor such as P3 provide only one set of AS. These single threaded processors are used to support multiple threads

application today. However, before another thread can begin, the current thread’s state must be saved in the memory so it can properly

resume later. Depending on the number of registers involved and cache misses incurred, a thread-switch operation involving saving

and restoring registers can take hundreds of cycles. Consequentially, it is unprofitable to support thread switching on the operations

that take less than a hundred or so cycles.

Compaq W8000 Hyper-Threading Technology, New Feature of Intel Xeon Processor - Page 4

Overview - memory

Page 4 highlights