HP Vectra VT 6/xxx HP Vectra XU 6/xxx and VT 6/xxx PCs - Technical Reference M - Page 16

Cache Memory

Page 16 highlights

Many techniques have been adopted to accelerate the throughput of the instruction-pipeline of the Pentium Pro over that of the Pentium. Firstly, it is super-pipelined: the individual operations of the Pentium pipeline have been broken down into many sub-operations, leading to a much longer pipeline of smaller operations. Secondly, it is super-scalar: the five execution units are completely independent; not only can they have instructions issued to them asynchronously of each other, but they can complete their execution asynchronously of each other, too. Since instructions can complete asynchronously, it is possible for a simple instruction to complete before a complex one which precedes it. This is the first of two ways in which the Pentium Pro manifests out-of-order instruction execution. The second way follows as a direct result of the speculative execution feature: whilst a time-consuming instruction is still awaiting completion, the processor gets on with executing instructions that were fetched after it, on the speculation that they will probably be needed next. Related to this, the Pentium Pro incorporates an even more elaborate (and more accurate) 16state dynamic branch prediction mechanism than the one which is used on the Pentium. This allows the processor to speculate as to which instructions will be needed following a conditional branch, based on past behavior at the branch. A module, known as the re-order buffer (ROB), handles the out-of-order completion of instructions, and the cases where speculative execution proves to have been wrong (a misprediction by the branch prediction unit, for example). System Board Switch Speed Settings Like the Pentium and 80486 DX2 processors, the Pentium Pro uses internal clock multiplication. For example, the Pentium Pro 150 MHz processor multiplies the 60 MHz system clock by 2.5. Switches 4 and 5 on the system board switch bank set the frequency of the Processor-Local bus. Switches 6, 7 and 8 set the clock multiplier ratio. The relationship of the switch settings to Processor-Local bus and processor frequencies is summarized in the following table: Switch 4 Off On Off On Off Switch 5 Off Off Off Off Off Processor Local Bus Frequency 66 MHz 60 MHz 66 MHz 60 MHz 66 MHz Switch 6 Off On On Off Off Switch 7 Off Off Off On On Switch 8 Off Off Off Off Off Frequency Ratio Processor : Local Bus 2 : 1 2.5 : 1 2.5 : 1 3 : 1 3 : 1 Processor Frequency 133 MHz* 150 MHz 166 MHz 180 MHz 200 MHz *The 133 MHz PentiumPro processor is not supplied in any of the Vectra models. This information is provided for completeness only. CACHE MEMORY There are two integrated circuits sealed within a single Pentium Pro package. One of these contains the Level-2 (L2) cache memory chip; the other contains the processor, which includes two banks of Level-1 (L1) cache memory. Each L1 cache memory has a capacity of 8 KB, and is set-associative. The L2 cache memory has a capacity 256 KB, and is four-way set-associative. Data is stored in the cache memories in lines of 32-bytes (256 bits). This involves two consecutive transfers of 128-bits with the main memory.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51

Many techniques have been adopted to accelerate the throughput of the instruction-pipeline of the
Pentium Pro over that of the Pentium. Firstly, it is
super-pipelined
: the individual operations of the
Pentium pipeline have been broken down into many sub-operations, leading to a much longer
pipeline of smaller operations. Secondly, it is
super-scalar
: the five execution units are completely
independent; not only can they have instructions issued to them asynchronously of each other, but
they can complete their execution asynchronously of each other, too.
Since instructions can complete asynchronously, it is possible for a simple instruction to complete
before a complex one which precedes it. This is the first of two ways in which the Pentium Pro
manifests
out-of-order instruction execution
. The second way follows as a direct result of the
speculative execution
feature: whilst a time-consuming instruction is still awaiting completion, the
processor gets on with executing instructions that were fetched after it, on the speculation that they
will probably be needed next.
Related to this, the Pentium Pro incorporates an even more elaborate (and more accurate) 16-
state
dynamic branch prediction
mechanism than the one which is used on the Pentium. This
allows the processor to speculate as to which instructions will be needed following a conditional
branch, based on past behavior at the branch.
A module, known as the
re-order buffer
(ROB), handles the out-of-order completion of instructions,
and the cases where speculative execution proves to have been wrong (a misprediction by the
branch prediction unit, for example).
System Board Switch Speed Settings
Like the Pentium and 80486 DX2 processors, the Pentium Pro uses internal clock multiplication.
For example, the Pentium Pro 150 MHz processor multiplies the 60 MHz system clock by 2.5.
Switches 4 and 5 on the system board switch bank set the frequency of the Processor-Local bus.
Switches 6, 7 and 8 set the clock multiplier ratio. The relationship of the switch settings to
Processor-Local bus and processor frequencies is summarized in the following table:
Switch 4
Switch 5
Processor
Local Bus
Frequency
Switch 6
Switch 7
Switch 8
Frequency
Ratio
Processor :
Local Bus
Processor
Frequency
Off
Off
66 MHz
Off
Off
Off
2 : 1
133 MHz*
On
Off
60 MHz
On
Off
Off
2.5 : 1
150 MHz
Off
Off
66 MHz
On
Off
Off
2.5 : 1
166 MHz
On
Off
60 MHz
Off
On
Off
3 : 1
180 MHz
Off
Off
66 MHz
Off
On
Off
3 : 1
200 MHz
*The 133 MHz PentiumPro processor is not supplied in any of the Vectra models. This information is
provided for completeness only.
CACHE MEMORY
There are two integrated circuits sealed within a single Pentium Pro package. One of these
contains the Level-2 (L2) cache memory chip; the other contains the processor, which includes two
banks of Level-1 (L1) cache memory.
Each L1 cache memory has a capacity of 8 KB, and is set-associative. The L2 cache memory has
a capacity 256 KB, and is four-way set-associative.
Data is stored in the cache memories in lines of 32-bytes (256 bits). This involves two consecutive
transfers of 128-bits with the main memory.