Home » AMD Manuals » Processors » AMD AMD-K6-2/500AFX » Manual Viewer

AMD AMD-K6-2/500AFX Data Sheet - Page 30

Prefetching, Predecode Bits, Cache Sector Organization - architecture

View all AMD AMD-K6-2/500AFX manuals

Add to My Manuals
Save this manual to your list of manuals

Page 30 highlights

AMD-K6®-2 Processor Data Sheet Preliminary Information 21850J/0-February 2000 Prefetching Predecode Bits Two forms of cache misses and associated cache fills can take place-a tag-miss cache fill and a tag-hit cache fill. In the case of a tag-miss cache fill, the miss is due to a tag mismatch, in which case the required cache line is filled from external memory, and the cache line within the sector that was not required is marked as invalid. In the case of a tag-hit cache fill, the address matches the tag, but the requested cache line is marked as invalid. The required cache line is filled from external memory, and the cache line within the sector that is not required remains in the same cache state. The AMD-K6-2 processor conditionally performs cache prefetching which results in the filling of the required cache line first, and a prefetch of the second cache line making up the other half of the sector. From the perspective of the external bus, the two cache-line fills typically appear as two 32-byte burst read cycles occurring back-to-back or, if allowed, as pipelined cycles. The 3DNow! technology includes an instruction called PREFETCH that allows a cache line to be prefetched into the data cache. The PREFETCH instruction format is defined in Table 17, "3DNow!™ Instructions," on page 81. For more detailed information, see the 3DNow!™ Technology Manual, order# 21928. Decoding x86 instructions is particularly difficult because the instructions are variable-length and can be from 1 to 15 bytes long. Predecode logic supplies the five predecode bits that are associated with each instruction byte. The predecode bits indicate the number of bytes to the start of the next x86 instruction. The predecode bits are stored in an extended instruction cache alongside each x86 instruction byte as shown in Figure 2. The predecode bits are passed with the instruction bytes to the decoders where they assist with parallel x86 instruction decoding. Tag Address Cache Line 0 Byte 31 Predecode Bits Byte 30 Predecode Bits Byte 0 Predecode Bits MESI Bits Cache Line 1 Byte 31 Predecode Bits Byte 30 Predecode Bits Byte 0 Predecode Bits MESI Bits Figure 2. Cache Sector Organization 10 Internal Architecture Chapter 2

Section	Page
Contents	3
List of Figures	11
List of Tables	15
Revision History	19
1 AMDK6®2 Processor	21
1.1 Super7™ Platform Initiative	23
Super7™ Platform Enhancements	23
Super7™ Platform Advantages	24
2 Internal Architecture	25
2.1 Introduction	25
2.2 AMDK6®2 Processor Microarchitecture Overview	25
Enhanced RISC86® Microarchitecture	26
2.3 Cache, Instruction Prefetch, and Predecode Bits	29
Cache	29
Prefetching	30
Predecode Bits	30
2.4 Instruction Fetch and Decode	31
Instruction Fetch	31
Instruction Decode	32
2.5 Centralized Scheduler	34
2.6 Execution Units	35
Register X and Y Pipelines	36
2.7 BranchPrediction Logic	37
Branch History Table	38
Branch Target Cache	38
Return Address Stack	38
Branch Execution Unit	39
3 Software Environment	41
3.1 Registers	41
GeneralPurpose Registers	42
Integer Data Types	43
Segment Registers	44
Segment Usage	44
Instruction Pointer	45
FloatingPoint Registers	45
FloatingPoint Register Data Types	48
MMX™/3DNow!™ Registers	49
MMX™ Data Types	49
3DNow!™ Data Types	50
EFLAGS Register	51
Control Registers	52
Debug Registers	54
ModelSpecific Registers (MSR)	57
Memory Management Registers	60
Task State Segment	62
Paging	63
Descriptors and Gates	66
Exceptions and Interrupts	69
3.2 AMDK6®2 Processor Model 8/[F:8] Registers	70
Extended Feature Enable Register (EFER)–Model 8/[F:8]	70
Write Handling Control Register (WHCR)–Model 8/[F:8]	71
UC/WC Cacheability Control Register (UWCCR)	72
Processor State Observability Register (PSOR)	73
Page Flush/Invalidate Register (PFIR)	73
3.3 Instructions Supported by the AMDK6®2 Processor	74
4 Signal Descriptions	103
4.1 Signal Terminology	103
4.2 A20M# (Address Bit 20 Mask)	105
4.3 A[31:3] (Address Bus)	106
4.4 ADS# (Address Strobe)	107
4.5 ADSC# (Address Strobe Copy)	107
4.6 AHOLD (Address Hold)	108
4.7 AP (Address Parity)	109
4.8 APCHK# (Address Parity Check)	110
4.9 BE[7:0]# (Byte Enables)	111
4.10 BF[2:0] (Bus Frequency)	112
4.11 BOFF# (Backoff)	113
4.12 BRDY# (Burst Ready)	114
4.13 BRDYC# (Burst Ready Copy)	115
4.14 BREQ (Bus Request)	116
4.15 CACHE# (Cacheable Access)	116
4.16 CLK (Clock)	117
4.17 D/C# (Data/Code)	117
4.18 D[63:0] (Data Bus)	118
4.19 DP[7:0] (Data Parity)	119
4.20 EADS# (External Address Strobe)	120
4.21 EWBE# (External Write Buffer Empty)	121
4.22 FERR# (FloatingPoint Error)	122
4.23 FLUSH# (Cache Flush)	123
4.24 HIT# (Inquire Cycle Hit)	124
4.25 HITM# (Inquire Cycle Hit To Modified Line)	124
4.26 HLDA (Hold Acknowledge)	125
4.27 HOLD (Bus Hold Request)	125
4.28 IGNNE# (Ignore Numeric Exception)	126
4.29 INIT (Initialization)	127
4.30 INTR (Maskable Interrupt)	128
4.31 INV (Invalidation Request)	128
4.32 KEN# (Cache Enable)	129
4.33 LOCK# (Bus Lock)	130
4.34 M/IO# (Memory or I/O)	131
4.35 NA# (Next Address)	132
4.36 NMI (NonMaskable Interrupt)	132
4.37 PCD (Page Cache Disable)	133
4.38 PCHK# (Parity Check)	134
4.39 PWT (Page Writethrough)	135
4.40 RESET (Reset)	136
4.41 RSVD (Reserved)	136
4.42 SCYC (Split Cycle)	137
4.43 SMI# (System Management Interrupt)	137
4.44 SMIACT# (System Management Interrupt Active)	138
4.45 STPCLK# (Stop Clock)	139
4.46 TCK (Test Clock)	139
4.47 TDI (Test Data Input)	140
4.48 TDO (Test Data Output)	140
4.49 TMS (Test Mode Select)	140
4.50 TRST# (Test Reset)	141
4.51 VCC2DET (VCC2 Detect)	141
4.52 VCC2H/L# (VCC2 High/Low)	141
4.53 W/R# (Write/Read)	142
4.54 WB/WT# (Writeback or Writethrough)	143
5 Bus Cycles	147
5.1 Timing Diagrams	147
5.2 Bus State Machine Diagram	149
Idle	150
Address	150
Data	150
DataNA# Requested	150
Pipeline Address	150
Pipeline Data	151
Transition	151
5.3 Memory Reads and Writes	152
SingleTransfer Memory Read and Write	152
Misaligned SingleTransfer Memory Read and Write	154
Burst Reads and Pipelined Burst Reads	156
Burst Writeback	158
5.4 I/O Read and Write	160
Basic I/O Read and Write	160
Misaligned I/O Read and Write	161
5.5 Inquire and Bus Arbitration Cycles	162
Hold and Hold Acknowledge Cycle	162
HOLDInitiated Inquire Hit to Shared or Exclusive Line	164
HOLDInitiated Inquire Hit to Modified Line	166
AHOLDInitiated Inquire Miss	168
AHOLDInitiated Inquire Hit to Shared or Exclusive Line	170
AHOLDInitiated Inquire Hit to Modified Line	172
AHOLD Restriction	174
Bus Backoff (BOFF#)	176
Locked Cycles	178
Basic Locked Operation	178
Locked Operation with BOFF# Intervention	180
Interrupt Acknowledge	182
5.6 Special Bus Cycles	184
Basic Special Bus Cycle	184
Shutdown Cycle	186
Stop Grant and Stop Clock States	187
INITInitiated Transition from Protected Mode to Real Mode	190
6 Poweron Configuration and Initialization	193
6.1 Signals Sampled During the Falling Transition of RESET	193
FLUSH#	193
BF[2:0]	193
BRDYC#	193
6.2 RESET Requirements	194
6.3 State of Processor After RESET	194
Output Signals	194
Registers	194
6.4 State of Processor After INIT	197
7 Cache Organization	199
7.1 MESI States in the Data Cache	200
7.2 Predecode Bits	200
7.3 Cache Operation	201
CacheRelated Signals	203
7.4 Cache Disabling and Flushing	203
7.5 CacheLine Fills	204
7.6 CacheLine Replacements	205
7.7 Write Allocate	206
Write to a Cacheable Page	206
Write to a Sector	207
Write Allocate Limit	207
Write Allocate Logic Mechanisms and Conditions	209
7.8 Prefetching	212
Hardware Prefetching	212
Software Prefetching	212
7.9 Cache States	212
7.10 Cache Coherency	214
Inquire Cycles	214
Internal Snooping	214
FLUSH#	215
PFIR	215
WBINVD and INVD	216
CacheLine Replacement	216
Cache Snooping	218
7.11 Writethrough versus Writeback Coherency States	219
7.12 A20M# Masking of Cache Accesses	219
8 Write Merge Buffer	221
8.1 EWBE Control	221
8.2 Memory Type Range Registers	223
UC/WC Cacheability Control Register (UWCCR)	223
9 FloatingPoint and Multimedia Execution Units	227
9.1 FloatingPoint Execution Unit	227
Handling FloatingPoint Exceptions	227
External Logic Support of FloatingPoint Exceptions	227
9.2 Multimedia and 3DNow!™ Execution Units	229
9.3 FloatingPoint and MMX™/3DNow!™ Instruction Compatibility	229
Registers	229
Exceptions	229
FERR# and IGNNE#	229
10 System Management Mode (SMM)	231
10.1 Overview	231
10.2 SMM Operating Mode and Default Register Values	231
10.3 SMM StateSave Area	234
10.4 SMM Revision Identifier	236
10.5 SMM Base Address	237
10.6 Halt Restart Slot	237
10.7 I/O Trap Dword	238
10.8 I/O Trap Restart Slot	239
10.9 Exceptions, Interrupts, and Debug in SMM	240
11 Test and Debug	241
11.1 BuiltIn SelfTest (BIST)	241
11.2 TriState Test Mode	242
11.3 BoundaryScan Test Access Port (TAP)	243
Test Access Port	243
TAP Signals	243
TAP Registers	244
TAP Instructions	251
TAP Controller State Machine	252
11.4 L1 Cache Inhibit	255
Purpose	255
11.5 Debug	256
Debug Registers	256
Debug Exceptions	261
12 Clock Control	263
12.1 Halt State	264
Enter Halt State	264
Exit Halt State	264
12.2 Stop Grant State	265
Enter Stop Grant State	265
Exit Stop Grant State	265
12.3 Stop Grant Inquire State	266
Enter Stop Grant Inquire State	266
Exit Stop Grant Inquire State	266
12.4 Stop Clock State	266
Enter Stop Clock State	266
Exit Stop Clock State	267
13 Power and Grounding	269
13.1 Power Connections	269
13.2 Decoupling Recommendations	270
13.3 Pin Connection Requirements	271
14 Electrical Data	273
14.1 Electrical Data for OPN Suffixes AHX, 400AFQ, and AFR	273
Operating Ranges	273
Absolute Ratings	274
DC Characteristics	274
Power Dissipation	277
14.2 Electrical Data for OPN Suffixes AGR, AFX, and 400AFR	278
Operating Ranges	278
Absolute Ratings	279
DC Characteristics	279
Power Dissipation	282
15 I/O Buffer Characteristics	283
15.1 Selectable Drive Strength	283
15.2 I/O Buffer Model	284
15.3 I/O Model Application Note	285
15.4 I/O Buffer AC and DC Characteristics	285
16 Signal Switching Characteristics	287
16.1 CLK Switching Characteristics	287
16.2 Clock Switching Characteristics for 100MHz Bus Operation	288
16.3 Clock Switching Characteristics for 66MHz Bus Operation	288
16.4 Valid Delay, Float, Setup, and Hold Timings	289
16.5 Output Delay Timings for 100MHz Bus Operation	290
16.6 Input Setup and Hold Timings for 100MHz Bus Operation	292
16.7 Output Delay Timings for 66MHz Bus Operation	294
16.8 Input Setup and Hold Timings for 66MHz Bus Operation	296
16.9 RESET and Test Signal Timing	298
17 Thermal Design	305
17.1 Package Thermal Specifications	305
Heat Dissipation Path	310
Measuring Case Temperature	310
17.2 Layout and Airflow Considerations	311
Voltage Regulator	311
Airflow Management in a System Design	312
18 Pin Description Diagram	315
19 Pin Designations	317
20 Package Specifications	319
20.1 321Pin Staggered CPGA Package Specification	319
21 Ordering Information	321

Match case Limit results 1 per page

Internal Architecture

Chapter 2

AMD-K6

-2 Processor Data Sheet

21850J/0—February 2000

Preliminary Information

Two forms of cache misses and associated cache fills can take

place—a tag-miss cache fill and a tag-hit cache fill. In the case

of a tag-miss cache fill, the miss is due to a tag mismatch, in

which case the required cache line is filled from external

memory, and the cache line within the sector that was not

required is marked as invalid. In the case of a tag-hit cache fill,

the address matches the tag, but the requested cache line is

marked as invalid. The required cache line is filled from

external memory, and the cache line within the sector that is

not required remains in the same cache state.

Prefetching

The AMD-K6-2 processor conditionally performs cache

prefetching which results in the filling of the required cache

line first, and a prefetch of the second cache line making up the

other half of the sector. From the perspective of the external

bus, the two cache-line fills typically appear as two 32-byte

burst read cycles occurring back-to-back or, if allowed, as

pipelined cycles.

The 3DNow! technology includes an instruction called

PREFETCH that allows a cache line to be prefetched into the

data cache. The PREFETCH instruction format is defined in

Table 17, “3DNow!™ Instructions,” on page 81. For more

detailed information, see the

3DNow!™ Technology Manual

order# 21928.

Predecode Bits

Decoding x86 instructions is particularly difficult because the

instructions are variable-length and can be from 1 to 15 bytes

long. Predecode logic supplies the five predecode bits that are

associated with each instruction byte. The predecode bits

indicate the number of bytes to the start of the next x86

instruction. The predecode bits are stored in an extended

instruction cache alongside each x86 instruction byte as shown

in Figure 2. The predecode bits are passed with the instruction

bytes to the decoders where they assist with parallel x86

instruction decoding.

Figure 2.

Cache Sector Organization

Tag

Address

Cache Line 0

Byte 31

Predecode Bits

Byte 30

Predecode Bits

........

Byte 0

Predecode Bits

MESI Bits

Cache Line 1

Byte 31

Predecode Bits

Byte 30

Predecode Bits

........

Byte 0

Predecode Bits

MESI Bits