Home » AMD Manuals » Processors » AMD AMD-K6-2/500AFX » Manual Viewer

AMD AMD-K6-2/500AFX Data Sheet - Page 206

Write Allocate, Write to a Cacheable

View all AMD AMD-K6-2/500AFX manuals

Add to My Manuals
Save this manual to your list of manuals

Page 206 highlights

AMD-K6®-2 Processor Data Sheet Preliminary Information 21850J/0-February 2000 7.7 Write Allocate Write allocate, if enabled, occurs when the processor has a pending memory write cycle to a cacheable line and the line does not currently reside in the data cache. In this case, the processor performs a 32-byte burst read cycle to fetch the data-cache line addressed by the pending write cycle. The data associated with the pending write cycle is merged with the recently-allocated data-cache line and stored in the processor's data cache. The final MESI state of the cache line depends on the state of the WB/WT# and PWT signals during the burst read cycle and the subsequent L1 data cache write hit (See Table 36 on page 193 to determine the cache-line states and the access types following a cache read miss and cache write hit). If a data-cache line fetch from memory is attempted because the write allocate misses the data cache, and KEN# is sampled negated, the processor does not perform an allocation. In this case, the pending write cycle is executed as a single write cycle on the system bus. During write allocates, a 32-byte burst read cycle is executed in place of a non-burst write cycle. While the burst read cycle generally takes longer to execute than the non-burst write cycle, performance gains are realized on subsequent write cycle hits to the write-allocated cache line. Due to the nature of software, memory accesses tend to occur in proximity of each other (principle of locality). The likelihood of additional write hits to the write-allocated cache line is high. The following is a description of three mechanisms by which the AMD-K6-2 processor performs write allocations. A write allocate is performed when any one or more of these mechanisms indicates that a pending write is to a cacheable area of memory. Write to a Cacheable Page Every time the processor performs a cache line fill, the address of the page in which the cache line resides is saved in the Cacheability Control Register (CCR). The page address of subsequent write cycles is compared with the page address stored in the CCR. If the two addresses are equal, then the processor performs a write allocate because the page has already been determined to be cacheable. 186 Cache Organization Chapter 7

Section	Page
Contents	3
List of Figures	11
List of Tables	15
Revision History	19
1 AMDK6®2 Processor	21
1.1 Super7™ Platform Initiative	23
Super7™ Platform Enhancements	23
Super7™ Platform Advantages	24
2 Internal Architecture	25
2.1 Introduction	25
2.2 AMDK6®2 Processor Microarchitecture Overview	25
Enhanced RISC86® Microarchitecture	26
2.3 Cache, Instruction Prefetch, and Predecode Bits	29
Cache	29
Prefetching	30
Predecode Bits	30
2.4 Instruction Fetch and Decode	31
Instruction Fetch	31
Instruction Decode	32
2.5 Centralized Scheduler	34
2.6 Execution Units	35
Register X and Y Pipelines	36
2.7 BranchPrediction Logic	37
Branch History Table	38
Branch Target Cache	38
Return Address Stack	38
Branch Execution Unit	39
3 Software Environment	41
3.1 Registers	41
GeneralPurpose Registers	42
Integer Data Types	43
Segment Registers	44
Segment Usage	44
Instruction Pointer	45
FloatingPoint Registers	45
FloatingPoint Register Data Types	48
MMX™/3DNow!™ Registers	49
MMX™ Data Types	49
3DNow!™ Data Types	50
EFLAGS Register	51
Control Registers	52
Debug Registers	54
ModelSpecific Registers (MSR)	57
Memory Management Registers	60
Task State Segment	62
Paging	63
Descriptors and Gates	66
Exceptions and Interrupts	69
3.2 AMDK6®2 Processor Model 8/[F:8] Registers	70
Extended Feature Enable Register (EFER)–Model 8/[F:8]	70
Write Handling Control Register (WHCR)–Model 8/[F:8]	71
UC/WC Cacheability Control Register (UWCCR)	72
Processor State Observability Register (PSOR)	73
Page Flush/Invalidate Register (PFIR)	73
3.3 Instructions Supported by the AMDK6®2 Processor	74
4 Signal Descriptions	103
4.1 Signal Terminology	103
4.2 A20M# (Address Bit 20 Mask)	105
4.3 A[31:3] (Address Bus)	106
4.4 ADS# (Address Strobe)	107
4.5 ADSC# (Address Strobe Copy)	107
4.6 AHOLD (Address Hold)	108
4.7 AP (Address Parity)	109
4.8 APCHK# (Address Parity Check)	110
4.9 BE[7:0]# (Byte Enables)	111
4.10 BF[2:0] (Bus Frequency)	112
4.11 BOFF# (Backoff)	113
4.12 BRDY# (Burst Ready)	114
4.13 BRDYC# (Burst Ready Copy)	115
4.14 BREQ (Bus Request)	116
4.15 CACHE# (Cacheable Access)	116
4.16 CLK (Clock)	117
4.17 D/C# (Data/Code)	117
4.18 D[63:0] (Data Bus)	118
4.19 DP[7:0] (Data Parity)	119
4.20 EADS# (External Address Strobe)	120
4.21 EWBE# (External Write Buffer Empty)	121
4.22 FERR# (FloatingPoint Error)	122
4.23 FLUSH# (Cache Flush)	123
4.24 HIT# (Inquire Cycle Hit)	124
4.25 HITM# (Inquire Cycle Hit To Modified Line)	124
4.26 HLDA (Hold Acknowledge)	125
4.27 HOLD (Bus Hold Request)	125
4.28 IGNNE# (Ignore Numeric Exception)	126
4.29 INIT (Initialization)	127
4.30 INTR (Maskable Interrupt)	128
4.31 INV (Invalidation Request)	128
4.32 KEN# (Cache Enable)	129
4.33 LOCK# (Bus Lock)	130
4.34 M/IO# (Memory or I/O)	131
4.35 NA# (Next Address)	132
4.36 NMI (NonMaskable Interrupt)	132
4.37 PCD (Page Cache Disable)	133
4.38 PCHK# (Parity Check)	134
4.39 PWT (Page Writethrough)	135
4.40 RESET (Reset)	136
4.41 RSVD (Reserved)	136
4.42 SCYC (Split Cycle)	137
4.43 SMI# (System Management Interrupt)	137
4.44 SMIACT# (System Management Interrupt Active)	138
4.45 STPCLK# (Stop Clock)	139
4.46 TCK (Test Clock)	139
4.47 TDI (Test Data Input)	140
4.48 TDO (Test Data Output)	140
4.49 TMS (Test Mode Select)	140
4.50 TRST# (Test Reset)	141
4.51 VCC2DET (VCC2 Detect)	141
4.52 VCC2H/L# (VCC2 High/Low)	141
4.53 W/R# (Write/Read)	142
4.54 WB/WT# (Writeback or Writethrough)	143
5 Bus Cycles	147
5.1 Timing Diagrams	147
5.2 Bus State Machine Diagram	149
Idle	150
Address	150
Data	150
DataNA# Requested	150
Pipeline Address	150
Pipeline Data	151
Transition	151
5.3 Memory Reads and Writes	152
SingleTransfer Memory Read and Write	152
Misaligned SingleTransfer Memory Read and Write	154
Burst Reads and Pipelined Burst Reads	156
Burst Writeback	158
5.4 I/O Read and Write	160
Basic I/O Read and Write	160
Misaligned I/O Read and Write	161
5.5 Inquire and Bus Arbitration Cycles	162
Hold and Hold Acknowledge Cycle	162
HOLDInitiated Inquire Hit to Shared or Exclusive Line	164
HOLDInitiated Inquire Hit to Modified Line	166
AHOLDInitiated Inquire Miss	168
AHOLDInitiated Inquire Hit to Shared or Exclusive Line	170
AHOLDInitiated Inquire Hit to Modified Line	172
AHOLD Restriction	174
Bus Backoff (BOFF#)	176
Locked Cycles	178
Basic Locked Operation	178
Locked Operation with BOFF# Intervention	180
Interrupt Acknowledge	182
5.6 Special Bus Cycles	184
Basic Special Bus Cycle	184
Shutdown Cycle	186
Stop Grant and Stop Clock States	187
INITInitiated Transition from Protected Mode to Real Mode	190
6 Poweron Configuration and Initialization	193
6.1 Signals Sampled During the Falling Transition of RESET	193
FLUSH#	193
BF[2:0]	193
BRDYC#	193
6.2 RESET Requirements	194
6.3 State of Processor After RESET	194
Output Signals	194
Registers	194
6.4 State of Processor After INIT	197
7 Cache Organization	199
7.1 MESI States in the Data Cache	200
7.2 Predecode Bits	200
7.3 Cache Operation	201
CacheRelated Signals	203
7.4 Cache Disabling and Flushing	203
7.5 CacheLine Fills	204
7.6 CacheLine Replacements	205
7.7 Write Allocate	206
Write to a Cacheable Page	206
Write to a Sector	207
Write Allocate Limit	207
Write Allocate Logic Mechanisms and Conditions	209
7.8 Prefetching	212
Hardware Prefetching	212
Software Prefetching	212
7.9 Cache States	212
7.10 Cache Coherency	214
Inquire Cycles	214
Internal Snooping	214
FLUSH#	215
PFIR	215
WBINVD and INVD	216
CacheLine Replacement	216
Cache Snooping	218
7.11 Writethrough versus Writeback Coherency States	219
7.12 A20M# Masking of Cache Accesses	219
8 Write Merge Buffer	221
8.1 EWBE Control	221
8.2 Memory Type Range Registers	223
UC/WC Cacheability Control Register (UWCCR)	223
9 FloatingPoint and Multimedia Execution Units	227
9.1 FloatingPoint Execution Unit	227
Handling FloatingPoint Exceptions	227
External Logic Support of FloatingPoint Exceptions	227
9.2 Multimedia and 3DNow!™ Execution Units	229
9.3 FloatingPoint and MMX™/3DNow!™ Instruction Compatibility	229
Registers	229
Exceptions	229
FERR# and IGNNE#	229
10 System Management Mode (SMM)	231
10.1 Overview	231
10.2 SMM Operating Mode and Default Register Values	231
10.3 SMM StateSave Area	234
10.4 SMM Revision Identifier	236
10.5 SMM Base Address	237
10.6 Halt Restart Slot	237
10.7 I/O Trap Dword	238
10.8 I/O Trap Restart Slot	239
10.9 Exceptions, Interrupts, and Debug in SMM	240
11 Test and Debug	241
11.1 BuiltIn SelfTest (BIST)	241
11.2 TriState Test Mode	242
11.3 BoundaryScan Test Access Port (TAP)	243
Test Access Port	243
TAP Signals	243
TAP Registers	244
TAP Instructions	251
TAP Controller State Machine	252
11.4 L1 Cache Inhibit	255
Purpose	255
11.5 Debug	256
Debug Registers	256
Debug Exceptions	261
12 Clock Control	263
12.1 Halt State	264
Enter Halt State	264
Exit Halt State	264
12.2 Stop Grant State	265
Enter Stop Grant State	265
Exit Stop Grant State	265
12.3 Stop Grant Inquire State	266
Enter Stop Grant Inquire State	266
Exit Stop Grant Inquire State	266
12.4 Stop Clock State	266
Enter Stop Clock State	266
Exit Stop Clock State	267
13 Power and Grounding	269
13.1 Power Connections	269
13.2 Decoupling Recommendations	270
13.3 Pin Connection Requirements	271
14 Electrical Data	273
14.1 Electrical Data for OPN Suffixes AHX, 400AFQ, and AFR	273
Operating Ranges	273
Absolute Ratings	274
DC Characteristics	274
Power Dissipation	277
14.2 Electrical Data for OPN Suffixes AGR, AFX, and 400AFR	278
Operating Ranges	278
Absolute Ratings	279
DC Characteristics	279
Power Dissipation	282
15 I/O Buffer Characteristics	283
15.1 Selectable Drive Strength	283
15.2 I/O Buffer Model	284
15.3 I/O Model Application Note	285
15.4 I/O Buffer AC and DC Characteristics	285
16 Signal Switching Characteristics	287
16.1 CLK Switching Characteristics	287
16.2 Clock Switching Characteristics for 100MHz Bus Operation	288
16.3 Clock Switching Characteristics for 66MHz Bus Operation	288
16.4 Valid Delay, Float, Setup, and Hold Timings	289
16.5 Output Delay Timings for 100MHz Bus Operation	290
16.6 Input Setup and Hold Timings for 100MHz Bus Operation	292
16.7 Output Delay Timings for 66MHz Bus Operation	294
16.8 Input Setup and Hold Timings for 66MHz Bus Operation	296
16.9 RESET and Test Signal Timing	298
17 Thermal Design	305
17.1 Package Thermal Specifications	305
Heat Dissipation Path	310
Measuring Case Temperature	310
17.2 Layout and Airflow Considerations	311
Voltage Regulator	311
Airflow Management in a System Design	312
18 Pin Description Diagram	315
19 Pin Designations	317
20 Package Specifications	319
20.1 321Pin Staggered CPGA Package Specification	319
21 Ordering Information	321

Match case Limit results 1 per page

186

Cache Organization

Chapter 7

AMD-K6

-2 Processor Data Sheet

21850J/0—February 2000

Preliminary Information

7.7

Write Allocate

Write allocate, if enabled, occurs when the processor has a

pending memory write cycle to a cacheable line and the line

does not currently reside in the data cache. In this case, the

processor performs a 32-byte burst read cycle to fetch the

data-cache line addressed by the pending write cycle. The data

associated with the pending write cycle is merged with the

recently-allocated data-cache line and stored in the processor’s

data cache. The final MESI state of the cache line depends on

the state of the WB/WT# and PWT signals during the burst read

cycle and the subsequent L1 data cache write hit (See Table 36

on page 193 to determine the cache-line states and the access

types following a cache read miss and cache write hit).

If a data-cache line fetch from memory is attempted because

the write allocate misses the data cache, and KEN# is sampled

negated, the processor does not perform an allocation. In this

case, the pending write cycle is executed as a single write cycle

on the system bus.

During write allocates, a 32-byte burst read cycle is executed in

place of a non-burst write cycle. While the burst read cycle

generally takes longer to execute than the non-burst write

cycle, performance gains are realized on subsequent write cycle

hits to the write-allocated cache line. Due to the nature of

software, memory accesses tend to occur in proximity of each

other (principle of locality). The likelihood of additional write

hits to the write-allocated cache line is high.

The following is a description of three mechanisms by which the

AMD-K6-2 processor performs write allocations. A write

allocate is performed when any one or more of these

mechanisms indicates that a pending write is to a cacheable

area of memory.

Write to a Cacheable

Page

Every time the processor performs a cache line fill, the address

of the page in which the cache line resides is saved in the

Cacheability Control Register (CCR). The page address of

subsequent write cycles is compared with the page address

stored in the CCR. If the two addresses are equal, then the

processor performs a write allocate because the page has

already been determined to be cacheable.