Home » AMD Manuals » Processors » AMD AMD-K6-2/400 » Manual Viewer

AMD AMD-K6-2/400 User Guide - Page 237

Write Allocate, not exist in the L2 cache, the processor performs a 32-byte burst

Add to My Manuals
Save this manual to your list of manuals

Page 237 highlights

23542A/0-September 2000 Preliminary Information AMD-K6™-2E+ Embedded Processor Data Sheet 9.8 Chapter 9 L1 instruction-cache lines and L2 cache lines are replaced using a Least Recently Used (LRU) algorithm. If a line replacement is required, lines are replaced when read cache misses occur. The L1 data cache uses a slightly different approach to line replacement. If a miss occurs, and a replacement is required, lines are replaced by using a Least Recently Allocated (LRA) algorithm. Write Allocate Write allocate, if enabled, occurs when the processor has a pending memory write cycle to a cacheable line and the line does not currently reside in the L1 data cache. If the line does not exist in the L2 cache, the processor performs a 32-byte burst read cycle on the system bus to fetch the data-cache line addressed by the pending write cycle. If the line does exist in the L2 cache, the data is supplied directly from the L2 cache, in which case a system bus cycle is not executed. The data associated with the pending write cycle is merged with the recently-allocated data-cache line and stored in the processor's L1 data cache. If the data-cache line was fetched from memory (because of a L2 cache miss), the data is stored, without modification, in the L2 cache. The final MESI state of the cache lines depends on the state of the WB/WT# and PWT signals during the burst read cycle and the subsequent L1 data cache write hit (See Table 39 on page 221 to determine the cache-line states and the access types following a cache write miss). If the L1 data cache line is stored in the modified state, then the same cache line is stored in the L2 cache in the exclusive state. If the L1 data cache line is stored in the shared state, then the same cache line is stored in the L2 cache in the shared state. If a data-cache line fetch from memory is attempted because the write allocate misses the L2 cache, and KEN# is sampled negated, the processor does not perform an allocation. In this case, the pending write cycle is executed as a single write cycle on the system bus. During write allocates that miss the L2 cache, a 32-byte burst read cycle is executed in place of a non-burst write cycle. While the burst read cycle generally takes longer to execute than the non-burst write cycle, performance gains are realized on subsequent write cycle hits to the write-allocated cache line. Cache Organization 215

Section	Page
IF YOU HAVE QUESTIONS, WE\	3
Contents	5
List of Figures	9
List of Tables	13
Revision History	17
About this Data Sheet	19
1 AMDK6™2E+ Embedded Processor	23
1.1 AMDK6™2E+ Embedded Processor Features	25
1.2 Process Technology	29
1.3 Super7™ Platform	30
2 Internal Architecture	33
2.1 Microarchitecture Overview	33
2.2 Cache, Instruction Prefetch, and Predecode Bits	38
2.3 Instruction Fetch and Decode	39
2.4 Centralized Scheduler	43
2.5 Execution Units	44
2.6 BranchPrediction Logic	47
3 Software Environment	49
3.1 Registers	49
3.2 ModelSpecific Registers (MSR)	66
3.3 Memory Management Registers	76
3.4 Paging	78
3.5 Descriptors and Gates	81
3.6 Exceptions and Interrupts	84
3.7 Instructions Supported by the AMDK6™2E+ Processor	85
4 Logic Symbol Diagram	113
5 Signal Descriptions	115
5.1 Signal Terminology	115
5.2 A20M# (Address Bit 20 Mask)	116
5.3 A[31:3] (Address Bus)	117
5.4 ADS# (Address Strobe)	118
5.5 ADSC# (Address Strobe Copy)	118
5.6 AHOLD (Address Hold)	119
5.7 AP (Address Parity)	120
5.8 APCHK# (Address Parity Check)	121
5.9 BE[7:0]# (Byte Enables)	122
5.10 BF[2:0] (Bus Frequency)	123
5.11 BOFF# (Backoff)	124
5.12 BRDY# (Burst Ready)	125
5.13 BRDYC# (Burst Ready Copy)	126
5.14 BREQ (Bus Request)	126
5.15 CACHE# (Cacheable Access)	127
5.16 CLK (Clock)	127
5.17 D/C# (Data/Code)	128
5.18 D[63:0] (Data Bus)	129
5.19 DP[7:0] (Data Parity)	130
5.20 EADS# (External Address Strobe)	131
5.21 EWBE# (External Write Buffer Empty)	132
5.22 FERR# (FloatingPoint Error)	133
5.23 FLUSH# (Cache Flush)	134
5.24 HIT# (Inquire Cycle Hit)	135
5.25 HITM# (Inquire Cycle Hit To Modified Line)	135
5.26 HLDA (Hold Acknowledge)	136
5.27 HOLD (Bus Hold Request)	137
5.28 IGNNE# (Ignore Numeric Exception)	138
5.29 INIT (Initialization)	139
5.30 INTR (Maskable Interrupt)	140
5.31 INV (Invalidation Request)	140
5.32 KEN# (Cache Enable)	141
5.33 LOCK# (Bus Lock)	142
5.34 M/IO# (Memory or I/O)	143
5.35 NA# (Next Address)	144
5.36 NMI (NonMaskable Interrupt)	145
5.37 PCD (Page Cache Disable)	146
5.38 PCHK# (Parity Check)	147
5.39 PWT (Page Writethrough)	148
5.40 RESET (Reset)	149
5.41 RSVD (Reserved)	150
5.42 SCYC (Split Cycle)	151
5.43 SMI# (System Management Interrupt)	152
5.44 SMIACT# (System Management Interrupt Active)	153
5.45 STPCLK# (Stop Clock)	154
5.46 TCK (Test Clock)	155
5.47 TDI (Test Data Input)	155
5.48 TDO (Test Data Output)	155
5.49 TMS (Test Mode Select)	156
5.50 TRST# (Test Reset)	156
5.51 VCC2DET (VCC2 Detect)	157
5.52 VCC2H/L# (VCC2 High/Low)	158
5.53 VID[4:0] (Voltage Identification)	159
5.54 W/R# (Write/Read)	160
5.55 WB/WT# (Writeback or Writethrough)	161
5.56 Pin Tables by Type	162
5.57 Bus Cycle Definitions	164
6 AMD PowerNow!™ Technology	165
6.1 Enhanced Power Management Features	165
6.2 Dynamic Core Frequency and Core Voltage Control	172
7 Bus Cycles	175
7.1 Timing Diagrams	175
7.2 Bus States	177
7.3 Memory Reads and Writes	180
7.4 I/O Read and Write	188
7.5 Inquire and Bus Arbitration Cycles	190
7.6 Special Bus Cycles	212
8 Poweron Configuration and Initialization	221
8.1 Signals Sampled During the Falling Transition of RESET	221
8.2 RESET Requirements	222
8.3 State of Processor After RESET	222
8.4 State of Processor After INIT	225
9 Cache Organization	227
9.1 MESI States in the L1 Data Cache and L2 Cache	229
9.2 Predecode Bits	230
9.3 Cache Operation	230
9.4 Cache Disabling and Flushing	233
9.5 L2 Cache Testing	235
9.6 CacheLine Fills	235
9.7 CacheLine Replacements	236
9.8 Write Allocate	237
9.9 Prefetching	242
9.10 Cache States	243
9.11 Cache Coherency	244
9.12 Writethrough and Writeback Coherency States	249
9.13 A20M# Masking of Cache Accesses	249
10 Write Merge Buffer	251
10.1 EWBE# Control	251
10.2 Memory Type Range Registers	253
10.3 Memory-Range Restrictions	255
10.4 Examples	257
11 FloatingPoint and Multimedia Execution Units	259
11.1 FloatingPoint Execution Unit	259
11.2 Multimedia and 3DNow!™ Execution Units	261
11.3 FloatingPoint and MMX™/3DNow!™ Instruction Compatibility	262
12 System Management Mode (SMM)	263
12.1 SMM Operating Mode and Default Register Values	263
12.2 SMM StateSave Area	265
12.3 SMM Revision Identifier	267
12.4 SMM Base Address	268
12.5 Halt Restart Slot	268
12.6 I/O Trap Doubleword	269
12.7 I/O Trap Restart Slot	270
12.8 Exceptions, Interrupts, and Debug in SMM	272
13 Test and Debug	273
13.1 BuiltIn SelfTest (BIST)	273
13.2 ThreeState Test Mode	274
13.3 BoundaryScan Test Access Port (TAP)	275
13.4 Cache Inhibit	285
13.5 L2 Cache and Tag Array Testing	286
13.6 Debug	290
14 Clock Control	297
14.1 Clock Control States	297
14.2 Halt State	300
14.3 Stop Grant State	300
14.4 Stop Grant Inquire State	302
14.5 EPM Stop Grant State	303
14.6 Stop Clock State	305
15 Electrical Data	307
15.1 Operating Ranges	308
15.2 Absolute Ratings	309
15.3 DC Characteristics	309
15.4 Power Dissipation	311
15.5 Power and Grounding	313
16 Signal Switching Characteristics	317
16.1 CLK Switching Characteristics	318
16.2 Clock Switching Characteristics for 100MHz Bus Operation	318
16.3 Clock Switching Characteristics for 66MHz Bus Operation	319
16.4 Valid Delay, Float, Setup, and Hold Timings	320
16.5 Output Delay Timings for 100MHz Bus Operation	320
16.6 Input Setup and Hold Timings for 100MHz Bus Operation	322
16.7 Output Delay Timings for 66MHz Bus Operation	324
16.8 Input Setup and Hold Timings for 66MHz Bus Operation	326
16.9 RESET and Test Signal Timing	328
16.10 Timing Diagrams	331
17 Thermal Design	335
17.1 Package Thermal Specifications	335
17.2 Measuring Case Temperature	339
17.3 Layout and Airflow Considerations	339
18 Pin Designations	343
18.1 Pins Designations for CPGA Package	344
18.2 Pins Designations for OBGA Package	348
19 Package Specifications	353
19.1 321Pin Staggered CPGA Package Specification	353
19.2 349Ball OBGA Package Specification	354
20 Ordering Information	355

Match case Limit results 1 per page

Chapter 9

Cache Organization

215

23542A/0—September 2000

AMD-K6™-2E+ Embedded Processor Data Sheet

Preliminary Information

L1 instruction-cache lines and L2 cache lines are replaced using

a Least Recently Used (LRU) algorithm. If a line replacement is

required, lines are replaced when read cache misses occur.

The L1 data cache uses a slightly different approach to line

replacement. If a miss occurs, and a replacement is required,

lines are replaced by using a Least Recently Allocated (LRA)

algorithm.

9.8

Write Allocate

Write allocate, if enabled, occurs when the processor has a

pending memory write cycle to a cacheable line and the line

does not currently reside in the L1 data cache. If the line does

not exist in the L2 cache, the processor performs a 32-byte burst

read cycle on the system bus to fetch the data-cache line

addressed by the pending write cycle. If the line does exist in

the L2 cache, the data is supplied directly from the L2 cache, in

which case a system bus cycle is not executed. The data

associated with the pending write cycle is merged with the

recently-allocated data-cache line and stored in the processor’s

L1 data cache. If the data-cache line was fetched from memory

(because of a L2 cache miss), the data is stored, without

modification, in the L2 cache. The final MESI state of the cache

lines depends on the state of the WB/WT# and PWT signals

during the burst read cycle and the subsequent L1 data cache

write hit (See Table 39 on page 221 to determine the cache-line

states and the access types following a cache write miss). If the

L1 data cache line is stored in the modified state, then the same

cache line is stored in the L2 cache in the exclusive state. If the

L1 data cache line is stored in the shared state, then the same

cache line is stored in the L2 cache in the shared state.

If a data-cache line fetch from memory is attempted because

the write allocate misses the L2 cache, and KEN# is sampled

negated, the processor does not perform an allocation. In this

case, the pending write cycle is executed as a single write cycle

on the system bus.

During write allocates that miss the L2 cache, a 32-byte burst

read cycle is executed in place of a non-burst write cycle. While

the burst read cycle generally takes longer to execute than the

non-burst write cycle, performance gains are realized on

subsequent write cycle hits to the write-allocated cache line.