Contents
Revision
History
......................................................................................................................................
6
1
Preface
...................................................................................................................................................
7
2
Microarchitecture
of
the
Family
16
h
Processor
................................................................................
8
2.1
Features
....................................................................................................................................................................
8
2.2
Instruction
Decomposition
.....................................................................................................................................
10
2.3
Superscalar
Organization
.......................................................................................................................................
10
2.4
Processor
Block
Diagram
......................................................................................................................................
11
2.5
Processor
Cache
Operations
..................................................................................................................................
11
2.5.1
L
1
Instruction
Cache
...............................................................................................................................
12
2.5.2
L
1
Data
Cache
.........................................................................................................................................
12
2.5.3
L
2
Cache
.................................................................................................................................................
12
2.6
Memory
Address
Translation
................................................................................................................................
13
2.6.1
L
1
Translation
Lookaside
Buffers
..........................................................................................................
13
2.6.2
L
2
Translation
Lookaside
Buffers
..........................................................................................................
13
2.6.3
Hardware
Page
Table
Walker
.................................................................................................................
13
2.7
Optimizing
Branching
............................................................................................................................................
13
2.7.1
Branch
Prediction
....................................................................................................................................
13
2.7.2
Loop
Alignment
......................................................................................................................................
16
2.8
Instruction
Fetch
and
Decode
................................................................................................................................
18
2.9
Integer
Unit
............................................................................................................................................................
18
2.9.1
Integer
Schedulers
...................................................................................................................................
18
2.9.2
Integer
Execution
Units
..........................................................................................................................
18
2.9.3
Retire
Control
Unit
.................................................................................................................................
19
2.10
Floating-Point
Unit
..............................................................................................................................................
19
2.10.1
Denormals
.............................................................................................................................................
21
2.11
XMM
Register
Merge
Optimization
....................................................................................................................
22
2.12
Load
Store
Unit
....................................................................................................................................................
23
Appendix
A
Instruction
Latencies
.......................................................................................................
24
A
.1
Instruction
Latency
Assumptions
..........................................................................................................................
24
A
.2
Spreadsheet
Column
Descriptions
........................................................................................................................
24
52128
Rev
. 1.1
March
2013
Software
Optimization
Guide
for
AMD
Family
16
h
Processors
Contents
3