AMD OS1354WBJ4BGHBOX Optimization Guide - Page 25

fpunit

Page 25 highlights

52128 Rev. 1.1 March 2013 Software Optimization Guide for AMD Family 16h Processors Columns Opn B-E Instruction operands. The following notations are used in these columns: • imm-an immediate operand (value range left unspecified) • imm8-an 8-bit immediate operand • m-an 8, 16, 32 or 64-bit memory operand (128 and 256 bit memory operands are always explicitly specified as m128 or m256) • mm-any 64-bit MMX register • mN-an N-bit memory operand • r-any general purpose (integer) register • rN-an N-bit general purpose register • xmmN-any xmm register, the N distinguishes among multiple operands of the same type • ymmN-any ymm register, the N distinguishes among multiple operands of the same type A slash denotes an alternative, for example m64/m32 is a 32-bit or 64-bit memory operand. The notation "" denotes that the register xmm0 is an implicit operand of the instruction. Column F Cpuid flag CPUID feature flag for the instruction Column G Macro Ops Number of macro-ops for the instruction. Any number greater than 2 implies that the instruction is microcoded, with the given number of macro-ops in the micro-program. If the entry in this column is simply 'ucode' then the instruction is microcoded but the exact number of macro-ops either has not been determined or is variable. Column H Unit Execution units. The following abbreviations are used: • ALU-Arithmetic / logical unit. • FPA-Floating-point add functional element within the floating-point cluster of the floating- point unit. • FPM-Floating-point multiply functional element in the floating-point cluster of the floating- point unit. • DIV-Integer divide functional element within the integer unit • MUL-Integer multiply functional element within the integer unit. • SAGU-Store address generation unit within the integer unit. • STC-Store/convert functional element in the store/convert cluster of the floating point unit. • VALU-Either of the vector ALUs (VALU0 or VALU1) within the integer cluster of the floating-point unit. • VIMUL-Vector integer multiply functional element within the integer cluster of the floating- point unit. • ST-Store unit. In this column, a vertical bar indicates that the instruction can use either of two alternative resources. A comma indicates that both of the comma-separated resources are required. A number of instructions are floating-point load-ops which combine a transfer of data from the integer unit to the floating-point unit with a floating point operation. This transfer is implemented by storing the data from the integer unit to a private scratch memory location, then loading it back into the floating point unit. The Unit column indicates this with "ST,LD-fpunit" where fpunit is the floating point unit required for the load-op. 25

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

Columns
B–E
Op
n
Instruction
operands
.
The
following
notations
are
used
in
these
columns
:
imm—an
immediate
operand
(
value
range
left
unspecified
imm
8
—an
8
-bit
immediate
operand
m—an
8, 16, 32
or
64
-bit
memory
operand
(128
and
256
bit
memory
operands
are
always
explicitly
specified
as
m
128
or
m
256
mm—any
64
-bit
MMX
register
m
N
—an
N
-bit
memory
operand
r—any
general
purpose
(
integer
register
r
N
—an
N
-bit
general
purpose
register
xmm
N
—any
xmm
register
,
the
N
distinguishes
among
multiple
operands
of
the
same
type
ymm
N
—any
ymm
register
,
the
N
distinguishes
among
multiple
operands
of
the
same
type
A
slash
denotes
an
alternative
,
for
example
m
64/
m
32
is
a
32
-bit
or
64
-bit
memory
operand
.
The
notation
"
<
xmm
0>
"
denotes
that
the
register
xmm
0
is
an
implicit
operand
of
the
instruction
.
Column
F
Cpuid
flag
CPUID
feature
flag
for
the
instruction
Column
G
Macro
Ops
Number
of
macro-ops
for
the
instruction
.
Any
number
greater
than
2
implies
that
the
instruction
is
microcoded
,
with
the
given
number
of
macro-ops
in
the
micro-program
.
If
the
entry
in
this
column
is
simply
‘ucode’
then
the
instruction
is
microcoded
but
the
exact
number
of
macro-ops
either
has
not
been
determined
or
is
variable
.
Column
H
Unit
Execution
units
.
The
following
abbreviations
are
used
:
ALU—Arithmetic
/
logical
unit
.
FPA—Floating-point
add
functional
element
within
the
floating-point
cluster
of
the
floating-
point
unit
.
FPM—Floating-point
multiply
functional
element
in
the
floating-point
cluster
of
the
floating-
point
unit
.
DIV—Integer
divide
functional
element
within
the
integer
unit
MUL—Integer
multiply
functional
element
within
the
integer
unit
.
SAGU—Store
address
generation
unit
within
the
integer
unit
.
STC—Store
/
convert
functional
element
in
the
store
/
convert
cluster
of
the
floating
point
unit
.
VALU—Either
of
the
vector
ALUs
(
VALU
0
or
VALU
1
within
the
integer
cluster
of
the
floating-point
unit
.
VIMUL—Vector
integer
multiply
functional
element
within
the
integer
cluster
of
the
floating-
point
unit
.
ST—Store
unit
.
In
this
column
,
a
vertical
bar
indicates
that
the
instruction
can
use
either
of
two
alternative
resources
.
A
comma
indicates
that
both
of
the
comma-separated
resources
are
required
.
A
number
of
instructions
are
floating-point
load-ops
which
combine
a
transfer
of
data
from
the
integer
unit
to
the
floating-point
unit
with
a
floating
point
operation
.
This
transfer
is
implemented
by
storing
the
data
from
the
integer
unit
to
a
private
scratch
memory
location
,
then
loading
it
back
into
the
floating
point
unit
.
The
Unit
column
indicates
this
with
"ST
,
LD-
fpunit
"
where
fpunit
is
the
floating
point
unit
required
for
the
load-op
.
52128
Rev
. 1.1
March
2013
Software
Optimization
Guide
for
AMD
Family
16
h
Processors
25