小辉程序员之路, since 1996 http://www.xiaohui.com
乐走天涯: 工作并快乐着,职业并休闲着
 » 首页 > MMX 优化: How to optimize for the Pentium family of microprocessors

28.1 Integer instructions


http://www.XiaoHui.com 日期: 2000-04-03 13:00

28. List of instruction timings for PPlain and PMMX

28.1 Integer instructions

Explanations:

Operands:

r = register, m = memory, i = immediate data, sr = segment register

m32 = 32 bit memory operand, etc.

Clock cycles:

The numbers are minimum values. Cache misses, misalignment, and exceptions may increase the clock counts considerably.

Pairability:

u = pairable in u-pipe, v = pairable in v-pipe, uv = pairable in either pipe, np = not pairable.

Instruction Operands Clock cycles Pairability
NOP1uv
MOVr/m, r/m/i1uv
MOVr/m, sr1np
MOVsr , r/m>= 2 b)np
MOVm , accum1uv h)
XCHG(E)AX, r2np
XCHGr , r3np
XCHGr , m>15np
XLAT4np
PUSHr/i1uv
POPr1uv
PUSHm2np
POPm3np
PUSHsr1 b)np
POPsr>= 3 b)np
PUSHF3-5np
POPF4-6np
PUSHA POPA5-9 i)np
PUSHAD POPAD5np
LAHF SAHF2np
MOVSX MOVZXr , r/m3 a)np
LEAr , m1uv
LDS LES LFS LGS LSSm4 c)np
ADD SUB AND OR XORr , r/i1uv
ADD SUB AND OR XORr , m2uv
ADD SUB AND OR XORm , r/i3uv
ADC SBBr , r/i1u
ADC SBBr , m2u
ADC SBBm , r/i3u
CMPr , r/i1uv
CMPm , r/i2uv
TESTr , r1uv
TESTm , r2uv
TESTr , i1f)
TESTm , i2np
INC DECr1uv
INC DECm3uv
NEG NOTr/m1/3np
MUL IMULr8/r16/m8/m1611np
MUL IMULall other versions9 d)np
DIVr8/m817np
DIVr16/m1625np
DIVr32/m3241np
IDIVr8/m822np
IDIVr16/m1630np
IDIVr32/m3246np
CBW CWDE3np
CWD CDQ2np
SHR SHL SAR SALr , i1u
SHR SHL SAR SALm , i3u
SHR SHL SAR SALr/m, CL4/5np
ROR ROL RCR RCLr/m, 11/3u
ROR ROLr/m, i(><1)1/3np
ROR ROLr/m, CL4/5np
RCR RCLr/m, i(><1)8/10np
RCR RCLr/m, CL7/9np
SHLD SHRDr, i/CL4 a)np
SHLD SHRDm, i/CL5 a)np
BTr, r/i4 a)np
BTm, i4 a)np
BTm, i9 a)np
BTR BTS BTCr, r/i7 a)np
BTR BTS BTCm, i8 a)np
BTR BTS BTCm, r14 a)< /TD>np
BSF BSRr , r/m7-73 a)np
SETccr/m1/2 a)np
JMP CALLshort/near1 e)v
JMP CALLfar>= 3 e)np
conditional jumpshort/near1/4/5/6 e)v
CALL JMPr/m2/5 enp
RETN2/5 enp
RETNi3/6 e)np
RETF4/7 e)np
RETFi5/8 e)np
J(E)CXZshort4-11 e)np
LOOPshort5-10 e)np
BOUNDr , m8np
CLC STC CMC CLD STD2np
CLI STI6-9np
LODS2np
REP LODS7+3*n g)np
STOS3np
REP STOS10+n g)np
MOVS4np
REP MOVS12+n g)np
SCAS4np
REP(N)E SCAS9+4*n g)np
CMPS5np
REP(N)E CMPS8+4*n g)np
BSWAP1 a)np
CPUID13-16 a)np
RDTSC6-13 a) j)np

Notes:

a) this instruction has a 0FH prefix which takes one clock cycle extra to decode on a PPlain unless preceded by a multicycle instruction (see chapter 12).

b) versions with FS and GS have a 0FH prefix. see note a.

c) versions with SS, FS, and GS have a 0FH prefix. see note a.

d) versions with two operands and no immediate have a 0FH prefix, see note a.

e) see chapter 22

f) only pairable if register is accumulator. see chapter 26.14.

g) add one clock cycle for decoding the repeat prefix unless preceded by a multicycle instruction (such as CLD. see chapter 12).

h) pairs as if it were writing to the accumulator. see chapter 26.14.

i) 9 if SP divisible by 4. See 10.2

j) on PPlain: 6 in priviledged or real mode, 11 in nonpriviledged, error in virtual mode. On PMMX: 8 and 13 clocks respectively.

Tags: MMX 优化 | Integer



 文章评论

目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓
 
发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填
E-mail: 可选 (不会被公开)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
4 + 14 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 

小辉程序员之路 建站于 1997 ◇ 做一名最好的开发者是我不变的理想……
Copyright(C) 1997-2008 XiaoHui.com   All rights reserved
声明:站内所有原创文字,未经许可,均可转载、复制。
转载时必须以链接形式注明作者和原始出处