# 28.2 Floating point instructions

Explanations:

Operands:

r = register, m = memory, m32 = 32 bit memory operand, etc.

Clock cycles:

The numbers are minimum values. Cache misses, misalignment, denormal operands, and exceptions may increase the clock counts considerably.

Pairability:

+ = pairable with FXCH, np = not pairable with FXCH.

i-ov:

Overlap with integer instructions. i-ov = 4 means that the last four clock cycles can overlap with subsequent integer instructions.

fp-ov:

Overlap with floating point instructions. fp-ov = 2 means that the last two clock cycles can overlap with subsequent floating point instructions. (WAIT is considered a floating point instruction here)

 Instruction Operand Clock cycles Pairability i-ov fp-ov FLD r/m32/m64 1 + 0 0 FLD m80 3 np 0 0 FBLD m80 48-58 np 0 0 FST(P) r 1 np 0 0 FST(P) m32/m64 2 m) np 0 0 FST(P) m80 3 m) np 0 0 FBSTP m80 148-154 np 0 0 FILD m 3 np 2 2 FIST(P) m 6 np 0 0 FLDZ FLD1 2 np 0 0 FLDPI FLDL2E etc. 5 s) np 2 2 FNSTSW AX/m16 6 q) np 0 0 FLDCW m16 8 np 0 0 FNSTCW m16 2 np 0 0 FADD(P) r/m 3 + 2 2 FSUB(R)(P) r/m 3 + 2 2 FMUL(P) r/m 3 + 2 2 n) FDIV(R)(P) r/m 19/33/39 p) + 38 o) 2 FCHS FABS 1 + 0 0 FCOM(P)(P) FUCOM r/m 1 + 0 0 FIADD FISUB(R) m 6 np 2 2 FIMUL m 6 np 2 2 FIDIV(R) m 22/36/42 p) np 38 o) 2 FICOM m 4 np 0 0 FTST 1 np 0 0 FXAM 17-21 np 4 0 FPREM 16-64 np 2 2 FPREM1 20-70 np 2 2 FRNDINT 9-20 np 0 0 FSCALE 20-32 np 5 0 FXTRACT 12-66 np 0 0 FSQRT 70 np 69 o) 2 FSIN FCOS 65-100 r) np 2 2 FSINCOS 89-112 r) np 2 2 F2XM1 53-59 r) np 2 2 FYL2X 103 r) np 2 2 FYL2XP1 105 r) np 2 2 FPTAN 120-147 r) np 36 o) 0 FPATAN 112-134 r) np 2 2 FNOP 1 np 0 0 FXCH r 1 np 0 0 FINCSTP FDECSTP 2 np 0 0 FFREE r 2 np 0 0 FNCLEX 6-9 np 0 0 FNINIT 12-22 np 0 0 FNSAVE m 124-300 np 0 0 FRSTOR m 70-95 np 0 0 WAIT 1 np 0 0

Notes:

m) The value to store is needed one clock cycle in advance.

n) 1 if the overlapping instruction is also an FMUL.

o) Cannot overlap integer multiplication instructions.

p) FDIV takes 19, 33, or 39 clock cycles for 24, 53, and 64 bit precision respectively. FIDIV takes 3 clocks more. The precision is defined by bit 8-9 of the floating point control word.

q) The first 4 clock cycles can overlap with preceding integer instructions. See chapter 26.7.

r) clock counts are typical. Trivial cases may be faster, extreme cases may be slower.

s) may be up to 3 clocks more when output needed for FST, FCHS, or FABS.

文章评论

1: #include <stdio.h>
2:
3: int main()
4: {
004106B0 55 push ebp
004106B1 8B EC mov ebp,esp
004106B3 83 EC 4C sub esp,4Ch
004106B6 53 push ebx
004106B7 56 push esi
004106B8 57 push edi
004106B9 8D 7D B4 lea edi,[ebp-4Ch]
004106BC B9 13 00 00 00 mov ecx,13h
004106C1 B8 CC CC CC CC mov eax,0CCCCCCCCh
004106C6 F3 AB rep stos dword ptr [edi]
5: int x=12345;
004106C8 C7 45 FC 39 30 00 00 mov dword ptr [ebp-4],3039h
6:
7: float a=3458764513820540927.0;
004106CF C7 45 F8 00 00 40 5E mov dword ptr [ebp-8],5E400000h
8:
9: int c=123;
004106D6 C7 45 F4 7B 00 00 00 mov dword ptr [ebp-0Ch],7Bh
10: c=a;
004106DD D9 45 F8 fld dword ptr [ebp-8]
004106E0 E8 9F FF FF FF call __ftol (00410684)
004106E5 89 45 F4 mov dword ptr [ebp-0Ch],eax
11: return 0;
004106E8 33 C0 xor eax,eax
12: }
004106EA 5F pop edi
004106EB 5E pop esi
004106EC 5B pop ebx
004106ED 8B E5 mov esp,ebp

__ftol:
00410684 55 push ebp
00410685 8B EC mov ebp,esp
00410687 83 C4 F4 add esp,0F4h
0041068A 9B wait
0041068B D9 7D FE fnstcw word ptr [ebp-2] ;FNSTCW 将FPU控制字保存到xx，不检查非屏蔽浮点异常
0041068E 9B wait
0041068F 66 8B 45 FE mov ax,word ptr [ebp-2]
00410693 80 CC 0C or ah,0Ch ;修改FPU?
00410696 66 89 45 FC mov word ptr [ebp-4],ax
0041069A D9 6D FC fldcw word ptr [ebp-4]
0041069D DF 7D F4 fistp qword ptr [ebp-0Ch]
004106A0 D9 6D FE fldcw word ptr [ebp-2]
004106A3 8B 45 F4 mov eax,dword ptr [ebp-0Ch]
004106A6 8B 55 F8 mov edx,dword ptr [ebp-8]
004106A9 C9 leave
004106AA C3 ret
004106AB CC int 3
004106AC CC int 3
004106AD CC int 3
004106AE CC int 3
004106AF CC int 3

004106EF 5D pop ebp
004106F0 C3 ret

 发表你的评论如果你想针对此文发表评论, 请填写下列表单: 姓名: * 必填 (Twitter 用户可输入以 @ 开头的用户名, Steemit 用户可输入 @@ 开头的用户名) E-mail: 可选 (不会被公开。如果我回复了你的评论，你将会收到邮件通知) 网站 / Blog: 可选 反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值: 8 x 2 + 1 = * 必填 评论内容: * 必填 你可以使用下列标签修饰文字: [b] 文字 [/b]: 加粗文字 [quote] 文字 [/quote]: 引用文字

Copyright © XiaoHui.com; 保留所有权利。