首页 随笔 乐走天涯 程序资料 评论中心 Tag 论坛 其他资源 搜索 联系我 关于 RSS

26.7 FCOM + FSTSW AX (all processors)


日期: 2000-04-02 15:00 | 联系我 | 关注我: Telegram, Twitter

26.7 FCOM + FSTSW AX (all processors)

The FNSTSW instruction is very slow on all processors. The PPro, PII and PIII processors have FCOMI instructions to avoid the slow FNSTSW. Using FCOMI instead of the common sequence FCOM / FNSTSW AX / SAHF will save you 8 clock cycles. You should therefore use FCOMI to avoid FNSTSW wherever possible, even in cases where it costs some extra code.

On processors without FCOMI instructions, the usual way of doing floating point comparisons is:

FLD [a] FCOMP [b] FSTSW AX SAHF JB ASmallerThanBYou may improve this code by using FNSTSW AX rather than FSTSW AX and test AH directly rather than using the non-pairable SAHF (TASM version 3.0 has a bug with the FNSTSW AX instruction):

FLD [a] FCOMP [b] FNSTSW AX SHR AH,1 JC ASmallerThanB

Testing for zero or equality:

FTST FNSTSW AX AND AH,40H JNZ IsZero ; (the zero flag is inverted!)

Test if greater:

FLD [a] FCOMP [b] FNSTSW AX AND AH,41H JZ AGreaterThanB

Do not use TEST AH,41H as it is not pairable on PPlain and PMMX.

On the PPlain and PMMX, the FNSTSW instruction takes 2 clocks, but it is delayed for an additional 4 clocks after any floating point instruction because it is waiting for the status word to retire from the pipeline. This delay comes even after FNOP which cannot change the status word, but not after integer instructions. You can fill the latency between FCOM and FNSTSW with integer instructions taking up to four clock cycles. A paired FXCH immediately after FCOM doesn't delay the FNSTSW, not even if the pairing is imperfect:

FCOM ; clock 1 FXCH ; clock 1-2 (imperfect pairing) INC DWORD PTR [EBX] ; clock 3-5 FNSTSW AX ; clock 6-7

You may want to use FCOM rather than FTST here because FTST is not pairable. Remember to include the N in FNSTSW. FSTSW (without N) has a WAIT prefix which delays it further.

It is sometimes faster to use integer instructions for comparing floating point values, as described in chapter 27.6.

标签: MMX 优化

 文章评论
目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓

发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填 (Twitter 用户可输入以 @ 开头的用户名, Steemit 用户可输入 @@ 开头的用户名)
E-mail: 可选 (不会被公开。如果我回复了你的评论,你将会收到邮件通知)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
2 x 5 + 1 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 
首页 随笔 乐走天涯 猎户星 Google Earth 程序资料 程序生活 评论 Tag 论坛 资源 搜索 联系 关于 隐私声明 版权声明 订阅邮件

程序员小辉 建站于 1997 ◇ 做一名最好的开发者是我不变的理想。
Copyright © XiaoHui.com; 保留所有权利。