小辉程序员之路, since 1996 http://www.xiaohui.com
乐走天涯: 工作并快乐着,职业并休闲着
 » 首页 > MMX 优化: How to optimize for the Pentium family of microprocessors

26.15 Bit scan (PPlain and PMMX)


http://www.XiaoHui.com 日期: 2000-04-02 14:00

26.15 Bit scan (PPlain and PMMX)

BSF and BSR are the poorest optimized instructions on the PPlain and PMMX, taking approximately 11 + 2*n clock cycles, where n is the number of zeros skipped.

The following code emulates BSR ECX,EAX:

TEST EAX,EAX JZ SHORT BS1 MOV DWORD PTR [TEMP],EAX MOV DWORD PTR [TEMP+4],0 FILD QWORD PTR [TEMP] FSTP QWORD PTR [TEMP] WAIT ; WAIT only needed for compatibility with old 80287 processor MOV ECX, DWORD PTR [TEMP+4] SHR ECX,20 ; isolate exponent SUB ECX,3FFH ; adjust TEST EAX,EAX ; clear zero flag BS1:

The following code emulates BSF ECX,EAX:

TEST EAX,EAX JZ SHORT BS2 XOR ECX,ECX MOV DWORD PTR [TEMP+4],ECX SUB ECX,EAX AND EAX,ECX MOV DWORD PTR [TEMP],EAX FILD QWORD PTR [TEMP] FSTP QWORD PTR [TEMP] WAIT ; WAIT only needed for compatibility with old 80287 processor MOV ECX, DWORD PTR [TEMP+4] SHR ECX,20 SUB ECX,3FFH TEST EAX,EAX ; clear zero flag BS2:

These emulation codes should not be used on the PPro, PII and PIII, where the bit scan instructions take only 1 or 2 clocks, and where the emulation codes shown above have two partial memory stalls.

Tags: MMX 优化 | Bit



 文章评论

目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓
 
发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填
E-mail: 可选 (不会被公开)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
1 + 11 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 

小辉程序员之路 建站于 1997 ◇ 做一名最好的开发者是我不变的理想……
Copyright(C) 1997-2009 XiaoHui.com   All rights reserved
声明:站内所有原创文字,未经许可,均可转载、复制。
转载时必须以链接形式注明作者和原始出处