小辉程序员之路, since 1996 http://www.xiaohui.com
乐走天涯: 工作并快乐着,职业并休闲着
 » 首页 > MMX 优化: How to optimize for the Pentium family of microprocessors

11. Splitting complex instructions into simpler ones (PPlain and PMMX)


http://www.XiaoHui.com 日期: 2000-04-01 13:00

11. Splitting complex instructions into simpler ones (PPlain and PMMX)

You may split up read/modify and read/modify/write instructions to improve pairing. Example:

ADD [mem1],EAX / ADD [mem2],EBX ; 5 clock cycles

This code may be split up into a sequence which takes only 3 clock cycles:

MOV ECX,[mem1] / MOV EDX,[mem2] / ADD ECX,EAX / ADD EDX,EBX MOV [mem1],ECX / MOV [mem2],EDX

Likewise you may split up non-pairable instructions into pairable instructions:

PUSH [mem1] PUSH [mem2] ; non-pairable

Split up into:

MOV EAX,[mem1] MOV EBX,[mem2] PUSH EAX PUSH EBX ; everything pairs

Other examples of non-pairable instructions which may be split up into simpler pairable instructions:

CDQ split into: MOV EDX,EAX / SAR EDX,31

NOT EAX change to XOR EAX,-1

NEG EAX split into XOR EAX,-1 / INC EAX

MOVZX EAX,BYTE PTR [mem] split into XOR EAX,EAX / MOV AL,BYTE PTR [mem]

JECXZ split into TEST ECX,ECX / JZ

LOOP split into DEC ECX / JNZ

XLAT change to MOV AL,[EBX+EAX]

If splitting instructions doesn't improve speed, then you may keep the complex or nonpairable instructions in order to reduce code size.

Splitting instructions is not needed on the PPro, PII and PIII, except when the split instructions generate fewer uops.

Tags: MMX 优化



 文章评论

目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓
 
发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填
E-mail: 可选 (不会被公开)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
1 + 16 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 

小辉程序员之路 建站于 1997 ◇ 做一名最好的开发者是我不变的理想……
Copyright(C) 1997-2008 XiaoHui.com   All rights reserved
声明:站内所有原创文字,未经许可,均可转载、复制。
转载时必须以链接形式注明作者和原始出处