首页 随笔 乐走天涯 程序资料 评论中心 Tag 论坛 其他资源 搜索 消息中心 联系我 关于 RSS

22.3. Avoiding jumps (all processors)


日期: 2000-04-01 14:00 | 联系我 | 关注我: SteemIT, Twitter, Google+

22.3. Avoiding jumps (all processors)

There can be many reasons why you may want reduce the number of jumps, calls and returns:

  • jump mispredictions are very expensive,
  • there are various penalties for consecutive or chained jumps, depending on the processor,
  • jump instructions may push one another out of the branch target buffer because of the random replacement algorithm,
  • a return takes 2 clocks on PPlain and PMMX, calls and returns generate 4 uops on PPro, PII and PIII.
  • on PPro, PII and PIII, instruction fetch may be delayed after a jump (chapter 15), and retirement may be slightly less effective for taken jumps then for other uops (chapter 18).

Calls and returns can be avoided by replacing small procedures with inline macros. And in many cases it is possible to reduce the number of jumps by restructuring your code. For example, a jump to a jump should be replaced by a jump to the final target. In some cases this is even possible with conditional jumps if the condition is the same or is known. A jump to a return can be replaced by a return. If you want to eliminate a return to a return, then you should not manipulate the stack pointer because that would interfere with the prediction mechanism of the return stack buffer. Instead, you can replace the preceding call with a jump. For example CALL PRO1 / RET can be replaced by JMP PRO1 if PRO1 ends with the same kind of RET.

You may also eliminate a jump by dublicating the code jumped to. This can be useful if you have a two-way branch inside a loop or before a return. Example:

A: CMP [EAX+4*EDX],ECX JE B CALL X JMP C B: CALL Y C: INC EDX JNZ A MOV ESP, EBP POP EBP RETThe jump to C may be eliminated by dublicating the loop epilog:

A: CMP [EAX+4*EDX],ECX JE B CALL X INC EDX JNZ A JMP D B: CALL Y C: INC EDX JNZ A D: MOV ESP, EBP POP EBP RETThe most often executed branch should come first here. The jump to D is outside the loop and therefore less critical. If this jump is executed so often that it needs optimizing too, then replace it with the three instructions following D.

标签: MMX 优化

 文章评论
目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓

发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填 (Twitter 用户可输入以 @ 开头的用户名, Steemit 用户可输入 @@ 开头的用户名)
E-mail: 可选 (不会被公开。如果我回复了你的评论,你将会收到邮件通知)
网站 / Blog: 可选
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
6 x 3 + 2 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 
首页 随笔 乐走天涯 猎户星 Google Earth 程序资料 程序生活 评论 Tag 论坛 资源 搜索 联系 关于 隐私声明 版权声明 订阅邮件

程序员小辉 建站于 1997 ◇ 做一名最好的开发者是我不变的理想。
Copyright © XiaoHui.com; 保留所有权利。