首页 随笔 乐走天涯 程序资料 评论中心 Tag 论坛 其他资源 搜索 联系我 关于 RSS

19.2 Partial flags stalls


日期: 2000-04-01 14:00 | 联系我 | 关注我: Telegram, Twitter

19.2 Partial flags stalls

The flags register can also cause partial register stalls:

CMP EAX, EBX INC ECX JBE XX ; partial flags stall

The JBE instruction reads both the carry flag and the zero flag. Since the INC instruction changes the zero flag, but not the carry flag, the JBE instruction has to wait for the two preceding instructions to retire before it can combine the carry flag from the CMP instruction and the zero flag from the INC instruction. This situation is likely to be a bug rather than an intended combination of flags. To correct it change INC ECX to ADD ECX,1. A similar bug that causes a partial flags stall is SAHF / JL XX. The JL instruction tests the sign flag and the overflow flag, but SAHF doesn't change the overflow flag. To correct it, change JL XX to JS XX.

Unexpectedly (and contrary to what Intel manuals say) you also get a partial flags stall after an instruction that modifies some of the flag bits when reading only unmodified flag bits:

CMP EAX, EBX INC ECX JC XX ; partial flags stall

but not when reading only modified bits:

CMP EAX, EBX INC ECX JE XX ; no stall

Partial flags stalls are likely to occur on instructions that read many or all flags bits, i.e. LAHF, PUSHF, PUSHFD. The following instructions cause partial flags stalls when followed by LAHF or PUSHF(D): INC, DEC, TEST, bit tests, bit scan, CLC, STC, CMC, CLD, STD, CLI, STI, MUL, IMUL, and all shifts and rotates. The following instructions do not cause partial flags stalls: AND, OR, XOR, ADD, ADC, SUB, SBB, CMP, NEG. It is strange that TEST and AND behave differently while, by definition, they do exactly the same thing to the flags. You may use a SETcc instruction instead of LAHF or PUSHF(D) for storing the value of a flag in order to avoid a stall.

Examples:

INC EAX / PUSHFD ; stall ADD EAX,1 / PUSHFD ; no stall SHR EAX,1 / PUSHFD ; stall SHR EAX,1 / OR EAX,EAX / PUSHFD ; no stall TEST EBX,EBX / LAHF ; stall AND EBX,EBX / LAHF ; no stall TEST EBX,EBX / SETZ AL ; no stall CLC / SETZ AL ; stall CLD / SETZ AL ; no stall

The penalty for partial flags stalls is approximately 4 clocks.

标签: MMX 优化

 文章评论
目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓

发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填 (Twitter 用户可输入以 @ 开头的用户名, Steemit 用户可输入 @@ 开头的用户名)
E-mail: 可选 (不会被公开。如果我回复了你的评论,你将会收到邮件通知)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
5 x 2 + 2 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 
首页 随笔 乐走天涯 猎户星 Google Earth 程序资料 程序生活 评论 Tag 论坛 资源 搜索 联系 关于 隐私声明 版权声明 订阅邮件

程序员小辉 建站于 1997 ◇ 做一名最好的开发者是我不变的理想。
Copyright © XiaoHui.com; 保留所有权利。