小辉程序员之路, since 1996 http://www.xiaohui.com
乐走天涯: 工作并快乐着,职业并休闲着
 » 首页 > MMX 优化: How to optimize for the Pentium family of microprocessors

19.4 Partial memory stalls


http://www.XiaoHui.com 日期: 2000-04-01 13:00

19.4 Partial memory stalls

A partial memory stall is somewhat analogous to a partial register stall. It occurs when you mix data sizes for the same memory address:

MOV BYTE PTR [ESI], AL MOV EBX, DWORD PTR [ESI] ; partial memory stall

Here you get a stall because the processor has to combine the byte written from AL with the next three bytes, which were in memory before, to get the four bytes needed for reading into EBX. The penalty is approximately 7-8 clocks.

Unlike the partial register stalls, you also get a partial memory stall when you write a bigger operand to memory and then read part of it, if the smaller part doesn't start at the same address:

MOV DWORD PTR [ESI], EAX MOV BL, BYTE PTR [ESI] ; no stall MOV BH, BYTE PTR [ESI+1] ; stall

You can avoid this stall by changing the last line to MOV BH,AH, but such a solution is not possible in a situation like this:

FISTP QWORD PTR [EDI] MOV EAX, DWORD PTR [EDI] MOV EDX, DWORD PTR [EDI+4] ; stall

Interestingly, you can also get a partial memory stall when writing and reading completely different addresses if they happen to have the same set-value in different cache banks:

MOV BYTE PTR [ESI], AL MOV EBX, DWORD PTR [ESI+4092] ; no stall MOV ECX, DWORD PTR [ESI+4096] ; stall

Tags: MMX 优化 | Memory



 文章评论

目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓
 
发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填
E-mail: 可选 (不会被公开)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
4 + 18 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 

小辉程序员之路 建站于 1997 ◇ 做一名最好的开发者是我不变的理想……
Copyright(C) 1997-2009 XiaoHui.com   All rights reserved
声明:站内所有原创文字,未经许可,均可转载、复制。
转载时必须以链接形式注明作者和原始出处