首页 随笔 乐走天涯 程序资料 评论中心 Tag 论坛 其他资源 搜索 联系我 关于 RSS

16.1 Eliminating dependencies


日期: 2000-04-01 14:00 | 联系我 | 关注我: Telegram, Twitter

16. Register renaming (PPro, PII and PIII)

16.1 Eliminating dependencies

Register renaming is an advanced technique used by these microprocessors to remove dependencies between different parts of the code. Example:

MOV EAX, [MEM1] IMUL EAX, 6 MOV [MEM2], EAX MOV EAX, [MEM3] INC EAX MOV [MEM4], EAX

Here the last three instructions are independent of the first three in the sense that they don't need any result from the first three instructions. To optimize this on earlier processors you would have to use a different register instead of EAX in the last three instructions and reorder the instructions so that the last three instructions could execute in parallel with the first three instructions. The PPro, PII and PIII processors do this for you automatically. They assign a new temporary register for EAX every time you write to it. Thereby the MOV EAX,[MEM3] instruction becomes independent of the preceding instructions. With out-of-order execution it is likely to finish the move to [MEM4] before the slow IMUL instruction is finished.

Register renaming goes fully automatically. A new temporary register is assigned as an alias for the permanent register every time an instruction writes to this register. An instruction that both reads and writes a register also causes renaming. For example the INC EAX instruction above uses one temporary register for input and another temporary register for output. This does not remove any dependency, of course, but it has some significance for subsequent register reads as I will explain later.

All general purpose registers, stack pointer, flags, floating point registers, MMX registers, XMM registers and segment registers can be renamed. Control words, and the floating point status word cannot be renamed and this is the reason why the use of these registers is slow. There are 40 universal temporary registers so it is unlikely that you will run out of temporary registers.

A common way of setting a register to zero is XOR EAX,EAX or SUB EAX,EAX. These instructions are not recognized as independent of the previous value of the register. If you want to remove the dependency on slow preceding instructions then use MOV EAX,0.

Register renaming is controlled by the register alias table (RAT) and the reorder buffer (ROB). The uops from the decoders go to the RAT via a queue, and then to the ROB and the reservation station. The RAT can handle only 3 uops per clock cycle. This means that the overall throughput of the microprocessor can never exceed 3 uops per clock cycle on average.

There is no practical limit to the number of renamings. The RAT can rename three registers per clock cycle, and it can even rename the same register three times in one clock cycle.

标签: MMX 优化

 文章评论
目前没有任何评论.

↓ 快抢占第1楼,发表你的评论和意见 ↓

发表你的评论
如果你想针对此文发表评论, 请填写下列表单:
姓名: * 必填 (Twitter 用户可输入以 @ 开头的用户名, Steemit 用户可输入 @@ 开头的用户名)
E-mail: 可选 (不会被公开。如果我回复了你的评论,你将会收到邮件通知)
反垃圾广告: 为了防止广告机器人自动发贴, 请计算下列表达式的值:
5 x 3 + 5 = * 必填
评论内容:
* 必填
你可以使用下列标签修饰文字:
[b] 文字 [/b]: 加粗文字
[quote] 文字 [/quote]: 引用文字

 
首页 随笔 乐走天涯 猎户星 Google Earth 程序资料 程序生活 评论 Tag 论坛 资源 搜索 联系 关于 隐私声明 版权声明 订阅邮件

程序员小辉 建站于 1997 ◇ 做一名最好的开发者是我不变的理想。
Copyright © XiaoHui.com; 保留所有权利。