Late-binding: enabling unordered load-store queues

Simha Sethumadhavan; Franziska Roesner; Joel S. Emer; Doug Burger; Stephen W. Keckler

doi:10.1145/1273440.1250705

Sethumadhavan, Simha; Roesner, Franziska; Emer, Joel S.; Burger, Doug; Keckler, Stephen W.

2007-06-09 00:00:00

Conventional load/store queues (LSQs) are an impediment to both power-efficient execution in superscalar processors and scaling tolarge-window designs. In this paper, we propose techniques to improve the area and power efficiency of LSQs by allocating entries when instructions issue ("late binding"), rather than when they are dispatched. This approach enables lower occupancy and thus smaller LSQs. Efficient implementations of late-binding LSQs, however, require the entries in the LSQ to be unordered with respect to age. In this paper, we show how to provide full LSQ functionality in an unordered design with only small additional complexity and negligible performance losses. We show that late-binding, unordered LSQs work well for small-window superscalar processors, but can also be scaled effectively to large, kilo-window processors by breaking the LSQs into address-interleaved banks. To handle the increased overflows, we apply classic network flow control techniques to the processor micronetworks, enabling low-overhead recovery mechanisms from bank overflows. We evaluate three such mechanisms: instruction replay, skid buffers, an dvirtual-channel buffering in the on-chip memory network. We show that for an 80-instruction window, the LSQ can be reduced to 32 entries. For a 1024-instruction window, the unordered, late-binding LSQ works well with four banks of 48 entries each. By applying a Bloom filter as well, this design achieves full hardware memory disambiguation for a 1,024 instruction window while requiring low average power per load and store access of 8 and 12 CAM entries, respectively.

http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

ACM SIGARCH Computer Architecture News Association for Computing Machinery

http://www.deepdyve.com/lp/association-for-computing-machinery/late-binding-enabling-unordered-load-store-queues-I0l0l5HAHF

Late-binding: enabling unordered load-store queues

Loading next page...

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher: Association for Computing Machinery
Subject: Packet-switching networks
ISSN: 0163-5964
DOI: 10.1145/1273440.1250705
Publisher site: See Article on Publisher Site

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Late-binding: enabling unordered load-store queues

Late-binding: enabling unordered load-store queues

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Late-binding: enabling unordered load-store queues

Late-binding: enabling unordered load-store queues

References

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies