Access the full text.
Sign up today, get DeepDyve free for 14 days.
R. Teodorescu, Jun Nakano, Abhishek Tiwari, J. Torrellas (2007)
Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)
M. Galles (1997)
Spider: a high-speed network interconnectIEEE Micro, 17
Niket Agarwal, L. Peh, N. Jha (2009)
In-Network Snoop Ordering (INSO): Snoopy coherence on unordered interconnects2009 IEEE 15th International Symposium on High Performance Computer Architecture
Milo Martin, Daniel Sorin, Bradford Beckmann, Michael Marty, Min Xu, Alaa Alameldeen, Kevin Moore, M. Hill, D. Wood (2005)
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolsetSIGARCH Comput. Archit. News, 33
Coherence Ordering for Ring-based Chip Multiprocessors
L. Barroso, K. Gharachorloo, Robert McNamara, A. Nowatzyk, S. Qadeer, B. Sano, Scott Smith, R. Stets, Ben Verghese (2000)
Piranha: a scalable architecture based on single-chip multiprocessingProceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201)
Ricardo Pascual, José García, M. Acacio, J. Duato (2007)
A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures2007 IEEE 13th International Symposium on High Performance Computer Architecture
W. Dally, Brian Towles (2004)
Principles and Practices of Interconnection Networks
P. Sweazey, A. Smith (1986)
A class of compatible cache consistency protocols and their support by the IEEE futurebus
Alan Charlesworth (1998)
Starfire: extending the SMP envelopeIEEE Micro, 18
Natalie Jerger, Mikko Lipasti, L. Peh (2008)
Circuit-Switched CoherenceSecond ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008)
Michael Marty, Jesse Bingham, M. Hill, A. Hu, Milo Martin, D. Wood (2005)
Improving multiple-CMP systems using token coherence11th International Symposium on High-Performance Computer Architecture
A. Meixner, Daniel Sorin (2007)
Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures2007 IEEE 13th International Symposium on High Performance Computer Architecture
Milo Martin, Daniel Sorin, A. Ailamaki, Alaa Alameldeen, R. Dickson, Carl Mauer, Kevin Moore, Manoj Plakal, M. Hill, D. Wood (2000)
Timestamp snooping: an approach for extending SMPs
K. Strauss, Xiaowei Shen, J. Torrellas (2006)
Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors33rd International Symposium on Computer Architecture (ISCA'06)
J. Laudon, D. Lenoski (1997)
The SGI Origin: A ccnuma Highly Scalable ServerConference Proceedings. The 24th Annual International Symposium on Computer Architecture
Ender Bilir, Ross kson, Ying Hu, Manoj Plakal, Daniel Sorin, M. Hill, Da Wood (1999)
Multicast snooping: a new coherence method using a multicast address networkProceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367)
Arun Raghavan, Colin Blundell, Milo Martin (2008)
Token tenure: PATCHing token counting using directory-based cache coherence2008 41st IEEE/ACM International Symposium on Microarchitecture
Article 6, Pub. date: September 2010
Natalie Jerger, L. Peh, Mikko Lipasti (2008)
Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support2008 International Symposium on Computer Architecture
B. Cuesta, A. Robles, J. Duato (2007)
An Effective Starvation Avoidance Mechanism to Enhance the Token Coherence Protocol15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing (PDP'07)
Milo Martin, M. Hill, D. Wood (2003)
Token Coherence: A New Framework for Shared-Memory MultiprocessorsIEEE Micro, 23
Liqun Cheng, J. Carter, Donglai Dai (2007)
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing2007 IEEE 13th International Symposium on High Performance Computer Architecture
K. Gharachorloo, Madhu Sharma, S. Steely, Stephen Doren (2000)
Architecture and design of AlphaServer GS320
Joel Tendler, J. Dodson, J. Fields, Hung Le, B. Sinharoy (2002)
POWER4 system microarchitectureIBM J. Res. Dev., 46
Anoop Gupta, W. Weber, T. Mowry (1990)
Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes
Milo Martin, Daniel Sorin, M. Hill, D. Wood (2002)
Bandwidth adaptive snoopingProceedings Eighth International Symposium on High Performance Computer Architecture
R. Alaa, Alameldeen, K. MiloM., Martin, Carl Mauer, E. Kevin, Moore, Min Xu, Mark Hill, A. David, Wood, J. Daniel, Sorin (2001)
Simulating a $ 2 M Commercial Server on a $ 2 K PC T
Milo Martin, P. Harper, Daniel Sorin, M. Hill, D. Wood (2003)
Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors30th Annual International Symposium on Computer Architecture, 2003. Proceedings.
S. Woo, Moriyoshi Ohara, Evan Torrie, J. Singh, Anoop Gupta (1995)
The SPLASH-2 programs: characterization and methodological considerationsProceedings 22nd Annual International Symposium on Computer Architecture
Alberto Ros, M. Acacio, José García (2008)
DiCo-CMP: Efficient cache coherency in tiled CMP architectures2008 IEEE International Symposium on Parallel and Distributed Processing
Peter Magnusson, M. Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hållberg, Johan Högberg, F. Larsson, Andreas Moestedt, Bengt Werner (2002)
Simics: A Full System Simulation PlatformComputer, 35
Milo Martin, M. Hill, D. Wood (2003)
Token Coherence: decoupling performance and correctness30th Annual International Symposium on Computer Architecture, 2003. Proceedings.
M. Acacio, José González, José García, J. Duato (2002)
The use of prediction for accelerating upgrade misses in cc-NUMA multiprocessorsProceedings.International Conference on Parallel Architectures and Compilation Techniques
Erik Hagersten, M. Koster (1999)
WildFire: a scalable path for SMPsProceedings Fifth International Symposium on High-Performance Computer Architecture
M. Acacio, José González, José García, J. Duato (2002)
Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA ArchitectureACM/IEEE SC 2002 Conference (SC'02)
Traditional coherence protocols present a set of difficult trade-offs: the reliance of snoopy protocols on broadcast and ordered interconnects limits their scalability, while directory protocols incur a performance penalty on sharing misses due to indirection. This work introduces Patch (Predictive/Adaptive Token-Counting Hybrid), a coherence protocol that provides the scalability of directory protocols while opportunistically sending direct requests to reduce sharing latency. Patch extends a standard directory protocol to track tokens and use token-counting rules for enforcing coherence permissions. Token counting allows Patch to support direct requests on an unordered interconnect, while a mechanism called token tenure provides broadcast-free forward progress using the directory protocol's per-block point of ordering at the home along with either timeouts at requesters or explicit race notification messages. Patch makes three main contributions. First, Patch introduces token tenure, which provides broadcast-free forward progress for token-counting protocols. Second, Patch deprioritizes best-effort direct requests to match or exceed the performance of directory protocols without restricting scalability. Finally, Patch provides greater scalability than directory protocols when using inexact encodings of sharers because only processors holding tokens need to acknowledge requests. Overall, Patch is a “one-size-fits-all” coherence protocol that dynamically adapts to work well for small systems, large systems, and anywhere in between.
ACM Transactions on Architecture and Code Optimization (TACO) – Association for Computing Machinery
Published: Sep 1, 2010
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.