Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Interlock collapsing ALU for increased instruction-level parallelism

Interlock collapsing ALU for increased instruction-level parallelism Interlock Collapsing ALU for Increased Instruction-Level Parallelism Nadeem Malik IBM Corporation, Richard J. Eickemeyer IBM Corporation, Rochester, MN 55901 Endicott, NY 13760 Stamatis Vassiliadis+ IBM Corporation, Pougbkeepsie, NY 12602 Abstract Simulation results are presented for machine implementations using a novel integer ALL7 design that allows parallel execution, in a single cycle, of two interlocked instructions, which because of true data dependency must normally be executed sequentially. This parallel execution is achieved by collapsing the execution interlocks between integer ALU operations as well as between address generation operations, but without increasing the cycle time of the base implementation. Results demonstrate that in integer benchmarks, this new design can provide an overall increase in instruction-level parallelism of more than 7% and up to 19% for the out-of-order and in-order instruction issue, respectively. a. b. Rl := O[R9] R2 := Rl+R3 c. 4[R2] := R6 Figure 1: Instruction sequence with interlocks. 1. Introduction Instruction-level parallelism (ILP) is generally limited by true data dependencies, which arise because of the dependence of an instruction on the result of a preceding instruction [ 11. This occurs because of the interlocks that must be enforced between the instructions to avoid the resulting read-after-write (RAW) data http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM SIGMICRO Newsletter Association for Computing Machinery

Interlock collapsing ALU for increased instruction-level parallelism

Loading next page...
 
/lp/association-for-computing-machinery/interlock-collapsing-alu-for-increased-instruction-level-parallelism-pBdvcW3MzL

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 1992 by ACM Inc.
ISSN
1050-916X
DOI
10.1145/144965.145794
Publisher site
See Article on Publisher Site

Abstract

Interlock Collapsing ALU for Increased Instruction-Level Parallelism Nadeem Malik IBM Corporation, Richard J. Eickemeyer IBM Corporation, Rochester, MN 55901 Endicott, NY 13760 Stamatis Vassiliadis+ IBM Corporation, Pougbkeepsie, NY 12602 Abstract Simulation results are presented for machine implementations using a novel integer ALL7 design that allows parallel execution, in a single cycle, of two interlocked instructions, which because of true data dependency must normally be executed sequentially. This parallel execution is achieved by collapsing the execution interlocks between integer ALU operations as well as between address generation operations, but without increasing the cycle time of the base implementation. Results demonstrate that in integer benchmarks, this new design can provide an overall increase in instruction-level parallelism of more than 7% and up to 19% for the out-of-order and in-order instruction issue, respectively. a. b. Rl := O[R9] R2 := Rl+R3 c. 4[R2] := R6 Figure 1: Instruction sequence with interlocks. 1. Introduction Instruction-level parallelism (ILP) is generally limited by true data dependencies, which arise because of the dependence of an instruction on the result of a preceding instruction [ 11. This occurs because of the interlocks that must be enforced between the instructions to avoid the resulting read-after-write (RAW) data

Journal

ACM SIGMICRO NewsletterAssociation for Computing Machinery

Published: Dec 10, 1992

There are no references for this article.