proceeding
LitStream Collection
doi: 10.1145/75395.75396pmid: N/A
There are several methods of executing programs written in a high level language. The most widely used is to compile the programs into machine language. Another is to translate the programs into some intermediate form and then to execute that form interpretively. A third method is to directly execute either the HLL or the intermediate form. This study was aimed at investigating the feasibility of directly executing the intermediate representation of the sequential features of Concurrent Euclid (CE) on the SEL 32/75 computer. The CE intermediate code was translated into Ecode, and a microprogrammed interpreter for Ecode was designed and implemented on the SEL, and benchmarked against the compiler. For the CPU-bound prime number algorithm Sieve of Eratosthenes, the interpreter was measured to be about twice as slow as the compiler, due primarily to poor overlap within microinstructions. Ecode was then modified, and a new translator and interpreter designed and implemented. The same benchmark then yielded comparable results for both the interpreter and compiler. We project that further changes in Ecode design and hardware support would result in substantial Ecode efficiency gains.
doi: 10.1145/75395.75397pmid: N/A
This paper discusses the advantages of using high-level languages in the development of microcode. It also describes reasons functional programming languages should be considered as the source language for microcode compilers. The emergence of parallel execution in microarchitectures dictates that parallelism must be extracted from the microcode programs. This paper shows how functional languages meet the needs of microprogrammers by allowing them to express their algorithms in natural ways while allowing the microcode compiler to extract the parallelism from the program.
Sibai, F. N.; Watson, L.; Lu, M.
doi: 10.1145/75395.75398pmid: N/A
Unification is known to be the most repeated operation in logic programming and PROLOG interpreters. To speed up the execution of logic programs, the performance of unification must be improved. We propose a parallel unification machine for speeding up the unification algorithm. The machine is simulated at the register transfer level and the simulation results as well as performance comparison with a serial unification coprocessor are presented.
doi: 10.1145/75395.75399pmid: N/A
This paper describes work in progress on the development of a direct execution Prolog processor.
doi: 10.1145/75395.75400pmid: N/A
Increasing the performance of application-specific processors by exploiting application-resident parallelism is often prohibited by costs; especially in the case of low-volume productions. The flexibility of horizontal-microcoded machines allows these costs to be reduced, but the flexibility often reduces efficiency. VLIW is a new and promising concept for the design of low-cost, high-performance parallel computer systems. We suggest that the VLIW concept can also be used as a basis for cost-effective design of application-specific processors which must exploit application-resident parallelism. The SCARCE (SCalable ARChitecture Experiment) framework, an approach for cost-effective design of application-specific processors, provides features which allow the design of retargetable VLIW architectures. However, a retargetable VLIW architecture is only effective if there is a retargetable VLIW compiler. Since a VLIW compiler is an essential part of the VLIW architecture, tradeoffs must be made between the variety of VLIW architectures and the compiler complexity. We suggest that limiting the flexibility of the retargetable VLIW architecture does not necessary reduce the application space. This paper discusses the issues related to the design of a retargetable VLIW processor architecture and compiler within the SCARCE framework.
doi: 10.1145/75395.75401pmid: N/A
Combining is a local compiler optimization technique that can enhance the performance of global compaction techniques for VLIW machines. Given two adjacent operations of a certain class that are flow (read-after-write) dependent and that cannot be placed in the same micro-instruction, the combining technique can transform the operations so that the modified operations have no dependence. The transformed operations can be executed in the same micro-instruction, thus allowing the total execution time of the program to be reduced. In this paper, combining a pair of flow-dependent operations into a wide instruction word is suggested as an important compilation technique for VLIW architectures. Combining is particularly effective with software pipelining and loop unrolling since combinable operations can come together with a higher probability when these compilation techniques are used. We implemented combining in our parallelizing compiler for the wide instruction word architecture, which is now being built at the IBM T. J. Watson Research Center. It is shown that ten percent speedup is obtained on the Stanford integer benchmarks and other sequential-matured C programs, in comparison to compaction techniques that do not use combining. For a class of inner loops, combining can remove the inter-iteration dependencies completely and can improve performance in the same ratio as the loop is unrolled.
Lenders, P. M.; Schröder, H.; Strazdins, P.
doi: 10.1145/75395.75402pmid: N/A
The instruction systolic array (ISA) is a programmable parallel architecture suitable for VLSI implementation. This paper presents a generalization of the ISA, called the microprogrammed ISA, which uses simple microprogramming techniques. Microprogrammed ISAs use dynamic microcodes whose length and contents are tailor made to the current program to be executed, and this can be efficiently implemented in VLSI. Here, microprogramming has the novel advantage of extending the range of algorithms that can be implemented on a given ISA. In particular, microprogramming can extend an ISA's effective communication abilities. Also, the reduction of the program input bandwidth (and pinout) afforded by microprogramming is even more important on large-scale MIMD architectures, such as the ISA. This paper also presents a weakest precondition semantics for the (microprogrammed) ISA model, which provides a means for verifying microprogrammed ISA programs. The semantics is modeled at the micro level, and has potential in the optimization of the microcodes of ISA programs.
Showing 1 to 10 of 32 Articles