C++ implementations of numerical methods for solving differential-algebraic equations: design and optimization considerationsKees, Christopher E.; Miller, Cass T.
doi: 10.1145/332242.334001pmid: N/A
Object-oriented programming can produce improved implementations of complex numerical methods, but it can also introduce a performance penalty. Since computational simulation often requires intricate and highly efficient codes, the performance penalty of high-level techniques must always be weighed against the improvements they enable. These issues are addressed in a general object-oriented (OO) toolkit for the numerical solution of differential-algebraic equations (DAEs). The toolkit can be configured in several different ways to solve DAE initial-value problems with an adaptive multistep method. It contains a wrapped version of the Fortran 77 code DASPK and a translation of this to C++. Two C++ constructs for assembling the tools are provided, as are two implementations an important DAE test problem. Multiple configurations of the toolkit for DAE test problems are compared in order to assess the performance penalties of C++. The mathematical methods and implementation techniques are discussed in detail in order to provide heuristics for efficient OO scientific programming and to demonstrate the effectiveness of OO techniques in managing complexity and producing better code. The codes were tested on a variety of problems using publicly available Fortran 77 and C++ compilers. Extensive effficiency comparisons are presented in order to isolate computationally inefficient OO techniques. Techniques that caused difficulty in implementation and maintenance are also highlighted. The comparisons demonstrate that the majority of C++'s built-in support for OO programming has a negligible effect on performance, when used at sufficiently high levels, and provides flexible and extensible software for numerical methods.
C++ implementations of numerical methods for solving differential-algebraic equationsKees, Christopher E.; Miller, Cass T.
doi: 10.1145/332242.334001pmid: N/A
Object-oriented programming can produce improved implementations of complex numerical methods, but it can also introduce a performance penalty. Since computational simulation often requires intricate and highly efficient codes, the performance penalty of high-level techniques must always be weighed against the improvements they enable. These issues are addressed in a general object-oriented (OO) toolkit for the numerical solution of differential-algebraic equations (DAEs). The toolkit can be configured in several different ways to solve DAE initial-value problems with an adaptive multistep method. It contains a wrapped version of the Fortran 77 code DASPK and a translation of this to C++. Two C++ constructs for assembling the tools are provided, as are two implementations an important DAE test problem. Multiple configurations of the toolkit for DAE test problems are compared in order to assess the performance penalties of C++. The mathematical methods and implementation techniques are discussed in detail in order to provide heuristics for efficient OO scientific programming and to demonstrate the effectiveness of OO techniques in managing complexity and producing better code. The codes were tested on a variety of problems using publicly available Fortran 77 and C++ compilers. Extensive effficiency comparisons are presented in order to isolate computationally inefficient OO techniques. Techniques that caused difficulty in implementation and maintenance are also highlighted. The comparisons demonstrate that the majority of C++'s built-in support for OO programming has a negligible effect on performance, when used at sufficiently high levels, and provides flexible and extensible software for numerical methods.
A frontal code for the solution of sparse positive-definite symmetric systems arising from finite-element applicationsDuff, Iain S.; Scott, Jennifer A.
doi: 10.1145/332242.332243pmid: N/A
We describe the design, implementation, and performance of a frontal code for the solution of large sparse symmetric systems of linear finite-element equations. The code is intended primarily for positive-definite systems, since numerical pivoting is not performed. The resulting software package, MA62, will be included in the Harwell Subroutine Library. We illustrate the performance of our new code on a range of problems arising from real engineering and industrial applications. The performance of the code is compared with that of the Harwell Subroutine Library general frontal solver MA42 and with other positive-definite codes from the Harwell Subroutine Library.
A frontal code for the solution of sparse positive-definite symmetric systems arising from finite-element applicationsDuff, Iain S.; Scott, Jennifer A.
doi: 10.1145/332242.332243pmid: N/A
We describe the design, implementation, and performance of a frontal code for the solution of large sparse symmetric systems of linear finite-element equations. The code is intended primarily for positive-definite systems, since numerical pivoting is not performed. The resulting software package, MA62, will be included in the Harwell Subroutine Library. We illustrate the performance of our new code on a range of problems arising from real engineering and industrial applications. The performance of the code is compared with that of the Harwell Subroutine Library general frontal solver MA42 and with other positive-definite codes from the Harwell Subroutine Library.
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur formDackland, Krister; Kågström, Bo
doi: 10.1145/332242.332244pmid: N/A
A two-stage blocked algorithm for reduction of a regular matrix pair (A , B ) to upper Hessenberg-triangular form is presented. In stage 1 (A, B is reduced to block upper Hessenberg-triangular form using mainly level 3 (matrix-matrix) operations that permit data reuse in the higher levels of a memory hierarchy. In the second stage all but one of the r subdiagonals of the block Hessenberg A-part are set to zero using Givens rotations. The algorithm proceeds in a sequence of supersweeps, each reducing m columns. The updates with respect to row and column rotations are organized to reference consecutive columns of A and B. To further improve the data locality, all rotations produced in a supersweep are stored to enable a left-looking reference pattern, i.e., all updates are delayed until they are required for the continuation of the supersweep. Moreover, we present a blocked variant of the single-diagonal double-shift QZ method for computing the generalized Schur form of (A, B in upper Hessenberg-triangular form. The blocking for improved data locality is done similarly, now by restructuring the reference pattern of the updates associated with the bulge chasing in the QZ iteration. Timing results show that our new blocked variants outperform the current LAPACK routines, including drivers for the generalized eigenvalue problem, by a factor 2-5 for sufficiently large problems.
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur formDackland, Krister; Kågström, Bo
doi: 10.1145/332242.332244pmid: N/A
A two-stage blocked algorithm for reduction of a regular matrix pair ( A , B ) to upper Hessenberg-triangular form is presented. In stage 1 ( A, B is reduced to block upper Hessenberg-triangular form using mainly level 3 (matrix-matrix) operations that permit data reuse in the higher levels of a memory hierarchy. In the second stage all but one of the r subdiagonals of the block Hessenberg A -part are set to zero using Givens rotations. The algorithm proceeds in a sequence of supersweeps, each reducing m columns. The updates with respect to row and column rotations are organized to reference consecutive columns of A and B . To further improve the data locality, all rotations produced in a supersweep are stored to enable a left-looking reference pattern, i.e., all updates are delayed until they are required for the continuation of the supersweep. Moreover, we present a blocked variant of the single-diagonal double-shift QZ method for computing the generalized Schur form of ( A, B in upper Hessenberg-triangular form. The blocking for improved data locality is done similarly, now by restructuring the reference pattern of the updates associated with the bulge chasing in the QZ iteration. Timing results show that our new blocked variants outperform the current LAPACK routines, including drivers for the generalized eigenvalue problem, by a factor 2-5 for sufficiently large problems.
Characteristic spectra of the curvature functionalEdwards, John A.
doi: 10.1145/332242.332245pmid: N/A
A method is described for the eignevalues of piecewise smooth C2 extremum-energy curves. Typical interpolants are investigated within the framework of their eigensystems, and conclusions are presented concerning their natural modes of vibration, stability state, and limits of existence.In the present discussion the word spline means exclusively an interpolating elastica.