The LAPACK software project currently under development is intented to provide a portable lineal algebra library for high performance computers. LAPCK will make use of the level 1.2. and 3 blas to carry out basic operations. A principal focus of this project is to implement blocked versions of a number of algorithms to take advantage of the greater parallelism and improved data locality of the level 3 Blas. In this paper, we describe our work with variants of some of these algorithms and the performance data we have collected
|
|