ABSTRACT

Known polylog parallel algorithms for the solution of linear systems and related problems require computation of the char- acteristic polynomial or related forms, which are known to be highly unstable in practice. However, matrix factorizations of various types, bypassing computation of the characteris- tic polynomial, are used extensively in sequential numerical computations and are essential in many applications. This paper gives new parallel methods for various exact factorizations of several classes of matrices. We assume the input matrices are n x n with either integer entries of size 2no(1) or rational entries expressible as a ratio of integers of size 2n(1). We make no other assumption on the input. We assume the arithmetic PRAM model of parallel computation. Our main result is a reduction of the known parallel time bounds for these factorizations from O(log3 n) to O(log n). Our results are work efficient; we match the best known work bounds of parallel algorithms with polylog time bounds, and are within a log n factor of the work bounds for the best known sequential algorithms for the same problems. The exact factorizations we compute for symmetric posi- tive definite matrices include: 1. recursive factorization sequences and trees, 2. LU factorizations, 3. QR factorizations, and 4. reduction into upper Hessenberg form. The classes of matrices for which we can efficiently compute these factorizations include: 1. dense matrices, in time O(logn) with processor bound P(n) (the number of processors needed to multiply two nx n matrices in O(log n) time), 2. block diagonal matrices, in time O(log2 b) with P(b)n/b processors, 3. sparse matrices which are s(n)-s torizations only), in time O(log cessors where s(n) is of the for and 4. banded matrices, in parallel tir P(b)n/b processors. Our factorizations also provide us rithms for exact computation (given trices that need not be symmetric following: 1. solution of the corresponding li 2. the determinant, 3. the inverse. Thus our results provide the first k gorithms for exact solution of thes avoids computation of the characteri forms. Instead we use a constructio put matrix, which may initially ha as to have condition nearly 1, and pipelined Newton iteration, followe pipelined Hensel Lifting. Keywords: parallel algorithms torization, dense matrices, sparse m banded matrices.