NPAC Technical Report SCCS-673
Scalable BLAS 2 and 3 Matrix Multiplication for Sparse Banded Matrices on Distributed Memory MIMD Machines
Nikos Chrisochoides, Aboelaze Mokhtar, Elias Houstis, Catherine Houstis
Submitted December 11 1994
Abstract
In this paper, we present two algorithms for sparse banded matrix-vector
and sparse banded matrix-matrix product operations on distributed
memory multiprocessor systems
that support a mesh and ring interconnection topology.
We aslo study the scalability of these two algorithms.
We employ systolic
type techniques to eliminate synchronization delay and minimize the
communication overhead among processors. The performance of algorithms
developed for the above operations depends on the bandwidth of
the matrices involved and have been currently implemented on the
NCUBE II with 64 processors. Our preliminary experimental data
agree with the expected theoretical behavior.