NPAC Technical Report SCCS-688
Parallelization of Hierarchically Structured Applications on Distributed Memory Machines
Sanjay Goil, Sanjay Ranka
Submitted January 31 1995
Abstract
(Preliminary version/ Under Revision)
In this paper we address a class of problems consisting of highly structured
computations on data sets that are described by hierarchical data structures.
These are often represented as {\em tree} structures to optimize
data storage requirements and perform efficient queries for data access.
Specifically, applications that are dynamic and perform many iterations on data
are of interest to us, since the requirements for data evolve over time and require
modifying data structures incrementally. The computational relationship between the
subdomains is known only at runtime, and may change between computation phases.
Parallelization of such applications requires efficient distributed data management
and has received some attention recently.
We study the computational structure of a few irregular applications falling
in this category. This helps us to discuss
the needs of choosing an appropriate data structure and the issues of data
partitioning, load balancing and communication requirements for these applications.
Most recent efforts have been application specific and solutions are not
portable across applications or parallel computing platforms.
We wish to characterize requirements for primitives and runtime software support
needed to parallelize irregular applications.
In this paper we present a detailed study of the computational structure of
the following eight hierarchical applications:
{\em N-body simulation, Molecular Dynamics, Hierarchical Radiosity, Volume Rendering and Ray .
This allows us to identify requirements for parallelization across a broad spectrum
of applications. Each of these applications uses a tree data structure.
We enumerate tree algorithms that are used for solving these applications
and discuss their parallelization. Next, we present an outline of common
primitives that are required for these applications. Requirements for a runtime
support system are presented for obtaining effective parallel solutions
using the above primitives on coarse-grained distributed memory machines.