NPAC Technical Report SCCS-688

Parallelization of Hierarchically Structured Applications on Distributed Memory Machines

Sanjay Goil, Sanjay Ranka

Submitted January 31 1995


Abstract

(Preliminary version/ Under Revision)


In this paper we address a class of problems consisting of highly structured computations on data sets that are described by hierarchical data structures. These are often represented as {\em tree} structures to optimize data storage requirements and perform efficient queries for data access. Specifically, applications that are dynamic and perform many iterations on data are of interest to us, since the requirements for data evolve over time and require modifying data structures incrementally. The computational relationship between the subdomains is known only at runtime, and may change between computation phases. Parallelization of such applications requires efficient distributed data management and has received some attention recently. We study the computational structure of a few irregular applications falling in this category. This helps us to discuss the needs of choosing an appropriate data structure and the issues of data partitioning, load balancing and communication requirements for these applications. Most recent efforts have been application specific and solutions are not portable across applications or parallel computing platforms. We wish to characterize requirements for primitives and runtime software support needed to parallelize irregular applications. In this paper we present a detailed study of the computational structure of the following eight hierarchical applications: {\em N-body simulation, Molecular Dynamics, Hierarchical Radiosity, Volume Rendering and Ray . This allows us to identify requirements for parallelization across a broad spectrum of applications. Each of these applications uses a tree data structure. We enumerate tree algorithms that are used for solving these applications and discuss their parallelization. Next, we present an outline of common primitives that are required for these applications. Requirements for a runtime support system are presented for obtaining effective parallel solutions using the above primitives on coarse-grained distributed memory machines.


PostScript version of the paper