Models: Understanding and Interoperability

Broadly speaking, parallel computing can be classified into three programming models:
data parallel, task parallel, and object parallel.

In data parallelism concurrency is achieved with a single thread of control operating over 
elements of large distributed data structures such as arrays. Implementation of a 
data parallel program can be either SIMD or SPMD. In addition to entirely new data
parallel languages, data parallel extensions exist for various dialects of Fortran, C, and Lisp.
For example, Fortran 90, with operations over arrays and array sections, can be
considered as a data parallel language. For a data parallel program to run
efficiently on large distributed memory SIMD and MIMD machines, data should
reside in memories as close as possible to the processors that will need them.

For task parallelism, a program is partitioned into cooperating tasks. These taskes can be quite different from one another, execute asynchronously, and use a variety of techniques for
synchronizing with each other. Many languages support some degree of both data and
task parallelism. Locality is once again an issue in task parallelism on most architectures.
Language constructs are necessary to allow the user to exploit fine-grain and coarse-grain
parallelism, as well as levels in between.

The problems of locality and concurrency also affect parallel object-oriented programs. 
Parallelism in object-oriented languages is achieved in different ways. For exmaple,
(1) a member function of one object can call a public member function of another (remote)
object and (2) an abstract data type, such as an array, can be implemented in a distributed manner.

This partitioning of programming models into three categories is incomplete and fuzzy; it is intended to server only as rough guideline. Furthermore, a finer categorization - for instnace partitioning task parallelism
into functional programming, communicating processes, etc.

Understanding the choice. Our treatment of the various compuing  models in some ways mirrors the situation with parallel languages
today: a number of good ideas have been proposed, and some implemented, but little or no effort has been made to provide an organizational 
framework which would help the user decide on the appropriate model for a particular application - in fact, it is likely that a combination of methods is required. Such a framework would form the foundation for a methodology for parallel programming and would begin to elucidate ways in whcih paradigms might profitably be
integrated.

Interoperability of Models. For many applications, a combinations of paradigms will be the most effective to solve the problem. This could either be achieved in a single language (for example, by including both data layout directives and task oriented extensions in same language) or by making it easier for the user to use different language/models for different parts of an application. In fact, it is very desirable to standardize parameter passing data descriptors, and calling conventions, so that different languages can communicate. The various computing models must also deal issues such as  concurrency specification, parallel I/O and exception handling in a consistent fashion. 

The current status of parallel programming paradigms
1. No single paradigm is 'the answer', even within a single program. The current
software provided by vendors represents different programming paradigms:
data parallel, basic shared memory, and message passing. There is no consensus
among GC applications people on the best model to use. To some extent the
choice is based on taste, but it is also based on specific application
domain considerations, on software availability, and on different types of 
functionality found within a given program. It is noted that there is a 
hierarchy in the choice of programming paradigms. Users express a definite 
prederence for general, portable, high-level languages such as C++ for portions
of programs associated with preprocessing, funcitonal control, and 
postprocessing. When it comes to computationally intensive parts of programs,
however, applications researchers are most likely to use a language
for which the compilers are more efficient, such as Fortran. There is
a further tendency among researchers working on heavily used algorithms (e.g.,
FFT or linear equation solvers) to focus even more closely on efficiency
with explict message passing and even assembly language.

2. Currently, each vendor tends to promote only a single paradigm (matching its
architecture) and lacks the resources to give substantial support to multiple
paradigms. This situation is in contract to 1) above. Further more
the hardware technology cycle is faster than the robust tool development cycle,
leading to constant pressure on paradigm development.

3. Portability of applications is an important consideration. It is not so much
the case that the users expect to be moving their codes daily, but rather
that the underlying hardware is still in a state of flux. Users are uncomfortable
withj committing substantial optimization efforts to vendor-specific language
extensions, message-passing systems, or assembly languages when they expect
hardware changes.

4. Rich data and operation abstractions are getting increasing attention
because of the resulting code simplification and reuse. C++ is being used
by several rpojects, and there are reports of substantial productivity
gains due to the
higher level of abstraction.

Paradigm studies must include realistic industrial and comercial applications.
Achieving better performance with data abstraction. The need for data abstraction
is particularly pressing to researchers who solve problems with complex hierarchically defined
data structure such as structured adaptive grids or irregular tree-type
data structures found in fast multipole
algorithms. 

This integrated framework will solve:
1. Mehtods developed to allow mixing of different programming paradigms and 
languages in a single framework. Mixded programming paradigms should
be integrated into a unified framework.
Examples of such mixing are studies in Section 3.3 which include standard
ways of calling Fortran from C, invoking message-passing code from HPF, or
coupling a heterogeneous set of data parallel programs. Researchers 
interested in multidisciplinary problems also need to have large-grained, process-level 
paralleism with subordinate data-parallel modules.

2. Widely available, modular components for software tools should be 
developed to speed the implementation of new programming paradigms.