We start with a nostalgic note. The 1984 COMPCON conference was my first opportunity to discuss our early hypercube results from Caltech [1] based on the software and science applications we built for C. Seitz's 64-node Cosmic Cube which started ``production'' runs on Quantum Chromodynamics (QCD) in October, 1983. That first MIMD machine was only two megaflops performance -- ten times better than the VAX11/780 we were using at the time. However, the basic parallelization issues remain similar in the 1991 six gigaflop QCD implementation on the full size CM-2. What have we and others learned in the succeeding eight years while the parallel hardware has evolved impressively with in particular a factor of 3000 improvement in performance? There is certainly plenty of information! In 1989, I surveyed some four hundred papers describing parallel applications [2], [3] -- now the total must be over one thousand. A new complete survey is too daunting for me. However, my personal experience, and I believe the lesson of the widespread international research on message passing parallel computers, has a clear message.
The message passing computational model is very powerful and allows one to express essentially all large scale computations and execute them efficiently on distributed memory SIMD and MIMD parallel machines.
Less formally one can say that parallel computing works, or more controversially but accurately in my opinion that ``distributed memory parallel computing works''. In the rest of this paper, we will dissect this assertion and suggest that it has different implications for hardware, software and applications. Formally, we relate these as shown in Figure 1 by viewing computation as a series of maps. Software is an expression of the map of the problem onto the machine. In Section 2, we review a classification of problems described in more detail in [3], [4], [5], [6], [7], [8]. In the following three sections, we draw lessons for applications, hardware, and software and quantify our assertion above about message passing parallel systems.