Each of the machines and architectures described above have both strong and weak point, as they have been optimized in different ways.
The shared and virtual shared memory architectures have been designed for easier software, and in particular, for easier porting of existing Fortran codes. It has, however, proved difficult to scale them to large systems, and retain good cost performance. However, many believe that new hardware developments may change this. This MPP trend to shared memory should be contrasted with the opposite tendency seen with metacomputing, which is probably the most rapidly growing area and clearly distributed memory.
The data parallel methodology described earlier fits well with distributed memory machines, but substantial reworking of software is needed so that compilers can exploit the inherent parallelism of the problem. However, distributed memory machines are clearly scalable to very large systems, and with the appropriate software, the vast majority of large-scale problems will run on them. The trade-off between SIMD and MIMD is also reasonably well understood in terms of a problem classification introduced by Fox. Regular applications, such as the matrix operations seen in Figure 6 are suitable for SIMD machines; MIMD computers can perform well on both these and the irregular problems typified by the particle dynamics simulation in Figure 5. We estimated in 1990 that roughly half of existing large supercomputer simulations could use SIMD efficiently with the other half needing the extra flexibility of the MIMD architecture. The increasing interest in adaptive irregular algorithms which require MIMD systems, is decreasing the relevance of SIMD machines.
The hardware and software are evolving so as to integrate the various architectures. In the future, the user will hopefully be presented with a uniform interface to the different parallel machines. Although it is not clear that MPPs and heterogenous metacomputers can effectively be supported with the same software model. One will be able to make choices, as for conventional machines, based on the parallel computer's performance on one's application mix. One will not be faced, as in the past, with radically different software environments on each design. Future architectural developments will improve performance by moving critical functionality, such as the illusion of shared memory, from software to hardware; this will offer the user increased performance from an unchanged software model.