next up previous
Next: Computational Chemistry and Up: General HPCC Hardware Previous: General HPCC Hardware

Monte Carlo Methods (Applications 4, 6, 7, 9, 11)

We have already mentioned in Section gif, Quantum Chromodynamics Simulations as a classic example of large scale Monte Carlo simulations suitable for parallel machines. As described in Chapter 4 of [Fox:94a], this application is straightforward to parallelize and very suitable for HPF as the basic data structure is an array. The array represents a regular structure in space time as seen in the simplest finite different problems. The Monte Carlo occurs at each grid point and is typically local (nearest neighbor) so that the overall problem architecture is just like that of a PDE. This specific computation is from an academic field, but is typical in structure of some practical material science problems. Further, just as many PDEs have irregular data structures, the same is true of many Monte Carlos. QCD is typical of simulations of crystalline substances with a regular array of atoms. However, many substances---in particular gases and liquids---have irregular particle distributions and many of issues discussed briefly in Section gif for finite element methods. As described in Chapter 14 of [Fox:94a], there is a subtle point that distinguishes Monte Carlo and PDE algorithms as one cannot simultaneously update in Monte Carlo, sites with overlapping neighbors. This complicates the loosely synchronous structure and can make problem architecture look like that of asynchronous event driven simulations---here events are individual Monte Carlo updates. ``Detailed balance'' requires that such events be sequentially (if arbitrarily) ordered. In the example of [Johnson:86c] described in [Fox:94a], a clever implementation gave good parallel performance.

Monte Carlo methods can be implemented quite differently---above we decomposed the underlying physical data. One can also use ``data parallelism'' on the random number set used in the simulation. This is not possible for QCD for two reasons. Firstly, the physical dataset is so large it would not fit in the memory of a single node---we need to decompose the physical dataset just to get enough total memory. More importantly, one can run QCD with several different starting points. However, all Monte Carlos---using importance sampling of the Metropolis type employed by QCD---have a ``thermalization stage'' where one must get ``into equilibrium'' before the sampling is useful. Thermalization is very time consuming for QCD and makes multiple starting points of limited value. However, there are many cases where this is not true, and as shown in Chapter 7 of [Fox:94a], one can get an embarrassing parallel architecture for Monte Carlo problems. Each instance of the problem has the full physical dataset, but can be run independently with different random number streams. Like many such embarrassingly parallel cases, the different instances do need to accumulate their data---in this case, Monte Carlo averages. One important examples of this class of application is Quantum Monte Carlo used in many ab initio chemistry problems [Kalos:85a].

Yet, a different set of issues comes with a class of Monte Carlo problems which are termed ``clustered.'' In most physical system Monte Carlos, one updates a single ``entity'' (grid point or particle) at a time. This is very ineffective when there is substantial correlation between neighboring points. A simple example comes from ferromagnetic materials where domains form where spins are locked in the same direction over large regions., Clustering algorithms are quite hard to find for sequential systems, and their parallelization is challenging and very different from the earlier examples. As discussed in Section 12.6 of [Fox:94a], the algorithm is similar to that used in region finding in image processing [Copty:93a;94a;95a]. Parallelism requires consideration (as in domain decomposition for PDEs) of inter and intra region issues.

Makivic [Makivic:95a] has described the implementation of sophisticated path integral Monte Carlos for economic modeling, in particular for option pricing. He describes an HPF implementation, but one would probably use simple parallel C++ (or even Java) in a production implementation, as parallelization is rather straightforward with an array holding the different Monte Carlo paths linked by global reduction operations. This field shows a generalization to metaproblems when pricing a full portfolio consisting of many separate financial instruments.



next up previous
Next: Computational Chemistry and Up: General HPCC Hardware Previous: General HPCC Hardware



Geoffrey Fox, Northeast Parallel Architectures Center at Syracuse University, gcf@npac.syr.edu