Monte Carlo methods Monte Carlo Methods are central to the numerical approachs to many fields (especially in physics and chemistry) and by their nature can take substantial computing resources. Note thatthe error in the computation only decreases like the square root of computer time used compared to the power convergence of most differential equation and particle dynamics based methods. One finds Monte Carlo methods when problems are posed as integral equations and the often high dimension integrals are solved by Monte Carlo methods using a randomly distributed set of integration points. Quantum Chromodynamics (QCD) Simulations as a classic example of large scale Monte Carlo simulations suitable for parallel machines. As described in Chapter 4 of PCW, this application is straightforward to parallelize and very suitable for HPF as the basic data structure is an array. The array represents a regular structure in space time as seen in the simplest finite different problems. The Monte Carlo occurs at each grid point and is typically local (nearest neighbor) so that the overall problem architecture is just like that of a partial differential equation(PDE). This specific computation is from an academic field, but is typical in structure of some practical material science problems. Further, just as many PDEs have irregular data structures, the same is true of many Monte Carlos. QCD is typical of simulations of crystalline substances with a regular array of atoms. However, many substances---in particular gases and liquids---have irregular particle distributions and many of issues discussed briefly in chapter IIIA for finite element methods. As described in Chapter 14 of PCW, there is a subtle point that distinguishes Monte Carlo and PDE algorithms as one cannot simultaneously update in Monte Carlo, sites with overlapping neighbors. This complicates the loosely synchronous structure and can make problem architecture look like that of asynchronous event driven simulations---here events are individual Monte Carlo updates. ``Detailed balance'' requires that such events be sequentially (if arbitrarily) ordered. In the example of (Johnson:86c) described in PCW, a clever implementation gave good parallel performance in spite of the asynchronous structure. Many special purpose machines have been built for Monte Carlo based problems -- especially the regular QCD like cases which have special features that make "conventional parallel machines overkill". The intense computational load of QCD implies a different trade-off between computational power and memory and communication bandwidth is appropriate. Often the codes are quite modest (say 10,000 lines of code or less) and so a rich software environment is not necessary and one can optimize in hardware or software inner loops. Recent specialized machines of note include those in Japan and from Norman Christ's group at Columbia University. Monte Carlo methods can be implemented quite differently---above we decomposed the underlying physical data. One can also use ``data parallelism'' on the random number set used in the simulation. This is not possible for QCD for two reasons. Firstly, the physical dataset is so large it would not fit in the memory of a single node---we need to decompose the physical dataset just to get enough total memory. More importantly, one can run QCD with several different starting points. However, all Monte Carlos---using importance sampling of the Metropolis type employed by QCD---have a ``thermalization stage'' where one must get ``into equilibrium'' before the sampling is useful. Thermalization is very time consuming for QCD and makes multiple starting points of limited value. However, there are many cases where this is not true, and as shown in Chapter 7 of PCW, one can get an embarrassing parallel architecture for Monte Carlo problems. Each instance of the problem has the full physical dataset, but can be run independently with different random number streams. Like many such pleasingly parallel cases, the different instances do need to accumulate their data---in this case, Monte Carlo averages. One important examples of this class of application is Quantum Monte Carlo used in many ab initio chemistry problems (Kalos:85a). Yet, a different set of issues comes with a class of Monte Carlo problems which are termed ``clustered.'' In most physical system Monte Carlos, one updates a single ``entity'' (grid point or particle) at a time. This is very ineffective when there is substantial correlation between neighboring points. A simple example comes from ferromagnetic materials where domains form where spins are locked in the same direction over large regions., Clustering algorithms are quite hard to find for sequential systems, and their parallelization is challenging and very different from the earlier examples. As discussed in Section 12.6 of PCW, the algorithm is similar to that used in region finding in image processing (Copty:95a). Parallelism requires consideration (as in domain decomposition for PDEs) of inter and intra region issues. As in most computational domains, considerable application expertise is needed to generate realistic codes -- computer scientists cannot "just" implement the basic physics equations. In many sparse matrix approaches to PDE's one uses preconditioners to improve the convergence of iterative methods such as conjugate gradient. Preconditioners can be thought of as reducing the modulus of the largest eigenvalue of the iteration matrix. In the case of Monte Carlo methods one uses "importance sampling" to improve the convergence. This is essentially equivalent to changing variables in the numerical integrals to reduce the standard deviation of the integrand where we remember that the error is proportional to As described in a separate case study Monte Carlo methods are the most powerful approach to many economic modelling problems although some of the more interesting derivative options need sophisticated extensions to the basic technique. This field shows a generalization to metaproblems when pricing a full portfolio consisting of many separate financial instruments. References Paul Coddington, NHSE Article on Parallel Random Number Generation QCD Japan Christ Machine PCW Copty:95a Kalos:85a Metropolis