Since the Metropolis algorithm for spin models is local and regular, it should parallelize very efficiently, even for the Ising model, which has very little computation.
For a 2-d grid of
processors, use a (BLOCK,BLOCK) distribution of the
sites of a 2-d lattice over the processor grid so
that every processor has an
sub-lattice
.
Communication time
(# edge sites of sub-lattice,
i.e. perimeter).
Calculation time
(# sites of sub-lattice,
i.e. volume).
\
Thus, as long as l is large enough, the communication/calculation ratio will be small, and the efficiency will be near 1.