For a data parallel Wolff algorithm, it is not possible to get the high efficiences that can be obtained for regular (e.g. Metropolis) algorithms, even using a cyclic data distribution. However highly irregular Monte Carlo methods, such as the Wolff algorithm, can be parallelized efficiently, since domain decomposition is not the only level of parallelism available for MC algorithms.
Remember that the basic idea of MC simulation is to approximate an extremely large or infinite sum (or large or infinite dimensional integral) by a finite sum over important configurations. The error is predominantly statistical, that is, proportional to , where N is the number of independent configurations. So here is another way the simulation can be done in parallel --- if different processors are generating independent configurations to add to the sum.