The reason we want to use cluster algorithms is that they offer such a large
reduction in computation time over the traditional Metropolis algorithm.
For example, for the Ising model on a lattice, the number of
Metropolis iterations required to generate an independent configuration is of
order 1000, whereas for the cluster algorithms it is of order 10, i.e., a
100-fold improvement in speed.
However, early attempts at running cluster algorithms on parallel supercomputers gave speed-ups only of order 10, whereas the Metropolis algorithm can easily have speed-ups of 100--1000 on vector and parallel supercomputers.
So on modern supercomputers, the improvement of the cluster algorithm is lost unless an efficient parallel algorithm can be found.
The first attempts at parallelizing irregular cluster algorithms got poor speedups compared to regular Metropolis algorithms. Figure taken from A.N. Burkitt and D.W. Heermann, ``Parallelization of a Cluster Algorithm'', Comp. Phys. Comm. 54, 210 (1989).