Why use Process Groups?
We find a good example when we consider typical Matrix Algorithm
- (matrix multiplication)
- A i,j = ?k B i,k C k,j
- summed over k'th column of B and k'th row of C
Consider a block decomposition of 16 by 16 matrices B and C as for Laplace's equation. (Efficient Decomposition as we will see later)
Each sum operation involves a subset(group) of 4 processors