We find a good example when we consider typical Matrix Algorithm
|
Consider a block decomposition of 16 by 16 matrices B and C as for Laplace's equation. (Efficient Decomposition as we will see later) |
Each sum operation involves a subset(group) of 4 processors |
k = 2 |