Consider N by N array of grid points on P Processors where ?P is an integer and they are arranged in a ?P by ?P topology |
Suppose N is exactly divisible by ?P and a general processor has a grain size n = N2/P grid points |
Sequential time T1 = (N-2)2 tcalc |
Parallel Time TP = n tcalc |
Speedup S = T1/TP = P (1 - 2/N)2 = P(1 - 2/?(nP) )2 |
S tends to P as N gets large at fixed P |
This expresses analytically intuitive idea that load imbalance due to boundary effects and will go away for large N |