1 | Consider N grid points in P processors with grain size n = N2/P |
2 | Sequential Time T1 = 4N2 tfloat |
3 | Parallel Time TP = 4 n tfloat + 4 ?n tcomm |
4 | Speed up S = P (1 - 2/N)2 / (1 + tcomm/(?n tfloat) ) |
5 | Both overheads decrease like 1/?n as n increases |
6 | This ignores communication latency but is otherwise accurate |
7 | Speed up is reduced from P by both overheads |
8 | Load Imbalance Communication Overhead |