Largest communication load is communicating 16 words to be compared to calculating 16 updates -- each taking time tcalc |
Each communication is one value of ? probably stored in a 4 byte word and takes time tcomm |
Then on 16 processors, T16 = 16tcalc + 16tcomm |
Speedup S = T1/T16 = 12.25 / (1 + tcomm/tcalc) |
or S = 12.25 / (1 + 0.25 tcomm/tfloat) |
or S ? 12.25 * (1 - 0.25 tcomm/tfloat) |