Alternatively we could send even members of fa to b and odd indexed entries in fb to a.
|
Communication overhead fcomm = Tparallel *Nproc/Tsequential -1 is now given by |
We have avoided load imbalance and halved the communication |
compared to simple algorithm. In later foils we will find even better |
methods that get rid of log2N term |