In our performance analysis, we assume is the time to send one data item, disregarding important but complicating issues such as latency. |
Naive broadcast: The simplest way to broadcast a block of data from a source processor to a group of processors is to repeat a send message. In our case, the data is of size and the number of processors is , so the total time needed is |
Log broadcast: In many architectures such as the hypercube, it is more efficient to send the message via intermediate processors who copy the message and pass it along. This reduces the number of messages to the log of the number of processors. |
Although we are not modelling differences in communication time here, it is also the case that on a hypercube network, the processors can be ordered so that neighboring processors in the binary tree are only one hop away on the network. |