This is a similar matrix multiplication algorithm to the first in that it assumes the block definition of matrix multiply |
and the calculation of each is done in place (owner computes rule) by moving the pairs of blocks and to the processor in which it resides. |
Cannon's algorithm differs in the order in which the pairs of blocks are multiplied. We first "skew" both the A and the B matrix so that we can "roll" both A and B - A to the left and B to the top - to circulate the blocks in row l and column k to calculate . |
Reference: |
Ho, Johnsson and Edelman |