One can however for DIT (and DIF) get much better performance if one avoids the "owner computes rule" and changes the processor which calculates a given FFT component. |
This can seen by examining a typical step in a phase of parallel FFT where communication is needed in current algorithm. Consider any two processors -- called a and b --- which need to swap data in current algorithm
|