As discussed we have two types of phases |
The sequential time Tsequential (phase p) is identical for each phase p at NTbutterfly/2 |
Tparallel (phases 0? p? d-P-1) = NTbutterfly/(2Nproc) as perfectly parallel (load balanced) |
At the remaining stages, one must communicate in each of computations where fE and fO (DIT) are exchanged between two processors. This must be done for every one of the N/Nproc points stored in each processor |