Basic HTML version of Foils prepared April 22 2000

Foil 22 Basic Parallel FFT Algorithms I

From Parallel FFT and use in PDE Solvers Computational Science Class CPS615 -- Winter Semester 2000. by Geoffrey C. Fox


1 Traditionally algorithms are iterative, going across each phase in the tree starting with all the N/2 2 point FFT's at phase d-1 and ending with combining two N/2 point FFT's at phase 0.
  • However FFTW papers at their web site points out that most such implementations make poor use of cache; recursive version can be better in cache use and we will describe this in general later
2 We can decompose in many ways but here we take the simplest case of block decomposition illustrated for N=16 on next foil.
3 Let Nproc = 2P be the number of processors in the parallel computer
4 Consider the d digits in binary representation of m
  • The bottom d-P digits md-P-1 through m0 representation location inside processor labeled by top P digits md-1 through md-P
5 P digits labeling 2P processors
6 d-P digits labeling position
7 in processor

in Table To:


© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Mon Apr 24 2000