Performance of Simplest Parallel DIT FFT V

Alternatively we could send even members of fa to b and odd indexed entries in fb to a.
- We would take resultant vector members in processors a and b and combine them in pairs to get FFT components

We have avoided load imbalance and halved the communication

compared to simple algorithm. In later foils we will find even better

methods that get rid of log2N term

Alternatively we could send even members of fa to b and odd indexed entries in fb to a.