Parallel Computing Rationale
Transistors are still getting cheaper and cheaper and it only takes some 0.5 million transistors to make a very high quality CPU
This chip would have little ILP (or parallelism in “innermost loops”)
Thus next generation of processor chips more or less have to have multiple CPU’s as gain from ILP limited
However getting much more speedup than this requires use of “outer loop” or data parallelism.
- This is naturally implemented with threads on chip
The March of Parallelism:Multiple boards --> Multiple chips on a board --> Multiple CPU’s on a chip
Implies that “outer loop” Parallel Computing gets more and more important in dominant commodity market
Use of “Outer Loop” parallelism can not (yet) be automated