1 |
Transistors are still getting cheaper and cheaper and it only takes some 0.5 million transistors to make a very high quality CPU
|
2 |
This chip would have little ILP (or parallelism in "innermost loops")
|
3 |
Thus next generation of processor chips more or less have to have multiple CPU's as gain from ILP limited
|
4 |
However getting much more speedup than this requires use of "outer loop" or data parallelism.
-
This is naturally implemented with threads on chip
|
5 |
The March of Parallelism: Multiple boards --> Multiple chips on a board --> Multiple CPU's on a chip
|
6 |
Implies that "outer loop" Parallel Computing gets more and more important in dominant commodity market
|
7 |
Use of "Outer Loop" parallelism can not (yet) be automated
|