Principles of Parallel Computing and Limitations
Principles of Parallel Computing
- Parallelism and Amdahl's Law.
- Granularity.
- Locality.
- Load balance.
- Coordination and synchronization.
- Performance modeling.
----> Making parallel programming harder than
sequential.
On Parallelism
Definition: (Due to Almasi and Gottlieb 1989) A parallel
computer is a "collection of processing elements that communicate and
cooperate to solve large problems fast."
These processing elements don't have to be as one large and expensive
parallel machine but can also be a cluster of personal computers or
workstations communicating and cooperating to tackle a
specific computational problem or application.
Here are few of the main reasons for looking into parallelism:
- It provides an interesting alternative computing paradigm.
- It applies at all levels of system design.
- It is an interesting perspective from which to view architecture.
- It is becoming increasingly central in information processing.
- Parallelism is exploited at several levels:
- Instruction-level parallelism
- Multiprocessor servers
- Large-scale multiprocessors (MPPs)
Our focus here is on the multiprocessor level of parallelism.
What limits the performance of a parallel program?
- Available parallelism.
- Load balance.
- some processors work while others wait due to insufficient
parallelism or unequal size tasks.
- examples of unequal size tasks:
- the problems are fundamentally unstructured.
- adapting to only the "interesting parts of the domain".
- dealing with trees-structured computations.
- Extra work.
- managing parallelism
- redundant computation
- Communication.
Schematic View of A Generic Multiprocessor
The following diagram shows a collection of complete computers
including one or more processors (P) each with its own cache ($) and
memory (M) communicating through a general-purpose, high-performance,
scalable interconnect. Also as shown, each node contain a
communication assistant (CA) that assists in communication across the
network.
Another schematic view of a parallel processor is shown below:
Each processor has its own cache as well as a second level cache.