This starts with a discussion of Parallel Computing using analogies from nature |
It uses foils and material from CSEP chapter on Computer Architecture to discuss how and why to build a parallel computer including synchronization memory structure and network issues |
SIMD and MIMD Architectures with a brief comparison of workstation networks with closely coupled systems |
A look to the future is based on results from Petaflops workshop |
CPS615-Master95-2 Master Material for Second set of lectures on CPS615 Parallel Computing Overview CPS615-95A Master Set A of Overview Material on Parallel Computing for CPS615 Foils CPS615-95B Master Set B of Overview Material on Parallel Computing for CPS615 Foils
CPS615-Master95-2 001 001 CPS615 -- Base Course for the Simulation Track of Computational Science Fall Semester 1995 -- Lecture Stream 2 CPS615-Master95-2 002 002 Abstract of Lecture Stream 2 of CPS615
CPS615-95A 015 003 Elementary Discussion of Parallel Computing CPS615-95A 016 004 Single nCUBE2 CPU Chip CPS615-95A 017 005 64 Node nCUBE Board CPS615-95A 018 006 CM-5 in NPAC Machine Room CPS615-95A 019 007 Basic METHODOLOGY of Parallel Computing CPS615-95A 020 008 Concurrent Computation as a Mapping Problem -I CPS615-95A 021 009 Concurrent Computation as a Mapping Problem - II CPS615-95A 022 010 Concurrent Computation as a Mapping Problem - III CPS615-95A 023 011 Finite Element Mesh From Nastran (mesh only shown in upper half) CPS615-95A 024 012 A Simple Equal Area Decomposition CPS615-95A 025 013 Decomposition After Annealing (one particularly good but nonoptimal decomposition) CPS615-95A 026 014 Parallel Processing and Society CPS615-95A 027 015 Concurrent Construction of a Wall Using N = 8 Bricklayers Decomposition by Vertical Sections CPS615-95A 028 016 Quantitative Speed-Up Analysis for Construction of Hadrian's Wall CPS615-95A 029 017 Amdahl's law for Real World Parallel Processing CPS615-95A 030 018 Pipelining --Another Parallel Processing Strategy for Hadrian's Wall CPS615-95A 031 019 Hadrian's Wall Illustrates that the Topology of Processor Must Include Topology of Problem CPS615-95A 032 020 General Speed Up Analysis CPS615-95A 033 021 Comparison of The Complete Problem to the subproblems formed in domain decomposition CPS615-95A 034 022 Hadrian's Wall Illustrating an Irregular but Homogeneous Problem CPS615-95A 035 023 Some Problems are Inhomogeneous Illustrated by: An Inhomogeneous Hadrian Wall with Decoration CPS615-95A 036 024 Global and Local Parallelism Illustrated by Hadrian's Wall CPS615-95A 037 025 Parallel I/O Illustrated by Concurrent Brick Delivery for Hadrian's Wall Bandwidth of Trucks and Roads Matches that of Masons CPS615-95A 038 026 Nature's Concurrent Computers CPS615-95A 039 027 Comparison of Concurrent Processing in Society and Computing
CPS615-95B 001 028 Computational Science CPS615 Simulation Track Overview Foilsets B 1995 CPS615-95B 002 029 Abstract of CPS615 Foilsets B 1995 CPS615-95B 003 030 Overview of Parallel Hardware Architecture CPS615-95B 004 031 3 Major Basic Hardware Architectures CPS615-95B 005 032 Examples of the Three Current Concurrent Supercomputer Architectures CPS615-95B 006 033 Parallel Computer Architecture Issues CPS615-95B 007 034 General Types of Synchronization CPS615-95B 008 035 Granularity of Parallel Components CPS615-95B 009 036 Types of Parallel Memory Architectures -- Logical Structure CPS615-95B 010 037 Types of Parallel Memory Architectures -- Physical Characteristics CPS615-95B 011 038 Diagrams of Shared and Distributed Memories
CPS615-95B 013 039 Survey of Issues in Communication Networks CPS615-95B 014 040 Glossary of Useful Concepts in Communication Systems CPS615-95B 015 041 Switch and Bus based Architectures CPS615-95B 012 042 Classes of Communication Network include ... CPS615-95B 016 043 Point to Point Networks (Store and Forward) -- I CPS615-95B 017 044 Examples of Interconnection Topologies CPS615-95B 018 045 Degree and Diameter of Ring and Mesh(Torus) Architectures
CPS615-95B 019 046 Degree and Diameter of Hypercube and Tree Architectures CPS615-95B 052 047 Rules for Making Hypercube Network Topologies CPS615-95B 053 048 Mapping of Hypercubes into Three Dimensional Meshes CPS615-95B 054 049 Mapping of Hypercubes into One Dimensional Systems CPS615-95B 055 050 The One dimensional Mapping can be thought of as for one dimensional problem solving or one dimensional layout of chips forming hypercube CPS615-95B 056 051 Hypercube Versus Mesh Topologies
CPS615-95B 020 052 Point to Point Networks (Store and Forward) -- II CPS615-95B 021 053 Latency and Bandwidth of a Network CPS615-95B 022 054 Transfer Time in Microseconds for both Shared Memory Operations and Explicit Message Passing CPS615-95B 023 055 Latency/Bandwidth Space for 0-byte message(Latency) and 1 MB message(bandwidth). CPS615-95B 024 056 Switches versus Processor Networks CPS615-95B 025 057 Circuit Switched Networks
CPS615-95B 026 058 Let's Return to General Parallel Architectures in more detail CPS615-95B 027 059 Overview of Computer Architecture Issues CPS615-95B 028 060 Some Global Computer Architecture Issues CPS615-95B 029 061 Two General Real World Architectural Issues
CPS615-95B 030 062 MIMD Distributed Memory Architecture CPS615-95B 031 063 Some MIMD Architecture Issues CPS615-95B 032 064 SIMD (Single Instruction Multiple Data) Architecture CPS615-95B 033 065 SIMD Architecture Issues CPS615-95B 034 066 Shared Memory Architecture CPS615-95B 038 067 Shared versus Distributed Memory
CPS615-95B 035 068 The General Structure of a full sized CRAY C-90 CPS615-95B 036 069 The General Structure of a NEC SX-3 Classic Vector Supercomputer CPS615-95B 037 070 Comparison of MIMD and SIMD Parallelism seen on Classic Vector Supercomputers
CPS615-95B 039 071 What will happen in the year 2015 with .05 micron feature size and Petaflop Supercomputers using CMOS CPS615-95B 040 072 CMOS Technology and Parallel Processor Chip Projections CPS615-95B 041 073 Processor Chip Requirements for a Petaflop Machine Using 0.05 Micron Technology CPS615-95B 042 074 Three Designs for a Year 2015 Petaflops machine with 0.05 micron technology CPS615-95B 043 075 The Global Shared Memory Category I Petaflop Architecture CPS615-95B 044 076 Category II Petaflop Architecture -- Network of microprocessors CPS615-95B 045 077 Category III Petaflop Design -- Processor in Memory (PIM) CPS615-95B 046 078 Necessary Latency to Support Three Categories CPS615-95B 047 079 Chip Density Projections to year 2013 CPS615-95B 048 080 DRAM Chip count for Construction of Petaflop computer in year 2013 using 64 Gbit memory parts CPS615-95B 049 081 Memory Chip Bandwidth in Gigabytes/sec CPS615-95B 050 082 Power and I/O Bandwidth (I/O Connections) per Chip throught the year 2013 CPS615-95B 051 083 Clock Speed and I/O Speed in megabytes/sec per pin through year 2013
CPS615-Master95-2 Master Material for Second set of lectures on CPS615 Parallel Computing Overview1 2
CPS615-95A Master Set A of Overview Material on Parallel Computing for CPS615 Foils15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
CPS615-95B Master Set B of Overview Material on Parallel Computing for CPS615 Foils1 2 3 4 5 6 7 8 9 10 11 13 14 15 12 16 17 18 19 52 53 54 55 56 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 38 35 36 37 39 40 41 42 43 44 45 46 47 48 49 50 51
CPS615-Master95-2 Master Material for Second set of lectures on CPS615 Parallel Computing Overview1 2
CPS615-95A Master Set A of Overview Material on Parallel Computing for CPS615 Foils15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
CPS615-95B Master Set B of Overview Material on Parallel Computing for CPS615 Foils1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56