1
Computer Architecture for Computational Science 2
Abstract of Computer Architecture Overview 3
Some NPAC Parallel Machines Technologies of Relevance 4
Technologies for High Performance Computers 5
Architectures for High Performance Computers - I 6
Architectures for High Performance Computers - II 7
There is no Best Machine! Commodity Driving Forces 8
Architectural Trends I 9
Architectural Trends 10
3 Classes of VLSI Design? 11
Ames Summer 97 Workshop on Device Technology -- Moore's Law - I 12
Ames Summer 97 Workshop on Device Technology -- Moore's Law - II 13
Ames Summer 97 Workshop on Device Technology -- Alternate Technologies I 14
Ames Summer 97 Workshop on Device Technology -- Alternate Technologies II 15
Architectural Trends: Bus-based SMPs 16
Bus Bandwidth 17
Economics Parallel Computing Architectures 18
Important High Performance Computing Architectures 19
Some General Issues Addressed by High Performance Architectures 20
Architecture Classes of High Performance Computers 21
Flynn's Classification of HPC Systems Performance Issues 22
Raw Uniprocessor Performance: Cray v. Microprocessor LINPACK n by n Matrix Solves 23
Raw Parallel Performance: LINPACK 24
Linear Linpack HPC Performance versus Time 25
Top 10 Supercomputers November 1998 26
Distribution of 500 Fastest Computers 27
CPU Technology used in Top 500 versus Time 28
Geographical Distribution of Top 500 Supercomputers versus time 29
Node Technology used in Top 500 Supercomputers versus Time 30
Total Performance in Top 500 Supercomputers versus Time and Manufacturer 31
Number of Top 500 Systems as a function of time and Manufacturer 32
Total Number of Top 500 Systems Installed June 98 versus Manufacturer 33Netlib Benchweb Benchmarks 34Linpack Benchmarks 35Java Linpack Benchmarks 36Java Numerics Sequential Computer Architecture 37
von Neuman Architecture in a Nutshell Pipelining 38
What is a Pipeline -- Cafeteria Analogy? 39
Instruction Flow in A Simple Machine Pipeline 40
Example of MIPS R4000 Floating Point 41
MIPS R4000 Floating Point Stages Caches 42
Illustration of Importance of Cache 43
Sequential Memory Structure 44
Cache Issues I 45
Cache Issues II 46
Spatial versus Temporal Locality I 47
Spatial versus Temporal Locality II Cray T3E as an Example of a Cache 48
Cray/SGI memory latencies 49
Architecture of Cray T3E 50
T3E Messaging System 51
Cray T3E Cache Structure 52
Cray T3E Cache Performance 53
Finite Difference Example for T3E Cache Use I 54
Finite Difference Example for T3E Cache Use II 55
How to use Cache in Example I 56
How to use Cache in Example II Vector Architecture 57
Cray Vector Supercomputers 58
Vector Supercomputers in a Nutshell - I 59
Vector Supercomputing in a picture 60
Vector Supercomputers in a Nutshell - II Parallel Memory Structure 61
Parallel Computer Architecture Memory Structure 62
Comparison of Memory Access Strategies 63
Types of Parallel Memory Architectures -- Physical Characteristics 64
Diagrams of Shared and Distributed Memories Parallel Control Structure 65
Parallel Computer Architecture Control Structure MIMD Architectures 66
Mark2 Hypercube built by JPL(1985) Cosmic Cube (1983) built by Caltech (Chuck Seitz) 67
64 Ncube Processors (each with 6 memory chips) on a large board 68
ncube1 Chip -- integrated CPU and communication channels 69
Example of Message Passing System: IBM SP-2 70
Example of Message Passing System: Intel Paragon 71
ASCI Red -- Intel Supercomputer at Sandia Parallel Computing Cache Issues 72
Parallel Computer Memory Structure 73
Cache Coherent or Not? 74
Cache Coherence Origin 2000 as an Example of Cache Coherence 75
SGI Origin 2000 I 76
SGI Origin II 77
SGI Origin Block Diagram 78
SGI Origin III 79
SGI Origin 2 Processor Node Board 80
Performance of NCSA 128 node SGI Origin 2000 81
Summary of Cache Coherence Approaches SIMD Architectures 82
Some Major Hardware Architectures - SIMD 83
SIMD (Single Instruction Multiple Data) Architecture 84
Examples of Some SIMD machines 85
SIMD CM 2 from Thinking Machines 86
Official Thinking Machines Specification of CM2 Metacomputers 87
Some Major Hardware Architectures - Mixed 88
Some MetaComputer Systems 89
Clusters of PC's 1986-1998 90
HP Kayak PC (300 MHz Intel Pentium II) vs Origin 2000 Special Purpose Devices 91
Comments on Special Purpose Devices 92
The GRAPE N-Body Machine 93
Why isn't GRAPE a Perfect Solution? 94
GRAPE Special Purpose Machines 95
Quantum ChromoDynamics (QCD) Special Purpose Machines Granularity 96
Granularity of Parallel Components - I 97
Granularity of Parallel Components - II Parallel Computer Networks 98
Classes of Communication Networks 99
Switch and Bus based Architectures 100
Examples of Interconnection Topologies 101
Useful Concepts in Communication Systems Network Performance 102
Latency and Bandwidth of a Network 103
Transfer Time in Microseconds for both Shared Memory Operations and Explicit Message Passing 104
Latency/Bandwidth Space for 0-byte message(Latency) and 1 MB message(bandwidth). 105
Communication Performance of Some MPP's 106
Implication of Hardware Performance 107
MPI Bandwidth on SGI Origin and Sun Shared Memory Machines 108
Latency Measurements on Origin and Sun for MPI Architectures according to Culler 109
Two Basic Programming Models 110
Shared Address Space Architectures 111
Shared Address Space Model 112
Communication Hardware 113
History -- Mainframe 114
History -- Minicomputer 115
Scalable Interconnects 116
Message Passing Architectures 117
Message-Passing Abstraction e.g. MPI 118
First Message-Passing Machines Intel SMP 119
SMP Example: Intel Pentium Pro Quad Sun E10000 as Example of UMA Commodity System 120
Sun E10000 in a Nutshell 121
Sun Enterprise Systems E6000/10000 122
Starfire E10000 Architecture I 123
Starfire E10000 Architecture II 124
Sun Enterprise E6000/6500 Architecture 125
Sun's Evaluation of E10000 Characteristics I 126
Sun's Evaluation of E10000 Characteristics II 127
Scalability of E1000 Current Near Term Trends 128
Consider Scientific Supercomputing 129
Toward Architectural Convergence 130
Convergence: Generic Parallel Architecture Emerging Architectures MTA and COMA 131
Tera Multithreaded Supercomputer 132
Tera Computer at San Diego Supercomputer Center 133
Overview of the Tera MTA I 134
Overview of the Tera MTA II 135
Tera 1 Processor Architecture from H. Bokhari (ICASE) 136
Tera Processor Characteristics 137
Tera System Diagram 138
Interconnect / Communications System of Tera I 139
Interconnect / Communications System of Tera II 140
T90/Tera MTA Hardware Comparison 141
Tera Configurations / Performance 142
Performance of MTA wrt T90 and in parallel 143
Tera MTA Performance on NAS Benchmarks Compared to T90 144
Cache Only COMA Machines Application Motivation for PetaFlops 145
III. Key drivers: The Need for PetaFLOPS Computing 146
10 Possible PetaFlop Applications 147
Petaflop Performance for Flow in Porous Media? 148
Target Flow in Porous Media Problem (Glimm - Petaflop Workshop) 149
NASA's Projection of Memory and Computational Requirements upto Petaflops for Aerospace Applications The 3 classes of PetaFlop Designs 150
Supercomputer Architectures in Years 2005-2010 -- I 151
Supercomputer Architectures in Years 2005-2010 -- II 152
Supercomputer Architectures in Years 2005-2010 -- III 153
Performance Per Transistor 154
Comparison of Supercomputer Architectures The Processor in Memory Design 155
Current PIM Chips 156
New "Strawman" PIM Processing Node Macro 157
"Strawman" Chip Floorplan 158
SIA-Based PIM Chip Projections Exotic Technology: Quantum Computing 159
Quantum Computing - I 160
Quantum Computing - II 161
Quantum Computing - III Exotic Technology: Superconducting Technology 162
Superconducting Technology -- Past 163
Superconducting Technology -- Present 164
Superconducting Technology -- Problems
Click outside pointer rectangle to move pointer
Click on Pointer to Hide
Click on Pointer + ALT to toggle message hiding
Click on Pointer + CNTL to abolish pointer
Click on Pointer + Shift to cycle families
Click outside + Alt is Change Image
Click outside + Control is Double Size
Click outside + Shift is Halve Size
Right Mouse Down on Pointer Toggles Index
Shift Right Mouse aligns top with scrolled Page While With Mouse Down on Current Pointer h hides This Message while m restores i Toggles Index Aligned with Page Top j Toggles Index Aligned with Scrolled View Top a Abolishes Pointer while CNTL-Click restores f cycles through pointer families c cycles through members of a family u increases Size Up and d decreases Down Mouse Up-Down between changes of Pointer to process new option