Table of Contents
Computer Architecture forComputational Science
Abstract of Computer Architecture Overview
Some NPAC Parallel Machines
Architectural Trends I
Architectural Trends
3 Classes of VLSI Design?
Architectural Trends: Bus-based SMPs
Bus Bandwidth
Economics
Important High Performance Computing Architectures
Some General Issues Addressed by High Performance Architectures
What is a Pipeline -- Cafeteria Analogy?
Example of MIPS R4000 Floating Point
MIPS R4000 Floating Point Stages
Sequential Memory Structure
Cache Issues I
Cache Issues II
Spatial versus Temporal Locality I
Spatial versus Temporal Locality II
Parallel Computer Memory Structure
Cache Coherence
Raw Uniprocessor Performance: Cray v. MicroprocessorLINPACK n by n Matrix Solves
Raw Parallel Performance: LINPACK
Linear Linpack HPC Performance versus Time
Top 10 Supercomputers November 1998
Distribution of 500 Fastest Computers
CPU Technology used in Top 500 versus Time
Geographical Distribution of Top 500 Supercomputers versus time
Node Technology used in Top 500 Supercomputers versus Time
Total Performance in Top 500 Supercomputers versus Time and Manufacturer
Number of Top 500 Systems as a function of time and Manufacturer
Total Number of Top 500 Systems Installed June 98 versus Manufacturer
Two Basic Programming Models
Shared Address Space Architectures
Shared Address Space Model
Communication Hardware
History -- Mainframe
History -- Minicomputer
Scalable Interconnects
Message Passing Architectures
Message-Passing Abstraction e.g. MPI
First Message-Passing Machines
Mark2 Hypercube built by JPL(1985)Cosmic Cube (1983) built by Caltech (Chuck Seitz)
64 Ncube Processors (each with 6 memory chips) on a large board
ncube1 Chip -- integrated CPU and communication channels
Example of Message Passing System: IBM SP-2
Example of Message Passing System: Intel Paragon
Clusters of PC’s 1986-1998
HP Kayak PC (300 MHz Intel Pentium II) vs Origin 2000
Cray Vector Supercomputers
Cray/SGI memory latencies
Architecture of Cray T3E
T3E Messaging System
Cray T3E Cache Structure
Cray T3E Cache Performance
Finite Difference Example for T3E Cache Use I
Finite Difference Example for T3E Cache Use II
How to use Cache in Example I
How to use Cache in Example II
SGI Origin 2000 I
SGI Origin II
SGI Origin Block Diagram
SGI Origin III
SGI Origin 2 Processor Node Board
Performance of NCSA 128 node SGI Origin 2000
Cache Coherent or Not?
Summary of Cache Coherence Approaches
SMP Example: Intel Pentium Pro Quad
Sun E10000 in a Nutshell
Sun Enterprise Systems E6000/10000
Starfire E10000 Architecture I
Starfire E10000 Architecture II
Sun Enterprise E6000/6500 Architecture
Sun’s Evaluation of E10000 Characteristics I
Sun’s Evaluation of E10000 Characteristics II
Scalability of E1000
MPI Bandwidth on SGI Origin and Sun Shared Memory Machines
Latency Measurements on Origin and Sun for MPI
Tera Multithreaded Supercomputer
Tera Computer at San Diego Supercomputer Center
Overview of the Tera MTA I
Overview of the Tera MTA II
Tera1 Processor Architecture from H. Bokhari (ICASE)
Tera Processor Characteristics
Tera System Diagram
Interconnect / Communications System of Tera I
Interconnect / Communications System of Tera II
T90/Tera MTA Hardware Comparison
Tera Configurations / Performance
Performance of MTA wrt T90 and in parallel
Tera MTA Performance on NAS Benchmarks Compared to T90
Cache Only COMA Machines
Examples of Some SIMD machines
Consider Scientific Supercomputing
Toward Architectural Convergence
Convergence: Generic Parallel Architecture
SIMD CM 2 from Thinking Machines
Official Thinking Machines Specification of CM2
GRAPE Special Purpose Machines
Quantum ChromoDynamics (QCD) Special Purpose Machines
ASCI Red -- Intel Supercomputer at Sandia
|