Computer Architecture for Computational Science

11/18/98


Click here to start


Table of Contents

Computer Architecture for Computational Science

Abstract of Computer Architecture Overview

Some NPAC Parallel Machines

Architectural Trends I

Architectural Trends

3 Classes of VLSI Design?

Architectural Trends: Bus-based SMPs

Bus Bandwidth

Economics

Important High Performance Computing Architectures

Some General Issues Addressed by High Performance Architectures

What is a Pipeline -- Cafeteria Analogy?

Example of MIPS R4000 Floating Point

MIPS R4000 Floating Point Stages

Sequential Memory Structure

Cache Issues I

Cache Issues II

Spatial versus Temporal Locality I

Spatial versus Temporal Locality II

Parallel Computer Memory Structure

Cache Coherence

Raw Uniprocessor Performance: Cray v. Microprocessor LINPACK n by n Matrix Solves

Raw Parallel Performance: LINPACK

Linear Linpack HPC Performance versus Time

Top 10 Supercomputers November 1998

Distribution of 500 Fastest Computers

CPU Technology used in Top 500 versus Time

Geographical Distribution of Top 500 Supercomputers versus time

Node Technology used in Top 500 Supercomputers versus Time

Total Performance in Top 500 Supercomputers versus Time and Manufacturer

Number of Top 500 Systems as a function of time and Manufacturer

Total Number of Top 500 Systems Installed June 98 versus Manufacturer

Two Basic Programming Models

Shared Address Space Architectures

Shared Address Space Model

Communication Hardware

History -- Mainframe

History -- Minicomputer

Scalable Interconnects

Message Passing Architectures

Message-Passing Abstraction e.g. MPI

First Message-Passing Machines

Mark2 Hypercube built by JPL(1985) Cosmic Cube (1983) built by Caltech (Chuck Seitz)

64 Ncube Processors (each with 6 memory chips) on a large board

ncube1 Chip -- integrated CPU and communication channels

Example of Message Passing System: IBM SP-2

Example of Message Passing System: Intel Paragon

Clusters of PC’s 1986-1998

HP Kayak PC (300 MHz Intel Pentium II) vs Origin 2000

Cray Vector Supercomputers

Cray/SGI memory latencies

Architecture of Cray T3E

T3E Messaging System

Cray T3E Cache Structure

Cray T3E Cache Performance

Finite Difference Example for T3E Cache Use I

Finite Difference Example for T3E Cache Use II

How to use Cache in Example I

How to use Cache in Example II

SGI Origin 2000 I

SGI Origin II

SGI Origin Block Diagram

SGI Origin III

SGI Origin 2 Processor Node Board

Performance of NCSA 128 node SGI Origin 2000

Cache Coherent or Not?

Summary of Cache Coherence Approaches

SMP Example: Intel Pentium Pro Quad

Sun E10000 in a Nutshell

Sun Enterprise Systems E6000/10000

Starfire E10000 Architecture I

Starfire E10000 Architecture II

Sun Enterprise E6000/6500 Architecture

Sun’s Evaluation of E10000 Characteristics I

Sun’s Evaluation of E10000 Characteristics II

Scalability of E1000

MPI Bandwidth on SGI Origin and Sun Shared Memory Machines

Latency Measurements on Origin and Sun for MPI

Tera Multithreaded Supercomputer

Tera Computer at San Diego Supercomputer Center

Overview of the Tera MTA I

Overview of the Tera MTA II

Tera 1 Processor Architecture from H. Bokhari (ICASE)

Tera Processor Characteristics

Tera System Diagram

Interconnect / Communications System of Tera I

Interconnect / Communications System of Tera II

T90/Tera MTA Hardware Comparison

Tera Configurations / Performance

Performance of MTA wrt T90 and in parallel

Tera MTA Performance on NAS Benchmarks Compared to T90

Cache Only COMA Machines

Examples of Some SIMD machines

Consider Scientific Supercomputing

Toward Architectural Convergence

Convergence: Generic Parallel Architecture

SIMD CM 2 from Thinking Machines

Official Thinking Machines Specification of CM2

GRAPE Special Purpose Machines

Quantum ChromoDynamics (QCD) Special Purpose Machines

ASCI Red -- Intel Supercomputer at Sandia

Email: gcf@npac.syr.edu

Home Page: http://www.npac.syr.edu