This presentation came from material developed by David Culler and Jack Dongarra available on the Web |
See summary of Saleh Elmohamed and Ken Hawick at http://nhse.npac.syr.edu/hpccsurvey/ |
We discuss several examples in detail including T3E, Origin 2000, Sun E10000 and Tera MTA |
These are used to illustrate major architecture types |
We discuss key sequential architecture issues including cache structure |
We also discuss technologies from today's commodities through Petaflop ideas and Quantum Computing |
001 Computer Architecture for Computational Science 002 Abstract of Computer Architecture Overview 003 Some NPAC Parallel Machines 004 Architectural Trends I 005 Architectural Trends 006 3 Classes of VLSI Design? 007 Architectural Trends: Bus-based SMPs 008 Bus Bandwidth 009 Economics 010 Important High Performance Computing Architectures 011 Some General Issues Addressed by High Performance Architectures 012 What is a Pipeline -- Cafeteria Analogy? 013 Example of MIPS R4000 Floating Point 014 MIPS R4000 Floating Point Stages 015 Sequential Memory Structure 016 Cache Issues I 017 Cache Issues II 018 Spatial versus Temporal Locality I 019 Spatial versus Temporal Locality II 020 Parallel Computer Memory Structure 021 Cache Coherence 022 Raw Uniprocessor Performance: Cray v. Microprocessor LINPACK n by n Matrix Solves 023 Raw Parallel Performance: LINPACK 024 Linear Linpack HPC Performance versus Time 025 Top 10 Supercomputers November 1998 026 Distribution of 500 Fastest Computers 027 CPU Technology used in Top 500 versus Time 028 Geographical Distribution of Top 500 Supercomputers versus time 029 Node Technology used in Top 500 Supercomputers versus Time 030 Total Performance in Top 500 Supercomputers versus Time and Manufacturer 031 Number of Top 500 Systems as a function of time and Manufacturer 032 Total Number of Top 500 Systems Installed June 98 versus Manufacturer 033 Two Basic Programming Models 034 Shared Address Space Architectures 035 Shared Address Space Model 036 Communication Hardware 037 History -- Mainframe 038 History -- Minicomputer 039 Scalable Interconnects 040 Message Passing Architectures 041 Message-Passing Abstraction e.g. MPI 042 First Message-Passing Machines 043 Mark2 Hypercube built by JPL(1985) Cosmic Cube (1983) built by Caltech (Chuck Seitz) 044 64 Ncube Processors (each with 6 memory chips) on a large board 045 ncube1 Chip -- integrated CPU and communication channels 046 Example of Message Passing System: IBM SP-2 047 Example of Message Passing System: Intel Paragon 048 Clusters of PC's 1986-1998 049 HP Kayak PC (300 MHz Intel Pentium II) vs Origin 2000 050 Cray Vector Supercomputers 051 Cray/SGI memory latencies 052 Architecture of Cray T3E 053 T3E Messaging System 054 Cray T3E Cache Structure 055 Cray T3E Cache Performance 056 Finite Difference Example for T3E Cache Use I 057 Finite Difference Example for T3E Cache Use II 058 How to use Cache in Example I 059 How to use Cache in Example II 060 SGI Origin 2000 I 061 SGI Origin II 062 SGI Origin Block Diagram 063 SGI Origin III 064 SGI Origin 2 Processor Node Board 065 Performance of NCSA 128 node SGI Origin 2000 066 Cache Coherent or Not? 067 Summary of Cache Coherence Approaches 068 SMP Example: Intel Pentium Pro Quad 069 Sun E10000 in a Nutshell 070 Sun Enterprise Systems E6000/10000 071 Starfire E10000 Architecture I 072 Starfire E10000 Architecture II 073 Sun Enterprise E6000/6500 Architecture 074 Sun's Evaluation of E10000 Characteristics I 075 Sun's Evaluation of E10000 Characteristics II 076 Scalability of E1000 077 MPI Bandwidth on SGI Origin and Sun Shared Memory Machines 078 Latency Measurements on Origin and Sun for MPI 079 Tera Multithreaded Supercomputer 080 Tera Computer at San Diego Supercomputer Center 081 Overview of the Tera MTA I 082 Overview of the Tera MTA II 083 Tera 1 Processor Architecture from H. Bokhari (ICASE) 084 Tera Processor Characteristics 085 Tera System Diagram 086 Interconnect / Communications System of Tera I 087 Interconnect / Communications System of Tera II 088 T90/Tera MTA Hardware Comparison 089 Tera Configurations / Performance 090 Performance of MTA wrt T90 and in parallel 091 Tera MTA Performance on NAS Benchmarks Compared to T90 092 Cache Only COMA Machines 093 Examples of Some SIMD machines 094 Consider Scientific Supercomputing 095 Toward Architectural Convergence 096 Convergence: Generic Parallel Architecture 097 SIMD CM 2 from Thinking Machines 098 Official Thinking Machines Specification of CM2 099 GRAPE Special Purpose Machines 100 Quantum ChromoDynamics (QCD) Special Purpose Machines 101 ASCI Red -- Intel Supercomputer at Sandia