Full HTML for

Scripted foilset CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures

Given by Geoffrey C. Fox at CPS615 Basic Simulation Track for Computational Science on Fall Semester 96. Foils prepared 27 August 1996
Outside Index Summary of Material Secs 80


Overview of Course Itself! -- and then introductory material on basic curricula
Overview of National Program -- The Grand Challenges
Overview of Technology Trends leading to petaflop performance in year 2007 (hopefully)
Overview of Syracuse and National programs in computational science
Parallel Computing in Society
Why Parallel Computing works
Simple Overview of Computer Architectures
  • SIMD MIMD Distributed (shared memory) Systems ... PIM ... Quantum Computing
General Discussion of Message Passing and Data Parallel Programming Paradigms and a comparison of languages

Table of Contents for full HTML of CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures

Denote Foils where Image Critical
Denote Foils where Image has important information
Denote Foils where HTML is sufficient
Indicates Available audio which is lightpurpleed out if missing
1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to Driving Technology and HPCC
Current Status and Futures

2 Abstract of The Current Status and Futures of HPCC
3 Basic Course CPS615 Contact Points
4 Course Organization
5 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- I
6 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- II
7 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- III
8 Performance of High End Machines Years 1940-2000
9 Performance of High End Machines Years 1980-2000
10 Peak Supercomputer Performance
11 The Technology
Driving Forces for HPCC

12 Effect of Feature Size on Performance
13 Growing Logic Chip Density
14 Trends in Feature and Die Size as a Function of Time
15 Supercomputer Memory Sizes and trends in RAM Density
16 Comparison of Trends in RAM Density and CPU Performance Increases
17 Three Major Markets -- Logic,ASIC,DRAM
18 Chip and Package Characteristics
19 Fabrication Characteristics
20 Electrical Design and Test Metrics
21 National Roadmap for Semiconductor Technology --1992
22 Status of Parallel Computing and High Speed Networks --
The Grand Challenges and the National Information Infrastructure

23 Superficial Observations on High Performance Computing-I
24 We have learnt that Parallel Computing Works !
25 Advances in Parallel Computer and High Speed Network (HPCC) Technology
26 Superficial Observations on High Performance Computing-II
27 When will Parallel Computing Take Over ?
28 Some Hardware/Software Trends over next 5 years
29 Who Uses High Performance Computers?
30 The Federal High Performance Computing and Communication Initiative 1992--1996
31 The Federal High Performance Computing and Communication Initiative (HPCCI)
32 The High Performance Computing and Communications Initiative
33 HPCCI Goals
34 Note the Trend from Large Scale Numerical Computing to the Integration of Computing and Communication in the NII
35 The Blue Books
Supplements to the President's Fiscal Year Budget

36 http://www.hpcc.gov/
37 The Blue Book Covers
38 Superficial Observations on High Performance Communication
39 Some Implications of HPCC Observations
40 What and Why is Computational Science ?
41 Parallelism Implies Major Changes which have significant educational Implications
42 What is Computational Science?
43 What do we have at Syracuse University?
44 Program in Computational Science
Implemented within current academic framework

45 Methodology for Computation
46 Usefulness of Computational Science Degrees:
47 Syracuse Computational Science Academic Programs -- Masters Degree
48 Syracuse Graduate Computational Science Academic Programs
49 Computational Science Courses -- Typical CPS615 Module
50 Computational Science Courses -- CPS713
51 Some Academic Areas and their Relation to Computational Science
52 Program in Information Age Computational Science Implemented Within Current Academic Program
53 Parallel Processing and Society
54 Concurrent Construction of a Wall
Using N = 8 Bricklayers
Decomposition by Vertical Sections

55 Quantitative Speed-Up Analysis for Construction of Hadrian's Wall
56 Amdahl's law for Real World Parallel Processing
57 Pipelining --Another Parallel Processing Strategy for Hadrian's Wall
58 Hadrian's Wall Illustrates that the Topology of Processor Must Include Topology of Problem
59 General Speed Up Analysis
60 Nature's Concurrent Computers
61 Comparison of Concurrent Processing in Society and Computing
62 Data Parallelism is a Universal Source of Scaling Parallelism
63 We have learnt that Parallel Computing Works !
64 Methodology of Parallel Computing
65 Concurrent Computation as a Mapping Problem -I
66 Concurrent Computation as a Mapping Problem - II
67 Concurrent Computation as a Mapping Problem - III
68 Finite Element Mesh From Nastran
(mesh only shown in upper half)

69 A Simple Equal Area Decomposition
70 Decomposition After Annealing
(one particularly good but nonoptimal decomposition)

71 Comparison of The Complete Problem to the subproblems formed in domain decomposition
72 Hadrian's Wall Illustrating an
Irregular but Homogeneous Problem

73 Some Problems are Inhomogeneous Illustrated by:
An Inhomogeneous Hadrian Wall with Decoration

74 Global and Local Parallelism Illustrated by Hadrian's Wall
75 Parallel I/O Illustrated by
Concurrent Brick Delivery for Hadrian's Wall
Bandwidth of Trucks and Roads
Matches that of Masons

76 Single nCUBE2 CPU Chip
77 64 Node nCUBE Board
78 Technologies for High Performance Computers
79 Architectures for High Performance Computers - I
80 Architectures for High Performance Computers - II
81 There is no Best Machine!
82 Quantum Computing - I
83 Quantum Computing - II
84 Quantum Computing - III
85 Superconducting Technology -- Past
86 Superconducting Technology -- Present
87 Superconducting Technology -- Problems
88 Architecture Classes of High Performance Computers
89 von Neuman Architecture in a Nutshell
90 Illustration of Importance of Cache
91 Vector Supercomputers in a Nutshell - I
92 Vector Supercomputing in a picture
93 Vector Supercomputers in a Nutshell - II
94 Instruction Flow in A Simple Machine Pipeline
95 Flynn's Classification of HPC Systems
96 Parallel Computer Architecture Memory Structure
97 Comparison of Memory Access Strategies
98 Types of Parallel Memory Architectures -- Physical Characteristics
99 Diagrams of Shared and Distributed Memories
100 Parallel Computer Architecture Control Structure
101 Some Major Hardware Architectures - MIMD
102 MIMD Distributed Memory Architecture
103 Some Major Hardware Architectures - SIMD
104 SIMD (Single Instruction Multiple Data) Architecture
105 Some Major Hardware Architectures - Mixed
106 Some MetaComputer Systems
107 Comments on Special Purpose Devices
108 The GRAPE N-Body Machine
109 Why isn't GRAPE a Perfect Solution?
110 Granularity of Parallel Components - I
111 Granularity of Parallel Components - II
112 Classes of Communication Networks
113 Switch and Bus based Architectures
114 Examples of Interconnection Topologies
115 Useful Concepts in Communication Systems
116 Communication Performance of Some MPP's
117 Implication of Hardware Performance
118 Latency and Bandwidth of a Network
119 Transfer Time in Microseconds for both Shared Memory Operations and Explicit Message Passing
120 Latency/Bandwidth Space for 0-byte message(Latency) and 1 MB message(bandwidth).
121 The Federal Program Focusing on 1996 Highlights with many exciting Applications
122 1996 Blue Book
123 1996 Blue Book (1 of 3)
124 1996 Blue Book (2 of 3)
125 1996 Blue Book (3 of 3)
126 The Application Motivation for HPCC
127 High Performance Computing Research Facilities
128 Grand Challenge Applications
129 Applied Fluid Dynamics
130 Coupled Field Problems and GAFD Turbulence
131 Numerical Tokamak Project
132 Meso- to Macro-Scale Environmental Modeling
133 Mathematical Modeling of Air Pollution Dynamics
134 Global Climate Modeling
135 4-D Data Assimilation
136 Eco Simulations
137 Biomedical Imaging and Biomechanics
138 Molecular Biology
139 Molecular Design
140 Biomolecular Modeling and Structure Determination
141 Fundamental Computational Sciences
142 Binary Black Holes Simulation
143 The Binary Black Hole Grand Challenge Alliance
144 BBH: Computational Challenge
145 Adaptive Multilevel Parallel Infrastructure
146 First Principal Simulation of Materials Properties
147 Large Scale Structure and Galaxy Formation
148 Grand-Challenge-Scale Applications
149 Visible Human
150 A Realistic Ocean Model
151 Shoemaker-Levy 9 Collision with Jupiter
152 Advanced Simulation of Crash Simulation
153 National Challenge Applications
154 A Survey of New York State Industrial Opportunities for HPCC was very influential for me and my group(NPAC)
155 Categories of Industrial and Government Applications of HPCC (with reference to academic applications)
156 The 33 Application areas were studied in detail:
Simulation (Roughly the Grand Challenges)

157 The 33 Application areas were studied in detail:
Information Analysis -- DataMining

158 The 33 Application areas were studied in detail:
InfoVision: Information, Video, Imagery and Simulation on Demand

159 The 33 Application areas were studied in detail:
Information Integration combining Simulation, Analysis and InfoVision

160 Some detailed Analysis of Opportunities for HPCC in the Science and Engineering Simulation Arena
161 Opportunities for HPCC in the Science and Engineering Simulation Arena
162 Some Simulation Areas which will be Difficult to exploit in near term
163 Suprisingly Difficult and Suprisingly Promising Areas for HPCC in Simulation
164 Why is it hard to use HPCC in Manufacturing-I?
165 Why is it hard to use HPCC in Manufacturing-II?
166 Multidisciplinary Analysis and Design as a Critical use of HPCC in Manufacturing?
167 Role of Government and DoD in HPCC Simulation Applications
168 The HPCC Software Industry is not Viable in Simulation Area ?
169 Anecdotes from HPCC Software Industry Arena
170 National Challenges will drive the adoption of HPCC in the "Real World"
171 From the Grand(Simulation) Challenges to the National (information) Challenges
172 Characteristics of Grand Challenges
173 Federal 1994 Blue Book Comparison of National and Grand Challenges
174 Come to CPS616 for a detailed discussion of the National Challenges and the National Information Infrastructure

Outside Index Summary of Material



HTML version of Scripted Foils prepared 27 August 1996

Foil 1 CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Introduction to Driving Technology and HPCC
Current Status and Futures

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 31
Geoffrey Fox
NPAC
Room 3-131 CST
111 College Place
Syracuse NY 13244-4100

HTML version of Scripted Foils prepared 27 August 1996

Foil 2 Abstract of The Current Status and Futures of HPCC

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 80
Overview of Course Itself! -- and then introductory material on basic curricula
Overview of National Program -- The Grand Challenges
Overview of Technology Trends leading to petaflop performance in year 2007 (hopefully)
Overview of Syracuse and National programs in computational science
Parallel Computing in Society
Why Parallel Computing works
Simple Overview of Computer Architectures
  • SIMD MIMD Distributed (shared memory) Systems ... PIM ... Quantum Computing
General Discussion of Message Passing and Data Parallel Programming Paradigms and a comparison of languages

HTML version of Scripted Foils prepared 27 August 1996

Foil 3 Basic Course CPS615 Contact Points

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 50
Instructor: Geoffrey Fox gcf@npac.syr.edu 3154432163 Room 3-131 CST
Backup: Nancy McCracken njm@npac.syr.edu 3154434687 Room 3-234 CST
NPAC Administrative support: Nora Downey-Easter nora@npac.syr.edu 3154431722 Room 3-206 CST
CPS615 Powers that be above can be reached at cps615ad@npac.syr.edu
CPS615 Students can be reached by mailing cps615@npac.syr.edu
Homepage will be:
http://www.npac.syr.edu/projects/cps615fall96
See my paper SCCS 736 as an overview of HPCC status

HTML version of Scripted Foils prepared 27 August 1996

Foil 4 Course Organization

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 31
Graded on the basis of Approximately 8 Homeworks which will be due Thursday of week following day given out (Tuesday or Thursday)
Plus one modest sized project at the end of class -- must involve "real" running parallel code!
No finals or written exams
All material will be placed on World Wide Web(WWW)
Preference given to work returned on the Web

HTML version of Scripted Foils prepared 27 August 1996

Foil 5 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 48
Overview of National Scene -- Why is High Performance Computing Important
  • Grand Challenges
What is Computational Science -- The Program at Syracuse
Basic Technology Situation -- Increasing density of transistors on a chip
  • Trends to year 2007 using Moore's Law (see UVC Video)
Elementary Discussion of Parallel Computing including use in society
  • why does parallel computing always "work" in principle
Computer Architecture -- Parallel and Sequential
  • Network Interconnections, SIMD v. MIMD, Distributed Shared Memory
  • vectorization contrasted with parallism

HTML version of Scripted Foils prepared 27 August 1996

Foil 6 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 72
Simple base example -- Laplace's Equation
  • How does parallel computing work
This is followed by two sections -- software technologies and applications which are interspersed with each other and "algorithm" modules
Programming Models -- Message Passing and Data Parallel Computing -- MPI and HPF (Fortran 90)
  • Some remarks on parallel compilers
  • Remarks on use of parallel Java
Some real applications analysed in detail
  • Chemistry, CFD, Earthquake prediction, Statistical Physics

HTML version of Scripted Foils prepared 27 August 1996

Foil 7 Basic Structure of Complete CPS615 Base Course on Computational Science Simulation Track -- III

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 96
This introduction is followed by a set of "vignettes" discussing problem classes which illustrate parallel programming and parallel algorithms
Ordinary Differential Equations
  • N body Problem by both O(N^2) and "fast multipole" O(N) method
Numerical Integration including adaptive methods
Floating Point Arithmetic
Monte Carlo Methods including Random Numbers
Full Matrix Algebra as in
  • Computational Electromagnetism
  • Computational Chemistry
Partial Differential Equations implemented as sparse matrix problems (as in Computational Fluid Dynamics)
  • Iterative Algorithms from Gauss Seidel to Conjugate Gradient
  • Finite Element Methods

HTML version of Scripted Foils prepared 27 August 1996

Foil 8 Performance of High End Machines Years 1940-2000

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 72

HTML version of Scripted Foils prepared 27 August 1996

Foil 9 Performance of High End Machines Years 1980-2000

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 33

HTML version of Scripted Foils prepared 27 August 1996

Foil 10 Peak Supercomputer Performance

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
For "Convential" MPP/Distributed Shared Memory Architecture
Now(1996) Peak is 0.1 to 0.2 Teraflops in Production Centers
  • Note both SGI and IBM are changing architectures:
  • IBM Distributed Memory to Distributed Shared Memory
  • SGI Shared Memory to Distributed Shared Memory
In 1999, one will see production 1 Teraflop systems
In 2003, one will see production 10 Teraflop Systems
In 2007, one will see production 50-100 Teraflop Systems
Memory is Roughly 0.25 to 1 Terabyte per 1 Teraflop
If you are lucky/work hard: Realized performance is 30% of Peak

HTML version of Scripted Foils prepared 27 August 1996

Foil 11 The Technology
Driving Forces for HPCC

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 14

HTML version of Scripted Foils prepared 27 August 1996

Foil 12 Effect of Feature Size on Performance

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 63

HTML version of Scripted Foils prepared 27 August 1996

Foil 13 Growing Logic Chip Density

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 30

HTML version of Scripted Foils prepared 27 August 1996

Foil 14 Trends in Feature and Die Size as a Function of Time

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 25

HTML version of Scripted Foils prepared 27 August 1996

Foil 15 Supercomputer Memory Sizes and trends in RAM Density

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 57
RAM density increases by about a factor of 50 in 8 years
Supercomputers in 1992 have memory sizes around 32 gigabytes (giga = 109)
Supercomputers in year 2000 should have memory sizes around 1.5 terabytes (tera = 1012)

HTML version of Scripted Foils prepared 27 August 1996

Foil 16 Comparison of Trends in RAM Density and CPU Performance Increases

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 27
Computer Performance is increasing faster than RAM density

HTML version of Scripted Foils prepared 27 August 1996

Foil 17 Three Major Markets -- Logic,ASIC,DRAM

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 47
Overall Roadmap Technology Characteristics from SIA (Semiconductor Industry Association) Report 1994
L=Logic, D=DRAM, A=ASIC, mP = microprocessor

HTML version of Scripted Foils prepared 27 August 1996

Foil 18 Chip and Package Characteristics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 37
Overall Roadmap Technology Characteristics from SIA (Semiconductor Industry Association) Report 1994

HTML version of Scripted Foils prepared 27 August 1996

Foil 19 Fabrication Characteristics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 25
Overall Roadmap Technology Characteristics from SIA (Semiconductor Industry Association) Report 1994

HTML version of Scripted Foils prepared 27 August 1996

Foil 20 Electrical Design and Test Metrics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 34
Overall Roadmap Technology Characteristics from SIA (Semiconductor Industry Association) Report 1994

HTML version of Scripted Foils prepared 27 August 1996

Foil 21 National Roadmap for Semiconductor Technology --1992

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 56
See Chapter 5 of Petaflops Report -- July 95

HTML version of Scripted Foils prepared 27 August 1996

Foil 22 Status of Parallel Computing and High Speed Networks --
The Grand Challenges and the National Information Infrastructure

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 23 Superficial Observations on High Performance Computing-I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 87
Parallel Computing Works!
Technology well understood for Science and Engineering
  • Good parallel algorithms, several examples of major applications in many fields exploring range of issues
  • Data and Message Parallel programming models developed
Supercomputing market small (few percent at best) and probably decreasing in size
  • Essential to have good common software infrastructure
  • Productivity tools -- Software Engineering -- Programming Support tools POOR
  • The parallel software "industry" is very small

HTML version of Scripted Foils prepared 27 August 1996

Foil 24 We have learnt that Parallel Computing Works !

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 50
Data Parallelism - universal form of scaling parallelism
Functional Parallelism - Important but typically modest speedup. - Critical in multidisciplinary applications.
On any machine architecture
  • Distributed Memory MIMD
  • Distributed Memory SIMD
  • Shared memory - this affects programming model
  • This affects generality
  • SIMD ~ 50% academic problems
  • but < 50% commercial

HTML version of Scripted Foils prepared 27 August 1996

Foil 25 Advances in Parallel Computer and High Speed Network (HPCC) Technology

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 99
Performance of both communication networks and computers will increase by a factor of 1000 during the 1990's
  • New uses of Computers to design new drugs, search terabyte databases etc.
  • National Information Infrastructure will see pervasive deployment of upgraded Internet to give megagabit/second interactive links to homes and offices allowing interactive realtime video.
  • Greater utility of computers in "Old Applications"
Competitive advantage to industries that can use either or both High Performance Computers and Communication Networks. (United States clearly ahead of Japan and Europe in these technologies.)

HTML version of Scripted Foils prepared 27 August 1996

Foil 26 Superficial Observations on High Performance Computing-II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 141
No silver programming bullet -- I doubt if new language will revolutionize parallel programmimng and make much easier
  • Hardware (shared memory) could be helpful
Social forces are tending to hinder adoption of parallel computing as most applications are areas where large scale computing already common
  • Parallelizing existing applications (porting sequential software) very hard
  • Opportunities offered by use of MPP's often require major organizational changes

HTML version of Scripted Foils prepared 27 August 1996

Foil 27 When will Parallel Computing Take Over ?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 72
Switch from conventional to new types of technology is a phase transition
Needs headroom (Carver Mead) which is large (factor of 10 ?) due to large new software investment
Machines such as the nCUBE-1 and CM-2 were comparable in cost performance to conventional supercomputers
  • Enough to show that "Parallel Computing Works"
  • Not enough to take over!
Cray T3D, Intel Paragon, CM-5, DECmpp (Maspar MP-2), IBM SP-2, nCUBE-3 have enough headroom to take over from traditional computers ?

HTML version of Scripted Foils prepared 27 August 1996

Foil 28 Some Hardware/Software Trends over next 5 years

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 76
ATM networks have rapidly transitioned from research Gigabit networks to commercial deployment
  • ATM likely to be a major force in local area as well as wide area networks
Computer Hardware trends imply that all computers (PC's ---> Supercomputers) will be parallel by the year 2000
  • Up to 1993, parallel computers are from small start-up companies (except Intel Supercomputer Division)
  • Now Cray, Convex (HP), Digital, IBM have massively parallel computing systems and Silicon Graphics is becoming a powerful high performance computing vendor
  • Several architectures but only one : Distributed memory MIMD multicomputer is known to scale from one to very many processors
Software is challenge and could prevent/delay hardware trend that suggests parallelism will be a mainline computer architecture
  • We must get systems software correct
  • Simultaneously develop applications software in gradually improving parallel programming environment

HTML version of Scripted Foils prepared 27 August 1996

Foil 29 Who Uses High Performance Computers?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 112
High Energy Physics
Semiconductor Industry, VLSI Design
Graphics and Virtual Reality
Weather and Ocean Modeling
Visualization
Oil Industry
Automobile Industry
Chemicals and Pharmaceuticals Industry
Financial Applications
Business Applications
Airline Industry

HTML version of Scripted Foils prepared 27 August 1996

Foil 30 The Federal High Performance Computing and Communication Initiative 1992--1996

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 31 The Federal High Performance Computing and Communication Initiative (HPCCI)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 51
Originally $2.9 billion over 5 years starting in 1992 and
  • Rapidly growing Information technology component starting in 1994 and total budget now over $1 billion per year
The Grand Challenges
  • Enabled by teraflop computers and important to economy or fundamental research
    • Global warming - NOAA
    • Oil reservoir and environmental simulation - DOE
    • Structural and aerodynamic calculations - NASA
    • Earth observing satellite - data analysis - NASA
    • Human genome - NIH, DOE
    • Quantum chromodynamics - Fundamental Physics
    • Gravitational waves from black holes - Fundamental Physics
    • Molecular modeling - Fundamental Chemistry
Nearly all grand challenges have industrial payoff but technology transfer NOT funded by HPCCI

HTML version of Scripted Foils prepared 27 August 1996

Foil 32 The High Performance Computing and Communications Initiative

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 38
High Performance Computing Act of 1991

HTML version of Scripted Foils prepared 27 August 1996

Foil 33 HPCCI Goals

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 80
Computational performance of one trillion operations per second on a wide range of important applications
Development of associated system software, tools, and improved algorithms
A national research network capable of one billion bits per second
Sufficient production of PhDs in computational science and engineering

HTML version of Scripted Foils prepared 27 August 1996

Foil 34 Note the Trend from Large Scale Numerical Computing to the Integration of Computing and Communication in the NII

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 35 The Blue Books
Supplements to the President's Fiscal Year Budget

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
1992: Grand Challenges
1993: Grand Challenges
1994: Toward a National Information Infrastructure
1995: Technology for the National Information Infrastructure
1996: Foundation for America's Information Future

HTML version of Scripted Foils prepared 27 August 1996

Foil 36 http://www.hpcc.gov/

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 37 The Blue Book Covers

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 38 Superficial Observations on High Performance Communication

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 86
ATM ISDN Wireless Satellite advancing rapidly in commercial arena which is adopting research rapidly
Social forces (deregulation in the U.S.A.) are tending to accelerate adoption of digital communication technologies
  • These are often NEW applications (porting of POTS relatively easy!) such as interactive TV/Shopping
  • Tremendous competition between different telecommunication sectors encourages new technology now to ensure future success
Not clear how to make money on Web(Internet) but growing interest/acceptance by general public
  • huge sales in home multimedia PC's -- comparable to TV's in volume
Integration of Communities and Opportunities
  • Computing and Communication and Information Industries merging -- similar impact on academic departments will(should) happen

HTML version of Scripted Foils prepared 27 August 1996

Foil 39 Some Implications of HPCC Observations

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 63
Technology Opportunities in Integration of High Performance Computing and Communication Systems
  • Merging of networking, parallel computing, distributed comouting communities
  • This SOLVES previous difficulties observed for high performance computing as implies a much larger distributed (world-wide metacomputing) computing base
New Business opportunities linking Enterprise Information Systems to Community networks to current cable/network TV journalism
New educational needs at interface of computer science and communications/information applications
Major implications for education -- the Virtual University

HTML version of Scripted Foils prepared 27 August 1996

Foil 40 What and Why is Computational Science ?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 7

HTML version of Scripted Foils prepared 27 August 1996

Foil 41 Parallelism Implies Major Changes which have significant educational Implications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 47
Different machines
New types of computers
New libraries
Rewritten Applications
Totally new fields able to use computers .... ==> Need new educational initiatives Computational Science
Will be a nucleus for the phase transition
and accelerate use of parallel computers in the real world

HTML version of Scripted Foils prepared 27 August 1996

Foil 42 What is Computational Science?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Computational Science is an interdisciplinary field that integrates computer science and applied mathematics with a wide variety of application areas that use significant computation to solve their problems
Includes the study of computational techniques
  • Science and Engineering - Grand Challenges
  • Society and Business - National Challenge
Includes the study of new algorithms, languages and models in computer science and applied mathematics required by the use of high performance computing and communications in any (?) important application
  • At interface of (applied) computer science and applications
Includes computation of complex systems using physical analogies such as neural networks and genetic optimization.

HTML version of Scripted Foils prepared 27 August 1996

Foil 43 What do we have at Syracuse University?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Formal Master's Program with reasonable curriculum and course material
PhD called Computer and Information Science but can choose computational science research
Certificates(Minors) in Computational Science at both the Masters and PhD Level
Undergraduate Minors in Computational Science
All Programs are open to both computer science and application (computer user) students
Currently have both an "Science and Engineering Track" ("parallel computing") and an "Information oriented Track" ("the web")

HTML version of Scripted Foils prepared 27 August 1996

Foil 44 Program in Computational Science
Implemented within current academic framework

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 45 Methodology for Computation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 46 Usefulness of Computational Science Degrees:

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Conclusions of DOE Conference on Computational Science Education, Feb 1994
Industry and government laboratories want graduates with Computational Science and Engineering training - don't care what degree is called
Universities - want graduates with Computational Science and Engineering training - want degrees to have traditional names
Premature to have BS Computational Science and Engineering

HTML version of Scripted Foils prepared 27 August 1996

Foil 47 Syracuse Computational Science Academic Programs -- Masters Degree

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Master's Degree in Computational Science Course Requirements:
Core Courses:
  • CPS 615: Introduction to Computational Science
  • CPS 675: Design and analysis of algorithms
  • MAT 683: Methods of numerical analysis I
Application Area:
  • Applications of computational science, including a substantial project. Example: CPS713 Case Studies in Computational Science
It is required to take one course in 3 out of the following 4 areas:
  • 1. Parallel programming, algorithms, and architecture
  • 2. Methodology and techniques Numerical analysis, optimization, simulation
  • 3. High performance software Compilers, languages, visualization, programming environments
  • 4. Advanced computer science and software engineering Structured programming and formal methods

HTML version of Scripted Foils prepared 27 August 1996

Foil 48 Syracuse Graduate Computational Science Academic Programs

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Minors in Computational Science
Masters Level Certificate:
  • Available to graduate students enrolled in any SU masters or Ph.D. program
  • Courses required for certificate are one from each area (15 credits)
    • 1. CPS 615: Introduction to Computational Science -- Simulation or
    • 1. CPS 616: Computational Science for Information Applications
    • (probably CPS 606 can be substituted here)
  • 2. Applications of Computer Science (e.g. CPS713/714)
  • 3. High Performance Parallel Computing
  • 4. Methodology and techniques
  • 5. Computational Science elective -Relevant course chosen by student - (e.g. CPS730)
Doctoral level Certificate:
  • 5 courses as above with one more elective (18 credits)
  • Make a contribution to computational science through the research of the dissertation
Doctoral level Certificate in Computational Neuroscience:
  • Joint program Bioengineering (Institute of Sensory Research, SUNY Health Science Center, Computer Science

HTML version of Scripted Foils prepared 27 August 1996

Foil 49 Computational Science Courses -- Typical CPS615 Module

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Example Course Module
  • Application: Field simulations as in fluid mechanics, solid mechanics and electromagetics.
  • Numerical methods: Numerical solutions to partial differential equations.
  • Computational algorithms: Parallel techniques for iterative solvers based on finite differences and finite elements solvers.
  • Software -- Message Passing and HPF Implementation for simplest Jacobi solvers
  • Results: Calculating the field of an electrostatic lens; calculating the air flow around an airplane wing.

HTML version of Scripted Foils prepared 27 August 1996

Foil 50 Computational Science Courses -- CPS713

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
CPS 713 Case Studies in Computational Science
This course emphasizes a few applications and gives an in-depth treatment of the more advanced computing techniques, aiming for a level of sophistication representing the best techniques currently known by researchers in the field.
  • Typically, the course is organized around three or four application topics such as:
  • Analysis of data and parameterization - statistics and optimization methods for massive data sets.
  • Molecular dynamics, as in CHARMM - particle dynamics in very large systems
  • Determining energy levels of large chemical systems, as in MOPAC - eigenvalues by matrix methods
  • Statistical physics - clustering methods
  • Collision of black holes - PDE's by adaptive finite difference meshs
  • Computational fluid dynamics as in NAS problem from NASA - PDE's by finite differences and finite elements
  • Students carry out detailed implementation projects for one or more topics, working either individually or in teams.
Instructor: Professor Geoffrey Fox, Computer Science and Physics

HTML version of Scripted Foils prepared 27 August 1996

Foil 51 Some Academic Areas and their Relation to Computational Science

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Computer Science -- Nationally viewed as central activity
  • Congress thinks Computer Science is activities such as NSF Supercomputer Centers i.e. Computational Science
  • Computer Scientists think of the field as less applied
Computer Engineering -- Historically Mathematics and Electrical Engineering have spawned Computer Science programs -- if from electrical engineering, the field is sometimes called computer engineering
Applied Mathematics is a very broad field in U.K. where equivalent to Theoretical Physics. In USA applied mathematics is roughly mathematics associated with fluid flow
  • Field teachs areas such as Scientific Computing even though ignore many issues needed outside differential equation solution
Computational Physics -- Practioners will be judged by their contribtion to physics and not directly by algorithms and software innovations.
  • Similar remarks about Computational Aerospace, Chemistry etc.

HTML version of Scripted Foils prepared 27 August 1996

Foil 52 Program in Information Age Computational Science Implemented Within Current Academic Program

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 53 Parallel Processing and Society

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
The fundamental principles behind the use of concurrent computers are identical to those used in society - in fact they are partly why society exists.
If a problem is too large for one person, one does not hire a SUPERman, but rather puts together a team of ordinary people...
cf. Construction of Hadrians Wall

HTML version of Scripted Foils prepared 27 August 1996

Foil 54 Concurrent Construction of a Wall
Using N = 8 Bricklayers
Decomposition by Vertical Sections

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Domain Decomposition is Key to Parallelism
Need "Large" Subdomains l >> l overlap

HTML version of Scripted Foils prepared 27 August 1996

Foil 55 Quantitative Speed-Up Analysis for Construction of Hadrian's Wall

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 56 Amdahl's law for Real World Parallel Processing

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
AMDAHL"s LAW or
Too many cooks spoil the broth
Says that
Speedup S is small if efficiency e small
or for Hadrian's wall
equivalently S is small if length l small
But this is irrelevant as we do not need parallel processing unless problem big!

HTML version of Scripted Foils prepared 27 August 1996

Foil 57 Pipelining --Another Parallel Processing Strategy for Hadrian's Wall

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
"Pipelining" or decomposition by horizontal section is:
  • In general less effective
  • and leads to less parallelism
  • (N = Number of bricklayers must be < number of layers of bricks)

HTML version of Scripted Foils prepared 27 August 1996

Foil 58 Hadrian's Wall Illustrates that the Topology of Processor Must Include Topology of Problem

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Hadrian's Wall is one dimensional
Humans represent a flexible processor node that can be arranged in different ways for different problems
The lesson for computing is:
Original MIMD machines used a hypercube topology. The hypercube includes several topologies including all meshes. It is a flexible concurrent computer that can tackle a broad range of problems. Current machines use different interconnect structure from hypercube but preserve this capability.

HTML version of Scripted Foils prepared 27 August 1996

Foil 59 General Speed Up Analysis

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Comparing Computer and Hadrian's Wall Cases

HTML version of Scripted Foils prepared 27 August 1996

Foil 60 Nature's Concurrent Computers

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
At the finest resolution, collection of neurons sending and receiving messages by axons and dendrites
At a coarser resolution
Society is a collection of brains sending and receiving messages by sight and sound
Ant Hill is a collection of ants (smaller brains) sending and receiving messages by chemical signals
Lesson: All Nature's Computers Use Message Passing
With several different Architectures

HTML version of Scripted Foils prepared 27 August 1996

Foil 61 Comparison of Concurrent Processing in Society and Computing

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Problems are large - use domain decomposition Overheads are edge effects
Topology of processor matches that of domain - processor with rich flexible node/topology matches most domains
Regular homogeneous problems easiest but
irregular or
Inhomogeneous
Can use local and global parallelism
Can handle concurrent calculation and I/O
Nature always uses message passing as in parallel computers (at lowest level)

HTML version of Scripted Foils prepared 27 August 1996

Foil 62 Data Parallelism is a Universal Source of Scaling Parallelism

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 63 We have learnt that Parallel Computing Works !

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Data Parallelism - universal form of scaling parallelism
Functional Parallelism - Important but typically modest speedup. - Critical in multidisciplinary applications.
On any machine architecture
  • Distributed Memory MIMD
  • Distributed Memory SIMD
  • Shared memory - this affects programming model
  • This affects generality
  • SIMD ~ 50% academic problems
  • but < 50% commercial

HTML version of Scripted Foils prepared 27 August 1996

Foil 64 Methodology of Parallel Computing

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Simple, but general and extensible to many more nodes is domain decomposition
All successful concurrent machines with
  • Many nodes
  • High performance (this excludes Dataflow)
Have obtained parallelism from "Data Parallelism" or "Domain Decomposition"
Problem is an algorithm applied to data set
  • Obtain concurrency by acting on data concurrently.
The three architectures considered here differ as follows:
  • MIMD Distributed Memory -- Processing and Data Distributed
  • MIMD Shared Memory -- Processing Distributed but memory shared
  • SIMD Distributed Memory -- Synchronous Processing on Distributed Data

HTML version of Scripted Foils prepared 27 August 1996

Foil 65 Concurrent Computation as a Mapping Problem -I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
2 Different types of Mappings in Physical Spaces
Both are static
  • a) Seismic Migration with domain decomposition on 4 nodes
  • b)Universe simulation with irregular data but static 16 node decomposition
  • but this problem would be best with dynamic irregular decomposition

HTML version of Scripted Foils prepared 27 August 1996

Foil 66 Concurrent Computation as a Mapping Problem - II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Different types of Mappings -- A very dynamic case without any underlying Physical Space
c)Computer Chess with dynamic game tree decomposed onto 4 nodes

HTML version of Scripted Foils prepared 27 August 1996

Foil 67 Concurrent Computation as a Mapping Problem - III

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 68 Finite Element Mesh From Nastran
(mesh only shown in upper half)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 69 A Simple Equal Area Decomposition

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
And the corresponding poor workload balance

HTML version of Scripted Foils prepared 27 August 1996

Foil 70 Decomposition After Annealing
(one particularly good but nonoptimal decomposition)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
And excellent workload balance

HTML version of Scripted Foils prepared 27 August 1996

Foil 71 Comparison of The Complete Problem to the subproblems formed in domain decomposition

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
The case of Programming a Hypercube
Each node runs software that is similar to sequential code
e.g., FORTRAN with geometry and boundary value sections changed

HTML version of Scripted Foils prepared 27 August 1996

Foil 72 Hadrian's Wall Illustrating an
Irregular but Homogeneous Problem

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Geometry irregular but each brick takes about the same amount of time to lay.
Decomposition of wall for an irregular geometry involves equalizing number of bricks per mason, not length of wall per mason.

HTML version of Scripted Foils prepared 27 August 1996

Foil 73 Some Problems are Inhomogeneous Illustrated by:
An Inhomogeneous Hadrian Wall with Decoration

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Fundamental entities (bricks, gargoyles) are of different complexity
Best decomposition dynamic
Inhomogeneous problems run on concurrent computers but require dynamic assignment of work to nodes and strategies to optimize this
(we use neural networks, simulated annealing, spectral bisection etc.)

HTML version of Scripted Foils prepared 27 August 1996

Foil 74 Global and Local Parallelism Illustrated by Hadrian's Wall

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Global Parallelism
  • Break up domain
  • Amount of Parallelism proportional to size of problem (and is usually large)
  • Unit is Bricklayer or Computer node
Local Parallelism
  • Do in parallel local operations in the processing of basic entities
    • e.g. for Hadrian's problem, use two hands, one for brick and one for mortar while ...
    • for computer case, do addition at same time as multiplication
  • Local Parallelism is limited but useful
Local and Global Parallelism
Should both be Exploited

HTML version of Scripted Foils prepared 27 August 1996

Foil 75 Parallel I/O Illustrated by
Concurrent Brick Delivery for Hadrian's Wall
Bandwidth of Trucks and Roads
Matches that of Masons

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Disk (input/output) Technology is better matched to several modest power processors than to a single sequential supercomputer
Concurrent Computers natural in databases, transaction analysis

HTML version of Scripted Foils prepared 27 August 1996

Foil 76 Single nCUBE2 CPU Chip

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 18

HTML version of Scripted Foils prepared 27 August 1996

Foil 77 64 Node nCUBE Board

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 34
Each node is CPU and 6 memory chips -- CPU Chip integrates communication channels with floating, integer and logical CPU functions

HTML version of Scripted Foils prepared 27 August 1996

Foil 78 Technologies for High Performance Computers

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
We can choose technology and architecture separately in designing our high performance system
Technology is like choosing ants people or tanks as basic units in our society analogy
  • or less frivolously neurons or brains
In HPCC arena, we can distinguish current technologies
  • COTS (Consumer off the shelf) Microprocessors
  • Custom node computer architectures
  • More generally these are all CMOS technologies
Near term technology choices include
  • Gallium Arsenide or Superconducting materials as opposed to Silicon
  • These are faster by a factor of 2 (GaAs) to 300 (Superconducting)
Further term technology choices include
  • DNA (Chemical) or Quantum technologies
It will cost $40 Billion for next industry investment in CMOS plants and this huge investment makes it hard for new technologies to "break in"

HTML version of Scripted Foils prepared 27 August 1996

Foil 79 Architectures for High Performance Computers - I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Architecture is equivalent to organization or design in society analogy
  • Different models for society (Capitalism etc.) or different types of groupings in a given society
  • Businesses or Armies are more precisely controlled/organized than a crowd at the State Fair
  • We will generalize this to formal (army) and informal (crowds) organizations
We can distinguish formal and informal parallel computers
Informal parallel computers are typically "metacomputers"
  • i.e. a bunch of computers sitting on a department network

HTML version of Scripted Foils prepared 27 August 1996

Foil 80 Architectures for High Performance Computers - II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Metacomputers are a very important trend which uses similar software and algorithms to conventional "MPP's" but have typically less optimized parameters
  • In particular network latency is higher and bandwidth is lower for an informal HPC
  • Latency is time for zero length communication -- start up time
Formal high performance computers are the classic (basic) object of study and are
"closely coupled" specially designed collections of compute nodes which have (in principle) been carefully optimized and balanced in the areas of
  • Processor (computer) nodes
  • Communication (internal) Network
  • Linkage of Memory and Processors
  • I/O (external network) capabilities
  • Overall Control or Synchronization Structure

HTML version of Scripted Foils prepared 27 August 1996

Foil 81 There is no Best Machine!

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
In society, we see a rich set of technologies and architectures
  • Ant Hills
  • Brains as bunch of neurons
  • Cities as informal bunch of people
  • Armies as formal collections of people
With several different communication mechanisms with different trade-offs
  • One can walk -- low latency, low bandwidth
  • Go by car -- high latency (especially if can't park), reasonable bandwidth
  • Go by air -- higher latency and bandwidth than car
  • Phone -- High speed at long distance but can only communicate modest material (low capacity)

HTML version of Scripted Foils prepared 27 August 1996

Foil 82 Quantum Computing - I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Quantum-Mechanical Computers by Seth Lloyd, Scientific American, Oct 95
Chapter 6 of The Feynman Lectures on Computation edited by Tony Hey and Robin Allen, Addison-Wesley, 1996
Quantum Computing: Dream or Nightmare? Haroche and Raimond, Physics Today, August 96 page 51
Basically any physical system can "compute" as one "just" needs a system that gives answers that depend on inputs and all physical systems have this property
Thus one can build "superconducting" "DNA" or "Quantum" computers exploiting respectively superconducting molecular or quantum mechanical rules

HTML version of Scripted Foils prepared 27 August 1996

Foil 83 Quantum Computing - II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
For a "new technology" computer to be useful, one needs to be able to
  • conveniently prepare inputs,
  • conveniently program,
  • reliably produce answer (quicker than other techniques), and
  • conveniently read out answer
Conventional computers are built around bit ( taking values 0 or 1) manipulation
One can build arbitarily complex arithmetic if have some way of implementing NOT and AND
Quantum Systems naturally represent bits
  • A spin (of say an electron or proton) is either up or down
  • A hydrogen atom is either in lowest or (first) excited state etc.

HTML version of Scripted Foils prepared 27 August 1996

Foil 84 Quantum Computing - III

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Interactions between quantum systems can cause "spin-flips" or state transitions and so implement arithmetic
Incident photons can "read" state of system and so give I/O capabilities
Quantum "bits" called qubits have another property as one has not only
  • State |0> and state |1> but also
  • Coherent states such as .7071*(|0> + |1>) which are equally in either state
Lloyd describes how such coherent states provide new types of computing capabilities
  • Natural random number as measuring state of qubit gives answer 0 or 1 randomly with equal probability
  • As Feynman suggests, qubit based computers are natural for large scale simulation of quantum physical systems -- this is "just" analog computing

HTML version of Scripted Foils prepared 27 August 1996

Foil 85 Superconducting Technology -- Past

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Superconductors produce wonderful "wires" which transmit picosecond (10^-12 seconds) pulses at near speed of light
  • Superconducting is lower power and faster than diffusive electron transmission in CMOS
  • At about 0.35micron chip feature size, CMOS transmission time changes from domination by transmission (Distance) issues to resistive (diffusive effects)
Niobium used in constructing such superconducting circuits can be processed by similar fabrication techniques to CMOS
Josephson Junctions allow picosecond performance switches
BUT IBM (!969-1983) and Japan (MITI 1981-90) terminated major efforts in this area

HTML version of Scripted Foils prepared 27 August 1996

Foil 86 Superconducting Technology -- Present

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
New ideas have resurrected this concept using RSFQ -- Rapid Single Flux Quantum -- approach
This naturally gives a bit which is 0 or 1 (or in fact n units!)
This gives interesting circuits of similar structure to CMOS systems but with a clock speed of order 100-300GHz -- factor of 100 better than CMOS which will asymptote at around 1 GHz (= one nanosecond cycle time)

HTML version of Scripted Foils prepared 27 August 1996

Foil 87 Superconducting Technology -- Problems

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
At least two major problems:
Semiconductor industry will invest some some $40B in CMOS "plants" and infrastructure
  • Currently perhaps $100M a year going into superconducting circuit area!
  • How do we "bootstrap" superconducting industry?
Cannot build memory to match CPU speed and current designs have superconducting CPU's (with perhaps 256 Kbytes superconducting memory per processor) but conventional CMOS memory
  • So compared with current computers have a thousand times faster CPU, factor of four smaller cache of CPU speed and same speed basic memory as now
  • Can such machines perform well -- need new algorithms?
  • Can one design new superconducting memories?
Superconducting technology also has a bad "name" due to IBM termination!

HTML version of Scripted Foils prepared 27 August 1996

Foil 88 Architecture Classes of High Performance Computers

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Sequential or von Neuman Architecture
Vector (Super)computers
Parallel Computers
  • with various architectures classified by Flynn's methodology (this is incomplete as only discusses control or synchronization structure )
  • SISD
  • MISD
  • MIMD
  • SIMD
  • Metacomputers

HTML version of Scripted Foils prepared 27 August 1996

Foil 89 von Neuman Architecture in a Nutshell

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Instructions and data are stored in the same memory for which there is a single link (the von Neumann Bottleneck) to the CPU which decodes and executues instructions
The CPU can have multiple functional units
The memory access can be enhanced by use of caches made from faster memory to allow greater bandwidth and lower latency

HTML version of Scripted Foils prepared 27 August 1996

Foil 90 Illustration of Importance of Cache

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Fig 1.14 of Aspects of Computational Science
Editor Aad van der Steen
published by NCF

HTML version of Scripted Foils prepared 27 August 1996

Foil 91 Vector Supercomputers in a Nutshell - I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
This design enhances performance by noting that many applications calculate "vector-like" operations
  • Such as c(i)=a(i)+b(i) for i=1...N and N quite large
This allows one to address two performance problems
  • Latency in accessing memory (e.g. could take 10-20 clock cycles between requesting a particular memory location and delivery of result to CPU)
  • A complex operation , e.g. a floating point operation, can take a few machine cycles to complete
They are typified by Cray 1, XMP, YMP, C-90, CDC-205, ETA-10 and Japaneses Supercomputers from NEC Fujitsu and Hitachi

HTML version of Scripted Foils prepared 27 August 1996

Foil 92 Vector Supercomputing in a picture

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
A pipeline for vector addition looks like:
  • From Aspects of Computational Science -- Editor Aad van der Steen published by NCF

HTML version of Scripted Foils prepared 27 August 1996

Foil 93 Vector Supercomputers in a Nutshell - II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Vector machines pipeline data through the CPU
They are not so popular/relevant as in the past as
  • Improved C.P.U. architecture needs fewer cycles than before for each (complex) operation (e.g 4 now not ~100 as in past)
  • 8 Mhz 8087 of Cosmic Cube took 160 to 400 clock cycles to do a full floating point operation in 1983
  • Applications need more flexible pipelines which allow different operations to be executed on consequitive operands as they stream through CPU
  • Modern RISC processors (super scalar) can support such complex pipelines as they have far more logic than CPU's of the past
In fact excellence of say, Cray C-90 is due to its very good memory architecture allowing one to get enough operands to sustain pipeline.
Most workstation class machines have "good" CPU's but can never get enough data from memory to sustain good performance except for a few cache intensive applications

HTML version of Scripted Foils prepared 27 August 1996

Foil 94 Instruction Flow in A Simple Machine Pipeline

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Three Instructions are shown overlapped -- each starting one clock cycle after last

HTML version of Scripted Foils prepared 27 August 1996

Foil 95 Flynn's Classification of HPC Systems

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Very high speed computing systems,Proc of IEEE 54,12,p1901-1909(1966) and
Some Computer Organizations and their Effectiveness, IEEE Trans. on Comp. C-21,948-960(1972) -- both papers by M.J. Flynn
SISD -- Single Instruction stream, Single Data Stream -- i.e. von Neumann Architecture
MISD -- Multiple Instruction stream, Single Data Stream -- Not interesting
SIMD -- Single Instruction stream, Multiple Data Stream
MIMD -- Multiple Instruction stream and Multiple Data Stream -- dominant parallel system with ~one to ~one match of instruction and data streams.

HTML version of Scripted Foils prepared 27 August 1996

Foil 96 Parallel Computer Architecture Memory Structure

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Memory Structure of Parallel Machines
  • Distributed
  • Shared
  • Cached
and Heterogeneous mixtures
Shared (Global): There is a global memory space, accessible by all processors.
  • Processors may also have some local memory.
  • Algorithms may use global data structures efficiently.
  • However "distributed memory" algorithms may still be important as memory is NUMA (Nonuniform access times)
Distributed (Local, Message-Passing): All memory is associated with processors.
  • To retrieve information from another processors' memory a message must be sent there.
  • Algorithms should use distributed data structures.

HTML version of Scripted Foils prepared 27 August 1996

Foil 97 Comparison of Memory Access Strategies

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Memory can be accessed directly (analogous to a phone call) as in red lines below or indirectly by message passing (green line below)
We show two processors in a MIMD machine for distributed (left) or shared(right) memory architectures

HTML version of Scripted Foils prepared 27 August 1996

Foil 98 Types of Parallel Memory Architectures -- Physical Characteristics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Uniform: All processors take the same time to reach all memory locations.
Nonuniform (NUMA): Memory access is not uniform so that it takes a different time to get data by a given processor from each memory bank. This is natural for distributed memory machines but also true in most modern shared memory machines
  • DASH (Hennessey at Stanford) is best known example of such a virtual shared memory machine which is logically shared but physically distributed.
  • ALEWIFE from MIT is a similar project
  • TERA (from Burton Smith) is Uniform memory access and logically shared memory machine
Most NUMA machines these days have two memory access times
  • Local memory (divided in registers caches etc) and
  • Nonlocal memory with little or no difference in access time for different nonlocal memories
This simple two level memory access model gets more complicated in proposed 10 year out "petaflop" designs

HTML version of Scripted Foils prepared 27 August 1996

Foil 99 Diagrams of Shared and Distributed Memories

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 100 Parallel Computer Architecture Control Structure

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
SIMD -lockstep synchronization
  • Each processor executes same instruction stream
MIMD - Each Processor executes independent instruction streams
MIMD Synchronization can take several forms
  • Simplest: program controlled message passing
  • "Flags" (barriers,semaphores) in memory - typical shared memory construct as in locks seen in Java Threads
  • Special hardware - as in cache and its coherency (coordination between nodes)

HTML version of Scripted Foils prepared 27 August 1996

Foil 101 Some Major Hardware Architectures - MIMD

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
MIMD Distributed Memory
  • This is now best illustrated by a collection of computers on a network (i.e. a metacomputer)
MIMD with logically shared memory but usually physically distributed. The latter is sometimes called distributed shared memory.
  • In near future, ALL formal (closely coupled) MPP's will be distributed shared memory
  • Note all computers (e.g. current MIMD distributed memory IBM SP2) allow any node to get at any memory but this is done indirectly -- you send a message
  • In future "closely-coupled" machines, there will be built in hardware supporting the function that any node can directly address all memory of the system
  • This distributed shared memory architecture is currently of great interest to (a major challenge for) parallel compilers

HTML version of Scripted Foils prepared 27 August 1996

Foil 102 MIMD Distributed Memory Architecture

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
A special case of this is a network of workstations (NOW's) or personal computers (metacomputer)
Issues include:
  • Node - CPU, Memory
  • Network - Bandwidth, Memory
  • Hardware Enhanced Access to distributed Memory

HTML version of Scripted Foils prepared 27 August 1996

Foil 103 Some Major Hardware Architectures - SIMD

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
SIMD -- Single Instruction Multiple Data -- can have logically distributed or shared memory
  • Examples are CM-1,2 from Thinking Machines
  • and AMT DAP and Maspar which are currently focussed entirely on accelerating parts of database indexing
  • This architecture is of decreasing interest as has reduced functionality without significant cost advantage compared to MIMD machines
  • Cost of synchronization in MIMD machines is not high!
  • Main interest of SIMD is flexible bit arithmetic as processors "small" but as transistor densities get higher this also becomes less interesting as full function 64 bit CPU's only use a small fraction of silicon of modern computer

HTML version of Scripted Foils prepared 27 August 1996

Foil 104 SIMD (Single Instruction Multiple Data) Architecture

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
CM2 - 64 K processors with 1 bit arithmetic - hypercube network, broadcast network can also combine , "global or" network
Maspar, DECmpp - 16 K processors with 4 bit (MP-1), 32 bit (MP-2) arithmetic, fast two-dimensional mesh and slower general switch for communication

HTML version of Scripted Foils prepared 27 August 1996

Foil 105 Some Major Hardware Architectures - Mixed

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Also have heterogeneous compound architecture (metacomputer) gotten by arbitrary combination of MIMD or SIMD, Sequential or Parallel machines.
Metacomputers can vary from full collections of several hundred PC's/Settop boxes on the (future) World Wide Web to a CRAY C-90 connected to a CRAY T3D
This is a critical future architecture which is intrinsically distributed memory as multi-vendor heterogenity implies that one cannot have special hardware enhanced shared memory
  • note that this can be a MIMD collection of SIMD machines if have a set of Maspar's on a network
  • One can think of human brain as a SIMD machine and then a group of people is such a MIMD collection of SIMD processors

HTML version of Scripted Foils prepared 27 August 1996

Foil 106 Some MetaComputer Systems

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Cluster of workstations or PC's
Heterogeneous MetaComputer System

HTML version of Scripted Foils prepared 27 August 1996

Foil 107 Comments on Special Purpose Devices

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
One example is an Associative memory - SIMD or MIMD or content addressable memories
This is an an example of a special purpose "signal" processing machine which can in fact be built from "conventional" SIMD or "MIMD" architectures
This type of machine is not so popular as most applications are not dominated by computations for which good special purpose devices can be designed
If only 10% of a problem is say "track-finding" or some special purpose processing, then who cares if you reduce that 10% by a factor of 100
  • You have only sped up the system by a factor 1.1 not by 100!

HTML version of Scripted Foils prepared 27 August 1996

Foil 108 The GRAPE N-Body Machine

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
N body problems (e.g. Newton's laws for one million stars in a globular cluster) can have succesful special purpose devices
See GRAPE (GRAvity PipE) machine (Sugimoto et al. Nature 345 page 90,1990)
  • Essential reason is that such problems need much less memory per floating point unit than most problems
  • Globular Cluster: 10^6 computations per datum stored
  • Finite Element Iteration: A few computations per datum stored
  • Rule of thumb is that one needs one gigabyte of memory per gigaflop of computation in general problems and this general design puts most cost into memory not into CPU.
Note GRAPE uses EXACTLY same parallel algorithm that one finds in the books (e.g. Solving Problems on Concurrent Processors) for N-body problems on classic distributed memory MIMD machines

HTML version of Scripted Foils prepared 27 August 1996

Foil 109 Why isn't GRAPE a Perfect Solution?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
GRAPE will execute the classic O(N^2) (parallel) N body algorithm BUT this is not the algorithm used in most such computations
Rather there is the O(N) or O(N)logN so called "fast-multipole" algorithm which uses hierarchical approach
  • On one million stars, fast multipole is a factor of 100-1000 faster than GRAPE algorithm
  • fast multipole works in most but not all N-body problems (in globular clusters, extreme heterogenity makes direct O(N^2) method most attractive)
So special purpose devices cannot usually take advantage of new nifty algorithms!

HTML version of Scripted Foils prepared 27 August 1996

Foil 110 Granularity of Parallel Components - I

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Coarse-grain: Task is broken into a handful of pieces, each executed by powerful processors.
  • Pieces, processors may be heterogeneous. Computation/
  • Communication ratio very high -- Typical of Networked Metacomputing
Medium-grain: Tens to few thousands of pieces, typically executed by microprocessors.
  • Processors typically run the same code.(SPMD Style)
  • Computation/communication ratio often hundreds or more.
  • Typical of MIMD Parallel Systems such as SP2, CM5, Paragon, T3D

HTML version of Scripted Foils prepared 27 August 1996

Foil 111 Granularity of Parallel Components - II

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Fine-grain: Thousands to perhaps millions of small pieces, executed by very small, simple processors (several per chip) or through pipelines.
  • Processors often have instructions broadcasted to them.
  • Computation/ Communication ratio often near unity.
  • Typical of SIMD but seen in a few MIMD systems such as Kogge's Execube, Dally's J Machine or commercial Myrianet (Seitz)
  • This is going to be very important in future petaflop architectures as the dense chips of year 2003 onwards favor this Processor in Memory Architecture
  • So many "transistors" in future chips that "small processors" of the "future" will be similar to todays high end microprocessors
  • As chips get denser, not realistic to put processors and memories on separate chips as granularities become too big
Note that a machine of given granularity can be used on algorithms of the same or finer granularity

HTML version of Scripted Foils prepared 27 August 1996

Foil 112 Classes of Communication Networks

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
The last major architectural feature of a parallel machine is the network or design of hardware/software connecting processors and memories together.
Bus: All processors (and memory) connected to a common bus or busses.
  • Memory access fairly uniform, but not very scalable due to contention
  • Bus machines can be NUMA if memory consists of directly accessed local memory as well as memory banks accessed by Bus. The Bus accessed memories can be local memories on other processors
Switching Network: Processors (and memory) connected to routing switches like in telephone system.
  • Switches might have queues and "combining logic", which improve functionality but increase latency.
  • Switch settings may be determined by message headers or preset by controller.
  • Connections can be packet-switched (messages no longer than some fixed size) or circuit-switched (connection remains as long as needed)
  • Usually NUMA, blocking, often scalable and upgradable

HTML version of Scripted Foils prepared 27 August 1996

Foil 113 Switch and Bus based Architectures

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Switch
Bus

HTML version of Scripted Foils prepared 27 August 1996

Foil 114 Examples of Interconnection Topologies

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Two dimensional grid, Binary tree, complete interconnect and 4D Hypercube.
Communication (operating system) software ensures that systems appears fully connected even if physical connections partial

HTML version of Scripted Foils prepared 27 August 1996

Foil 115 Useful Concepts in Communication Systems

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Useful terms include:
Scalability: Can network be extended to very large systems? Related to wire length (synchronization and driving problems), degree (pinout)
Fault Tolerance: How easily can system bypass faulty processor, memory, switch, or link? How much of system is lost by fault?
Blocking: Some communication requests may not get through, due to conflicts caused by other requests.
Nonblocking: All communication requests succeed. Sometimes just applies as long as no two requests are for same memory cell or processor.
Latency (delay): Maximal time for nonblocked request to be transmitted.
Bandwidth: Maximal total rate (MB/sec) of system communication, or subsystem-to-subsystem communication. Sometimes determined by cutsets, which cut all communication between subsystems. Often useful in providing lower bounds on time needed for task.
Worm Hole Routing -- Intermediate switch nodes do not wait for full message but allow it to pass throuch in small packets

HTML version of Scripted Foils prepared 27 August 1996

Foil 116 Communication Performance of Some MPP's

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
From Aspects of Computational Science, Editor Aad van der Steen, published by NCF
System Communication Speed Computation Speed
    • Mbytes/sec(per link) Mflops/sec(per node)
IBM SP2 40 267
Intel iPSC860 2.8 60
Intel Paragon 200 75
Kendall Square
KSR-1 17.1 40
Meiko CS-2 100 200
Parsytec GC 20 25
TMC CM-5 20 128
Cray T3D 150 300

HTML version of Scripted Foils prepared 27 August 1996

Foil 117 Implication of Hardware Performance

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
tcomm = 4 or 8 /Speed in Mbytes sec
  • as 4 or 8 bytes in a floating point word
tfloat = 1/Speed in Mflops per sec
Thus tcomm / tfloat is just 4 X Computation Speed divided by Communication speed
tcomm / tfloat is 26.7, 85, 1.5, 9.35, 8, 5, 25.6, 8 for the machines SP2, iPSC860, Paragon, KSR-1, Meiko CS2, Parsytec GC, TMC CM5, and Cray T3D respectively
Latency makes situation worse for small messages and double for 64bit arithmetic natural on large problems!

HTML version of Scripted Foils prepared 27 August 1996

Foil 118 Latency and Bandwidth of a Network

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Transmission Time for message of n bytes:
T0 + T1 n where
T0 is latency containing a term proportional to number of hops. It also has a term representing interrupt processing time at beginning at and for communication network and processor to synchronize
T0 = TS + Td . Number of hops
T1 is the inverse bandwidth -- it can be made small if pipe is large size.
In practice TS and T1 are most important and Td is unimportant

HTML version of Scripted Foils prepared 27 August 1996

Foil 119 Transfer Time in Microseconds for both Shared Memory Operations and Explicit Message Passing

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Dongarra and Dunigan: Message-Passing Performance of Various Computers, August 1995

HTML version of Scripted Foils prepared 27 August 1996

Foil 120 Latency/Bandwidth Space for 0-byte message(Latency) and 1 MB message(bandwidth).

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Square blocks indicate shared memory copy performance
Dongarra and Dunigan: Message-Passing Performance of Various Computers, August 1995

HTML version of Scripted Foils prepared 27 August 1996

Foil 121 The Federal Program Focusing on 1996 Highlights with many exciting Applications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 122 1996 Blue Book

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 30

HTML version of Scripted Foils prepared 27 August 1996

Foil 123 1996 Blue Book (1 of 3)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 44
Executive Summary
I. Introduction
II. Program Accomplishments and Plan
1. High Performance Communications
  • Internetworking R&D
  • Gigabit Speed Networking R&D
  • Wireless Technologies
  • R&D for Network Integrated Computing
  • Enhanced Internet Connectivity
2. High Performance Computing Systems
  • Performance Accomplishments
  • Microsystems
  • Embedded Systems
  • Networks of Workstations
  • Rapid Prototyping Facility
  • Specialized Very High Performance Architectures
  • Mass Storage
3. Advanced Software Technologies
  • Systems Software
  • Programming Languages and Compilers
  • Software Tools
  • Computational Techniques
  • Performance Measurement
  • Benchmarking
  • Software Sharing
  • Visualization

HTML version of Scripted Foils prepared 27 August 1996

Foil 124 1996 Blue Book (2 of 3)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 72
4. Technologies for the Information Infrastructure
  • Information Infrastructure Services Technologies
  • World Wide Web (WWW) and NCSA Mosaic
  • Security and Privacy
  • Information Infrastructure Applications Technologies
5. High Performance Computing Research Facilities
  • NSF Supercomputer Centers
  • NSF Science and Technology Centers
  • NASA Testbeds
  • DOE Laboratories
  • NIH Systems
  • NOAA Laboratories
  • EPA Systems
6. Grand Challenge Applications
  • Applied Fluid Dynamics
  • Meso- to Macro-Scale Environmental Modeling
  • Ecosystem Simulations
  • Biomedical Imaging and Biomechanics
  • Molecular Biology
  • Molecular Design and Process Optimization
  • Cognition
  • Fundamental Computational Sciences
  • Grand-Challenge-Scale Applications

HTML version of Scripted Foils prepared 27 August 1996

Foil 125 1996 Blue Book (3 of 3)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 34
7. National Challenge Applications - Digital Libraries
  • Public Access to Government Information
  • Electronic Commerce
  • Civil Infrastructure
  • Education and Lifelong Learning
  • Energy Management
  • Environmental Monitoring
  • Health Care
  • Manufacturing Processes and Products
8. Basic Research and Human Resources
  • Basic Research
  • Training and Education
III. HPCC Program Organization
IV. HPCC Program Summary
V. References
VI. Glossary
VII. Contacts

HTML version of Scripted Foils prepared 27 August 1996

Foil 126 The Application Motivation for HPCC

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 127 High Performance Computing Research Facilities

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 84
NSF Supercomputing Centers
NSF Science and Technology Centers
NASA Testbeds
DOE Laboratories
NIH Systems
NOAA Laboratories
EPA Systems

HTML version of Scripted Foils prepared 27 August 1996

Foil 128 Grand Challenge Applications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 112
Applied Fluid Dynamics
Meso- to Macro-Scale Environmental Modeling
Ecosystem Simulations
Biomedical Imaging and Biomechanics
Molecular Biology
Molecular design and Process Optimization
Cognition
Fundamental Computational sciences
Grand-Challenge-Scale Applications

HTML version of Scripted Foils prepared 27 August 1996

Foil 129 Applied Fluid Dynamics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 20
Computational Aeroscience
Coupled Field Problems and GAFD (Geophysical and Astrophysical Fluid Dynamics) Turbulence
Combustion Modeling: Adaptive Grid Methods
Oil Reservoir Modeling: Parallel Algorithms for Modeling Flow in Permeable Media
Numerical Tokamak Project (NTP)

HTML version of Scripted Foils prepared 27 August 1996

Foil 130 Coupled Field Problems and GAFD Turbulence

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 31
An image from a video illustrating the flutter analysis of a FALCON jet under a sequence of transonic speed maneuvers. Areas of high stress are red; areas of low stress are blue.

HTML version of Scripted Foils prepared 27 August 1996

Foil 131 Numerical Tokamak Project

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 44
Particle trajectories and electrostatic potentials from a three- dimensional implicit tokamak plasma simulation employing adaptive mesh techniques. The boundary is aligned with the magnetic field that shears around the torus. The strip in the torus is aligned with the local magnetic field and is color mapped with the local electrostatic potential. The yellow trajectory is the gyrating orbit of a single ion.

HTML version of Scripted Foils prepared 27 August 1996

Foil 132 Meso- to Macro-Scale Environmental Modeling

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 51
Massively Parallel Atmospheric Modeling Projects
Parallel Ocean Modeling
Mathematical Modeling of Air Pollution Dynamics
A Distributed Computational System for Large Scale Environmental Modeling
Cross-Media (Air and Water) Linkage
Adaptive Coordination of Predictive Models with Experimental Data
Global Climate Modeling
Four-Dimensional Data Assimilation for Massive Earth System Data Analysis

HTML version of Scripted Foils prepared 27 August 1996

Foil 133 Mathematical Modeling of Air Pollution Dynamics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 64
Ozone concentrations for the California South Coast Air Basin predicted by the Caltech research model show a large region in which the national ozone standard of 120 parts per billion (ppb) are exceeded. Measurement data corroborate these predictions. Scientific studies have shown that human exposure to ozone concentrations at or above the standard can impair lung functions in people with respiratory problems and can cause chest pain and shortness of breath even in the healthy population. This problem raises concern since more than 30 urban areas across the country still do not meet the national standard.

HTML version of Scripted Foils prepared 27 August 1996

Foil 134 Global Climate Modeling

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 46
The colored plane floating above the block represents the simulated atmospheric temperature change at the earth's surface, assuming a steady one percent per year increase in atmospheric carbon dioxide to the time of doubled carbon dioxide. The surfaces in the ocean show the depths of the 1.0 and 0.2 degree (Celsius) temperature changes. The Southern Hemisphere shows much less surface warming than the Northern Hemisphere. This is caused primarily by the cooling effects of deep vertical mixing in the oceans south of 45 degrees South latitude. Coupled ocean-atmosphere climate models such as this one from NOAA/GFDL help improve scientific understanding of potential climate change.

HTML version of Scripted Foils prepared 27 August 1996

Foil 135 4-D Data Assimilation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 53
A scientist uses NASA's virtual reality modeling resources to explore the Earth's atmosphere as part of the Earth and Space Science Grand Challenge.

HTML version of Scripted Foils prepared 27 August 1996

Foil 136 Eco Simulations

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 37
Environmental Chemistry
Groundwater Transport and Remediation
Earthquake Ground Motion Modeling in Large Basins: The Quake Project
High Performance Computing for Land Cover Dynamics
Massively Parallel Simulations of Large-Scale, High- Resolution Ecosystme Models

HTML version of Scripted Foils prepared 27 August 1996

Foil 137 Biomedical Imaging and Biomechanics

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 47
Visible Human Project
Reconstruction of Positron Emission Tomography (PET) Images
Image Processing of Electron Micrographs
Understanding Human Joint Mechanisms

HTML version of Scripted Foils prepared 27 August 1996

Foil 138 Molecular Biology

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 43
Protein and Nucleic Sequence Analysis
Protein Folding Prediction
Ribonucleic Acid (RNA) Structure Predition

HTML version of Scripted Foils prepared 27 August 1996

Foil 139 Molecular Design

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 41
Biological Applications of Quantum Chemistry
Biomolecular Design
Biomolecular Modeling and Structure Determination
Computational Structural Biology
Biological Methods for Enzyme Catalysis

HTML version of Scripted Foils prepared 27 August 1996

Foil 140 Biomolecular Modeling and Structure Determination

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 30
A portion of the Glucocorticoid Receptor bound to DNA; the receptor helps to regulate expression of the genetic code.

HTML version of Scripted Foils prepared 27 August 1996

Foil 141 Fundamental Computational Sciences

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 79
Quantum Chromodynamics
High Capacity Atomic-Level Simulations for the Design of Materials
First Principals Simulation of Materials Properties
Black Hole Binaries: Coalescence and Gravitational Radiation
Scalable Hierarchical Particle Algorithms for Galzy Formation and Accretion Astrophysics
Radio Synthesis Imaging
Large Scale Structure and Galaxy Formation

HTML version of Scripted Foils prepared 27 August 1996

Foil 142 Binary Black Holes Simulation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 72
The Alliance will produce an accurate, efficient description of the coalescence of black holes, and gravitational radiation emitted, by solving computationally EinsteinÕs equations for gravitational fields with direct application to the gravity-wave detection systems LIGO and VIRGO under construction in USA and Europe.

HTML version of Scripted Foils prepared 27 August 1996

Foil 143 The Binary Black Hole Grand Challenge Alliance

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 70
Austin- Chapel Hill- Cornell- NCSA- Northwestern- Penn State- Pittsburgh- NPAC has Formal Goals
To develop a problem solving environment for the Nonlinear Einstein's equations describing General Relativity, including a dynamical adaptive multilevel parallel infrastructure
To provide controllable convergent algorithms to compute gravitational waveforms which arise from Black Hole encounters, and which are relevant to astrophysical events and may be used to predict signals which for detection by future ground-, and space-, based detectors.
  • This code will be made available to researchers in Computational Relativity (by publication and via the World Wide Web).
To provide representative examples of computational waveforms.
http://www.npac.syr.edu/projects/bbh/bbh.html

HTML version of Scripted Foils prepared 27 August 1996

Foil 144 BBH: Computational Challenge

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 48
Problem size: Analysis with Uniform Grid
  • Requested spatial resolution: 50 mesh points per black hole (radius of event horizon, R)
  • To extract gravitational waves a space region of ~100 R is necessary
  • Number of mesh points: (50 x100)3 => ~ 1011
  • Time evolution: 50,000 steps (corresponds to distance ~1000R with dt=dx)
  • Total number of events: ~1016
  • Floating point operations per event: ~104
  • Total FLOP count: 1020 => 30 years of a Teraflop machine!
Solution: Adaptive Mesh Refinement
  • one week of a Teraflop machine

HTML version of Scripted Foils prepared 27 August 1996

Foil 145 Adaptive Multilevel Parallel Infrastructure

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 74
Einstein's equations can be represented as a coupled system of hyperbolic and elliptic PDEs with non-trivial boundary conditions to be solved using adaptive multilevel methods
We are building PSE that will support:
  • composition of stable, convergent AMR and MG solvers
  • software integration (initial value problem, apparent horizon finders, ...
  • automatic conversion of sequential unigrid codes into parallel, multigrid versions
  • collaborative visualization environment
To implement the system we use technologies developed by CRPC, in particular MPI and HPF, combined with emerging new Web technologies: JAVA and VRML 2.0.

HTML version of Scripted Foils prepared 27 August 1996

Foil 146 First Principal Simulation of Materials Properties

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 38

HTML version of Scripted Foils prepared 27 August 1996

Foil 147 Large Scale Structure and Galaxy Formation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 66
Simulation of gravitational clustering of dark matter. This detail shows one sixth of the volume computed in a cosmological simulation involving 16 million highly clustered particles that required load balancing on a massively parallel computing system. Many particles are required to resolve the formation of individual galaxy halos seen here as red/white spots.

HTML version of Scripted Foils prepared 27 August 1996

Foil 148 Grand-Challenge-Scale Applications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 56
Simulation of Chorismate Mutase
Simulation of Antibody-Antigen Association
A Realistic Ocean Model
Drag Control
The Impact of Turbulence on Weather/Climate Prediction
Shoemaker-Levy 9 Collision with Jupiter
Vortex structure and Dynamics in Superconductors
Molecular Dynamics Modeling
Crash Simulation
Advanced Simulation of Chemically Reacting Flows

HTML version of Scripted Foils prepared 27 August 1996

Foil 149 Visible Human

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 150 A Realistic Ocean Model

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 51
Simulation of circulation in the North Atlantic. Color shows temperature, red corresponding to high temperature. In most prior modeling, the Gulf Stream turns left past Cape Hatteras, clinging to the continental shoreline. In this simulation, however, the Gulf Stream veers off from Cape Hatteras on a northeast course into the open Atlantic, following essentially the correct course.

HTML version of Scripted Foils prepared 27 August 1996

Foil 151 Shoemaker-Levy 9 Collision with Jupiter

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 21
Impact of the comet fragment. Image height corresponds to 1,000 kilometers. Color represents temperature, ranging from tens of thousands of degrees Kelvin (red), several times the temperature of the sun, to hundreds of degrees Kelvin (blue).

HTML version of Scripted Foils prepared 27 August 1996

Foil 152 Advanced Simulation of Crash Simulation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 47
Illustrative of the computing power at the Center for Computational Science is the 50 percent offset crash of two Ford Taurus cars moving at 35 mph shown here. The Taurus model is detailed; the results are useful in understanding crash dynamics and their consequences. These results were obtained using parallel DYNA-3D software developed at Oak Ridge. Run times of less than one hour on the most powerful machine are expected.

HTML version of Scripted Foils prepared 27 August 1996

Foil 153 National Challenge Applications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 44
Digital Libraries
Public Access to Government Information
Electronic Commerce
Civil Infrastructure
Education and Lifelong Learning
Energy Management
Environmental Monitoring
Health Care
Maunfacturing Processes and Products

HTML version of Scripted Foils prepared 27 August 1996

Foil 154 A Survey of New York State Industrial Opportunities for HPCC was very influential for me and my group(NPAC)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 36

HTML version of Scripted Foils prepared 27 August 1996

Foil 155 Categories of Industrial and Government Applications of HPCC (with reference to academic applications)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Define information generally to include both CNN headline news and the insights on QCD gotten from lattice gauge theories
Information Production e.g. Simulation
  • Major concentration of MPP and HPCC at present
Information Analysis e.g. Extraction of location of oil from seismic data, Extraction of customer preferences from purchase data
  • Growing area of importance and Short term major MPP opportunity in decision support combined with parallel databases
Information Access and Dissemination - InfoVision e.g. Transaction Processing, Video-On-Demand
  • Enabled by National Information Infrastructure
  • Very promising medium term market for MPP but need the NII
  • to be reasonably pervasive before area "takes off"
Information Integration .
  • Integrates Information Production Analysis and Access e.g.
    • Decision support in business
    • Command and Control for Military
    • Concurrent Engineering and Agile Manufacturing
  • Largest Long Term Market for MPP

HTML version of Scripted Foils prepared 27 August 1996

Foil 156 The 33 Application areas were studied in detail:
Simulation (Roughly the Grand Challenges)

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 23
1:Computational Fluid Dynamics
2:Structural Dynamics
3:Electromagnetic Simulation
4:Scheduling
5:Environmental Modelling (with PDE's)
6:Environmental Phenomenology
7:Basic Chemistry
8:Molecular Dynamics
9:Economic Modelling
10:Network Simulations
11:Particle Transport Problems
12: Graphics
13:Integrated Complex Systems Simulations

HTML version of Scripted Foils prepared 27 August 1996

Foil 157 The 33 Application areas were studied in detail:
Information Analysis -- DataMining

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 43
14:Seismic and Environmental Data Analysis
15:Image Processing
16:Statistical Analysis
17:Healthcare Fraud
18:Market Segmentation
Growing Area of Importance and reasonable near term MPP opportunity in decision support combined with parallel (relational) databases

HTML version of Scripted Foils prepared 27 August 1996

Foil 158 The 33 Application areas were studied in detail:
InfoVision: Information, Video, Imagery and Simulation on Demand

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 37
19:Transaction Processing
20:Collaboration Support
21:Text on Demand
22:Video on Demand
23:Imagery on Demand
24:Simulation on Demand (education,financial modelling etc.) -- simulation is a "media"!
MPP's as High Performance Multimedia (database) servers -- WebServers
Excellent Medium term Opportunity for MPP enabled by National Information Infrastructure

HTML version of Scripted Foils prepared 27 August 1996

Foil 159 The 33 Application areas were studied in detail:
Information Integration combining Simulation, Analysis and InfoVision

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 66
25:Military and Civilian Command and Control(Crisis Management)
26:Decision Support for Society (Community Servers)
27:Business Decision Support
28:Public Administration and Political Decision(Judgement) Support
29:Real-Time Control Systems
30:Electronic Banking
31:Electronic Shopping
32:(Agile) Manufacturing including Multidisciplinary Design/Concurrent Engineering
33:Education at K-12, University and Continuing levels
Largest Application of any Computer and Dominant HPCC Opportunity

HTML version of Scripted Foils prepared 27 August 1996

Foil 160 Some detailed Analysis of Opportunities for HPCC in the Science and Engineering Simulation Arena

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 10

HTML version of Scripted Foils prepared 27 August 1996

Foil 161 Opportunities for HPCC in the Science and Engineering Simulation Arena

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
In spite of the large and very succesful national activity, simulation will not be a large "real world" sales opportunity for MPP's
  • Maybe difficulties for Thinking Machines illustrate this
However some areas of national endeavor will be customers for MPP's used for simulation
  • Large Scale Academic Calculations
    • Value of Increased Computation demonstrated in many disciplines
    • Codes are sufficiently small that software engineering considerations of adapting 1,000,000 lines not so important
  • Petroleum Industry
    • Resevoir Simulation
    • Siesmic Data Analysis
  • Some Earth and Space Science including
    • Climate and Weather Forecasting

HTML version of Scripted Foils prepared 27 August 1996

Foil 162 Some Simulation Areas which will be Difficult to exploit in near term

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Some areas which may adopt HPCC for simulation in relatively near future
  • Pharmaceutical Industry
    • Intense and brilliant academic (government research laboratory) effort in biochemical molecular modelling
    • But "Computer Designed Drugs" are not sufficiently promising to clearly justify purchase of large MPP's by drug industry
  • Financial Industry
    • MPP's being used by Prudential but in spite of success, they are not yet being generally adopted
    • Networks of Workstations severe competition as many problems are "embarassingly parallel"
  • Electrical Power Industry
    • Value seems clear for planning and real time control but
    • Industry conservative and faced with growing near term competition

HTML version of Scripted Foils prepared 27 August 1996

Foil 163 Suprisingly Difficult and Suprisingly Promising Areas for HPCC in Simulation

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
The role of HPCC in Manufacturing is quite clear and will be critical to
  • Agile Manufacturing and the year 2010 Manufacturing Industry but for
    • Major fields including
    • Aircraft
    • Cars
  • HPCC will not have a major impact for simulation in the next few years
On the other hand for
  • War Games and Simulations of Complex Scenarios
  • Role of MPP's can be expected to grow especially when coupled as in (old) SIMNET with high speed geographically distributed networks
  • Note this is different basic software technology
    • Event driven -- not time stepped -- simulation

HTML version of Scripted Foils prepared 27 August 1996

Foil 164 Why is it hard to use HPCC in Manufacturing-I?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Return on Investment Unclear:
  • Amdahl's law for use of HPCC in Industrial Simulation
    • If Simulation was only 10% or less of original design and manufacturing cycle, then can only gain this 10% by speeding up simulation
    • And this speedup comes at huge software engineering cost !
    • Codes are long and expertise to convert to parallelism may no longer exist in new "slim" companies after layoffs , buyouts and freeze on hiring new employees with knowledge of new technologies such as HPCC
    • New codes must be validated by extensive tests before use
    • Remember we can't solve full Navier-Stokes Equations yet and so some approximations necessary
The Industry is in a very competitive situation and focussed on short term needs

HTML version of Scripted Foils prepared 27 August 1996

Foil 165 Why is it hard to use HPCC in Manufacturing-II?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
In March 1994 Arpa Meeting in Washington, Boeing(Neves) endorsed parallel databases and not parallel simulation
  • Similar comment made to me by Major Brokerage
    • "Financial Modelling (on MPP) gets the headlines but information services are the critical problem"
Aerospace Engineers are just like University Faculty
  • They prefer to use their own workstations and not central Supercomputers
There is perhaps some general decline of Supercomputer Industry
  • As performance of technology increases
  • Users don't take full advantage of this performance Increase
    • Rather buy somewhat more powerful computers at somewhat lower cost

HTML version of Scripted Foils prepared 27 August 1996

Foil 166 Multidisciplinary Analysis and Design as a Critical use of HPCC in Manufacturing?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
MAD (Multidisciplinary Analysis and Design) links:
  • Structural, Fluid flow, electromagnetic signature, manufacturing process computations
  • Design, Manufufacturing, Sales and Support Functions
(Includes MDO -- Multidisciplinary Optimization)
Link Simulation and CAD Processes
  • Technically link CAD databases (using parallel database technology) to MPP simulations
This is really important application of HPCC as addresses "Amdahl's Law" as we use HPCC to support full manufacturing cycle -- not just one part! Thus large improvements in manufacturers time to market and product quality possible.
BUT must change and even harder integrate:
  • ALL software used in Manufacturing and this now comes from different vendors
  • The way of doing business in company
    • New job skills and cultures -- the hardest problem

HTML version of Scripted Foils prepared 27 August 1996

Foil 167 Role of Government and DoD in HPCC Simulation Applications

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
The limited nearterm industrial use of HPCC implies that it is critical for Government and DoD to support and promote
DoD Simulation: Dual-Use Philosophy implies
  • Can use Commercial MPP hardware and basic systems software
  • Cannot rely on commercial market for application and sophisticated systems software and indeed hardware targeted at engineering and science simulation
Manufacturing Support can lead to future US Industry leadership in advanced HPCC based manufacturing environments 10-20 years from now
  • Industry cannot afford and even consider long term investment needed to integrate HPCC into manufacturing
  • Government should support long-term not short-term needs
  • Government must involve manufacturing Industry in its plans
  • Currently federal Initiatives are correctly involving Industry in more major fashion than before but focussing on short term needs

HTML version of Scripted Foils prepared 27 August 1996

Foil 168 The HPCC Software Industry is not Viable in Simulation Area ?

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
An HPCC Software Industry is essential if HPCC field is to become commercially succesful
The HPCC Simulation market is small
This market is not used to paying true cost for software
  • As cost traditionally bundled with hardware and one can get
  • Federal Grants to develop software yourself ....
There is a lot of excellent available public domain software (funded by federal government)
Small Businesses are natural implementation of HPCC Software Industry
  • Plenty of talented Entrepreneurs
Two InfoMall Success Stories
  • Portland Group: Commercializing High Performance Fortran Compiler developed at NPAC. We are not competing with them but adding value
  • Applied Parallel Technologies:Developing with NIST ATP and Venture Capital portable database exploitation tools
    • NPAC provides HPCC facilities, expertise and contact with best of class international software activities

HTML version of Scripted Foils prepared 27 August 1996

Foil 169 Anecdotes from HPCC Software Industry Arena

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
Anecdotes from Thinking Machines (TMC) April 94 before the fall
  • "They would like to be a software company but you can only sell software if bundled with hardware"
  • Customers did not buy TMC hardware because TMC's software was too good and so one couldn't then get the federal grants to improve parallel systems software
Anecdote from Digital September 94:
  • Digital cannot make money on their scientific software package for alpha workstations
  • If Digital charged true cost of development and maintenance, users would make do with good (but not optimized or as complete) public domain software
  • So if workstation market not viable for simulation software, how can much smaller MPP market lead to viable business plans?

HTML version of Scripted Foils prepared 27 August 1996

Foil 170 National Challenges will drive the adoption of HPCC in the "Real World"

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index
These can be defined simply as those HPCC applications which have sufficient market to sustain a true balanced HPCC computing Industry with viable hardware and software companies
  • With this definition, some "Grand Challenges" such as Oil Exploration are National challenges
Alternatively one can define National Challenges by the HPCC technologies exploited
  • High speed geographically distributed (ATM) networks i.e.
  • The National Information Infrastructure (NII) with several hundred million clients and perhaps some 10,000 MPP based high performance multi-media servers
  • Large scale text, Image and Video databases fed by Satellites, Information produced by National Enterprise such as credit card slips etc.

HTML version of Scripted Foils prepared 27 August 1996

Foil 171 From the Grand(Simulation) Challenges to the National (information) Challenges

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 23

HTML version of Scripted Foils prepared 27 August 1996

Foil 172 Characteristics of Grand Challenges

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index Secs 48
Partial Differential Equations
Particle Dynamics and Multidisciplinary Integration
Image Processing
Some:
Visualization
Artificial Intelligence
Not Much:
Network Simulation
Economic (and other complex system) modeling
Scheduling
Manufacturing
Education
Entertainment
Information Processing
BMC3IS (Command & Control in military war)
Decision Support in global economic war

HTML version of Scripted Foils prepared 27 August 1996

Foil 173 Federal 1994 Blue Book Comparison of National and Grand Challenges

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

HTML version of Scripted Foils prepared 27 August 1996

Foil 174 Come to CPS616 for a detailed discussion of the National Challenges and the National Information Infrastructure

From CPS615-Introduction-Course,Driving Technology and HPCC Current Status and Futures CPS615 Basic Simulation Track for Computational Science -- Fall Semester 96. *
Full HTML Index

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Wed Aug 27 1997