HELP! * PURPLE=global GREY=local Full HTML for GLOBAL foilset CPS615-Lecture on Performance(end) and Computer Technologies(start)
Given by Geoffrey C. Fox at Delivered Lectures of CPS615 Basic Simulation Track for Computational Science on 5 September 96. Foils prepared 15 September 1996
Abstract * Foil Index for this file
See also color IMAGE
- This starts by considering the analytic form for communication overhead and illustrates its stencil dependence in simple local cases -- stressing relevance of grain size
- The implication for scaling and generalizing from Laplace example is covered
- We covered scaled speedup (fixed grain size) as well as fixed problem size
- We noted some useful material was missing and this was continued in next lecture (Sept 10,96)
- The lecture starts coverage of computer architecture covering base technologies with both CMOS covered in an earlier lecture contrasted to Quantum and Superconducting technology
Table of Contents for full HTML of CPS615-Lecture on Performance(end) and Computer Technologies(start)
1
Delivered Lectures for CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Lecture of September 5 - 1995
2
Abstract of Sept 5 1996 CPS615 Lecture
3
Communication Overhead
4
Analytical Form of Speed Up for Communication Overhead
5
General Form of Efficiency
6
Communication to Calculation Ratio as a function of template
7
Matrix Multiplication on the Hypercube
8
Abstract of Laplace Example for CPS615
9
Abstract of The Current Status and Futures of HPCC
10
Technologies for High Performance Computers
11
Architectures for High Performance Computers - I
12
Architectures for High Performance Computers - II
13
There is no Best Machine!
14
Quantum Computing - I
15
Quantum Computing - II
16
Quantum Computing - III
17
Superconducting Technology -- Past
18
Superconducting Technology -- Present
19
Superconducting Technology -- Problems
20
Superconducting Technology -- Present
This table of Contents
Abstract
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 1 Delivered Lectures for CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Lecture of September 5 - 1995
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Full HTML Index
- Geoffrey Fox
- NPAC
- Room 3-131 CST
- 111 College Place
- Syracuse NY 13244-4100
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 2 Abstract of Sept 5 1996 CPS615 Lecture
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Full HTML Index
- This starts by considering the analytic form for communication overhead and illustrates its stencil dependence in simple local cases -- stressing relevance of grain size
- The implication for scaling and generalizing from Laplace example is covered
- We covered scaled speedup (fixed grain size) as well as fixed problem size
- We noted some useful material was missing and this was continued in next lecture (Sept 10,96)
- The lecture starts coverage of computer architecture covering base technologies with both CMOS covered in an earlier lecture contrasted to Quantum and Superconducting technology
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 3 Communication Overhead
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 223
Full HTML Index
- Suppose communicating a single word - here a j value probably stored in a 4 byte word - take time tcomm
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 4 Analytical Form of Speed Up for Communication Overhead
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 237
Full HTML Index
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 5 General Form of Efficiency
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 167
Full HTML Index
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 6 Communication to Calculation Ratio as a function of template
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 679
Full HTML Index
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 7 Matrix Multiplication on the Hypercube
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 398
Full HTML Index
- Showing linear overhead behavior for fc
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 8 Abstract of Laplace Example for CPS615
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 20
Full HTML Index
- This takes Jacobi Iteration for Laplace's Equation in a 2D square and uses this to illustrate:
- Programming in both Data Parallel (HPF) and Message Passing (MPI and a simplified Syntax)
- SPMD -- Single Program Multiple Data -- Programming Model
- Stencil dependence of Parallel Program and use of Guard Rings
- Collective Communication
- Basic Speed Up,Efficiency and Performance Analysis with edge over area dependence and consideration of load imbalance and communication overhead effects.
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 9 Abstract of The Current Status and Futures of HPCC
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 41
Full HTML Index
- Overview of Course Itself! -- and then introductory material on basic curricula
- Overview of National Program -- The Grand Challenges
- Overview of Technology Trends leading to petaflop performance in year 2007 (hopefully)
- Overview of Syracuse and National programs in computational science
- Parallel Computing in Society
- Why Parallel Computing works
- Simple Overview of Computer Architectures
- SIMD MIMD Distributed (shared memory) Systems ... PIM ... Quantum Computing
- General Discussion of Message Passing and Data Parallel Programming Paradigms and a comparison of languages
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 10 Technologies for High Performance Computers
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 192
Full HTML Index
- We can choose technology and architecture separately in designing our high performance system
- Technology is like choosing ants people or tanks as basic units in our society analogy
- or less frivolously neurons or brains
- In HPCC arena, we can distinguish current technologies
- COTS (Consumer off the shelf) Microprocessors
- Custom node computer architectures
- More generally these are all CMOS technologies
- Near term technology choices include
- Gallium Arsenide or Superconducting materials as opposed to Silicon
- These are faster by a factor of 2 (GaAs) to 300 (Superconducting)
- Further term technology choices include
- DNA (Chemical) or Quantum technologies
- It will cost $40 Billion for next industry investment in CMOS plants and this huge investment makes it hard for new technologies to "break in"
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 11 Architectures for High Performance Computers - I
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 136
Full HTML Index
- Architecture is equivalent to organization or design in society analogy
- Different models for society (Capitalism etc.) or different types of groupings in a given society
- Businesses or Armies are more precisely controlled/organized than a crowd at the State Fair
- We will generalize this to formal (army) and informal (crowds) organizations
- We can distinguish formal and informal parallel computers
- Informal parallel computers are typically "metacomputers"
- i.e. a bunch of computers sitting on a department network
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 12 Architectures for High Performance Computers - II
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 152
Full HTML Index
- Metacomputers are a very important trend which uses similar software and algorithms to conventional "MPP's" but have typically less optimized parameters
- In particular network latency is higher and bandwidth is lower for an informal HPC
- Latency is time for zero length communication -- start up time
- Formal high performance computers are the classic (basic) object of study and are
- "closely coupled" specially designed collections of compute nodes which have (in principle) been carefully optimized and balanced in the areas of
- Processor (computer) nodes
- Communication (internal) Network
- Linkage of Memory and Processors
- I/O (external network) capabilities
- Overall Control or Synchronization Structure
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 13 There is no Best Machine!
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 165
Full HTML Index
- In society, we see a rich set of technologies and architectures
- Ant Hills
- Brains as bunch of neurons
- Cities as informal bunch of people
- Armies as formal collections of people
- With several different communication mechanisms with different trade-offs
- One can walk -- low latency, low bandwidth
- Go by car -- high latency (especially if can't park), reasonable bandwidth
- Go by air -- higher latency and bandwidth than car
- Phone -- High speed at long distance but can only communicate modest material (low capacity)
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 14 Quantum Computing - I
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 221
Full HTML Index
- Quantum-Mechanical Computers by Seth Lloyd, Scientific American, Oct 95
- Chapter 6 of The Feynman Lectures on Computation edited by Tony Hey and Robin Allen, Addison-Wesley, 1996
- Quantum Computing: Dream or Nightmare? Haroche and Raimond, Physics Today, August 96 page 51
- Basically any physical system can "compute" as one "just" needs a system that gives answers that depend on inputs and all physical systems have this property
- Thus one can build "superconducting" "DNA" or "Quantum" computers exploiting respectively superconducting molecular or quantum mechanical rules
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 15 Quantum Computing - II
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 282
Full HTML Index
- For a "new technology" computer to be useful, one needs to be able to
- conveniently prepare inputs,
- conveniently program,
- reliably produce answer (quicker than other techniques), and
- conveniently read out answer
- Conventional computers are built around bit ( taking values 0 or 1) manipulation
- One can build arbitarily complex arithmetic if have some way of implementing NOT and AND
- Quantum Systems naturally represent bits
- A spin (of say an electron or proton) is either up or down
- A hydrogen atom is either in lowest or (first) excited state etc.
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 16 Quantum Computing - III
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 292
Full HTML Index
- Interactions between quantum systems can cause "spin-flips" or state transitions and so implement arithmetic
- Incident photons can "read" state of system and so give I/O capabilities
- Quantum "bits" called qubits have another property as one has not only
- State |0> and state |1> but also
- Coherent states such as .7071*(|0> + |1>) which are equally in either state
- Lloyd describes how such coherent states provide new types of computing capabilities
- Natural random number as measuring state of qubit gives answer 0 or 1 randomly with equal probability
- As Feynman suggests, qubit based computers are natural for large scale simulation of quantum physical systems -- this is "just" analog computing
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 17 Superconducting Technology -- Past
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 381
Full HTML Index
- Superconductors produce wonderful "wires" which transmit picosecond (10^-12 seconds) pulses at near speed of light
- Superconducting is lower power and faster than diffusive electron transmission in CMOS
- At about 0.35micron chip feature size, CMOS transmission time changes from domination by transmission (Distance) issues to resistive (diffusive effects)
- Niobium used in constructing such superconducting circuits can be processed by similar fabrication techniques to CMOS
- Josephson Junctions allow picosecond performance switches
- BUT IBM (!969-1983) and Japan (MITI 1981-90) terminated major efforts in this area
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 18 Superconducting Technology -- Present
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 138
Full HTML Index
- New ideas have resurrected this concept using RSFQ -- Rapid Single Flux Quantum -- approach
- This naturally gives a bit which is 0 or 1 (or in fact n units!)
- This gives interesting circuits of similar structure to CMOS systems but with a clock speed of order 100-300GHz -- factor of 100 better than CMOS which will asymptote at around 1 GHz (= one nanosecond cycle time)
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 19 Superconducting Technology -- Problems
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * See also color IMAGE
Secs 296
Full HTML Index
- At least two major problems:
- Semiconductor industry will invest some some $40B in CMOS "plants" and infrastructure
- Currently perhaps $100M a year going into superconducting circuit area!
- How do we "bootstrap" superconducting industry?
- Cannot build memory to match CPU speed and current designs have superconducting CPU's (with perhaps 256 Kbytes superconducting memory per processor) but conventional CMOS memory
- So compared with current computers have a thousand times faster CPU, factor of four smaller cache of CPU speed and same speed basic memory as now
- Can such machines perform well -- need new algorithms?
- Can one design new superconducting memories?
- Superconducting technology also has a bad "name" due to IBM termination!
HELP! * PURPLE=global GREY=local HTML version of GLOBAL Foils prepared 15 September 1996 Foil 20 Superconducting Technology -- Present
From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. * Critical Information in IMAGE
Secs 565
Full HTML Index
- New ideas have resurrected this concept using RSFQ -- Rapid Single Flux Quantum -- approach
- This naturally gives a bit which is 0 or 1 (or in fact n units!)
- This gives interesting circuits of similar structure to CMOS systems but with a clock speed of order 100-300GHz -- factor of 100 better than CMOS which will asymptote at around 1 GHz (= one nanosecond cycle time)
Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu
If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.
Page produced by wwwfoil on Mon Sep 16 1996