Full HTML for

Scripted foilset CPS615-Lecture on Performance(end) and Computer Technologies(start)

Given by Geoffrey C. Fox at Delivered Lectures of CPS615 Basic Simulation Track for Computational Science on 5 September 96. Foils prepared 15 September 1996
Outside Index Summary of Material Secs 74.8


This starts by considering the analytic form for communication overhead and illustrates its stencil dependence in simple local cases -- stressing relevance of grain size
The implication for scaling and generalizing from Laplace example is covered
  • We covered scaled speedup (fixed grain size) as well as fixed problem size
We noted some useful material was missing and this was continued in next lecture (Sept 10,96)
The lecture starts coverage of computer architecture covering base technologies with both CMOS covered in an earlier lecture contrasted to Quantum and Superconducting technology

Table of Contents for full HTML of CPS615-Lecture on Performance(end) and Computer Technologies(start)

Denote Foils where Image Critical
Denote Foils where HTML is sufficient
Indicates Available audio which is lightpurpleed out if missing
1 Delivered Lectures for CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Lecture of September 5 - 1996

2 Abstract of Sept 5 1996 CPS615 Lecture
3 Communication Overhead
4 Analytical Form of Speed Up for Communication Overhead
5 General Form of Efficiency
6 Communication to Calculation Ratio as a function of template
7 Speed Up as a Function of Grain Size
8 Technologies for High Performance Computers
9 Architectures for High Performance Computers - I
10 Architectures for High Performance Computers - II
11 There is no Best Machine!
12 Quantum Computing - I
13 Quantum Computing - II
14 Quantum Computing - III
15 Superconducting Technology -- Past
16 Superconducting Technology -- Present
17 Superconducting Technology -- Problems
18 Superconducting Technology -- Present

Outside Index Summary of Material



HTML version of Scripted Foils prepared 15 September 1996

Foil 1 Delivered Lectures for CPS615 -- Base Course for the Simulation Track of Computational Science
Fall Semester 1996 --
Lecture of September 5 - 1996

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 167
Geoffrey Fox
NPAC
Room 3-131 CST
111 College Place
Syracuse NY 13244-4100

HTML version of Scripted Foils prepared 15 September 1996

Foil 2 Abstract of Sept 5 1996 CPS615 Lecture

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 74.8
This starts by considering the analytic form for communication overhead and illustrates its stencil dependence in simple local cases -- stressing relevance of grain size
The implication for scaling and generalizing from Laplace example is covered
  • We covered scaled speedup (fixed grain size) as well as fixed problem size
We noted some useful material was missing and this was continued in next lecture (Sept 10,96)
The lecture starts coverage of computer architecture covering base technologies with both CMOS covered in an earlier lecture contrasted to Quantum and Superconducting technology

HTML version of Scripted Foils prepared 15 September 1996

Foil 3 Communication Overhead

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 223.2
Suppose communicating a single word - here a j value probably stored in a 4 byte word - take time tcomm

HTML version of Scripted Foils prepared 15 September 1996

Foil 4 Analytical Form of Speed Up for Communication Overhead

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 237.6

HTML version of Scripted Foils prepared 15 September 1996

Foil 5 General Form of Efficiency

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 167

HTML version of Scripted Foils prepared 15 September 1996

Foil 6 Communication to Calculation Ratio as a function of template

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 679.6

HTML version of Scripted Foils prepared 15 September 1996

Foil 7 Speed Up as a Function of Grain Size

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 466.5

HTML version of Scripted Foils prepared 15 September 1996

Foil 8 Technologies for High Performance Computers

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 475.2
We can choose technology and architecture separately in designing our high performance system
Technology is like choosing ants people or tanks as basic units in our society analogy
  • or less frivolously neurons or brains
In HPCC arena, we can distinguish current technologies
  • COTS (Consumer off the shelf) Microprocessors
  • Custom node computer architectures
  • More generally these are all CMOS technologies
Near term technology choices include
  • Gallium Arsenide or Superconducting materials as opposed to Silicon
  • These are faster by a factor of 2 (GaAs) to 300 (Superconducting)
Further term technology choices include
  • DNA (Chemical) or Quantum technologies
It will cost $40 Billion for next industry investment in CMOS plants and this huge investment makes it hard for new technologies to "break in"

HTML version of Scripted Foils prepared 15 September 1996

Foil 9 Architectures for High Performance Computers - I

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 136.8
Architecture is equivalent to organization or design in society analogy
  • Different models for society (Capitalism etc.) or different types of groupings in a given society
  • Businesses or Armies are more precisely controlled/organized than a crowd at the State Fair
  • We will generalize this to formal (army) and informal (crowds) organizations
We can distinguish formal and informal parallel computers
Informal parallel computers are typically "metacomputers"
  • i.e. a bunch of computers sitting on a department network

HTML version of Scripted Foils prepared 15 September 1996

Foil 10 Architectures for High Performance Computers - II

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 152.6
Metacomputers are a very important trend which uses similar software and algorithms to conventional "MPP's" but have typically less optimized parameters
  • In particular network latency is higher and bandwidth is lower for an informal HPC
  • Latency is time for zero length communication -- start up time
Formal high performance computers are the classic (basic) object of study and are
"closely coupled" specially designed collections of compute nodes which have (in principle) been carefully optimized and balanced in the areas of
  • Processor (computer) nodes
  • Communication (internal) Network
  • Linkage of Memory and Processors
  • I/O (external network) capabilities
  • Overall Control or Synchronization Structure

HTML version of Scripted Foils prepared 15 September 1996

Foil 11 There is no Best Machine!

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 165.6
In society, we see a rich set of technologies and architectures
  • Ant Hills
  • Brains as bunch of neurons
  • Cities as informal bunch of people
  • Armies as formal collections of people
With several different communication mechanisms with different trade-offs
  • One can walk -- low latency, low bandwidth
  • Go by car -- high latency (especially if can't park), reasonable bandwidth
  • Go by air -- higher latency and bandwidth than car
  • Phone -- High speed at long distance but can only communicate modest material (low capacity)

HTML version of Scripted Foils prepared 15 September 1996

Foil 12 Quantum Computing - I

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 221.7
Quantum-Mechanical Computers by Seth Lloyd, Scientific American, Oct 95
Chapter 6 of The Feynman Lectures on Computation edited by Tony Hey and Robin Allen, Addison-Wesley, 1996
Quantum Computing: Dream or Nightmare? Haroche and Raimond, Physics Today, August 96 page 51
Basically any physical system can "compute" as one "just" needs a system that gives answers that depend on inputs and all physical systems have this property
Thus one can build "superconducting" "DNA" or "Quantum" computers exploiting respectively superconducting molecular or quantum mechanical rules

HTML version of Scripted Foils prepared 15 September 1996

Foil 13 Quantum Computing - II

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 282.2
For a "new technology" computer to be useful, one needs to be able to
  • conveniently prepare inputs,
  • conveniently program,
  • reliably produce answer (quicker than other techniques), and
  • conveniently read out answer
Conventional computers are built around bit ( taking values 0 or 1) manipulation
One can build arbitarily complex arithmetic if have some way of implementing NOT and AND
Quantum Systems naturally represent bits
  • A spin (of say an electron or proton) is either up or down
  • A hydrogen atom is either in lowest or (first) excited state etc.

HTML version of Scripted Foils prepared 15 September 1996

Foil 14 Quantum Computing - III

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 292.3
Interactions between quantum systems can cause "spin-flips" or state transitions and so implement arithmetic
Incident photons can "read" state of system and so give I/O capabilities
Quantum "bits" called qubits have another property as one has not only
  • State |0> and state |1> but also
  • Coherent states such as .7071*(|0> + |1>) which are equally in either state
Lloyd describes how such coherent states provide new types of computing capabilities
  • Natural random number as measuring state of qubit gives answer 0 or 1 randomly with equal probability
  • As Feynman suggests, qubit based computers are natural for large scale simulation of quantum physical systems -- this is "just" analog computing

HTML version of Scripted Foils prepared 15 September 1996

Foil 15 Superconducting Technology -- Past

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 381.6
Superconductors produce wonderful "wires" which transmit picosecond (10^-12 seconds) pulses at near speed of light
  • Superconducting is lower power and faster than diffusive electron transmission in CMOS
  • At about 0.35micron chip feature size, CMOS transmission time changes from domination by transmission (Distance) issues to resistive (diffusive effects)
Niobium used in constructing such superconducting circuits can be processed by similar fabrication techniques to CMOS
Josephson Junctions allow picosecond performance switches
BUT IBM (!969-1983) and Japan (MITI 1981-90) terminated major efforts in this area

HTML version of Scripted Foils prepared 15 September 1996

Foil 16 Superconducting Technology -- Present

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 144
New ideas have resurrected this concept using RSFQ -- Rapid Single Flux Quantum -- approach
This naturally gives a bit which is 0 or 1 (or in fact n units!)
This gives interesting circuits of similar structure to CMOS systems but with a clock speed of order 100-300GHz -- factor of 100 better than CMOS which will asymptote at around 1 GHz (= one nanosecond cycle time)

HTML version of Scripted Foils prepared 15 September 1996

Foil 17 Superconducting Technology -- Problems

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 292.3
At least two major problems:
Semiconductor industry will invest some some $40B in CMOS "plants" and infrastructure
  • Currently perhaps $100M a year going into superconducting circuit area!
  • How do we "bootstrap" superconducting industry?
Cannot build memory to match CPU speed and current designs have superconducting CPU's (with perhaps 256 Kbytes superconducting memory per processor) but conventional CMOS memory
  • So compared with current computers have a thousand times faster CPU, factor of four smaller cache of CPU speed and same speed basic memory as now
  • Can such machines perform well -- need new algorithms?
  • Can one design new superconducting memories?
Superconducting technology also has a bad "name" due to IBM termination!

HTML version of Scripted Foils prepared 15 September 1996

Foil 18 Superconducting Technology -- Present

From CPS615-Lecture on Performance(end) and Computer Technologies(start) Delivered Lectures of CPS615 Basic Simulation Track for Computational Science -- 5 September 96. *
Full HTML Index Secs 87.8
New ideas have resurrected this concept using RSFQ -- Rapid Single Flux Quantum -- approach
This naturally gives a bit which is 0 or 1 (or in fact n units!)
This gives interesting circuits of similar structure to CMOS systems but with a clock speed of order 100-300GHz -- factor of 100 better than CMOS which will asymptote at around 1 GHz (= one nanosecond cycle time)

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Fri Aug 15 1997