GCF Talk 93A
	Complex Systems and Parallel Computing
	Australian International Conference on Complex Systems
	Australian National University
	Canberra, Australia
	
	December 14-16, 1992
	
	Geoffrey C. Fox
		gcf@nova.npac.syr.edu
		http://www.npac.syr.edu
		315.443.2163
	The Three Themes of Lecture: Parallel Computers and Complex Systems
		Parallel Computers for the Simulation of Complex Systems
			Teraflop Performance
			Virtual Reality
			Software Standards
			"Global metacomputer"	
		Complex Systems for the theory of Computer and Network Architecture
			Physical Analogies 	Space, Time, Temperature, Phase Transitions
			Problem Architecture
			A Theory of Parallel Computing?	
		Complex Systems for new Computational Methodologies
			Physical Computation
			Neural Networks..........
			Simulated Annealing and Tempering
	Issues in Parallel Computers for the Simulation of Complex Systems
		Can one use the 1 ----->10 TERAFLOP machines which will be available in the middle to late 1990's ?	
		What software do you need ? 	
		What about virtual reality and distributed metacomputers ?
	Standard Performance Graph Heading to 1 to 10 Teraflops by year 2000
		 
	When will parallel computing take over?
		When it has enough headroom (e.g. a factor of ten better "performance"/"cost-performance" than existing technology)
			Definition of headroom confused as parallel machines much harder to use and so hard to decide on vlue to a given organization
		nCUBE-1, CM-2 were comparable or perhaps a factor of 2 ... 4 better in cost-performance than conventional supercomputers
			Enough to show "Parallel Computing Works"
			not enough to "takeover"	
		DECmpp, IBM SP1, INTEL Paragon,  CM-5, nCUBE-2 are (perhaps) a factor of ten better than competitive sequential (vector) machines
			They can "takeover"  ?
	The President's High Performance Computing and Communication Initiative (HPCCI)
		$2.9 billion over 5 years
		The Grand Challenges
			Enabled by teraflop computers and important to economy or fundamental research
				Global warming - NOAA
				Oil reservoir and environmental simulation - DOE
				Structural and aerodynamic calculations - NASA
				Earth observing satellite - data analysis - NASA
				Human genome - NIH, DOE
				Quantum chromodynamics - Fundamental Physics
				Gravitational waves from black holes - Fundamental Physics
				Molecular modeling - Fundamental Chemistry
		Nearly all have industrial payoff but technology transfer NOT funded by HPCCI
	Challenges and Status of Parallel Computing
		Hardware trends imply that all computers (PC's--->Supercomputers) will be parallel by the year 2000
			Up to now parallel computers are from small start-up companies  (except Intel Supercomputer Division) 
			Recently Cray, Convex (HP), Digital, IBM have announced massively parallel computing initiatives 
			Several architectures "work" but only one :              Distributed memory MIMD multicomputer is known to scale from one to very many processors on general problems		
		Software is challenge and could prevent/delay hardware trend that suggests parallelism will be a mainline computer architecture 
			first we must get systems software correct 
			then applications software
	
	High Performance Fortran Overview
		Data Parallelism
			Fortran90 provides a number of features, notably array syntax, that make it possible for a compiler to detect parallelism. However, with existing compiler technology, to exploit full capability of concurrent systems, new language features are necessary to define data dependency and locality.
		High Performance
			Addressed primarily by proposed data distribution features and new intrinsic functions.
		Code Tuning
			optional, explicit message passing support for MIMD machines
		Validation
			a benchmarking suite is being prepared by NPAC, Syracuse University
	HPF computational model
		The underlying intuition behind HPF computational model is that 
			an operation on two or more data object is likely to be carried out much faster if they all reside on the same processor
			it may be possible to carry out many such operations concurrently if they can be performed on different processors
		The main mechanism for achieving parallelism in HPF is the data partition annotation of HPF. Distribution of control is derived from the data distribution, using the owner compute rule: each assignment is executed by the processor(s) that own the assigned variable. (However, a compiler may relax the owner rule, in order to improve performance.)
	Example of Fortran-90D source code: Gaussian Elimination
		 
	HPF directives
		An HPF directive has the form of a comment line:
		
		!HPF$ hpf-directive
		
		A goal of the design is that HPF directives should be consistent with Fortran 90 syntax in the following strong sense: if HPF were to be adopted as a part of a future Fortran standard, the only change necessary to convert an HPF program would be to remove the comment character and directive prefix from each directive.
		
		There is a clear separation between directives that serves as specification statements and directives that serve as executable statements ( in the sense of the Fortran standard). Specification statements are carried out on entry to a program unit; after these, executable statements are carried out. 
	Data Alignment and Distribution Directives
		Model of the allocation of data objects to processor memories is that there is a two-level mapping of data objects to abstract processors. Data objects (typically array elements) are first aligned with an abstract indexing space called a template; a template is then distributed onto a recilinear arrangement of abstract processors. The implementation then uses the same number, or perhaps some smaller number, of physical processors to implement these abstract processors. This mapping of abstract processors to physical processors is implementation dependent. 
	Examples of Alignments (1)
			      REAL,  ARRAY (N, N)     : : X3 , X4
		!HPF$	TEMPLATE B (N, N)
		!HPF$	ALIGN X3 (I, J)  with B (I, J-1)
		!HPF$	ALIGN X4 (I, J)  with B (I-1, J+2) 
	Examples of Distributions (1)
		
		!HPF$	TEMPLATE  A (N, N), B (N, N), C (N, N)
		!HPF$	DISTRIBUTE  A (BLOCK, *)
		!HPF$	DISTRIBUTE  B (*, BLOCK )
		!HPF$	DISTRIBUTE  C (BLOCK, BLOCK )  
	For More Information on HPF
		The final draft of the HPF proposal is available via anonymous ftp from
			minerva.npac.syr.edu (public/HPFF)
			titan.cs.rice.edu (public/HPFF/draft)
			ftp.gmd.de (hpf-Europe)
			theory.tc.cornell.edu (pub)
			think.com (public/HPFF)	
		E-mail servers
			softlib@cs.rice.edu (send hpf-vxx.ps)
			netlib@research.att.com (send hpf-vxx.ps from hpff)
			netlib@ornl.gov (send hpf-vxx.ps from hpf)	
		HPFF mailing list
			hpff-request@cs.rice.edu (to join)
			hpff@cs.rice.edu (to submit)
			hpff-info@cs.rice.edu (for general information)
			hpff-comments@cs.rice.edu (to submit comments on the HPF draft)
	FORTRAN-90D 
	The First Implementation of HPF 
	(NPAC, Syracuse University) 
	Current Status
		Alpha version: demonstrated at SUPERCOMPUTING '92 in Minneapolis, MN
			one- and two-dimensional BLOCK, CYCLIC and BLOCK-CYCLIC distributions
			FORALL statement
			full run-time support for regular communications
			PARTI (ICASE) primitives for irregular communications
			intrinsic library
		Now running on nCUBE2, iPSC/860 and network of SUN workstations
		beta release: June 1993
		alliance with industry to develop commercial compiler for HPF, optimized for specific architectures
		next target: heterogeneous networks
	Common Software needed for Heterogeneous Local Area Network 
	(Ethernet - FIDDI - HIPPI - FCS ......) 
		 
	Importance of MetaProblems
		High Performance Fortran is only suitable for data parallel problems whose data structure is an array.
			Will extend e.g. to tree or irregular graph

		But complex systems can be complicated!
		For instance, Image Processing, Data Fusion, Multidisciplinary Design have several modules.
		
		

			Each can be data parallel  
	Hybrid Problem Structure for Command and Control
		Si: Satellite with:	Image processing	Kalman filter	Stereo match

		FCi: Fire Control platform with loosely synchronous target-weapon pairing algorithm
		E: Overall Asynchronous expert system 
	The Mapping of Heterogeneous Problems onto Heterogeneous Computer Systems
		The analogy between metaproblems and metacomputers
	SIMCITY is an interesting PC based complex system simulation.		
		
		Can use teraflop computers to make such "games" 	more  realistic.
		Heirarchy of Complexity
		
		Virtual Class	e.g. weather or cortex model
		Virtual Class room
		Virtual Laboratory	realistic SIMCITY; combines 			several classes
		Virtual School ....	
		Experience virtual classes ---> school with "Virtual Reality"  
	Implementation of Complex System Simulation
		Complex System Simulation requires
			===> Support of many adaptively linked modules
			
			===> Interpreter not compiler
			
			===> Link to Virtual Reality
		Possible Implementation		           
	AVS as System Integration Tool
		Basic Features
			dataflow model for distributed computing
			AVS program = network of AVS modules, controlled by the AVS kernel via the RPC mechanism
			AVS module = C/Fortran program, conforming to AVS interface specification
			strong visualization and 3D graphics support
			visual editing tools for AVS networks	
		Highlights
			easy to use and to port existing applications to the AVS framework
			growing popularity in research community
			growing volume of public domain modules, maintained by the International AVS Center at the NCSC	
		Flaws
			poor interactivity (best for complex, static visualization, but not for simulation or animation)
			lack of organization principle to bookkeep dataflow modules (not an object-oriented system)
			lack of support for High Performance Computing
			not an Open Systems software (2 competitive commercial products: AVS, Explorer (SGI); 2 public domain systems: apE and Khoros)
	Parallel AVS - Planned Project at NPAC
		Forms of parallelism in the AVS model:
			intramodular data parallelism (e.g. image processing or PDE modules) for single AVS kernel 
			intermodular functional parallelism (single kernel)
			multi-user/kernel/department functional parallelism	
		NPAC Approach
			Develop 'standard' library of data parallel modules in High Performance Fortran for regular problems (matrix, algebra, PDEs, imaging)
			develop hierarchical/clustering algorithms for automatic decomposition of large AVS dataflow networks into a network of independent AVS subsystems
			unify existing tools for the user-level hierarchical functional organization with the automated tools for the network decomposition
			unify top-down approach for the single user network decomposition with the bottom-up approach for the multi-user network integration
	Architecture of Parallel AVS System
		 
	VR Operating Shells
		 
	Components of Proposed Televirtuality Server at NPAC
		modern parallel systems (CM-5, nCUBE2, DECmpp, Intel Touchstone)
		High Performance Fortran and Concurrent C++ software environment (compilers, interpreters, runtime support)
		parallel Oracle + SGI front-ends
		expert system (RT/AI from Intellisys, Inc.)
		Clusters of DEC/IBM/SUN workstations on the Ultra-Net /  FCS / HIPPI / ATM (?)  based gigabit/sec LAN
		visualization front-ends (AVS/Explorer, Khoros/apE)
		multimedia front-ends
		VR front-ends (BOOM, Convolvotron, DataGloves, PROvision and Sense8 systems, new consumer VR products)
		MOVIE as integration/infrastructure software
	A Theory of Parallel Computing based on Complex Systems 
		Use Complex Systems to describe
		Problems

		Computers

		Networks

		And their Interaction
		
		 
	Computing as a set of Mapping Problems
		A hierarchy of mapping problems
		We would like to optimize the overall mapping
	Complex Systems to give a theory of computing 
		Both problems and computer are
		Collections of homogeneous or heterogeneous basic entities with some sort of static or dynamic connection between them	
		Modeling involves mapping one "complex problem" into another			
		Simulation or parallel computing involves map 				Complex problem ---> Complex Computer							
		What parameters of these two underlying complex systems control effectiveness of map?
	Parallel Computing is "just" an optimization problem, even if we can't agree on what to optimize
		Execution time - main focus of HPCC community?	
		User happiness - main focus of software engineering community
	Concurrent Computation as a Mapping Problem -I
		2 Different types of Mappings in Physical Spaces
		Both are static
			a) Seismic Migration with domain decomposition on 4 nodes
			b)Universe simulation with irregular data but static 16 node decomposition
			but this problem would be best with dynamic irregular decomposition
			
	Concurrent Computation as a Mapping Problem - II
		 Different types of Mappings -- A very dynamic case without any underlying Physical Space
		c)Computer Chess with dynamic game tree decomposed onto 4 nodes
	Computation as a map of a set of Complex Systems
		 
	Domain Decomposition and Complex Systems ?
		All successful parallel computers with:
			Many nodes (massively parallel)
			High Performance (excludes architectures like dataflow) 	Have obtained parallelism from "data parallelism" or "domain decomposition"	
		This is just the mapping of data ("spatial domain") of complex problem onto spatial structure of complex computer 	Whether architecture 		SIMD 	MIMD 	Distributed or Shared Memory
	Physical Analogy for Complex Computer
		Space is either 
			Number of nodes of parallel computer	
		or at a different resolution 
			Number of memory locations	
		Time is either
			measured by cycle time of node	 
		or at a different resolution (on CISC computers)
			measured by machine instructions
	The Physical Space/TimeAnalogy for a General Problem
		(The correspondence implied by theory of complex systems)
		Problems have a data domain- "SPACE"
		and the computation involves work done on  the elements of this domain; label this work by - "TIME"
		Examples
			Simulation of a 3D physical system
				SPACE = "SPACE"
				TIME = "TIME"
				e.g. Wave equation in seismic exploration
				or motion of missiles after launch
			Full matrix operations (Inversion, multiplication)
				MATRIX elements = "SPACE" 
				Label of ROWS eliminated or multiplied = "TIME"
			 Gauss Seidel (SOR) Iterative approach to solving sparse matrix problems
				Set of nonzero matrix elements = "SPACE"
				ITERATION COUNT = "TIME" 
	Some Temporal Properties of Computation
		 
	General Space Time Complex System Picture for Problem to Computer Mapping
		e.g. On a sequential computer, all aspects of problem are mapped onto "TIME" in computer
		
		Seismograph: time dependence of earthquake mapped on spatial extent of recording paper. 
	Computer Languages and Space - Time Properties
		High Performance Fortran 
					Dimension A,B (10000)
					A=A+B	
		Space (elements of arrays A,B) in problem is mapped into space (i.e. arrays preserved) in Virtual Problem
		
		--->	Compiler can recognize data parallelism
		--->	Good parallel performance on all machines
		
		Fortran77 	Do 1  I = 1, 10000
					1 A(I) = A(I) + B(I)
		
		Space in problem is mapped into time (successive iterations of DO loop) in Virtual Problem
		
		--->	Compiler has to "undo" this to recognize parallelism and generally fails 
		--->	Good languages preserve problem architecture  
	Information Dimension of a General Complex System
		d = geometric dimension for local problems
		d~2  computer chips (Rent's Rule)
		d=1 ALL long range force problems - whatever dimension
	Performance of a Parallel Computer
		 
	Hierarchical Multicomputer
	Spatial and Temporal Decomposition
		Homogeneous Multicomputer
		Purely Spatial Decomposition 
	Shared or Hierarchical Memory Computer
		 
	Comparison of Cache and Distributed Memory Communication Overhead
		 
	Extension of Space-Time Picture to treat Hierarchial memory and caches etc.
		Both Distributed and Hierarchial memory need DATA LOCALITY in memory references
			Minimize communication
			Minimize cache miss	
		Hierarchal memory ----> "Time"
		Distributed memory ----> "Space"	
		All overheads proportional to 1/[space-time volume] 1/d 
		 with space and time decomposition (cf. BLAS-3 for Matrices)

		High Performance Fortran expresses data locality ==> will improve performance of sequential machines
	Space-Time Decompositions for the concurrent one dimensional wave equation
		 
	Typical Example of Mapping an Irregular Bunch of Grid Points
		Onto 8 Nodes
	Use of Physical Optimization in High Performance Fortran
		Problem is to Find MAP in 
		DISTRIBUTION (MAP) in FortranD/HPF (High Performance Fortran
		A user (compiler) generated data decomposition
		
		User (compiler) supplies
		 What is to be distributed
		 "Graph" - algorithmic connection between entities to be distributed
		These entities are: 
			particles
			grid points
			matrix elements .......   depending on problem 
	Physics Analogy for Load Balancing
		Label data points (members of problem spatial domain by i
		Each i ---> Particle moving in space with discrete  points (number of points in space = number of nodes in machine) and with topology defined by interconnection of nodes
		
		Let problem be defined by parameters:
			Execution time Ci for each i
			Communication time Vi,j between data points i and j
			(Vi,j often zero; nonzero only if i and j linked in computational graph) 
	Complex System SHLSoft governed by Hamiltonian = Execution Time
		Need ground state of H 
	Decomposition of an Arch onto 16 Processors in a Hypercube
		from Chrisochoides
	PHYSICS ANALOGY FOR STATIC AND DYNAMIC LOAD BALANCING
		The "Particles" and "springs"/forces between them including effect of the cost of redistribution
	General definition of temperature TS of a complex system
		 In a dynamic load balancing problem (time dependent system), simulated annealing will be performed concurrently on the same computer ensemble used to simulate system.
		
		Suppose "optimal" annealing strategy used. There will be a minimum temperature TS to which it will be possible to "cool" system before time spent annealing is longer than gain due to improved load balancing.
		
		Note TS  is independent of speed of computer - it is a property of system being simulated
	Particle dynamics problem on a four node system
		Block Scattered and Domain Decomposition Distributions Illustrated
	Instantaneous Energy Distribution for Time Dependent Domain Decomposition and Block Scattered Distributions
		Note scattered position is independent of time 
	Time Averaged Energy for Adaptive Particle Dynamics Problem
		Note Domain decomposition can end up worse than block scattered after averaging 
	A general  theory of computation
		The particle - process analogy does not apply to 
		Asynchronous problems
			Database
			Event driven simulations
			Phone network traffic
			Computer chess and other AI

		Loosely synchronous problems where fundamental entities move between nodes on a microscopic time scale (such problems can be called microscopically dynamic)
		Example - Global sums as used in
		
				DO 1  I=1, Limit 1
				A(I) = 0
				DO 2 J=1, Limit 2
				2  A(I) = A(I) + B(I,J)
		
		Where I and J distributed over nodes of hypercube 
	HISTORICALLY ONE OF THE MOTIVATIONS  FOR THE RESEARCH  WAS TO 
	" AUTOMATE" THE KNOWN FOLD ALGORITHM
			FOLD CALCULATES MULTIPLE GLOBAL SUMS BY SUITABLY INTERLEAVING TREES OVER THE HYPERCUBE
	The String Formalism for Dynamic Computations
		Originally we used the analogy
			Process or "entity"-----> Particle
			Label by p  -----> xp (t) POSITION
		Now generalize this to 
			process or "fundamental
			thread of computation" ------> String or worldline
			Label by p  ------> time ordered set {xp (t) }
		In this new formalism our loosely synchronous microscopically static problems are represented by 
	Loosely Synchronous Static and Adaptive Problems in the String Picture
		Particles p1 and p2 interact with an attractive "communication force" where this force is carried by message (packets) which are the quantum of force
		
		So loosely synchronous microscopically static problems can be described by strings that vary  slowly with time and it is sufficient to slice,  every now and then, at a fixed time and consider as particles 
	An initial approach to computational string dynamics or equivalently the
	Construction of the Energy Function
		The true problem is to minimize the length in the time direction of the string of maximum length. This can be treated directly in "full string dynamics" but here we will follow particle analogy and replace this minimax problem by a least squares formulation
		Again replace exact constraints (e.g. a node can only do one or a few things at a time) by at least squares constraints
			This in spirit of Hopfield and Tank
		New terms in energy (in field theory language) corresponding to the creation or annihilation or strings
	Full String Dynamics as an Interacting Field Theory
		One uses as dynamical variables the complete strings  {xp(t)}  or probably better specifies them with a neural  formalism inspired by Hopfield and Tank's traveling salesperson work
		Sp (x,t) = 0  if p is not at x time t 
		Sp (x,t) = 1 if  p is at x time t 
		 with constraint S Sp (x,t) = 1 (one string in field)
		The associated dynamics is that of interacting strings with usual repulsive and attractive forces plus creation and annihilation.  
	Complex systems suggest new computational methodologies
		Cellular Automata	
		Physical Computation
			Neural Networks 
			Genetic Algorithms
			Simulated Annealing
			Simulated Tempering  
	Physical Optimization and Computation Approaches and their Field of Origin
		Examples of Physical Computation
			Cellular Automata
			Complex Systems
		Physical (Pertaining to Nature) optimization and associated field
			Genetic Algorithms		      Evolution	
			Simulated Annealing		      Physics(Statistical)
			Neural Networks			      Biology (low level)
			Information Theory		      Electrical Engineering
				(Maximum Entropy)
					� � �
			Elastic Networks			      Physics (Determiistic)
			Deterministic Annealing
		These can be compared with
			Heuristics				     Particular Problem
			Combinatorial optimization   Mathematics
			Expert systems			Computer Scienc (High level reasoning)
	Genetic Algorithms for Data Decomposition
		Work of Nashat Mansour
	Three Major Genetic Operators
		A system is specified by a string of (binary) digits and evolves with a set of transitions generated by Genetic Operators
	MultiScale Methods in Parallel Data Decomposition
		Parallel algorithms for
			Genetic Algorithm
			Simulated Annealing
			Neural Networks
			Are not "trivial"
			Are not identical to sequential algorithms	
		Large problems (the "real world") require multiscale algorithms just as multigrid is faster than Gauss Seidel for large P.D.E.'s   (take           --> log Nnode, d=2,3)
	Results of Various Physical Optimization Methods for Data Decomposition
		This compares parallel Genetic(PGA), Simulated Annealing(PSA), Neural Network(PNN) and a wonderful heuristic(PRSB -recursive spectral bisection) 
	A similar but Larger Problem
		This comes from work of Nashat Mansour
	Some Overall Questions Relevant In Classisfying Optimization Problems and Methods
		What is the shape of the objective function?
			Correlated or uncorrelated minima?	
		Do you need the
			exact global minimum
			an approximate minimum			
			In this case, do you want solution to always be within some tolerance of exact solution
			or, on the average, to be within tolerance
	Two Types of Global Mininum and their relation to Local Minima
		Local minima in "physics" problems are closely correlated with true minimum																								
		"Computer Science" grand challenges in optimization might look different
	Typical Formalism for Physical Optimization
		Minimize E = E(parameters y)
		y can be continuous or discrete, or a mix
		Introduce a fake temperature T and set b = 1/T
			Often T ~ distance scale at which you look at problem

		Probability of state y =  exp(-bE)/Z														where								
		As b � �, minimum E (ground state) dominates
		Find y(T) by minimizing Free Energy 
		F = E - TS = -1/b log Z
	Global and Local Minima in Temperature Dependent Free Energy
		Annealing tracks global minima by initializing search at temperature T by minima found at temperature T + dT
		Movement at given temperature
		Global Minimum moving with decreasing temperature
	Comparison of Physical Optimization Methods
		Simulated annealing: find y(T) by Monte Carlo as mean over configurations at that temperature
		Neural networks: y is discrete.  Find y by mean field approximation
		Elastic net: y is discrete, but use improved mean field including some or all constraints
		Deterministic annealing: leave important y out of sum S.  Find by simple iterative optimization
	Some Applications of Deterministic Annealing
		General, Travelling Salesman, Quadratic Assignment
			Durbin and Willshaw, Frean
			Yuille
			Simic
		Scheduling
			Peterson and Soderberg (high school classes)
			Johnston (Hubble Space Telescope)
		Track Finding
			Rose, (Fox, Gurewitz)
			Ohlsson, Peterson, Yuille
		Robot Path Planning, Navigation
			Fox
		Character Recognition
			Hinton
		Image Analysis
			Geiger, Girosi
		Clustering, Vector Quantization (coding), Electronic Packaging
			Rose, et al. 
	Simulated Tempering -- a New Approach to Monte Carlo Optimization/Simulated Annealing
		A new method Simulated Tempering by Marinari ( Rome, Syracuse) and Parisi (Rome)
		Conventional Monte Carlo methods do not work well for Random Field Ising Model = RFIM
		

		 Spins si live on 3D lattice
		si = +/-1 to be determined

		Fixed Magnetic Field
		hi = |h| ri
		|h| = 1
		ri = +/- 1 with random probability 1/2  
	The Conventional Simulated Annealing and its Problems for Random Field Ising Models
		
			What sort of NP complete optimization problems are like the RFIM ?		
			Normally in Simulated Annealing, choose sequence	bm = 1/Tm increasing			m = 0,1,2 ....			At each bm, equiliberate system according to weight function 	Pm = exp (-bm E)	
			Monte Carlo gets stuck in local minima (which differ by approximately the square root of the system size from true minima)
			This local minimum problem is particularly severe for RFIM as several very widely separated sharp "almost-global minima"
			Traditional annealing copes with case where local minima clustered around a single global minimum
	Key Idea in The Tempering Approach
		Add b to list of dynamical variables 
		b chosen out of {bm} with probability given by
		         Prob ( b, {s} )  a   exp { - bm E ({s}) + gm}
		b1 Increasing
		b2
		b3
		b4
		b5																
		Now system can move up in temperature and jump barrier
		Don't have to decide on a priori annealing schedule i.e. list of bm and computer time to be spent at each bm.
	Goodbye! Many Choices - Which is best When?