CPS713 August--December 1994 Description CPS 713 August to December1994 - Case Studies in Computational Science -- Overview of Three Areas Geoffrey Fox NPAC Syracuse University Syracuse NY 13244-4100 Abstract of CPS 713- 1994 - Overview of Three Case Studies in Computational Science These foils contain the overview of the three areas: I) Statistical Physics and Optimization II) Computational Fluid Dynamics and Numerical Relativity i.e. the solution of partial differential equations III) Some Technologies and Applications of the Information Age This was meant to be enough Information to allow student to choose which area to do project in -- as project had to be chosen after this overview but before any detailed discussions of any of the areas CPS 713 August to December1994 - Course Structure Instructor: Geoffrey Fox 4432163 No Grader -- Course graded by projects. One project per student to be decided by September 12 when we can discuss options in class. Initially we will discuss topics to be covered in enough detail so that students will be able to choose a project synergistic with course content. Nancy McCracken 4434687 is backup when I am not available Course CPS713 is Structured as 3 Case Studies: Firstly we overview the 3 areas and then we discuss them sequentially in detail I) Statistical Physics and Optimization II) Computational Fluid Dynamics and Numerical Relativity i.e. the solution of partial differential equations III) Some Technologies and Applications of the Information Age CPS 713--Case Study I August to December1994 - Case Studies in Computational Science -- Overview of First Area Applications of Monte Carlo Geoffrey Fox NPAC Syracuse University Syracuse NY 13244-4100 Overview of CPS713 Case Study I) Statistical Physics (and Optimization) Statistical approaches underly most approaches to study of physical systems with many particles as detailed dynamics become both impossible and irrelevant. The Central Limit Theorem or Law of Large Numbers implies that most observable quantities only depend on average properties of system. We will give a list of topics in course labelling them T for technology or computer science oriented C for computationally oriented A for application (physics) oriented Note there is natural link T --> C --> A and The boundaries T to C and C to A are "grey" and classification is correspondingly imprecise Overview of CPS713 Case Study I) Relation of Statistical Physics and Optimization Further one can nearly always formulate the detailed laws of physics as "variational principles" with integrals over possible states. In the case of high energy physics (QCD or Quantum Chromodynamics theory of fundamental particles), this variational principle is Feynman's path integral. Using Monte Carlo methods for these variational principles leads to a formulation just like statistical physics. Note QCD is a theory of particles moving in 3D space but is formulated as integrals over a four dimensional space time i.e. looks like a 4D statistical problem. Overview of CPS713 Case Study I) Relation of Statistical Physics and Optimization -- Contd Variational principles minimize functions, and generally Mother Nature minimizes energy. Thus statistical physics is (nearly) always solving optimization problems. We can turn around and use physical analogies and use "physical optimization" methods to solve general optimization methods. Here Simulated Annealing, Simulating Tempering, Genetic Algorithms, and neural networks are some of algorithms in this class. Note the so called NP Complete problems in optimization; physics is NP complete (one cannot solve Nature's equations in polynomial time) and so physical optimization is very suitable for NP complete optimization problems. Case Study I) Topics -- Basic Algorithms and Application to the Ising Model A: Why Statistical Processes can be used to describe Nature with examples T: Monte Carlo Integration (review from CPS615) A: Introduction to Spin Models with the Ising model (an array of two component, up and down, entities) as simplest example A: Phase Transitions as a pervasive property of systems and common focus of statistical physics computations Ice -- Water transition is a typical physics phase transition Breakup of Eastern Europe in 1990 is example of phase transition in a "complex system" C: Use of Monte Carlo Methods in Statistical Physics T: Importance Sampling -- general method of improving efficiency of computation T: Markov Chains A: Detailed balance as property of equilibrium in physical system T: Metropolis Algorithm -- excellent importance sampling needed since incredibly peaked functions found in statistical physics Monte Carlo integrals. Case Study I) Topics -- Statistical Physics Computations in the Metropolis Approach C: Application of Metropolis Algorithm to the Ising Model C: Software Implementing Metropolis for the Ising Model C: After we have generated some physical states by Metropolis, what do we do? T: Calculate Integrals over these states as integration points with suitable functions of the state Examples are Energy, Correlations C: This is called measurement process A: Study of Phase Transitions Case Study I) Topics -- Computational Issues in the Metropolis Approach C: Tricks of the trade: Computational Issues C: Thermalization (are we in equilibrium?) C: Ergodicity (does our method (eventually) visit all points in space to be integrated over?) A: Finding critical exponents (characterizing (divergent) behavior of measurements near critical points) A: Finite Size effects ( Simulated system must be much much smaller than world being modelled) A: Boundary Conditions ( What to do at edge of simulated system) T: Pseudo-Random Numbers (to be discussed in detail later and in CPS615) C: Acceptance ratio -- trade-off between safe small changes which don't change state very fast versus large changes which may be rejected as fall off "ridge" of large value of integrand. A: Critical Slowing Down -- present in nature and simulation Case Study I) Topics -- Critical Slowing Down and Methods to accelerate Computation A: Critical Slowing Down -- systems change slowly near critical points as "domains form" C: Large length scales are not well represented by computations such as Metropolis which only change one site at a time. T: Acceleration Methods T: Over-Relaxation T: Multigrid or more generally hierarchical scale methods (A: Renormalization group) T: Fourier Transform T: Cluster Methods -- potentially most powerful A: Potts Model -- Spin systems with more than two components A: Percolation -- Physical system with properties related to some cluster determination methods T: Cluster Methods in Detail A/T: Swendsen-Wang Cluster Algorithm Note need physics insight to be able to define computationally useful clusters Cluster identification is not (just) a computer science problem A/T: Wolff Cluster Algorithm T: Connected Component Labelling -- needed in all cluster methods and a "pure" computer science problem Case Study I) Topics -- State Generation in Parallel Metropolis Algorithm T: Basic Parallel (single site at a time) Metropolis Algorithm Analogous to red-black issues in Gauss-Seidel for Iterative PDE Solution Historically this was initial algorithm we studied and application(algorithm) inspiration in 1981 for hypercube project at Caltech. My graduate students discovered basic issues needed for parallel Metropolis for QCD and various spin model statistical physics simulations. C: Software for Parallel Metropolis for Spin Model A: Study of Ising Model with Parallel Metropolis C: Efficiency of Parallel Metropolis Algorithm Topics Covered in CPS713 Case Study I) The Complete Parallel Metropolis Algorithm T: Parallelization of Measurement Phase of computation Metropolis is initial state generation phase T: Comparison of Statistical Mechanics and PDE Solvers T: Parallelization by replication of problem in different nodes of a MIMD parallel machine As opposed to solving single problem and decomposing domain of physical system Replication can be thought of as parallelization over random number generation Replication requires that each node can hold full physical system which is not always possible -- certainly impossible for QCD where one state fills memory of supercomputer. Topics Covered in CPS713 Case Study I) Parallel Cluster Algorithms Note these are quite hard as clusters are dynamic and irregular in shape. Further parallelism sometimes restricted as have to cope with modest numbers of clusters which restricts some parallelism T: SIMD and MIMD Parallel Cluster Labelling T: Relation of Cluster Determination and Region Finding in Image Processing T: Multigrid and Hierarchical Labelling Methods T: Shiloach-Vishkin SIMD Method (This problem is classic in theoretical computer science community) T:Parallel Algorithms for growing a single cluster Topics Covered in CPS713 Case Study I) Spin Glasses and Optimization A: Spin Glass and other Generalizations of Ising Model C: Monte Carlo Simulation of Spin Glasses T: Optimization via Simulated Annealing T: Simulated Quenching and other annealing Schedules T: Neural Networks as mean field approximation to Simulated Annealing T: Parallel Simulated Annealing T: Simulated Tempering for A: Statistical Mechanics of Random Field Models T: Optimization Case Study I) Topics -- Other Physical and Practical Optimization Methods T: The Travelling Salesperson Problem and NP Complete Optimization T: Branch and Cut -- the best known way of solving the TSP A: Classic airline scheduling problems T: Linear Programming -- main practical optimization method -- and its generalizations T: Genetic Algorithms When are these competitive with other approaches A: Some real world Optimization Problems A: Constrained (avoid obstacles) motion of Robot arms A: Scheduling of University Classes A: Datamining (see Case Study III) on databases in information arena) T: General theory of Physical Computation and Complex Systems Case Study I) Topics -- Computational Formulation of Quantum Chromodynamics (QCD) C/A: Historical and Physics Background on QCD and its Simulation A: A short description of High Energy Physics: The Study of the fundamental Description of Matter A: Gauge Theories and QCD as a gauge theory Electromagnetism (Maxwell's equations) is simplest gauge theory C: Gauge Theories on a lattice and relation between Lattice Gauge Theory and Statistical Physics C: The difficult limit from Lattice to Continuum theory C: Measurements on a Lattice Case Study I) Topics -- Computation of Quantum Chromodynamics (QCD) T: Parallel Computation of both Sites and Measurements in Lattice Gauge Theories A: Current Status of Computation of particle masses Simulation agrees (surprisingly) well with experimental measurements Monte Carlo uses a lot of supercomputer time as accuracy proportional to square root of computer time used C: Adding Fermions (quarks) to a theory of Gluons (The "carrier" of strong interaction force) Makes computation much lengthier but doesn't change (some) results much Needed in any true theory of Nature T: Hybrid Monte Carlo Combination of deterministic (equation of motion) and Monte Carlo Methods Case Study I) Topics -- Numerical Computation of Random Numbers T: Basic Methods for Pseudo Random Number Generation -- one doesn't use true random numbers! T: Linear Congruential T: Multiple Linear Congruential T: Lagged Fibonacci T: Parallel Random Number Generation T: Testing Quality of Pseudo Random Number Generators Several methods have unacceptable correlations Note a QCD computation could use 1012 separate random numbers and so the more powerful the computer, the more random numbers one will generate and the greater accuracy and period needed. CPS 713--Case study II August to December1994 - Case Studies in Computational Science -- Overview of Second Case Study CFD and Numerical Relativity Geoffrey Fox NPAC Syracuse University Syracuse NY 13244-4100 Abstract for CPS713 Case Study II) Computational Fluid Dynamics and Numerical Relativity CFD (Computational Fluid Dynamics) and NR (Numerical Relativity) both involve the solution of second order partial differential equations (PDE's) describing physical phenomena. This case study will study both applications and then look at the computer science (computational) issues which are both common and distinct. This will allow us to study the requirements of a computational toolkit for general solution of second order PDE's. These two applications are by no means the only applications but they cover a broad range of issues. CFD can be defined narrowly as confined to aerodynamic flow around vehicles but it can be generalized to include as well such areas as weather and climate simulation, flow of pollutants in the earth, and flow of liquids in oil fields (reservoir modelling). Further Remarks on CPS713 Case Study II) Computational Fluid Dynamics and Numerical Relativity Our numerical relativity example will be the collision of two black holes which is focus of a major NSF Grand Challenge involving NPAC with seven other institutions in a collaboration led by Richard Matzner at Texas. Note that Numerical Relativity involves solution of Einstein's equations which include as a special case Maxwell's equations used to describe electromagnetic phenomena. Thus issues of relevance to computational electromagnetics (used in study of antennas and radar cross-sections of military aircraft) are implicitly included in this case study. Nearly all partial differential equations which are commonly encountered can be found by choosing parameters or limits in CFD or NR. Overview of Topics in CPS713 Case Study II) Computational Fluid Dynamics and Numerical Relativity Motivation for CFD -- why we need Teraflop and Petaflop performance to design new aircraft and model oil reservoirs and polluted chemical dump sites Introduction to NAS benchmarks and remarks on performance of today's machines. Motivation of Numerical Relativity with Collision of two black holes as expected signature for LIGO gravitational wave detector -- expected need for Teraflop performance General Discussion of Continuum Physics as a model for Nature The Navier Stokes Equation -- Basic equations of CFD General discussion of Computational Issues for CFD Introduction to Numerical formulation of Einstein's Equations General discussion of computational issues for Numerical Relativity Comparison of Similarities and differences between CFD and NR CPS713 Case Study II)CFD+NNR -- Motivation of The NAS Benchmarks The NAS benchmarks were introduced by a group at NASA Ames as a novel approach to benchmarking high performance machines. Most benchmarks (call these software benchmarks) are presented as specific pieces of software which must be run on a target machine to measure its performance Software benchmarks involve both the computer and the compiler (assuming benchmark in a language such as Fortran C C++ ADA). Further they implement a particular algorithm to solve the problem. Particular high performance computers may need different algorithms and or different language to achieve good performance For instance High Performance Fortran would be needed on a parallel machine instead of simple Fortran77 The benchmark code may implement in a high level language a basic algorithm such as FFT or matrix solve while any practical code would use the optimized (assembly language) subroutines. The basic algorithm might need to be changed due to architecture of machine e.g. SIMD or vector architectures typically need different optimal algorithms from MIMD machines. The innovation in NAS benchmarks was "pencil and paper" (i.e. mathematical) definition of benchmark so each implementation could be optimized (with certain rules) for target computer as appropriate. CPS713 Case Study II) --Use of The NAS Benchmarks The NAS Benchmarks are described in a basic document: "NAS Parallel Benchmarks" RNR-94-007, March 1994 by D.Bailey et al. updating original 1991 document. "NAS Parallel Benchmark Results 3-94", RNR-94-006 by D.Bailey, E. Barszcz, L.Dagum, and H.D. Simon contain latest but obviously ephemeral results. See also http://www.nas.nasa.gov/RNR/Parallel/NPB/NPBindex.html. The material used in class selects results from 4 benchmarks described in detail in above citations: EP: Embarrassingly parallel -- no communication -- tests node performance The remaining three benchmarks involve significant communication. CG: Conjugate Gradient -- basic PDE Solution method discussed in CPS615 SP: Pentadiagonal Solver -- model of a full CFD solver BT: Block Tridiagonal PDE Iterative solver -- another model of a full CFD solver There are several striking results in NAS results with SGI Power Challenge and IBM SP2 leading the way in performance per node. Full document discusses performance per dollar and the full set of benchmarks. We will later use mathematical definition of CFD contained in BT and SP benchmarks as a way of indicating key computer science issues in CFD. We will generalize this approach and try to specify numerical relatively in a similar fashion. CPS713 Case Study II) -- Overview of Computational Toolkit Issues We will use AIAA-94-2249 "Computational Toolkit for Colliding Black Holes and CFD" by N.P. Chrisochoides, G.C. Fox and T. Haupt as an overview of issues that need to be studied in producing a PDE (CFD and NR) Computational toolkit Example of multidisciplinary design as a metaproblem needing integration of many disparate software modules (sec. 1) Study of ELLPACK and issues in modular programming for PDE solution. Relevance of domain specific interfaces built using Mathematica or similar package (sec. 2) Software Integration issues needed in metaproblems (sec. 3) Review of programming paradigms (sec. 4.1) High Performance Fortran for individual modules (Sec 4.2) Runtime Support from libraries, message passing to programming environment (Sec. 5) Parallel Grid Generation for Structured (NR) and Unstructured (CFD) meshs (Sec. 6) CPS713 Case Study II) -- Remainder of Basic Module (What will be in The Long Discussion of Subject after one lecture Overview) Relate CFD to NAS benchmark discussion Review of various PDE solution methods and their parallel implementations Discuss in detail numerics and parallel implementation of NAS CFD model problems Present NR in NAS benchmark form Discuss model problems from wave equation to simple realistic NR problem Return to general toolkit Issues: Current status of MPI and PVM -- standard message passing systems Comparison with thread based approaches (PORTS Collaboration) Irregular adaptive Compiler (High Performance Fortran) Language and runtime support Parallel Grid Generation -- Integration with HPF and HPC++ Software Integration with AVS and Fortran-M (CC++) CPS713 Case Study II) Overview Features of Numerical Relativity Numerical Relativity is a coupled set of partial differential equations Direct formulation has 10 independent equations Some implementations have up to 50 coupled functions Equations can be divided into two classes 4 Elliptic (Laplace equation-like) constraint equations which must be satisfied at each time 6 coupled Hyperbolic (Wave equation like) equations describing time evolution For CFD, the "physics" determines computational issues e.g. a shock represents a rapidly varying solution which requires typically an adaptive irregular mesh and finite element solution method For Numerical relativity, one can change the nature of solution by changing "space itself" This is called choosing the Gauge One gauge could have rapidly varying fields and require adaptive finite elements Another Gauge could have slowly varying fields and be soluble with finite difference. CPS713 Case Study II) Comparison of Numerical Relativity with Maxwell's Equations The Magnetic Field is given in terms of vector potential by Components of vector potential are not independent as expressed by gauge transformation for any field Choosing implies choosing gauge The Constraint equations are: The evolution Equations are: These give waves at infinity CPS713 Case Study II) Computational Features of Numerical Relativity The theory is very nonlinear: For instance one of the elliptic constraints can be written: CFD is also nonlinear as have terms such as Boundary Conditions at Infinity are those of computational electromagnetics (CEM) Not those familiar from CFD One needs to look for wave solutions and these waves are precisely what LIGO experiment will detect Curiously I see that methods used in CEM such as method of moments are not being used in Numerical Relativity NR solves evolution equations and identifies oscillatory wave solution Waves very sensitive to numerics -- small numerical errors can be amplified as wave propagates Numerical approximation can introduce an effective dissipative term which has small coefficient but enhanced by large propagation distance CPS713 Case Study II) Computational Features of Numerical Relativity (Contd) -- Singularity Structure There are no small coefficients of second order derivative terms In CFD small coefficient proportional to viscosity led to rapidly varying fields so product of viscosity times second derivative was comparable in size to other terms in CFD equations This leads to shocks, boundary layers, turbulent flow in CFD i.e. CFD has singularities which are lower dimension than solution space Numerical Relativity has the world's most significant singularity -- Black Holes Otherwise singularities are volume based and not like shocks Correspondingly Numerical Relativity can use finite difference methods One does need adaptive block structured meshs but probably not unstructured meshs Can use Finite Elements (FEM) and may be preferable -- CFD is more or less required to use FEM CPS713 Case Study II) Computational Features of Numerical Relativity (Contd) -- Black Hole Boundary Condition Boundary conditions at Black Holes involve physics and numerics Certainly finite difference mesh needs some sort of special treatment Physics says that any information inside black hole is irrelevant It cannot get out and so cannot affect solution outside hole However we don't know where Black Hole is until we get full solution! This issue is "Show-Stopper". It may be that difficulties in this area will prevent reliable solutions without a major algorithmic breakthrough or new physics insight Other NR issues are hard but no reason why we shouldn't solve reasonably well CPS713 Case Study II) Computational Needs for CFD and Numerical Relativity It is possible that a teraflop computer may be sufficient to "solve" the collision of two black holes Solution defined as producing a catalog of wave forms which can be used in analysis of initial LIGO data This is not of course only important issue and other studies may require more or less computer time NASA has documented carefully estimates of computer needs for various CFD approaches. Although not entirely clear what CFD approaches are actually relevant to design new aircraft or cars. Better -- this would mean using more accurate solutions to full equations Faster -- This is perhaps more promising and requires Integration of several distinct simulations in full design cycle This Multidisciplinary analysis and design underlies agile manufacturing or concurrent engineering Snapshot results are: Petaflop performance needed for full Navier Stokes equations for full aircraft Teraflop performance needed for Multidisciplinary Analysis using Reynolds averaged approximations with turbulence models CPS713 Case Study II) Some Common Issues between CFD and Numerical Relativity Both problems are coupled systems of second order partial differential equations Both can involve solution of metaproblems -- coupling between different systems of equations In CFD, Metaproblems seen in Multidisciplinary Analysis and Design in NR, Metaproblem seen in coupling of constraint and evolution equations Both problems require numerical experimentation to develop working codes Many unsolved issues requiring new physics insight implemented well numerically One cannot specify today the "right" approach to CFD or NR Both problems have elliptic and hyperbolic equations Similarities suggest we develop a toolkit applicable for either or both applications CPS713 Case Study II) Computer Science Support for CFD and NR -- Portable Scalable Software Tools Portable means runs on (nearly) all of today's high performance (parallel) computers Scalable means code written today will run on future high performance machines These current and future machines include networks of workstations as well as integrated massively parallel machines High Performance Fortran and C++ ; scalable data parallel support Fortran-M and CC++ ; scalable support of task parallelism AVS ; industry standard for visualization and software integration PVM and MPI ; standard message passing support ADIFOR ; differentiate Fortran code ; critical tool for optimization problems Prototyping Software ; needs development of Interpreters and other tools CPS713 Case Study II) Further Computer Science Issues for CFD and NR Computational Toolkit Domain Specific Software where user interfaces at level of mathematical equations -- not C++ or Fortran Can build with Computer algebra tools such as Maple or Mathematica which then must be taught to generate efficient High Performance Fortran , Fortran77 + MPI or equivalent. SINAPSE -- built from Mathematica at Schlumberger research ELLPACK -- one of first and best PDE toolkits (for ELLiptic equations) Runtime support and libraries Schedulers and data decomposition tools Scientific Libraries such as SCALAPACK which can be used directly to solve matrices coming from computational electromagnetics formulated with the method of moments Parallel Compiler Runtime Consortium Integrated support for many languages on many computers Can mix programming paradigms such as HPC++ for convenience and elegance with Fortran77 + MPI for efficiency CPS713 Case Study II) Specific Toolkit Modules Needed PDE Solvers for both Elliptic, Hyperbolic and mixed equations Geometry packages to define solution space (mainly for CFD used in design of real vehicles) Visualization including Virtual Reality (VR) VR used by Boeing today to study if a particular design can be serviced VR study of CFD or NR solution may lead to new physics insight Optimization needed for Multidisciplinary Analysis and Design Boundary Conditions can be application specific as sensitive to physical system Fluid -- Vehicle Boundary NR Infinity boundary condition of waves NR Black Hole event horizon (guaranteed to be inside true black hole surface) boundary condition CPS713 Case Study II) Specific Toolkit Modules Needed -- Parallel Grid Generation Grid Generation has several important characteristics: Adaptive Block Structured for CFD and NR Adaptive Unstructured for CFD Should be consistent with multigrid and domain decomposition solution methods Should be integrated with High Level Language or Domain Specific Interface Should be linkable between different sets of equations For instance mesh used to simulate structure of airframe must be consistent with volume mesh needed for CFD study of airflow around aircraft Must be able to generate and adapt "in place" and "in parallel" on parallel machine CPS713 Prototype of CPS714 August to December1994 - Case Studies in Computational Science -- Overview of Information Technology Applications Geoffrey Fox NPAC Syracuse University Syracuse NY 13244-4100 Remarks on Case Study III) Some Technologies and Applications of the Information Age This case study is a prototype for a new course referred to informally as CPS714 which is focussed applications supporting CPS616 which is a new course offered first as CPS600 in Spring 95 and then CPS616 in spring 96 CPS615 is Computational Science for scientific and Engineering Applications CPS616 is proposed as Computational Science for Information-oriented applications. CPS615 and CPS616 are aimed as base technology courses and CPS713 fulfills the application requirement for the Syracuse University Computational Science Academic Curricula. We have chosen from the CPS616 Curricula, four broad topics for the Case Study III) of CPS713 this fall The Four Topics are: A: Introduction to the future NII (National Information Infrastructure) and its current prototype -- the Internet B: Parallel Rendering and Geographic Information Systems C: Parallel and Distributed Databases and related issues such as Data Mining D: How to Organize Information in a Multimedia Geographically Distributed High Bandwidth World. Annotated Version (with CPS713 Case Studies) of CPS616: Technologies and Applications of the Information Age Draft 1: Geoffrey Fox August 7,1994 Background Computational science can be defined broadly as the discipline on the interface between computer science and applications of computers. The current Syracuse course CPS615 and others nationwide, can be considered as "Computational Science for Scientific Computing" or "Technologies and applications for Scientific Computing". The audience is both the technologists (Computer Science, Computer Engineering and Applied Mathematics) as well as the application fields such as Computational Chemistry, Physics and Aerospace Engineering. We propose a new course CPS616 playing a similar role to CPS615 but aimed at the Information related applications rather than scientific computing. At Syracuse University, application students could come from IST (Information studies which also covers technologies), Newhouse (Communications), Maxwell (Public Administration), VPA (Visual and Performing Arts), Education. Technology students are from Computer Science, Computer Engineering and IST. Implementation of Information Track of Computational Science We propose to offer the full course CPS600 in the first semester (January to April) of 1995 with a trial run of reduced scope as part of CPS713 (Applications of Computational Science) this fall. We will make all teaching material available electronically and have discussed producing a textbook (electronic and conventional). Many authors and teachers will be needed to cover field. Not all of these teachers will be at Syracuse University and videoconferencing may be used for part of the course. The course is currently structured as about ten independent modules of about three to six hours per module. We are now seeking comments and offers of help and collaboration. Overview of Draft Curriculum The conference proceedings "R and D for the NII: Technical Challenges" obtainable from EDUCOM (nii-forum@educom.com) is one useful general resource. It would be important to collect other useful general and specialized reference books for either teachers and/or students. There are currently 10 modules listed below. Possible material for each module will be found in the next sections., 1) The Internet and Specialized Testbeds as Prototypes of the GII (Global Information Infrastructure) 2) Physical Network 3) The Consumer Multimedia Enterprise: Multimedia Videogames, PC's, Settop boxes, and Workstations 4) Digital Media: Audio, Video, Graphics and Images 5) User, Application and Service Interfaces 6) Client and Server High Performance Multimedia Computer Requirements and Architecture 7) Base Software and Systems Architecture of the GII 8) Pervasive and Niche Applications for the GII 9) Generic Services and Middleware on the GII 10) The Emerging GII Enterprise in Industry, Academia and Society 1: Curriculum of Module: Internet and Specialized Testbeds as Prototypes of the GII (Global Information Infrastructure) 1) What is Internet including History, Phenomenology and base Technologies ** CPS 713 Topic A 2) Learn to use gopher, Mosaic etc. ** CPS 713 Topic A 3) Peruse examples of text, image, video, Information systems ** CPS 713 Topic A 4) How to prepare and convert HTML, JPEG, MPEG ** CPS 713 Topic A 5) Gigabit Testbeds 2: Curriculum of Module: The Physical Network 1) Local Home Delivery -- THE GII Offramps -- Copper pair, coax, fiber, wireless, Cellular, ADSL 2) Trunk Transmission -- fiber, Satellite 3) Switching -- ATM, ISDN 4) Architectures: Cable and Telephone Company, Distributed, Centralized, Multivendor, Military (Global Grid) 3: Curriculum of Module: The Consumer Multimedia Enterprise: Multimedia Videogames, PC's, Settop boxes, and Workstations 1) CD-ROM 2) Settop Box 3) CD-I, 3DO, Nintendo, Sega, Atari(Jaguar) 4) Specialized Hardware: DVI, Video Accelerator cards 5) SGI and other high end systems 6) Multimedia Authoring 7) Edutainment 8) Anatomy of selected videogames and Multimedia titles: SIMCITY, MYST, NBA Jam, Crash and Burn, Mortal Kombat, Encarta 4: Curriculum of Module: Digital Media: Audio, Video, Graphics and Images 1) Rendering and Modeling ** CPS 713 Topic B 2) Photo-CD 3) Compression of Images, Video, Audio and Text -- MPEG, JPEG, Wavelet, Fractal 4) Individual and "crowd" display technology 5) Computer Animation for movies such as Jurassic Park 6) Video browsing 7) Video indexing -- speech recognition 8) Displays: HDTV 5: Curriculum of Module: User, Application and Service Interfaces 1) Virtual Reality 2) X, Motif 3) Mosaic and its future 4) ATM Layers (AAL) 5) Interfaces for real world users such as children 6: Curriculum for Module: Client and Server High Performance Multimedia Computer Requirements and Architecture 1) Multimedia Clients (see module 3) 2) Parallel Video and other Information servers ** CPS 713 Topic C 3) Parallel I/O Issues ** CPS 713 Topic C 4) Disk and Archival Storage Issues ** CPS 713 Topic C 5) Specialized versus General Purpose Architectures (Workstation, Mainframe, Teradata, nCUBE, IBM SP-2 and equivalent) ** CPS 713 Topic C 7: Curriculum for Module: Base Software and Systems Architecture of the GII 1) World Wide Web -- URL and futures 2) Network Protocols, Management and Switching -- data transport 3) What is right/wrong with TCP/IP, PVM, MPI, ISIS etc. 4) Fault Tolerance 5) Distributed Operating Systems 6) Televirtuality 7) Network Resource Allocation 8) Caching 8: Curriculum for Module: Pervasive and Niche Applications for the GII 1) Movies on Demand 2) Interactive TV 3) Digital Library ** CPS 713 Topic D 4) Telemedicine 5) Education 6) Global Grid(Defense) 7) Commerce 8) Manufacturing 9) Distributed Scientific Computing 9: Curriculum for Module: Generic Services and Middleware on the GII 1) Parallel and Distributed Databases ** CPS 713 Topic C 2) Security, Privacy -- cipher/decipher 3) Collaboration -- distributed whiteboards etc. 4) Digital cash 5) Decision Support and Datamining Tools ** CPS 713 Topic C 6) Geographic Information Systems -- Terrain data ** CPS 713 Topic B 7) Organization of Material in Multimedia Systems on the World Wide Web with URL's -- the nonlinear Information Model ** CPS 713 Topic D 10: Curriculum for Module: The Emerging GII Enterprise in Industry, Academia and Society 1) Early (successful) commercial services 2) Convergence of industries 3) Convergence of Academic Fields ** CPS 713 Topic A 4) Convergence of Computing and Communication ** CPS 713 Topic A 5) What (if anything) will happen to society from the GII -- Quality of Life, Jobs, Education --are there important negative implications? ** CPS 713 Topic A 6) Intellectual property rights on the GII 7) What information is available now (free or more money) and what could be made available ** CPS 713 Topic D 8) Current Internet Assets ** CPS 713 Topic D 9) Kodak Picture Exchange Remarks on CPS713 Case Study III) Topic B: Geographic Information Systems Geographic Information Systems (called GIS) are of growing importance in areas such as Defense where they underlie Mission planning and related systems Commercially for City Planning and with companies such as Power Utilities whose business involves spatially labelled assets such as gas lines, health care clinics etc. NASA's Mission to planet Earth will dramatically increase the availability of data such as that gotten today from Satellites such as LANDSAT and SPOT. Their multi-spectrum data can be used for many applications such as studying state of environment, crop growth etc. GIS will overlay such Satellite data on a background map Other GIS functions are often typical "scientific computing" algorithms such as Image Processing and solution of scheduling problems which can use optimization methods we will study in Case Study I). Note that the GIS is natural multimedia Interface to Spatially labelled data (E.g. video footage for tourism arranged by vacation location). This contrasts with Mosaic as natural multimedia document interface The magazine GIS World has a wealth of information about real world GIS applications and companies Map data is available (almost free) from the USGS (Geological Survey) and with additional features from several commercial companies. (See memo by Paul Coddington) Remarks on CPS713 Case Study III) Topic B: Parallel Rendering of Three Dimensional Terrain data Rendering refers to process of creating images from a model of them We will only look at case where image is of three dimensional world but much of our analysis will be generalizable We intend to use such a 3D renderer to produce a 3D Image of New York State which can be navigated by children and teachers to provide a virtual field trip This "New York State -- The Interactive Journey" is part of our Living Textbook project which will link remote host parallel machines to client PC's and Macintosh's in 6 schools linked to NPAC by high speed ATM network NYNET. Current GIS systems tend only to support two dimensions properly with 3D either done crudely or in non real time mode. The Living Textbook project intends to use power of parallel machine to produce full 3D GIS. Rendering and Map data will be on host parallel machine. GIS Front End will be on Macintosh or PC in Schools Much rendering research uses ray tracing and optimizes for the best possible Image quality We will study texture mapped polygon method which is much faster and can give you ability to trade-off performance versus Image quality and guarantee real time constraint met. Remarks on CPS713 Case Study III) Topic C: Parallel and Distributed Databases NPAC has unusually good expertise in this area with the availability of Parallel Oracle (the largest commercial relational database) and parallel DB2 (IBM's relatively new relational database) Note that industry standard access language is SQL SQL is naturally parallel and so once parallel database implemented, applications can be parallelized without major attention to nature -- parallel or sequential -- of host machine. Compare with Fortran where we produce a parallel (High Performance Fortran) Compiler but then every use of HPF must still worry about parallel algorithm and parallel code. Academia is studying Object oriented databases which have attractive features but currently one expects relational databases to dominate parallel database field. Remarks on CPS713 Case Study III) Topic C: Datamining in Parallel and Distributed Databases Databases contain data which is converted to Information by Datamining This use of a database is often called a Data warehouse You extract data and the apply Decision Support tools which are essentially Optimization systems to extract Information High Visibility Commercial Applications are: Using customer purchase information to optimize store layout. Which products should be placed where, when. Using Credit card data, plan optimal mailings with "offers" which customers are likely to accept. For instance credit cards may show customer is a football fan who likes to spend Xmas in Florida. August mailing will discount combination of Florida trip with Syracuse University Football tickets Using Medicare data to identify fraudulent practices identified as being anomalous (e.g. Doctors claiming to see unusually many patients in a day etc.) Optimization tools will be those we study in Case Study I) Thinking Machines produced a package (called Darwin originally) featuring Genetic algorithms, Neural Nets etc. Remarks on CPS713 Case Study III) Topic D: How to Organize Information on the World Wide Web A traditional book is a relatively consistent set of information arranged in modules (paragraphs and chapters) and typically read in a linear fashion from beginning to end. An encyclopedia on the other hand arranges information in modules of chapter to paragraph size but one expects to read "randomly" or nonlinearly as each module "points" you to other modules. The world wide web is similar to encyclopedia generalized to dynamic rather than static links and with information spatially distributed and accessed by Network. Note that looking at commercial CD-ROM products, my family evaluates the electronic encyclopedia's (Encarta, Compton, Grolier) as superior by far to electronic (illustrated) books. One must enforce standards to allow linked modules to address information in a consistent fashion For instance, a distributed physics information resource should use common notation and equations. We refer to our Information enterprise as the "Encyclopedia Galactica" to reflect the importance of the nonlinear model and the prescience of the Hitchhikers guide to the Galaxy.