HPC at the Crossroads Academic Niche or Economic Development Cornucopia HPCS 95 Montreal Canada July 10-12,1995 Geoffrey Fox Syracuse University NPAC 111 College Place Syracuse NY 13244-4100 Online presentation at http://www.npac.syr.edu/users/gcf/hpcs95/fullindex.html Abstract of HPC(C) at the Crossroads What is status of High Performance Computing and Communications ? In a nutshell, we understand issues and technologies quite well but next step frfom research to reality is hard and not clear where the "killer(winning) applications are" The current U.S. Federal HPCC Program and particular work at NPAC on industrial implications The survey of industrial applications and its implications InfoVision (Information,Video, Simulation, Imagery, on demand) and MPP's as WebServers Applications to Education. Television and other media, Community Networks Lessons from a meeting at Pasadena, January 1995. HPCC does not clearly make business sense. Need expand user(application) and technology base This will also improve(revolutionize) HPCC Software Infrastructure with productivity and Software Engineering tools. Superficial Observations on High Performance Computing-I Parallel Computing Works! Technology well understood for Science and Engineering Good parallel algorithms, several examples of major applications in many fields exploring range of issues Data and Message Parallel programming models developed Supercomputing market small (few percent at best) and probably decreasing in size Essential to have good common software infrastructure Productivity tools -- Software Engineering -- Programming Support tools POOR The parallel software "industry" is very small Superficial Observations on High Performance Computing-II No silver programming bullet -- I doubt if new language will revolutionize parallel programmimng and make much easier Hardware (shared memory) could be helpful Social forces are tending to hinder adoption of parallel computing as most applications are areas where large scale computing already common Parallelizing existing applications (porting sequential software) very hard Opportunities offered by use of MPP's often require major organizational changes Superficial Observations on High Performance Communication ATM ISDN Wireless Satellite advancing rapidly in commercial arena which is adopting research rapidly Social forces (deregulation in the U.S.A.) are tending to accelerate adoption of digital communication technologies These are often NEW applications (porting of POTS relatively easy!) such as interactive TV/Shopping Tremendous competition between different telecommunication sectors encourages new technology now to ensure future success Not clear how to make money on Web(Internet) but growing interest/acceptance by general public huge sales in home multimedia PC's -- comparable to TV's in volume Integration of Communities and Opportunities Computing and Communication and Information Industries merging -- similar impact on academic departments will(should) happen Some Implications of HPCC Observations Technology Opportunities in Integration of High Performance Computing and Communication Systems Merging of networking, parallel computing, distributed comouting communities This SOLVES previous difficulties observed for high performance computing as implies a much larger distributed (world-wide metacomputing) computing base New Business opportunities linking Enterprise Information Systems to Community networks to current cable/network TV journalism New educational needs at interface of computer science and communications/information applications Major implications for education -- the Virtual University Current Status of HPCC Applications, Hardware and Software However we need more than fast enough machines We also need a large enough market to sustain technology (systems and software) This is both Grand Challenges augmented by National Challenges but also Build HPCC technologies on a broad not niche base starting at bottom not top of computing pyramid A Survey of New York State Industrial Opportunities for HPCC was very influential for me and my group(NPAC) The 33 Application areas were studied in detail: Simulation (Roughly the Grand Challenges) 1:Computational Fluid Dynamics 2:Structural Dynamics 3:Electromagnetic Simulation 4:Scheduling 5:Environmental Modelling (with PDE's) 6:Environmental Phenomenology 7:Basic Chemistry 8:Molecular Dynamics 9:Economic Modelling 10:Network Simulations 11:Particle Transport Problems 12: Graphics 13:Integrated Complex Systems Simulations The 33 Application areas were studied in detail: Information Analysis -- DataMining 14:Seismic and Environmental Data Analysis 15:Image Processing 16:Statistical Analysis 17:Healthcare Fraud 18:Market Segmentation Growing Area of Importance and reasonable near term MPP opportunity in decision support combined with parallel (relational) databases The 33 Application areas were studied in detail: InfoVision: Information, Video, Imagery and Simulation on Demand 19:Transaction Processing 20:Collaboration Support 21:Text on Demand 22:Video on Demand 23:Imagery on Demand 24:Simulation on Demand (education,financial modelling etc.) -- simulation is a "media"! MPP's as High Performance Multimedia (database) servers -- WebServers Excellent Medium term Opportunity for MPP enabled by National Information Infrastructure The 33 Application areas were studied in detail: Information Integration combining Simulation, Analysis and InfoVision 25:Military and Civilian Command and Control(Crisis Management) 26:Decision Support for Society (Community Servers) 27:Business Decision Support 28:Public Administration and Political Decision(Judgement) Support 29:Real-Time Control Systems 30:Electronic Banking 31:Electronic Shopping 32:(Agile) Manufacturing including Multidisciplinary Design/Concurrent Engineering 33:Education at K-12, University and Continuing levels Largest Application of any Computer and Dominant HPCC Opportunity Some detailed Analysis of Opportunities for HPCC in the Science and Engineering Simulation Arena From the Grand(Simulation) Challenges to the National (information) Challenges Need to Educate People to take advantage of HPCC technologies WebServers and InfoVision as an example of Opportunity for MPP's on the NII The Virtual University and Other Opportunities to use HPCC in Education Some Virtual University Projects with which NPAC is Collaborating Living Textbook -- Prototype of K-12 Educational Environment of year 2000 ATM delivery to K-12 schools from NPAC's MPP's Physics 105/106 -- Science for the 21st Century (for non-Scientists) -- Some course modules built around Multimedia Information Systems Search for Extra Terrestial Intelligence Mind and Machines PseudoScience and the Paranormal Scientific Literacy, Imaging and Evolutionism versus Creationism under development Distance Learning -- Web Technology provides new (as interactive, hyperlinked and multimedia) approachs Developing WWW (Perl scripts) support for authoring educational material Course Material on CPS600 -- Prototype of Core Material for Information track of Computational Science is mostly on the Web Could offer training (over NYNET or Web) to interested corporations in digital information technology Interesting world wide opportunities as in FLAG -- Fiber Link across the Globe The World Wide WebWindows and our contributions -- WebWork WebWork -- Figures/Screendumps Index 1:Server-to-Server Communication Diagram 2:WebWork System OverView 3:WebTools CASE tools sample page 4:Java documenation sample page 5:Java class database manager 6:Java screendump -- sorting algorithms 7:Java screendump -- WebFlow Editor prototype 8:Java screendump -- WebFlow application prototype --- Project Manager 9:VRML screendump 10:VRML source code example 11:Java source code example What Is WebWork -- NPAC, Boston University, Cooperating Systems Collaboration -- I? WebWork is an open, world-wide distributed computing environment based on computationally extended Web Technologies The backend computation and information infrastructure is provided by the World-Wide Virtual Machine -- a mesh of computationally extended Web Servers (called Compute Servers) These servers manage (via CGI mechanisms) a collection of standardized computational units called WebWork Modules. What Is WebWork -- NPAC, Boston University Cooperating Systems Collaboration -- II? Geographically distributed and Web-published WebWork modules interact by HTTP/MIME based message/object passing and form distributed computing surfaces called Compute-Webs The front-end user/client interfaces are provided by evolving Web browsers with increasing support for two-way interactivity (e.g. Java, VRML) that facilitates client side control and authoring. A natural user-level metaphor -- WebFlow -- is supported in terms of visual interactive compute-web authoring tools. Some Key Features of WebWork Implements the "Viable Base" Enterprise Model of HPCC Software identified in Pasadena2 workshop This will allow good programming tools to be developed and mnaintained as larger enough base to support software industry Implements a powerful software engineering framework for parallel computing by integrating parallel programming with the World Wide Web Productivity Tools WebWork Architecture WebWork is based on a three-layer architecture shown in figure 2, including: World_Wide Virtual Machine (WWVM) in the (bottom) layer 1, Middleware layer 2 of agents, wrappers, mediators etc., and high level programming environments (e.g. HPFCL) and user interfaces (e.g. WebFlow) in the (top) layer 3. All base WebWork concepts can be implemented in terms of today's Web technologies (HTTP, MIME, CGI) and a prototype is under development at NPAC. The overall design is open and ready to upgrade the existent (e.g. browsers or servers) and include new (e.g. agents or distributed object brokers) Internet/Web technologies One starting point for the WebWork construction is provided by NPAC WebTools -- a CGI-extended Web server with enhanced content authoring and database navigation functionalities. WebTools Server is used as a prototype WebWork node server. NPAC WebTools-I (Basic WebWindows Functionality) NPAC WebTools is a CGI-extended Web server that offers a HyperWorld based metaphor for organized content authoring and navigation, currently implemented in terms of the following tools: HyperWorld Manager, HyperWorld Navigator, On-Line HTML Editor, WebMail and CASE tools for HySource Worlds authoring. HyperWorld Manager offers database management support for the server document tree, integrated with browser GUI tools for remote file/document and directory/folder handling (create, destroy, copy etc.). The model assures concurrency control, atomicity and integrity of the document datatbase. Compare to File Manager in MS(becomes Web)Windows and simple UNIX shell cp mv rm commands. Directory structure is (crude) database structure built into UNIX. WebWindows has much much more powerful natural database support. NPAC WebTools-II HyperWorld Navigator offers a consistent navigation metaphor. Compare to UNIX directory structure and generalized cd Compare to MSWindows Program Manager On-Line HTML Editor offers remote authoring support for documents, created by the HyperWorld Manager. WebMail offers the Web interface to the MH mailing system and initial support for collaborative forums. Enables enhanced MH on all clients from PC's to Supercomputers ... Will also integrate Oracle with WebMail (and WebTools) for very fast indexed and free text search CASE tools offer disciplined WebTools software development environment, integrated with the HyperWorld database. Enabled by Integration of Computing, Software DEvelopment and Databases in WebWindows Web Productivity Tools and Virtual Software Laboratory (VSL) NPAC WebTools can be viewed as an instance of Web Productivity Tools (navigators, editors, databases), developed collectively by the Internet/Web community. We view these emergent open tools as central to develop and maintain Web based World-Wide Metacomputing. Software exchange and integration tools are urgently needed. Without it, 'pervasive Web' will become soon too complex to maintain and will be dominated by closed corporate products. One such attempt is made by the HySource CASE package in NPAC WebTools. So far, we developed HyPerl World (Screen 3) of the WebTools source code and we now integrate it with Java (Screen 4) in the form of HyJava World (Screen 5) These tools will evolve towardsVirtual Software Laboratory -- a collective distributed CASE framework for virtual corporation of WebWork developers. World-Wide Virtual Machine WebWork pilot project is a collaboration between NPAC, Boston University and Cooperative Systems Corporation, MA. It will prototype a candidate VSL, WWVM, Java based user interfaces, and port selected Grand/National Challenge applications to this platform. The project will use NPAC WebTools to bootstrap the software process and will prototype WWVM in terms of current Web technologies (Screen 1) Technically, early WWVM will include existent Web Servers with add-on CGI (Perl) scripts that build server-to-server communication and offer document database management, and module publication and linkage/instantiation support. This base model will be further extended and refined by using and driving evolving Web technologies. For example, the disk-based model in Screen1a will likely evolve towards memory-mapped model based on multi-threaded interpreted compute-servers (Screen 1b) WebFlow Paradigm User-level WebWork metaphor is given by WebFlow -- a distributed dataflow model built in terms of WebWork modules and MIME object/document communication channels. Think as Web versions of AVS or Khoros WebWork users will build and control distributed computing applications (compute-webs) using Web browsers based visual interactive editors and monitors. We are currently prototyping such WebFlow front-ends at NPAC using Java/HotJava model. WebWork modules are represented by Java threads (Screen 6) and visualized as interactive interconnected icons (Screen 7) Software Project Manager -- Example of Agent Middleware One current WebWork/WebFlow application, prototyped at NPAC, is Software Project Manager (Screen 8). Each software developer runs his/her WebTools server and uses HySource CASE tools. These servers are WWVM-connected to agent and manager servers. Agent server receives automatic notifications from developers servers on each software volume update, and uses customizable thresholds to decide when to fire a report to the manager or a deadline reminder to a developer. Software Project Manager tools contains a simple agent server that mediates between client/consumer ( here manager) and servers/producers (here developers). General WebScript and Agents More generally, this Middleware Layer 2 will be rather complex and populated by a spectrum of proprietary (e.g. Telescript, ScriptX, CORBA) and public (e.g. Perl, Tcl, Harvest, Java, VRML) scripted languages, brokers, agents, wrappers, mediators etc. see Screens In WebWork, we refer collectively by WebScript to the whole ensable of these models. At the current stage, it isn't clear if WebScript as a common intermediate language is a practical concept. An alternative is to live in the multi-language Web medium and emply interoperability agents to translate between various protocols. Practical initial implementation platfrom for this dual approch is provided in WebWork by an integrated collection of WebTools CASE tools based HySource Worlds for various languages. WebWork Integration Model WebWork Interpolates and Integrates pervasive Web HPCC and (nonHPCC) commercial software as in following table comparing computing concepts in three "worlds"; HPCC -- Commercial mainstream -- Web Current Web model needs computational extensions for banking/financial applications, manufacturing, interactice shopping/videogames etc HPCC can provide Web both parallel computing programming models, libraries and language/runtime concepts which coordinate components of distributed or parallel system HPCC needs the Web (or equivalent) to give it viable distributed computing and software engineering base The Web interpolates between "flaky" research software and solid but closed corporate solution. Clear trend away from proprietary towards open software models. HPCC needs a large enough market to sustain technology (systems and software) This implies that we look at both Grand Challenges and National Challenges but we suggest this is not enough: WebWork Builds HPCC technologies on a broad not niche base starting at bottom (Web,PC's) not top (MPP's, Supercomputers) of computing pyramid WebWork -- NPAC, Boston University, Cooperating Systems Collaboration Implements the "Viable Base" Enterprise Model of HPCC Software identified in Pasadena2 workshop This will allow good programming tools to be developed and maintained as larger enough base to support software industry Implements a powerful software engineering framework for parallel computing by integrating parallel programming with the World Wide Web Productivity Tools WebTools is a prototype developed at NPAC which is a base on which to build the Compute and Software Engineering Capabilities of WebWork An early development will be WebFlow -- a AVS/Khoros like system built on the Web which can be used for BOTH Computing (modules are executable software) and for management of Software Development task (modules are source code and people) Later can develop the full WebHPL -- a hybrid compiled/Interpreted environment implenting HPF/HPC++ etc system with Web infrastructure and front end PCRC Naturally Fits in with WebWork PCRC embodies the Parallel Computing Synchronization and collective parallel algorithms and runtime that will enable efficient Web-based computing Replace user interface of HPF or HPC++ with the Web(work) and use pervasive Web Technologies in infrastructure (World Wide Virtual Machine -- WWVM) WebWork Summary for PCRC WebWork is an open, world-wide distributed computing environment based on computationally extended Web Technologies The backend computation and information infrastructure is provided by the World-Wide Virtual Machine -- a mesh of computationally extended Web Servers (called Compute Servers) These servers manage (via CGI mechanisms) a collection of standardized computational units called WebWork Modules. Geographically distributed and Web-published WebWork modules interact by HTTP/MIME based message/object passing and form distributed computing surfaces called Compute-Webs The front-end user/client interfaces are provided by evolving Web browsers with increasing support for two-way interactivity (e.g. Java, VRML) that facilitates client side control and authoring. A natural user-level metaphor -- WebFlow -- is supported in terms of visual interactive compute-web authoring tools. WebWork Terms and Concepts -- I Agent A middleware broker module that facilitates WebWork operation Application A WWVM-runnable compute-web and its clients Bottom-Up Process A Software process that extracts reusable modules from applications Channel A communication link between two ports used to exchange objects Client A Web browser or editor Compute-Server Evolving Web Technology Server, driven by WebWork computation Compute-Web A composite module given by a dataflow network of modules linked by channels Database A server document tree with atomicity, integrity and concurrency control support WebWork Terms and Concepts -- II Document Web-viewable instance of an object Editor A Web Browser with enhanced WebFlow authoring functions HPFCL -- HP-Fickle for High Performance Fortran Coordination Language Coordination Script and Interface builder for HPF modules Middleware Any WebWork Module that is not a client or part of the WWVM Module Computational Unit with specified I/O ports and CGI interface to a server Object An instance of Object type used by modules as a (communication) unit Object Type Internet-public or WebWork-private MIME type Port A channel terminal with specified object type published by a module WebWork Terms and Concepts -- III Problem A published compute-web with missing modules Problem Solving Environment A WebWork enabled, agents aided collaborative process of matching problems with solutions Publication WWVM-runnable module with a Web-published interface Server Any Web server with database support or a compute-server Software Process A VSL based two-tier (top-down, bottom-up) WebWork Software Engineering process Solution A published module to be matched with a problem Top-down Process A software process that encapsulates applications as modules WebWork Terms and Concepts -- IV VSL or Virtual Software Laboratory Web Productivity Tools based CASE (Computer aided Software Engineering) tools that facilitate the software process WebFlow User level WebWork dataflow based application development environment Web Productivity Tools Any Web Software that facilitates WebWork Authoring WebScript WebWork coordination and management language in layer 2 which incorporates agents and enables a software process WebTools An instance of Web Productivity Tools developed at NPAC to bootstrap the Virtual Software Laboratory or VSL WebWork Hierarchical network of applications and the associated software process WWVM or World Wide Virtual Machine (Layer 1 of WebWork) WebWork Infrastructure layer given by an interactive surface of interconnected servers ASOP and Multidisciplinary Analysis and Design(MAD) A set of manufacturing companies -- Rockwell International, Northrop Grumman, McDoinnell Douglas, General Electric and General Motors is studying the NII implications for a particular MAD system "Affordable Systems Optimization Process" (ASOP) Interesting parameters are that next major aircraft to be built could involve: 6 major companies and 20,000 smaller supplier subcontractors Number of engineers involved is about: 50 at conceptual design 200 at preliminary design 2000 at final design upto 10,000 in manufacturing and development The design could involve upto 10,000 separate programs running in small linked clusters which vary from Airflow simulation around plane to expert system to plan location of inspection port to minimize maintenance costs Critical is configuration management and system database