Fox Presentation Spring 96 A Set of Foils used in 5 Presentations and 1 Tutorial CRPC Annual Meeting May 14-17 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for CRPC Tutorial and Presentations These foils cover 5 Talks and 1 2 hour tutorial given at CRPC Annual Meeting held at Argonne May 96 A Tutorial on Base Web Technologies Status of PCRC HPJava and HPF Overview of HPCC Applications at NPAC Problem Solving Environments and the Web Web Technology and Applications for Education Implementation and Issues for RSA Factoring on the Web Real-Time Interactive Distributed Weather Information System We briefly review 3 topics Capabilities of the Oklahoma Advanced Regional Prediction System (ARPS) code. Collaboration with the Center for Analysis and Prediction of Storms (CAPS). VRML integration to the Terrain Data Set. Capabilities of the ARPS code Non-hydrostatic, compressible dynamics in a terrain-following vertical coordinate. 6 water phases microphysics (water vapor, cloud water, rain water, cloud ice, snow, and hail/graupel.) Supports MPPÕs (T3DÕs, SP2Õs, clusters, ...) Other current numerical prediction models lack the spatial resolution required to capture small-scale, short-duration events such as snow bands. Can predict down to the Microscale Phenomena 0 to 1 hours. Location of events to within 1 km & timing to within 5 min. DV ±2 m/s, DT ±2 Kelvin, precip rate ±2 mm/hr. Note Syracuse Area weather very position sensitive Collaboration with CAPS We had a meeting with CAPS during the last week of April, and the outcome is the following: They will supply us with an initial data set over the Syracuse region. This data set will be taken from the end of April, during severe thunderstorms in the region. By mid summer they will try to help us run the code to evolve the initial data. The first attempts should result in stable runs, with inaccurate predictions. The ETA of accurate simulations should be roughly 1/2 person year. This will time line would allow for the accurate prediction of lake effect snow. Lake effect snow is caused mostly from wind blowing over a cold lake. VRML integration with the terrain data set We will use the current output from ARPS in hdf format. We will visualize the moisture variables. We run an isosurface routine on the raw data, which generates output of cloud formation. Will are developing a Java applet which will read in 3D data and determine isosurfaces. The user will input the isosurface value from a WWW page running the applet. The output of this applet will be VRML data that will be instantly rendered. Also it can be stored in our Illustra object database For the integration of the weather data to the existing 3D VRML terrain data set, we will use a 2D mask function which will determine where snow has fallen to a certain level. VRML data will be stored in a database, allowing the user to interactively turn on and off the weather information over the terrain. Aspects of Financial World Motivating HPCC Cooperative distributed (and parallel) computing will become mainstream in financial engineering due to a convergence of the following factors: Increased volatility due to globalization of financial markets Global distribution of data sources Increase in complexity of derivatives and risk management vehicles Increased demand for real-time asset allocation decision support Increased volume of raw data and need to process large databases Increased volume on the retail side of the spectrum in part due to on-line technologies (Internet and WWW) Financial Application areas for which High-performance computing technologies are becoming indispensable HPCC is becoming indispensable in the application domains such as: Derivative Valuation -- particularly over-the-counter products and exotics Portfolio optimization, valuation and asset allocation Hedging of large portfolios in real time Arbitrage trading Risk analysis simulations Pattern recognition Detection of fraud Credit risk analysis Market segmentation NPAC is engaged in development of new tools for quantitative financial modeling which take advantage of scalable computer architectures The ultimate goal is to integrate various quantitative analysis transparently using Web technologies into a seamless cooperative computing environment, capable of supporting all aspects of enterprise-wide risk management. Path Integral Approach to Derivative Valuation We developed new algorithms for risk neutral valuation of derivative financial instruments Theoretical prices of derivative instruments are obtained by discounting their expected payoffs under the equivalent martingale measure using money market interest rate. The core algorithm is Path Integral Monte Carlo which used to generate arbitrary distributions of underlying risk factors (stocks, bonds, short interest rates, commodities, indices etc.) The advantage of the new algorithm is that sensitivities of derivative prices with respect to changes in all model parameters are computed in a single simulation. This is crucial for effective hedging. Parallel version of the algorithm is written in C and MPI and relies on task parallelism and functional decomposition (could also use HPF) Monte Carlo samples are generated on multiple processors in embarrassingly parallel fashion Parallel Maximum Entropy and optimization Pricing modules can either run in lock-step with the Monte Carlo module which generates histories of risk factors or asynchronously perform valuation functions on the histories which are broadcast as they are generated by the Monte Carlo module We are linking this flexible algorithm with a novel scheme based on Maximum Entropy method which generates implied probability distributions from reported option prices. The implied distributions can be used within the Path Integral Monte Carlo module to price exotic contracts consistently with exchange-traded contracts and they can also be used to search for arbitrage opportunities Estimation of implied distributions requires large scale global optimizers. We are developing two parallel stochastic optimizers based on mean field approximation (Laplace formula) and Langevin equation Web-based System Integration -- Initial Server Implementation Derivative valuation functions are integrated using Web technologies into a service which can be accessed from any platform which supports a graphical browser Using a combination of HTML forms or Java front-end, CGI mechanism, Perl scripts and modules written in C and MPI, which are executed on multiple NPAC RS 6000 and Sun workstations and the SP-2, the user can: retrieve historical data from flat files perform statistical analysis display charts and histograms of historical data estimate parameters of the underlying stochastic processes enter own estimates of model parameters perform simulations display charts and plots of option prices and their sensitivities as functions of time, underlying stock price or option contract excercise (strike) price Web-based System Integration -- Futures In the next stage, flat files will be replaced with a parallel Oracle server Ultimately, the graphical user interface will be supplemented with an agent-based middleware layer, implemented in Java, where derivative pricing and risk management services will be requested and dispatched to the parallel Monte Carlo engine and returned to the client using an EDI-like protocol encapsulated within the KQML envelope. This will be a prototype of the new service economy that will flourish on the Web. Illustration of WebWindows Concept for Presentation Software Persuasion and Powerpoint are rather similar monolithic packages which can for instance only be clumsily ported to UNIX as cannot access internal data-structures defining foils WebFoil (NPAC prototype WebWindows presentation package) has Extended open HTML source manipulated by powerful PERL5 scripts allowing global changes and linkages of foils from many sources This plays role of outline which is a somewhat crippled open version of Persuasion/Powerpoint foils defining text alone Backend Oracle database illustrating modular WebWindows approach Using Appropriate templates WebFoil Uses Hotjava or Netscape 1,2 or 3 to display HTML with full Web Power including applets to enable Multimedia and dynamic presentations Lessons of WebFoil for WebWindows Software Development Scenario The WebTop Productivity environment will be built in a more modular fashion than current PC Windows or Macintosh arena e.g. future WebWindows presentation, word processor etc. packages will be built from many different modules coming from different commercial or public domain sources Java or equivalent future technology is key to understanding how WebWindows application/service software will look as it allows balanced client server applications to be built Note require an open display software so can produce appropriate customized interfaces for browsing, presenting, word processing etc. Emerging Web and NII Vision - I WebWindows -- the open nonproprietary operating system of future supplanting UNIX, Windows95/NT, Apple etc. Manages with a single interface all machines either individually or collectively on the NII WebTop Productivity -- Standard PC/workstation Applications made universal and powerful with Web Technology base -- illustrated with WebFoil discussion but also WebWord, WebExcel,WebLOTUSNotes etc. Encyclopedia Galactica -- The World's MultiMedia Information at the click of your big toe (using virtual reality Neat WebThing for disabled or creative input). Backbone of Education, Medical Informatics, scholarly research etc, Overview of HPCC Applications at NPAC CRPC Annual Meeting May 15 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for Overview of HPCC Applications at NPAC We describe several HPCC Large Scale Simulations in which NPAC is involved and comment on implications for HPF! Work in Porting ARPS Weather code to Syracuse region and Integration with VRML Visualization Work from InfoMall Industry Outreach on Financial modelling with Monte Carlo SP2 code linked to Web for "Pricing on Demand" Problem Solving Environment and Adaptive Meshs for NSF Grand Challenge on Binary Black Hole Collisions NASA Grand Challenge on 4D Data Assimilation A set of activities (mainly with PNL) on Computational Chemistry -- Relation of HPF and Global Arrays Status of PCRC HPF and HPJava CRPC Annual Meeting May 15 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for NPAC PCRC and HPF Status We describe status of PCRC Common library and Interoperability between HPF HPCC++ and the planned extension to Java The HPJava Evaluation of possible links between Java and HPCC We describe the Compiler testbed developed at NPAC which includes A new HPF/Fortran90(5) Public Domain Frontend -- benefit from our Collaboration with China A new HPF and Fortran90 Compiler Syntax checking system Linkage to Regular and Adaptive Runtime Systems Some World Wide Web Linkage with HPF running on top of network of Web Servers and (soon) a link of Pablo through Servers to Java client Performance evaluation An Analysis of HPF IN 4D Data Assimilation and Financial Modelling See also the NPAC Application discussion for more on evaluation of HPF in HPCC applications Problem Solving Environments from Simulation, Medicine and Defense and the Web CRPC Annual Meeting May 16 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for Problem Solving Environments and the Web We contrast demands from four areas Healthcare -- especially nursing and the Bridge concept introduced by Balch and Warner Defense -- Command and Control where Web is a natural COTS technology Distance Education with many issues in common with Collaboratory needed in Science and Engineering Computational Science and Engineering which is "small" area which needs to leverage all it can! We review progress in Web Technology and suggest that commercial efforts for the first three broad based applications can be leveraged with special tools aimed at parallel and distributed computing PSE's Web Technology and Applications for Education http://www.npac.syr.edu/users/gcf/crpcedtechmay96/index.html CRPC Annual Meeting May 16 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for Web Technology for Education This is abstracted from two more complete presentations http://www.npac.syr.edu/users/gcf/webwisdommar96/index.html http://www.npac.syr.edu/users/gcf/webwisdomapr96/index.html We discuss basic technologies Java, JavaScript, VRML, Web-linked databases and Digital Video and illustrate how we use them in a set of projects These are Basic University Classes, Distance Education in context of WebWisdom Virtual University Living SchoolBook (ATM linked K-12), Phy105/106 (Undergraduate Science for non Science majors) We stress some analogies with HealthCare both for Information dissemination and use of virtual reality front ends with home health care and education for disabled RSA Factoring on the Web -- Lessons and Implementation CRPC Annual Meeting May 16 1996 Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for RSA Factoring on the Web We describe the RSA Factoring Problem and the solution developed by Lenstra and collaborators with sieving techniques of increasing power The Web was used succesfully in the just completed RSA130 factoring -- an almost embarassingly parallel but very non trivial computation The mathematicians are preparing code for RSA155 factorization and probably Web will be critical here to increase resources from Teraop-hours (RSA129/130) to the needed Teraop-Months (RSA155) We overview architecture of FAFNER system used and lessons drawn for general Metacomputing administration MetaWeb http://www.npac.syr.edu/factoring.html A Tutorial on Base Web Technologies CRPC Annual Meeting Argonne May 14 1996 and IEEE Dual-Use Conference Syracuse June 3 1996 and ICASE and NASA Langley June 10-13 1996 and Trip to China July 12-28 1996 http://www.npac.syr.edu/users/gcf/crpctutmay96/index.html Geoffrey Fox NPAC Syracuse University 111 College Place Syracuse NY 13244-4100 Abstract for Base Tutorial on Web Technologies This tutorial is abstracted from two courses taught by NPAC this semester http://www.npac.syr.edu/projects/cps616spring96/index.html http://www.npac.syr.edu/projects/ecs400spring96/index.html You can get your credits from online courses starting this fall! We review Four critical Technologies Java -- a Programming Language JavaScript -- a Client side Integration System VRML 1.0 -- a set of 3D Data Descriptor Web Database Linkage Examples and Why WebWindows will Dominate Software Industry? Further WebWindows Software will be modular and allow plug and play insertion of capabilities developed around the Web World -- not a bunch of isolated stovepipe solutions WebWindows leverages not only universal hardware but also all the world's creative energy As an example some of Current Netscape and last year(!) NPAC's WebTools implements UNIX shell/PC file manager capabilities in terms CGI scripts -- allows universal access to these capabilities including powerful Web based (mh) mail NPAC's WebFoil is HotJava/Netscape 1,2,3 Open replacement for Powerpoint/Persuasion Particular Application areas (Business, Healthcare, Education) will be built on top of generic NII services so that for instance Healthcare video delivery builds on technology developed for CNN etc. Living SchoolBook Collaboration -- ATM linked K-12 Schools Main collaboraters are: Syracuse University School of Education and NPAC Columbia's Teachers College NYNEX supplying ATM link NYNET Rome Laboratory 3 Upstate and 3 Downstate K-12 Schools We have excellent Content Providers including The Discovery Channel Reuters News Service -- Video News Clips NewsBank -- text CD's Bob Frye -- Emmy Winning Producer of Documentaries Living SchoolBook Results/Developments 2D Java and 3D VRML FrontEnd for Terrain Renderer for Virtual FieldTrip Uses Illustra database to store terrain and clickable objects Web Interface to Illustra allows teachers or students to add content e.g. local landmarks with associated Web Site Text Indexed Video Database using Programming or Close Captioning Teachers very interested in Reuters Spanish feed as Current Events in Spanish creates student interest Common database search of text and video databases Succesful Use of Commercial Collaborative Tools Developing VRML Front End for Physical Simulations -- Lake Effect Snow Based on CAPS code Science for the 21st Century Phy105/106 4 credit General Science for non-science majors over 2 semesters Modules have Interdisciplinary Themes Mix of Lectures, demonstrations, lab, team projects and web-based learning Enrollment steadily increases while University decreases! NSF MRA will develop Java applets as client-side simulations and display of HPCC results material science, neural networks, basic math (see vector cross product applet) Also explore VRML and Interactive Digital Video The Binary Black Hole Grand Challenge Alliance Austin- Chapel Hill- Cornell- NCSA- Northwestern- Penn State- Pittsburgh- NPAC has Formal Goals To develop a problem solving environment for the Nonlinear EinsteinÕs equations describing General Relativity, including a dynamical adaptive multilevel parallel infrastructure To provide controllable convergent algorithms to compute gravitational waveforms which arise from Black Hole encounters, and which are relevant to astrophysical events and may be used to predict signals which for detection by future ground-, and space-, based detectors. This code will be made available to researchers in Computational Relativity (by publication and via the World Wide Web). To provide representative examples of computational waveforms. http://www.npac.syr.edu/projects/bbh/bbh.html Adaptive Multilevel Parallel Infrastructure EinsteinÕs equations can be represented as a coupled system of hyperbolic and elliptic PDEs with non-trivial boundary conditions to be solved using adaptive multilevel methods We are building PSE that will support: composition of stable, convergent AMR and MG solvers software integration (initial value problem, apparent horizon finders, ... automatic conversion of sequential unigrid codes into parallel, multigrid versions collaborative visualization environment To implement the system we use technologies developed by CRPC, in particular MPI and HPF, combined with emerging new Web technologies: JAVA and VRML 2.0. HPF and DAGH Implementation Strategies Main Approach is DAGH: Distributed Adaptive Grid Hierarchy (J. Browne, M. Parashar at Texas ) a set of programming abstractions in which computations on dynamic hierarchical grid structures are directly implementable (C++). a set of distributed dynamic data-structures providing transparent scalable distribution of the grid hierarchy across processors (MPI). a set of computational modules and AMR/MG support (shared with HPF when HPF supports this adaptive data structure) (http://godel.ph.utexas.edu/Members/parashar/DAGH/dagh.html) HPF (Syracuse) relegated to reserve status data parallel implementation of unigrid PDE solvers as HPF does not support necessary distributions Fortran90/HPF dynamic memory management (http://www.npac.syr.edu/projects/bbh/more.html) DAGH is similar model to HPF with much less general capability but excellent support for AMR data structures and distribution Syracuse Contributions to Black Hole GC The Alliance Software Library Coordination of the development of the base ADM (one of two major algorithms) evolution code (T2) DAGH implementation of T2 (with Austin) HPF implementation of T2 on SP-2, T3D and DEC Alphas Library Components of the AMR/MG systems Design and implementation of the PSE Standarization of the module interfaces AMR drivers (DAGH, Fortran 90, HPF) JAVA based GUI Visualization and Collaborative tools Development of CS course module (CPS713) on Numerical Relativity and CFD HPF implementation of T2 Scalable, portable performance of the unigrid code http://www.npac.syr.edu/users/haupt/bbh/HPF/index.html HPF Application Experience at NPAC 6 "full" size applications BBH: ADM Evolution Code (integrates EinsteinÕs equations) BBH: model for waveform extraction (for scalar waves) Electromagnetic Simulations Financial Modelling 4D Data Assimilation (Study -- Incomplete) 1D PIC Plasma Simulations (using extrinsics) HPFA application kernels HPF taught in CPS615 and CPS713 courses Many Student projects including quite complex CFD codes from Aerospace Engineering -- students (currently) are confused by HPF Environment HPF: Some Problems we found We find that our best results with HPF codes come from those designed as parallel codes. Traditional Model of Òdusty deck + HPF directivesÓ often does not work well. Is not as intuitive as expected. Students have problems with getting performance. Reasons: Poor knowledge of Fortran 90. Many ÒdetailsÓ are not handled well without some code massaging. Often HPF actions are highly non intuitive for inexperienced users. Poor and cryptic compiler diagnostics and feedback. Profilers (both PGI and DEC) help to identify sources of inefficiencies, though. Special Problems with PGI HPF Compiler PGI Compiler is typically very succesful but .. PGHPF is a translator, not a ÒrealÓ compiler. We encountered situations that the fortran 77 code generated by PGHPF cannot be compiled by the native IBM xlf compiler at a decent optimalization level (O3) which leads to a lousy node performance (with linear parallel speedup). We suspect that the problem lies in arrayÕs index expressions which are always one dimensional(?). Problems with Adaptive Mesh Refinement in HPF Lack of support for pointers to distributed arrays (vendors promise that soon, however). HPF-1 does not allow for distribution of components of derived types (tensor notation, tree structure of grids in AMR). Wait for HPF-2. HPF-1 is too restrictive wrt data and computation distribution (irregular block, subset of processors). Wait for HPF-2. RSA Factoring on the World-Wide Computer Bellcore-Boston U.-Cooperating Systems-NPAC-Oxford U. http://www.npac.syr.edu/factoring This approach to factoring uses world-wide distributed computing and is based on computationally enhanced Web servers. The project demonstrates the potential for self-organization of the resources of the World-Wide Web into a general-purpose parallel computing surface, overcoming geographic dispersion, architectural heterogeneity, and varying network connectivity The innovation of the WWC approach is a new way to execute HPC applications over highly heterogeneous platforms. The WWC uses the WebÕs client/server model to administer and optimize the distribution of work, and manage the entry of new Web volunteers into the pool of available computing resources. Web Virtual Machine and Server-Server Communication Model Proposed Architecture of WWVM Hierarchical FAFNER Servers http://cooperate.com/cgi-bin/FAFNER/factor.pl Features Fill out a form and click to check out ÒServer in a BoxÓ includes server code and initial task allocation Automatically refills from the original source Configurable to meet local standards of decency: selective availability of services months of runtime, dozens of collaborators, eight nations, four continents hardware platforms from an i386 laptop to an IBM SP/2 (including HPs, Alphas, MIPS, Suns, SGI machines, RS6000s) Most Heterogeneous and Geographically Dispersed Award, 3rd Annual HPC Challenge, Supercomputing Ô95. Features of FAFNER Server Code Implemented as Perl scripts, invoked via CGI Hierarchy of cooperating World-Wide Web servers used for many functions in the collaboration: sieving task distribution email-to-HTTP gateway user registration services (including anonymity) computational status updates solution data collection automated archival services Features of CLIENT CODE General Number Field Sieve (GNFS) legacy C code uniprocessor (not network-aware) internally fault-tolerant GNFSD Wrapper Code make a daemon out of GNFS add knowledge of Òtask serversÓ add external fault-tolerance to GNFS TECHNICAL CHALLENGES Partition workload onto multiple servers Avoid redundant task allocation Accumulate large relation datasets Manage evolving software base thatÕs widely distributed Coordinate many volunteer clients Requires a distributed Web based cluster management support Social/Administrative CHALLENGES Offload administration (divert blame) Coordinate volunteers with different computational capabilities Encourage anonymity, minimize exposure to security risks Tune task scheduling according to the individual workstation owner preferences We have designed Metaweb to extend Fafner to address technical and administrative issues RSA130 Factorization is completed! http://www.npac.syr.edu/factoring/status.html Web Sieving started in September 1995. On April 10, 1996, we found that RSA-130 = 1807082088687404805951656164405905566278102516769401349170127021450056662540244048387341127590812303371781887966563182013214880557 has the following factorization: RSA-130 = 39685999459597454290161126162883786067576449112810064832555157243 * 45534498646735972188403686897274408864356301263205069600999044599 Sieving was done on a great variety of workstations at many different locations: 28.37% by Bruce Dodson (Lehigh University) 27.77% by Marije Elkenbracht-Huizing (CWI, Amsterdam) 19.11% by Arjen K. Lenstra (Bellcore) 17.17% by contributors to the www-factoring project (organized by Jim Cowie, Wojtek Furmanski, Tom Haupt, and Arjen Lenstra, among others) 4.36% by Matt Fante (IDA) 1.66% by Paul Leyland (Oxford University) 1.56% by Damian Weber (University of Saarland) Computational Chemistry at NPAC http://www.npac.syr.edu/users/bernhold/comp_chem Use of modeling in chemistry has exploded in recent years, driving the push for larger and more accurate calculations to simulate Òreal worldÓ chemical phenomena. Chemistry applications range in cost from N2 to N4, N6, and higher (N proportional to the size of the molecule). Can be both CPU- and memory-intensive. Interested both in legacy and ÒHPCC-designedÓ applications Existing codes are often quite large (100,000+ lines) and embody perhaps decades of effort -- not rewritten lightly! Many legacy codes can be retrofitted with simple parallel algorithms that allow reasonable efficiency for small-scale parallelism on local resources (including NOWs). Large-scale calculations require parallel computing using a distributed-data model. Naive parallel algorithms are generally insufficient. Requires codes constructed from scratch. Computational Chemistry at NPAC -- MOPAC http:/www.npac.syr.edu/users/thlin/Mopac.html Widely used nearly-free legacy application, 30,000 lines of Fortran77 Solves the Hartree-Fock/self-consistent field (SCF) problem using ÒsemiempiricalÓ (approximate) representations of the electronic interactions. Applicable to large molecules (including biomolecules). Majority of computations in concentrated in construction of the Coulson electron density matrix (embarrasingly parallel: implemented in MPI) diagonalization (the original Mopac routines replaced by PEIGS library developed at PNNL) The parallel implementation is being tested on Cornell SP-2 now Computational Chemistry at NPAC -- NWChem http://www.emsl.pnl.gov:2080/docs/nwchem/nwchem.html New (begun 1993) computational chemistry package designed specifically for large-scale calculations on MPPs. NPAC collaborating with Pacific Northwest National Laboratory (PNNL) , which leads the development. Includes many comp. chem. methods: molecular dynamics, ab initio self-consistent field (SCF) and correlated methods. Implemented in Fortran77 & C using a distributed-data approach. All data larger than O(N) is distributed. Based on Global Array Toolkit -- provides programmer with one-sided shared-memory programming model regardless of underlying platform Portable: Implementations for distributed memory, shared memory, distributed clusters of SMP nodes, NOWs, I-WAY Exposes NUMA nature common to all platforms to programmer -- efficient portable algorithms consider or use NUMA Designed for straightforward migration to HPF Global Array Toolkit (PNNL) http://www.emsl.pnl.gov:2080/docs/global/ga.html Note Matrix Formation and Algebra underlies much Chemistry Provides programmer with one-sided shared-memory programming model regardless of underlying platform Interfaces with parallel linear algebra libraries: PeIGS, ScaLAPACK, ISDA, etc. Exposes NUMA nature common to all platforms to programmer -- efficient portable algorithms consider or use NUMA Portable -- implementations available for Distributed memory (interrupt-driven messages) Shared memory (using SysV shared memory features) Clusters of SMP nodes, NOWs, etc. (shared-memory within cluster, data server process for inter-cluster comms via simple message passing) I-WAY (data replicated on distant MPPs) Designed for straightforward migration to HPF Computational Chemistry at NPAC -- Related Projects Web-based Global Arrays (similar to WebHPF) being developed by Kivanc Dincer (NPAC) Parallel I/O requirements of NWChem algorithms with Alok Choudhary (SU) and Dan Reed (UIUC) Global Arrays on top of Active Messages with Nikos Chrisochoides (Cornell) Development of new theoretical methods and new algorithms for large-scale correlated calculations Chemical applications in collaboration with Syracuse and PNNL chemists Future Plans: Model computational chemistry applications in HPF (port from Global Array-based algorithms) AskNPAC about Chemistry -- NHSE Discipline Specific Resource Problem: Knowledge and discussions of interest to chemists are scattered all over the Internet -- hard to find and use! Nearly 90 mailing lists/newsgroups identified on the first pass Many lists not archived or offer only e-mail retrieval (tedious) Search capability very limited or nonexistent Info is too widely distributed Solution: Use the AskNPAC news Web linked database technology to provide single point of contact, archiving, and search capability via WWW AskNPAC already supports archives (primarily newsgroups) in Computers & Software, Education & Kids, Politics, New York State & Health, Jobs Use with largely e-mail-based discussions in a particular discipline puts a different ÒspinÓ on the technology AskNPAC about Chemistry -- NHSE Provides Òone stop shoppingÓ: Archiving (persistence) Structured searching (headers vs. body, URLs, phrases, etc.) Hypermail-like browser Future plans: Public roll-out when archive large enough to make searches worthwhile Use search capabilities to extract announcements of software, web resources, etc. for further cataloging (i.e. NHSE) Contact: David Bernholdt, or Gang Cheng {bernhold,gcheng}@npac.syr.edu HPJava Study Rationale Java is rapidly becoming a dominant distributed computing language driven by the the breadth and depth of the World Wide Web. It implements a natural object or Applet distributed parallelism combined with a classic light weight thread mechanism within a given applet i.e. within a given (SMP) processor. HPCC has developed technology and the application pull for large scale computation with typically tighter synchronization constraints than those of Java. Further HPCC can benefit from the pervasive software base illustrated by Web in general and Java in particular. Correspondingly there are many emerging Web based applications which will need large synchronized computation. HPJava Study is in Draft Form For these reasons, we thought it useful to examine the confluence of HPCC and Java -- referred to as HPjava.(without knowing what this is!) In particular it is natural for PCRC to examine its software indrastructure and see how it should be structured/changed to support HPJava. http://www.npac.syr.edu/users/gcf/hpjava3.html is not a proposal or plan. Rather it is an often conflicting(!) study of issues that emerge when you place Java and HPCC next to each other. Is data Parallelism useful in Java is Controversial! More thoughtfully, we study Programming Model suggested by HPJava What is Role of Optimizing Compilers in (HP)Java? What are performance issues -- can we separate out current implementations from intrinsic issues. There are a large number of important experiments in the community Other topics include role of CORBA, Security, Model for communication in Java PSE and the Web --- List of Foils - I PSE and the Web -- Base Concepts PSE and the Web -- Evolution Path Web Phase Transition/Revolution '95/'96 Web Phase Transition/Revolution '95/'96 (cont) Web Expansion Phase -- '96 and Beyond Web Tech Development: Commerce vs Academia NPAC Strategy: Technology and Application Niches Web Technologies at NPAC: Terms and Concepts Web Technologies at NPAC: Current Status Some Web Technologies at NPAC: WebAMR Example PSE and the Web --- List of Foils - II Example WebPSE Applications Multi-purpose Bridge Technology -- Overview Multi-purpose Bridge Technology -- Examples CareWeb for Telemedicine/Nursing - I CareWeb for Telemedicine/Nursing - II Command and Control Distance Education and Science Collaboratory Large Scale Numerical Computing Web based HPCC at NPAC -- URLs PSE and the Web -- Base Concepts We adopt a broad view/definition of Problem Solving Environment as a distributed system capable of attacking complex, so far unsolvable problems by integrating information and computation. This includes both scientific and large scale enterprise computing systems. Development of such systems was slow/inefficient so far due to the separation of (lower end) PC computing world (focused on productivity tools for information processing such as authoring or database) and (higher end) UNIX computing world (focused on HPCC) Both platforms become now integrated by the Web phenomenon in the form of what we called WebWindows paradigm. Hence, Web based distributed computing within the WebWindows framework offers a novel, unique and powerful model for PSEs that include and integrate NII and HPCC components. PSE and the Web -- Evolution Path We discuss here the multi-prong evolution path of the Web computing, starting form the "Web Revolution '95/'96' and extrapolating from the current 'Web Expansion Phase' towards the WebWindows based WebTop Systems to come We review both the base Web Technologies under development (commercially and in academia) and selected pilot projects / prototype applications at NPAC Base NPAC Technologies include: WebTools, WebVM, WebWork, WebFlow, Bridge Topologies, WebWindows, WebTop Systems Prototype NPAC Applications include: Telemedicine, Command and Control, and Large Scale Computation Problems such as RSA Factoring-by-Web, Adaptive Mesh Refinement and Visible->Virtual Human Web Phase Transition/Revolution '95/'96 - I Until '95, Web was perceived as a hot novelty with yet to be defined serious killer applications (other than Mosaic->Netscape), market segments and business models. In '95/'96, industry started to embrace Web as a promising viable platform for doing new style agile (faster, cheaper and better) computing in variety of information and computation domains. There is still no single large scale killer application but rather a growing set of Intranet based information/computation systems that use open Web technologies to prototype proprietary systems for internal corporate use in a more cost effective fashion that in the PC/Windows framework Web Phase Transition/Revolution '95/'96 - II Some factors that contributed to this phase transition are: Netscape2 plug-in support that attracted many software vendors to the Netscape platform Successful Java marketing by Sun Netscape's agile response to and support for Java, augmented by JavaScript/LifeWire add-on Success of VRML/VAG forum that caused all major computer vendors to compete in the open VRML 2.0 design proposal contest Aggressive use of the Web itself (and all of us...) by Netscape/Sun for marketing, distribution, debugging and customer feedback support of the Web software products via the sequence of ever changing/never really working alpha/beta releases Why do we put up with so many bugs! Web Expansion Phase -- '96 and Beyond As of today (May '96), we are in the middle of the "expansion phase", triggered by the '95/'96 "revolution" Netscape3 is out with support for Live3D (inlined VRML) and CoolTalk (Internet Phone). LiveMedia (inlined video based on OpenDVE/RTP) is coming soon. FastTrack Web servers offer now page authoring, site management, mailing services, scripted database interfaces and so on. A mortal Netscape-Microsoft war is imminent. Meanwhile, JavaSoft expands with a different strategy , by collaborating with Microsoft, seeking new niches such as support for transparent database backends (JDBC), distributed computing (DMI), CASE tools (JDE) and open browser technology (beta HotJava), family of extension API's -- Media, Commerce, Security, Collaboration . Meanwhile, Sun attempts at JavaVM based JavaOS and GNU is addressing open multiplatform JavaVM implementation. Meanwhile, open Web community (with the major concentration in the VRML forum) starts addressing "open JavaScript" that would challenge JavaScript/LiveWire and eventually Java. Web Tech Development: Commerce vs Academia We correctly predicted the "revolution" and the current "expansion phase" in our WebWindows presentation at the CRPC annual meeting 1 year ago in Houston. What was a prophecy 1 year ago, becomes everyone's idea today. What is still non-trivial is the proper strategy for academic computer science, given the unprecedented speed of the Web expansion and rapidly melting boundaries between Windows and UNIX. We view the area of distributed Web based computing for PSE as a promising niche, given that industry will continue their focus on client-server aspects of the Web where the near term profit can be made. We expect Web based PSE to evolve starting from the top-down Web interfaces to previous generation legacy systems (current state-of-the-art) and moving towards genuine bottom-up Web infrastructure (WebWindows). NPAC Strategy: Technology and Application Niches At NPAC, we follow a two-prong strategy, including component technology prototyping and integration technology development. -- Some component technology prototype projects are short lived -- their goal is to provide us with 'look ahead' and placeholders for the industry modules to come For example, NPAC '95 prototypes such as WebTools, WebMail or WebDBMS are now being (partially) offered by Netscape in their '96 browsers and servers. Meanwhile, however, we were able to make the next step in the conceptual design and start addressing the larger scope Web software system integration issues in terms of our '95 component technology prototypes. We call them WebTop Systems or distributed applications based on leading edge Web technologies and the emergent WebWindows operating system. Web Technologies at NPAC: Terms and Concepts WebTools -- PDA-like Web based environment using CGI (to be improved)-extended personal Web servers to handle file management, e-mail etc. WebWisdom -- JavaScript prototype for managing information hierarchies in the electronic presentation space for Virtual University WebWork -- a mesh of WebTools-like servers, coordinated to perform a common (potentially world-wide) distributed computational task WebVM -- an abstract VM implemented in terms of / on top of computationally extended evolving Web technologies WebFlow -- a dataflow paradigm for visual programming on top of WebVM in terms of interactive browser tools (Java based visual flow editor) WebHPL -- a high level object-oriented interpreted language on top of WebVM to support Web based HPCC WebSpace -- pervasive collaboration support, based on Web/Oracle and Java collaboratory servers WebWindows -- a Web(VM) based operating/windowing environment under collaborative development by the Web community WebTop Systems -- an ensemble of toolkits and integration framework for WebWindows application development Web Technologies at NPAC: Current Status WebTools -- CGI/Perl prototype in '95, now being augmented by / integrated with Java, JavaScript and Web/Oracle. WebWisdom -- Prototype done -- Productize next! WebWork -- proof-of-the-concept application to factor RSA130 by a tree of Web servers (completed Apr'96) WebVM -- minimal CGI prototype operational; work in progress on Java server support. WebFlow -- simple Java applet for visual Web programming operational. Near term application testbeds: AMR, 3D Visible Human. WebHPL -- WebHPF operational. Simple interpreted "little language" layer planned for WebAMR. WebWindows -- exploring all three current platform candidates: UNIX, WindowsNT, JavaOS (HotJava, JDE, DMI, JDE). WebTop Systems -- exploring Bridge topology as a reusable integration framework. Web Technologies at NPAC: WebAMR Example We illustrate here how the individual component technologies cooperate in a complete application, WebAMR (Adaptive Mesh Refinement) A mesh of computationally extended Web servers, connected via HTTP based message passing, acts as WebVM that runs PDE solver modules for individual grids In a simple static AMR topology (WebWork model), a tree of refined meshes is constructed by the user via the AVS like visual programming tools (WebFlow) Dynamic AMR trees require interpreted programming support -- a pilot "little language" design towards WebHPL WebAMR applications can be configured and run on heterogeneous clusters, including any WebWindows compliant platform Example of WebTop System in this domain in a set of WebVM/WebFlow modules, packaged and customized as a PDE Toolkit for a given Grand Challenge community. Example WebPSE Applications CareWeb for Telemedicine -- local community network to support electronic student health record database and collaborative diagnosis by nurses, nurse practitioners and pediatricians. Command and Control -- innovative use of Web technologies for integrating a suite of large scale applications (weather, electromagnetic scattering, telemedicine, GIS) contributing to a military Command and Control. Distance Education and Science Collaboratory -- content (Virtual University, Living Schoolbook) and technology (WebFoil, WebSpace/LabSpace) development for delivering education over the Internet and providing collaboratory links between students and mentors. Large Scale Numerical Computing -- A set of pilot projects that explore Web based HPCC starting from simple computational topologies. Current prototypes include: RSA Factoring-by-Web, Adaptive Mesh Refinement for PDEs, 3D Visible Human. Multi-purpose Bridge Technology -- Overview Most of the real world WebTop Systems will involve multi-user collaboratory modules. Even for scientific computing, complex toolkits such as WebAMR will be most conveniently supported by interactive consultation between developers and users. Collaboratory multi-user components will be further enhanced in enterprise, commerce and community systems. This is illustrated in our recent telemedicine prototype for nursing triage. Here we start from the collaboratory component involving nurses, nurse practioners and pediatricians and add HPCC components such as medical imaging and agent based diagnosis. We view the Bridge topology (Warner & Balch '95), underlying such telemedicine systems, as a promising generic framework, applicable also for other problem domains. A generic bridge includes "points of need", "points of expertise" and intelligent middleware that manages information resources and provides connectivity between customers and optimal services. Bridge point of expertise consistent with Anchor desk in JWID military exercises Multi-purpose Bridge Technology --- Examples We present here examples of the bridge topology, instantiated in various application domains: Domain Points of Need Points of Expertise Typical Services TeleMedicine Nurses, Nurse Practioners Diagnosis HomeCare Units Command Troops Commanders Decision and Control Making Distance Learners Teachers Mentoring Education Students Consultants Commerce Consumers Vendors Product Support Science Schools Scientists Popular Science Collaboratory Small Businesses Technology Transfer CareWeb for Telemedicine/Nursing - I Community collaboration including NPAC, SU College of Nursing, Syracuse City School District and SUNY Health Science Center (Univ. Hospital). Initial goal is to provide electronic student health record database, healthcare education and Web based interactive consultation between nurses, nurse practitioners and pediatricians. Trial demo implementation completed May'96. Trial deployment in select New York and North Carolina schools expected in fall '96. CareWeb core module is given by Oracle database at NPAC with WOW/OWA/Internet gateway, remotely accessed by CareWeb customers. The system integrates and offers customized access to ~30 databases, managing information about users, health education resources, and patient health records. CareWeb for Telemedicine/Nursing - II Typical CareWeb databases include: Customers, Connections, Transactions, Schools, Teachers, Nurses, Nurse Practitioners, Doctors, and Record Components such as Immunizations, Screening Tests, Health Histories, Progress Notes, Visit Logs, Assessment Forms etc. CareWeb Information Pages offer customized educational support for healthcare personnel as well as students and parents, as well as decision tree support to be used in the next project stage for the agent-based automated diagnosis generation and verification. Interactive consulting is based on shared record pages, optionally synchronized via phone chat or/and WebCast support, and VIC/VAT video support for 'talking heads' and/or video feeds from (Welch Allyn) multi-purpose fiberscopes for ear, nose and throat inspection. The system offers multi-level security, including Internet guests (with anonymous limited access), CareWeb friends (with registered restricted access) and CareWeb customers (nurses, nurse practitioners, doctors, parents) with secure password based access, individual home pages and customized information/operational spaces. Command and Control Real--time decision support for military Includes telemedicine as a special case for miltary medical activities It is the classic "system of systems" very suitable for loose integration with Web technologies HPCC applications include Image Processing, Tracking, Spatial Assessment, Weather, Electromagnetic Simulation Incorporates a Java/VRML based GIS in which we hope to integrate 3D terrain with output of weather simulations Netscape2/JavaScript prototype is "exact" copy of deployed system at Cheyenne Mountain -- consistent with COTS phylosophy Distance Education and Science Collaboratory A group of Web based education projects at NPAC, including: Virtual University -- scalable CPS certificate to be offered over the Internet WebWisdom -- experiments with Web based electronic presentation technologies (includes instrumented HotJava and Netscape2 + JavaScript prototypes) Living Schoolbook -- broadband multimedia content for K-12 distance education over NYNET NPAC WebSpace -- Web based plug-in for schools and small business into the advanced science laboratory LabSpace developed by ANL Associated Web technologies include Java, HotJava, JavaScript, VRML, Web/Oracle, and a set of prototype collaboratory spaces such as: AskNPAC Chat -- Oracle Server and client pull for realtime e-mail -- apply to chemists for NHSE (Software Exchange) Java Chat --- Java collaboratory server and applet chat client CareWebCast --- remotely guided collaboratory database navigator Large Scale Numerical Computing RSA Factoring-by-Web -- collaboration with Arjen Lenstra and Boston/CSC. New NFS factoring algorithm successfully applied to RSA130 factoring on a tree of Web+CGI servers (FAFNER by Jim Cowie/CSC). SC'95 Teraflop Challenge Award. Next Challenge -- RSA155. WebHPF -- Web front-end to HPF compiler and PVM-based distributed runtime. Supports CASE tools for program development, process management and performance monitoring. Adaptive Mesh Refinement -- planned WebVM/WebFlow application to support Grand Challenge PDE solvers. Includes static AMR trees specified by visual authoring and dynamic trees, implemented via interactive scripting modules. Visible->Virtual Human -- 3D reconstruction of the human body, based on the image database from the National Library of Medicine. Currently implemented is color segmentation stage (embarrassingly parallel), to be followed by WebVM/WebFlow based algorithms with non-trivial internode communication (surface reconstruction, object labelling and grouping). Next Steps in Visual Programming for Chaining and Aggregating Services -- WebFlow!! New powerful Web'96 technologies from Netscape, JavaSoft, Oracle, NeXT etc. will result in a new generation of interactive services A natural next step is to start Chaining (Integrating) such services to a distributed PSE by providing a server to server communication and dataflow support However Web'96 becomes also increasingly complex with its competing and overlapping multi-lingual standards HTML, CGI, Perl, Java, JavaScript, LiveWire, VRML, VRMLScript Visual Programming for a multi-server Web (We call it WebVM) based dataflow (we call it WebFlow) is a natural next generation user-friendly programming environment We view the area of distributed Web based computing for PSE as a promising niche for NPAC and academic R and D where we expect industry to continue their focus on client-server aspects of the Web where near term profits can be made Web based HPCC at NPAC: URLs Overview --- http://www.npac.syr.edu/projects/webbasedhpcc WebTools --- http://king.syr.edu:2006/WebTools.html RSA Factoring-by-Web -- http://www.npac.syr.edu/factoring Distance Education / Virtual University -- -- http://www.npac.syr.edu/users/gcf/foilsbyarea.html WebSpace/Labspace -- http://www.npac.syr.edu/projects/webspace Web based Telemedicine -- http://www.npac.syr.edu/projects/careweb