Computational Infrastructure for distributed collaborative data understanding Florida State University The software infrastructure is built around the concept of a collaborative portal which generalizes and combines earlier work by Fox in the Gateway/WebFlow computing portal and the Tango Interactive collaborative system. The new system is being developed for both computing and education applications at Florida State University using a mix of NSF and DoD funding. The architecture for a collaborative portal has come from an examination of both existing portals and collaboration systems and addresses the requirements discovered from the first generation systems. These reports are online a) http://www.new-npac.org/users/fox/documents/collabcompmay00/ b) http://www.new-npac.org/users/fox/documents/generalportalmay00/ c) http://www.new-npac.org/users/fox/documents/wapmay00/ d) http://www.new-npac.org/users/fox/documents/erdctraining.pdf They cover respectively a) collaborative computing and visualization b) portals architecture and services c) integration of hand-held devics and d) experience in distance education using collaborative system Tango Interactive Collaborative portals are built as a four-tier architecture: Backend resource - Middle Tier broker/server - personal server - user display Essential features of our new architecture include 1) Use of a "personal server" whose interfaces are defined in a new XML based specification "portalML", which captures the user view of system. This controls user customization, layout etc. We succesfully used in our Gateway system the two distinct XML interfaces with portalML as the user view of objects and services with resourceML describing the system view. 2) The personal server(s) communicate with conventional middle tier where "resourceML" defines the objects and their properties and methods. "resourceML" is being defined by different communities. For instance, the Grid Forum activities include the definition for computing and the IMS (http://www.imsproject.org/) and ADL (http://www.adlnet.org/) SCORM (Sharable Courseware Object Reference Model) organizations have preliminary XML specifications of learning objects. Note IMS and ADL assume a rather old fashioned client-server model and their current standardization work deals with the client side. in a way we consider unsatisfactory as they have no analogue to portalML in their approach. Both Gateway (the system on which we are building) and the follow-on collaborative portal are fully integrated with Globus technology and so should integrate well with the NASA IPG or Information Power Grid. 3) The personal server includes much of the code that we put in browser for our older systems Tango and Gateway. The new architecture produces a more robust system and makes it easier to drive different client side renderings (desktop, palmtop or Cave (PowerWall at FSU)) from a common personal server which includes key user session specific logic. 4) A federated (client and server side) XML-based event system with backend database store supporting both asynchronous access and synchronous multicast of events between clients. This leads to robust support of synchronous and asynchronous colaboration. All messages in the system are captured as events and an event delivery service replaces the traditional collaboration server. 5) The architecture and multiple interfaces in the collaborative portal system are designed to allow integration of outside modules such as the Access Grid (from ANL/NCSA) or RealNetwork audio-video conferencing technology. We also support the inclusion of commercial objects such as those from Blackboard Lotus and WebCT in education area. This is the most difficult aspect of the system -- the Internet world is now so sophisticated that you cannot hope to build a complete system. Ratherwe adopt a standards compliant collaborative framework and integrate capabilities from multiple sources Collaborative Portals in Data Understanding. The data understanding software infrastructure in this proposal will have several important capabilities 1) Support existing and new data streams and analysis programs "wrapped" to be able to be used as distributed objects with both Java XML and CORBA interfaces. We will produze a "wizard" that will allow user to register new entities. This wizard will be customized to the data understanding field. 2) The system can be used as a conventional science portal supporting many distributed researchers whose work is integrated together "asynchronously" by the production and use of common programs and files (web-pages) -- all viewed as distributed objects. In this mode, the web interface allows access to and execution of remote data and programs. This computing portal uses Globus to execute the programs and supports the graphical composition of general execution scripts. The latter is the WebFlow capability. The portal of course allows integration of a complete set of information resources i.e. papers, real-time streams, database records are some of the digital objects which can be customized by either an administrator or user or both. 3) Synchronous collaboration is supported to allow multiple researchers to share the objects of the system (e.g. a data set) with automatic updates done in real-time. This allows shared visualiztion of both simulations and geographical information systems used in many Earth science fields. 4) One novel concept supported by the portal is that of a subdomainlet. Many science fields are best thought of in a hierarchial fashion with domains corresponding to a given resolution divided into separate subdomains. Approachs like multigrid or the renormalization group, tackle the problems in each subdomain, integrate these together and then between resolutions. subdomainlets are designed as the natural digital object to support such physical systems. Subdomainlets register with the Jini technology and can be built as XML, Java or mixed objects. They contain in general data and a program. For instance in an adaptive mesh, a subdomainlet could be a set of mesh points with attached the program to refine this mesh. Subdomainlets can be prepared by the different collaboraters in the portal and then integrated. In data understanding, the subdomainlets could be earthquake fault fragments prepared by different researchers with programs attached to perform analysis of the integrated domain. Deliverables Year 1: Base syststem customized for chosen application area with wrapping wizard Year 2: Collaborative capabilities exploited for original and one other application Year 3: Complete portals with support of subdomainlets. We will give tutorials and otherwise explain use of system to NASA scientists Short Description about FSU Sub-Contractor This work will be done at Florida State University's new school CSIT for Computational Science and Information Technology. Here Fox is director of the computational infrastructure facility which includes a terascale system (IBM SP with by 2002 over 500 nodes and 2.5 teraflop peak performance) and several SGI Sun and PC systems to support and integrate information systems. Fox was on the faculty of Caltech for 18 years and has collaborated with JPL on numerous occasions. Budget $204.3K per year PI (Fox) Salary and Fringe $18,000 (one month) Postdoc Salary and Fringe $53,000 (12 months) Graduate Student Salaries (3) $48,300 Graduate Student Tuition $10,500 Undergraduate (2 Summer) $7,000 Travel $4,000 Expenses $2,000 Overhead $61,500