CHAPTER 1 – INTRODUCTION

Developing large applications is a complex process and the assistance of adequate programming tools is always welcome. Not surprisingly, there are numerous commercially available tools for this purpose. Visual debuggers, profilers, data analysis, and visualization packages are integral parts of the workstation environments of scientists and engineers. The situation is different for high-performance, parallel, or distributed architectures. Performance tuning, debugging, and data analysis are more difficult, and yet tools that are simultaneously sustainable, highly functional, robust, and easy to use for these purposes are not widely available to the HPCC arena. This is partially due to the difficulty of developing sophisticated, customized systems for what is a relatively small part of the worldwide computing enterprise. If we consider the entire computing market, the user base of different types of computing can be illustrated as a pyramid with a narrow top and much wider base. HPCC technologies have been developed at the top of the computing pyramid, mostly by federally funded organizations where moving down into a broader user base was encouraged [Fox96]. However, the results have never been completely satisfactory due to the lack of common open interfaces among personal computers with Windows-based user interfaces and Unix workstations.

Even if we had HPCC programming tools, we would still have problems with some types of applications, specifically metacomputing applications (meta-applications). There are many definitions of “metacomputing” whose origin is believed to have been the CASA project. Larry Smarr, the NCSA director, defined it as “the use of powerful computing resources transparently available to the user via a networked environment,” and he is generally credited with popularizing the term. We define metacomputing as “a means of integrating legacy codes into distributed environments and providing the user with seamless access to remote resources.” We consider the metacomputing environment to be a “computational grid” [HPccGridBook] that gives dependable, consistent, and seamless access to computational or remote resources. A computational grid is a dynamic environment that any type of computational resource can join or leave. “Meta-application” is usually an interdisciplinary application that works on a computational grid and needs heterogeneous machines, scientific instruments, archival storage, visualization, and multiple users working together.

Real-world systems bring many computational complexities to software developers to model them. A metacomputing application needs not only a substantial number of cycles, but also the use of heterogeneous models and various hardware and software resources with which to implement the various parts of the solution. However, there is little software support available for incorporating such heterogeneous resources into a virtual program. The resulting model is one in which an application consists of distributed data and individual programs (scientific algorithms or database and visualization servers), where files are used to transfer data from one component to another. There is also a configuration problem where the user has to manually start components separately on various machines and create connections (data and control) between components. There is little software in the HPCC community for configuring such a collection of components into a single virtual application (meta-application). The HPCC and metacomputing research communities have produced remarkable software for various types of parallel machines as well as software infrastructures for building meta-applications [FK96globus, LG96legion]. Unfortunately, there is no high-level visual development environment in which we can use both different parallel machines and commodity machines (WIN/NT PCc) and clusters (PC or workstation clusters) as one virtual machine to solve meta-applications.

For example, various DoD and DoE-funded projects have produced multiple ecosystem management/modeling tools such as Terrain Modeling and Soil Erosion Simulation, Ecological Modeling for Military Land-Use Decision Support, Watershed Modeling System, etc., which are really legacy codes written in different languages and possibly working on different machines. Managing land and water resources is a challenging task that needs an integrated modeling/decision support environment capable of simulating atmospheric-surface/water-groundwater connectivity, cleaning and rehabilitating contaminated sites, managing coastal zone, watershed, and riverine resources, etc. Even though these modeling tools help land and water resource managers, they currently are disconnected codes that must be united into an integrated framework to achieve their highest productivity. We emphasize that this integrated framework is the computational grid we must establish. Therefore, a new initiative, the Land Management System (LMS), has begun to design, develop, support, and apply an integrated capability for the modeling and decision support technologies needed for applications relevant to the management of DoD land, water, and airspace. The most important decision to be made here is whether or not we are going to provide an environment that brings relevant science and technology to DoD land managers in a complete and responsive manner. We have to use existing diverse investments in science and technology (commodity software) to design an evolutionary and scalable computational grid environment that gives a uniform point of access to both scientists and managers so that we can maximize synergism between them. Our grid environment allows us to develop protocols for model-to-model and model-to-data connectivity so that new technology investments in modeling and simulation, basic science, and information technology will seamlessly integrate with new data collection, assimilation, and management activities.

This dissertation proposes and develops Gateway, an integrated environment for High Performance Commodity Metacomputing (HPCM). We consider the contributions of this work to be the following:

· Gateway provides an integrating layer such that user-defined front ends and high-performance computing or commodity back-end elements (database, visualization servers and instrumentation server, and directory server, etc.) can be plugged dynamically into Gateway.

· Gateway makes it possible to configure and create the hierarchical meta-application across heterogeneous machines through the use of Gateway API or by preparing and introducing one’s own abstract job specification in XML to the Gateway. After starting the meta-application, it gives the user full control in running the application and saves and restores its distributed state with industry-standard XML.

· Gateway allows source-level debugging and monitoring of each component of the application if a malfunctioning component is found. There are currently a few tools that aid in debugging the distributed application.

· Gateway permits legal program statements to be interpreted in the same language as the component and thus facilitates the dynamic prototyping of complex software systems.

· The user can easily dynamically attach his metacomputing, visualization, and database or batch job submitter services to Gateway system, and individual components in the system can use it immediately.

· One unique feature of the Gateway system combines high-performance services with commodity computing tools without sacrificing high performance.

· Security and transaction protocol can be easily added into the Gateway system simply by putting these policies into proxy objects for each individual remote component. We already have the facility for generating necessary JDBC calls for component attributes and automatically storing and restoring the component state from/to the database.

· The Gateway system provides a server-side, object-collaboration framework and also permits TANGO [Tango97SIAM], a client-side interactive collaboration system, to be plugged into Gateway.

We followed our new strategy of High-Performance Commodity Computing (HPcc) [HPCCEuroPar98Gf] to construct the Gateway framework. HPcc builds HPCC programming tools on top of the remarkable new software infrastructure being built for the commercial web and distributed object areas. This leverage of a huge industry investment delivers tools naturally with the desired properties, with the one (albeit critical) exception that high performance is not guaranteed. The user can build his metacomputing environment as a multi-tier architecture that has a Web-based visual front-end and a middle-tier consisting of multiple Gateway servers and back-end modules. We add high performance to commodity systems with a Globus metacomputing toolkit as the backend. This three-tiered system architecture is very similar to Enterprise JavaBeans (EJB) architecture.

We performed two experiments to create Gateway. First, we built the DARP (Data Analysis and Rapid Prototyping system) system, an integrated program-development environment. DARP helps the user build a fine-grained meta-application and gives full control on remote applications. Second, we participated in developing and extending WebFlow, the primitive version of Gateway with other NPAC project members. Finally, we constructed Gateway based on these experiments.

We performed three applications of Gateway: Quantum Simulation (QS) and Land Management System (LMS) described above, which are coarse-grained meta-applications. The last one is “seamless access” infrastructure to HPCC resources, which has a three-tiered architecture whose middle tier is Gateway.

The middle-tier was based on the commercial distributed-objects technology, CORBA, and works on the CORBA product of any ORB vendor. We especially benefited from JavaBeans, Enterprise JavaBeans (EJB), the CORBA component and multiple interfaces specifications, and adapter and delegation design pattern.

1.1 Related Work

UNICORE provides seamless and secure access to distributed computing resources using the World Wide Web. It has a three-tier architecture similar to that of Gateway. In tier one of the applet, Job Preparation Agent (APA), the user prepares an Abstract Job Object (AJO) from some group of Java abstract classes and sends it to the middle tier, the UNICORE site, (which operates a gateway running the https server). The middle tier includes a security servlet where the user’s certificate is authenticated and mapped to the local Unix userid. The Network Job Supervisor (NJS) interprets AJO, possibly using Java reflection mechanisms, and creates a job for the third tier, the local Resource Management System (RMS) or forwards sub-jobs to other UNICORE sites. We provide the facility both to prepare more flexible XML documents to construct abstract jobs and to interactively construct user jobs through the Gateway API.

WebSubmit [WebSubmit], a product of the National Institute for Standards Technology (NIST), is similar to UNICORE in that it is a mechanism for obtaining a seamless interface to computational platforms using the web. Instead of being Java-based, it uses more traditional CGI scripts, mainly based on Tcl. Tasks to be submitted are generated in application- and platform-specific format at the time of creation. Job consignment and result delivery are handled as a single (possibly long) transaction. Currently, the implementation is specific to the LoadLeveler batch subsystem.

Legion [LegionACM97] is an object-based project for developing software in support of a “world wide virtual computer.” The project envisions a system in which a user sits at a Legion workstation and has the illusion of a single, very powerful computer. When the user invokes an application on a data set, it is Legion's responsibility to schedule application components transparently to processors, manage data transfer and coercion, and provide the necessary communication and synchronization. System boundaries, data location, and faults are to be made invisible. Since the user should not be aware of specific physical resources comprising the virtual metacomputer, this requires Legion itself to locate, schedule, and synchronize the necessary resources at run time. Legion proposes to establish its own model for security, allowing each user to establish security restrictions on an object basis. It would appear that Legion is a very ambitious project at the forefront of computer science.

Globus [GlobusIntlJ97] is a large U.S.-based project that is developing the fundamental technology needed to build computational grids-execution environments that enable an application to integrate geographically distributed instruments, displays, and computational and information resources. As with the Legion project, the primary focus of Globus is to combine distributed resources into a virtual metacomputer in support of grand challenge (possibly parallel) problems. This implies a need to synchronize access to resources at run time and to support, in the interest of bandwidth availability, message passing among components of the application over a network connection under the control of Globus components.

Our main intent in this thesis is to build a user-friendly, Web-based metacomputing environment. We used the Globus toolkit whenever we needed high performance in the back-end (third tier). But as front-ends for different problem domains can be plugged into the Gateway system, our Gateway middle tier is generic enough that any low-level (Globus), object-based (Legion), or some other type of metacomputing toolkit can be interchanged with each other. Because we follow a commodity-computing model (HPcc), we strongly believe in using whatever software is available for solving any problem.

One of the distributed systems most similar to Gateway is Sun Microsystems’ JINI [JINIWeb] where main goal is to create a new distributed computing architecture. It is an object-oriented framework that embodies a model of how devices and software inter-operate as well as how distributed systems function. The infrastructure consists of two main components: “Discovery and Join” and “Lookup’. Discovery and Join is a two-phased mechanism whereby a device or application identifies itself to the network. First, the entity broadcasts a discovery package that contains sufficient information to enable the network to start a dialogue with the entity that has just joined. Second, once acknowledged, the entity can join by sending a message containing details about its own characteristics. Lookup is a component that stores information about Jini-registered devices and applications. Clients using Jini use Lookup to find the services that they wish to access. The distributed programming model used by Jini promotes three technologies: leasing, distributed transactions, and distributed events. Leasing is when an object negotiates the usage of a service for a period of time. Communication within Jini is based on Remote Method Invocation (RMI). The Jini-distributed programming model is based on the JavaSpaces model, which is itself based on Linda.

Jini supports the distributed model based on the flow of objects among processes, as opposed to method calls. Gateway adopts the publish-subscriber model for services, but Jini uses the template-matching model to find an entry, E, in JavaSpace through a lookup operation with an appropriate template, T. The template T can match an entry E in the space only if the field values of T are matched exactly by the value in the same field of E. Gateway identifies services with just a simple name that the user can choose. Whenever a service is attached to or detached from the Gateway system, all of the registered objects in the Gateway system are notified.

The event model of Jini is similar to Gateway’s event model. Whenever a Jini client writes an entry matching a particular template written into JavaSpace, registered listeners are automatically notified. Jini’s event model supports the interposing of a third-party object between the event producer and the consumer. This intermediary object can behave as mailbox, store-and-forward, or notification-filter agent. Our Gateway supports these types of agents in exactly the same way. All events fired by any object in the Gateway system are captured by the parent context of this object and forwarded (“push” event type) or kept (“pull” event type) until the event consumer explicitly picks it up.

Another emerging technology is the server component model, Enterprise JavaBeans EJB [EJBWeb]. The Gateway system has some similarities to EJB. Gateway is a generic middle tier consisting of a tree of two types of components: user modules and context (a container that keeps other containers and user modules). Gateway, like EJB, supports many sessions or users simultaneously, and both Gateway and EJB (last version 1.1) chose XML for a persistency model. As the user puts his own EJB objects into the EJB container, Gateway users put user modules into Gateway contexts. Both EJB and Gateway generate an automatic code to perform serialization in XML. The EJB object and general proxy intercept client requests of EJB and Gateway, respectively. Currently, unlike EJB, we do not have a transaction model. The EJB container and the Gateway contexts for EJB and Gateway systems, respectively, handle security. Gateway has a distributed event model, while in the current EJB 1.1 specification EJB does not.

DISCWorld [DiscWHPCN97] develops a middleware infrastructure for distributed high-performance computing applications. The Distributed Information Systems Control World (DISCWorld) is a smart middleware system designed to integrate processing and storage resources across wide-area heterogeneous networks, exploiting broadband communications where available.

NetSolve [Netsolve97IJSP] is a client-server application that enables users to solve complex scientific problems remotely. This system allows users to access both hardware and software computational resources distributed across a network. NetSolve searches for computational resources on a network, chooses the best one available and, using “retry” for fault-tolerance, solves a problem and returns the answers to the user. Although Netsolve tries to solve simple problems, Gateway solves both simple and complex problems that may need applications consisting of many user modules or other sub-applications.

We have another related project, PAWS [PAWSWeb] (Parallel Application WorkSpace), that provides a framework for coupling parallel applications. The coupled applications can be running on different machines, and the data structures in each coupled component can have different parallel distributions. PAWS can help us to connect different Gateway applications. As stated earlier, Gateway can support the simultaneous running of many applications.

We have other NPAC research that has culminated into the Virtual Programming Laboratory (VPL) [VPL97ConcJ], that is a Web-based virtual programming environment based on a client-server architecture. It can be accessed from any platform (Unix, PC, or Mac) using a standard Java-enabled browser. Software delivery over the Web imposes a novel set of constraints on design. It outlines the tradeoffs in a design space, motivates the choices necessary to deliver an application, and details the lessons learned in the process. VPL facilitates the development and execution of parallel programs. The initial prototype supports high-level parallel programming based on Fortran 90 and High-Performance Fortran (HPF), as well as explicit low-level programming with the MPI message-passing interface. Supplementary Java-based platform-independent tools for data and performance visualization are integral parts of VPL. Pablo SDDF trace files generated by the Pablo performance instrumentation system are used for postmortem performance visualization.

VPL is excellent work that was based on emerging Web technologies a couple of years ago, and it is still a very good Web-based system that anybody can assume as a starting point. We derived much stimulating motivation from VPL.

The SUN client-side Java component model, JavaBeans [JavaBeansWeb], also influenced our Gateway project. We adopted the JavaBeans model at the beginning, but later removed its introspection mechanism, detecting attributes from its related get/set methods and events from its corresponding add/removeXListener methods for event X. We designed an introspection model on an XML document that is generated automatically for user modules with the help of IR (Interface Repository). We were also guided by the JavaBeans containment model (JavaBeans Glasgow specification), but in Gateway, we have Gateway context, distributed across machines, as opposed to JavaBeans context.

Another related work is SCIRun [SciRun97Press], a scientific programming environment that permits interactive construction, debugging, and steering of large-scale scientific computations. SCIRun supports only data-flow models and allows scientists to modify the simulation parameters interactively, but our Gateway supports both event-driven and data-flow models. The computational steering aspect of SCIRun includes only user-defined parameters, but the DARP functionality of Gateway gives more detailed steering for user-defined variables and all of the program variables defined in any program segment. SCIRun uses only local computing resources, while Gateway can also use distributed objects.

In Chapter 2 we explain what we mean by HPcc (High Performance Commodity Computing) and HPCM (High Performance Commodity MetaComputing). Chapter 3 discusses the TCP/IP client-server-based DARP (Data Analysis and Rapid Prototyping) architecture. Chapter 4 explains the architecture of WebFlow, early stage of the Gateway, and Chapter 5 discusses a new distributed-object-based Gateway middle tier that fully orchestrates a distributed meta-application consisting of various components. Chapter 6 explains two applications of Gateway such as Quantum simulations (QS) and LMS with which we evaluate our Gateway system. Chapter 7 provides conclusions derived from our experiments and an outline of our plans for future work.

CHAPTER 2 – High-Performance Commodity Computing

2.1 Introduction

We believe that industry and the loosely organized worldwide collection of commercial, academic, and freeware programmers are developing a remarkable new software environment of unprecedented quality and functionality. We believe this environment can benefit HPCC in several ways by allowing the development of both more powerful parallel programming environments (DARP) and new distributed metacomputing systems (WebFlow and Gateway) [HpccGridBook, HPCCEuroPar98Gf]. We abstract these to a three-tier model with largely independent clients connected to a distributed network of servers that host various services, including object and relational databases and, of course, parallel and sequential computing. High performance can be obtained by combining concurrency at the middle server tier with optimized parallel back-end services. The resultant system, HPcc-High-Performance Commodity Computing, combines the needed performance of large-scale HPCC applications with the rich functionality of commodity systems. In the second section we define “commodity technologies” and explain the ways they can be used in HPCC. In the third section, we define an emerging HPcc architecture in terms of a conventional three-tier commercial computing model.

2.2 Commodity Technologies and Their Use in Multi-Tier Systems

The Web is not just a document-access system supported by the somewhat limited HTTP protocol. Rather, it is a distributed-object technology that can build general multi-tiered enterprise intranet and internet applications.

There are many driving forces and many aspects to HPcc architecture, but we suggest that the three critical technology areas are the Web, distributed objects, and databases. These are being linked, and we see them subsumed in the next generation of "object-web" technologies, illustrated by the recent Netscape and Microsoft browsers. Databases are older technologies, but their linkage to the web and distributed objects is transforming their use and making them more widely applicable.

In each commodity technology area we have impressive and rapidly improving software artifacts. As examples, we have at the lower level a collection of standards and tools such as HTML, HTTP, MIME, IIOP, CGI, Java, JavaScript, Javabeans, CORBA, COM, ActiveX, VRML, dynamic Java servers, and clients that include applets and servlets. The new W3C base technologies include XML, DOM, and RDF. At a higher level of collaboration, security, electronic commerce, multimedia, and other applications/services are rapidly developing using standard interfaces or frameworks and facilities. This emphasizes that we have a set of open interfaces enabling distributed modular software development. These interfaces are at both low and high levels, and the ones at high levels generate a very powerful software environment in which large preexisting components can be quickly integrated into new applications. In our work, we designed and built an integrated environment called Gateway that facilitates the construction of large applications from built-in user modules.

We believe that there are significant incentives for building HPCC environments in a way that naturally inherits all the commodity capabilities so that HPCC applications can benefit from the impressive productivity of commodity systems. NPAC's HPcc activity is designed to demonstrate that this is possible and useful so that one can simultaneously achieve both high performance and the functionality of commodity systems. We demonstrated this with a couple of applications: seamless access to HPCC resources, Landscape Modeling Simulation (LMS) and Quantum Simulation (QS) which will be explained in Chapter 7.

Note that commodity technologies can be used in several ways. XML and Web servers will help customization and installations of distributed objects and servers across the Internet. Enterprise JavaBeans, RMI, COM, and CORBA will accelerate the usage of distributed-object technology.

However, our main target is not such pointed solutions but rather adapting the architecture of commodity systems for high-performance parallel and distributed computing. Even though we have seen over the last 30 years many other major broad-based hardware and software developments, they have not had a profound impact on HPCC software. Based on our many experiments, we strongly believe that HPcc architecture gives us a world-wide/enterprise-wide distributing computing environment. Previous software revolutions could help individual components of an HPCC software system, but HPcc architecture is the backbone of a complete HPCC software system. To achieve our goal, we added high performance to this architecture and in this way inherited a multi-billion-dollar investment and what is, in many respects, the most powerful and productive software environment ever built.

2.3 Three Tier High Performance Commodity Computing

We start with a common modern-industry view of commodity computing with the three tiers shown in Figure 2.1. Here we have customizable client and middle tier systems accessing "traditional" back-end services such as a relational database. A set of standard interfaces allows a rich set of custom applications to be built with appropriate client and middleware software. As indicated in Figure 2.1, the client sitting in the first tier can access to the middle tier via URL, COM/CORBA request, low-level socket, XML-based protocol, or a CGI program. The middle tier can be a simple server waiting on a port, COM/CORBA server(s) or a group of CGI programs written usually in PERL and other scripting languages. Accessing to a database at the back end is achieved through a JDBC interface. The database keeps the attribute values of application server objects as well as other related information.

Figure . 2.1. Industry Three-Tiered View of Enterprise Computing

The rapidly evolving commercial architecture is exploring several co-existing approaches in today's distributed information system. The most powerful solutions involve distributed objects. There are three important commercial object systems: CORBA, COM/DCOM, and Enterprise Javabeans. We envision the possibility of coming up with an enterprise commodity architecture that allows the use of many different technologies at the same time. Actually, we realized this with our sophisticated and extendible Gateway component model that will be explained in Chapter 5.

Enterprise JavaBeans is a "pure Java" solution-cross platform, but, unlike CORBA, not cross-language! Legion is an example of a major HPCC-focused distributed object approach; currently, it is not built on top of one of the three major commercial standards. The HLA/RTI standard for distributed simulations in the forces-modeling community is another important domain-specific, distributed-object system. It appears to be moving toward integration with CORBA standards.

Although a distributed-object approach is attractive, most network services today are provided in a more ad hoc fashion. Originally, these services were client-server with proprietary network access protocols like TCP/IP. Later, web-linked databases and enterprise intranets naturally produced a three-tier distributed service model, with an HTTP server using a CGI program to access the database at the backend. There is a trend toward using Web servers with the servlet mechanism for the services, but today we can build databases as distributed objects with a middle-tier Enterprise Javabean or CORBA using JDBC to access the backend database. We can summarize this evolution as follows:

client access method middle tier object or executable

Java sockets --> Java program (low-level network standard)

HTTP --> Java servlet or CGI script

IIOP, RMI, or COM --> Distributed objects (high-level network standard)

As shown in Figure 2.2, we see today a "Pragmatic Object Web" mixture of distributed service and distributed object architectures. CORBA, COM, Javabean, HTTP Server + CGI, Java Server and servlets, databases with specialized network accesses, and other services co-exist in the heterogeneous environment with common themes but disparate implementations. Our NPAC's HPcc strategy involves building this architecture (Figure 2.2) and adding high performance in the third tier to this system.

Figure 2.2. Today's Heterogeneous Interoperating Hybrid Server Architecture

We also believe that the resultant architecture will be integrated with the web so that the latter will exhibit distributed-object architecture. More generally, the emergence of IIOP (Internet Inter-ORB Protocol) and the realization that CORBA is naturally synergistic with Java is starting a new wave of "Object Web" developments that could have profound importance. The resultant architecture shows a small object broker (a so-called ORBlet) in each browser as in Netscape. Most of our remarks are valid for all these approaches to a distributed set of services.

We used this service/object-evolving three-tier commodity architecture as the basis of our HPcc environment. We need to naturally incorporate (essentially) all services of the commodity web and use its protocols and standards wherever possible. We insist on adopting the architecture of commodity distribution systems, because complex HPCC problems require the rich range of services offered by the broader community systems. Porting commodity services to a custom HPCC system requires continued upkeep with each new upgrade of the commodity service. By adopting the architecture of commodity systems we make it easier to track their rapid evolution and we expect it will give high functionality to HPCC systems, which will naturally track the evolving Web/distributed object worlds. This requires us to enhance certain services to obtain higher performance and to incorporate new capabilities such as high-end visualization (e.g., CAVEs) or massively parallel systems where needed. This is the essential research challenge for HPcc, for we must not only enhance performance where needed but do it in a way that is preserved as we evolve the basic commodity systems. We certainly have demonstrated clearly with our applications of QS, LMS, and Seamless Access that this is possible through deploying our distributed object-based Gateway middle tier. In order to achieve this, we exploit the three-tier structure and keep HPCC enhancements in the third tier, which is inevitably the home of specialized services in object-web architecture. This strategy isolates HPCC issues from the control or interface issues in the middle layer. We have successfully built an HPcc environment that offers the evolving functionality of commodity systems without significant re-engineering as advances in hardware and software lead to new and better commodity products.

Returning to Figure 2.2, we see that it elaborates Figure 2.1 in two natural ways. First, the middle tier is promoted to a distributed network of servers; in the "purest" model these are CORBA/COM/Enterprise JavaBean object-web servers, but any protocol-compatible server is obviously possible. This middle-tier layer includes not only networked servers with many different capabilities (increasing functionality), but also multiple server instantiations to increase performance on a given service. The use of high functionality but modest performance communication protocols and interfaces at the middle tier limits the performance levels that can be reached in this fashion. However, this first step gives a modest performance scaling, and a parallel (implemented, if necessary, in terms of multiple servers) HPcc system that includes all commodity services such as databases, object services, transaction processing, and collaborators. The next step is applied only to those services with insufficient performance. Naively, we "just" replace an existing back end (third-tier) implementation of a commodity service by its natural HPCC high-performance version: specifically, we used globus, MPI, and HPF.

2.4 Comparison of three-tiered and two-tiered (client-server) architecture models

Even though the programming effort for a client-server is much easier, it has many disadvantages compared with three-tier systems. For applications that thousands of clients connect to, a two-tier system doesn’t scale well and may not take the benefits of multithreading. A three-tier system solves the scalability and multithreading problems by pushing them down into the middle tier. In two-tier systems, client code is mixed with user interface code and some of the actual server codes and therefore has fatty code. Three-tier systems have thin client code and put all of the business methods of the server into the third tier. The coordination codes for these methods and the codes for allocation of resources are put into the middle tier. The security and transaction policies are inserted into the middle tier in three-tier systems, as opposed to server codes in two-tier environments. The Java-security sandbox doesn’t allow connecting to the host, other than the applet host, in two-tier systems, but introducing a middle tier will solve this problem in three-tier systems.

CHAPTER 3 – DARP SYSTEM (DATA ANALYSIS AND RAPID PROTOTYPING)

3.1. Introduction

The development of large distributed/parallel applications is a complex process. We face the problem of software integration, as different software components often follow different parallel programming paradigms. On the other hand, we witness a rapid progress of Web-based technologies that are inherently distributed, heterogeneous, and platform-independent. Of particular interest are the definition and standardization of interfaces that enable cross-platform software interoperability.

The integration of compiled and interpreted HPF gives us an opportunity to design a powerful application development environment targeted for high-performance parallel and distributed systems. This DARP environment includes a source-level debugger, data visualization and data analysis packages, and an HPF interpreter. The capability of alternating between compiled and interpreted modes provides the means for interaction with the code at real time while preserving an acceptable performance. Thanks to the interpreted interfaces typical of Web technologies we can use our system as a software integration tool.

The fundamental feature of our system is that the user can interrupt execution of the compiled code at any point and get an interactive access to the data. For visualizations, the execution is resumed as soon as the data transfer is completed. For data analysis, the interrupted code pauses and waits for the user's commands. The set of available commands closely reproduces functionality of a typical debugger (setting breakpoints, displaying or modifying values of variables, etc.). However, a unique feature of our system is that it can issue HPF commands to modify values of distributed arrays. In this sense, our system can be thought of as an HPF interpreter. For more complex data transformations, the user can dynamically link precompiled functions written in HPF or other languages, which enables rapid prototyping. In particular, parallel libraries that do not necessarily follow the HPF computational model can in this way be integrated dynamically with the HPF code through the HPF extrinsic interface.

Implementing proxy libraries in Java further increases the functionality of our system, allowing us to design and develop the DARP system as a three-tiered system rather than a traditional client-server one. Now we can treat components of the DARP system as distributed objects to be implemented as CORBA ORB-lets or JavaBeans. We use this mechanism for dynamical embedding of calls to a visualization system (such as SciViz [Sciviz98ACMJava]) or for coupling this system with the WebFlow [WebFlow97Furm].

This section is organized as follows. In Section 3.2 we discuss the overall architecture of the system in the context of the High-Performance Commodity Computing paradigm. Sections 3.3-3.6 describe the three-tiered design of the DARP system and its components: the tier-2 DARP server, the instrumentation server, DARP front-end and HPF interpreter, respectively. Section 3.7 demonstrates the integration of the DARP system with the visualization package, using a proxy library. Finally, in Section 3.8 we give our summary and conclusions.

3.2 High-Level Architecture of DARP

The design of the DARP system follows the idea of High-Performance commodity computing (HPcc). Conceptually, the architecture of this three-tiered system can be described as follows (cf. Figure 3.1): The DARP system uses an interpreted Web client interacting dynamically with a compiled code. At this time the system uses an HPF back end, but the architecture is independent of the back-end language. The Java or JavaScript front end holds proxy objects produced by an HPF front end operating on the back-end code. These proxy objects can be manipulated with interpreted Java commands to request additional processing, visualization, and other interactive computational steering and analysis.

Figure 3.1. DARP implementation within HPcc framework

3.3 DARP Server: Interactive control over an application

As shown in Figure 3.2, the heart of the DARP system is the DARP server, which controls the execution of the application. The server accepts commands from a client implemented as a Java applet. Control over the execution of an application and interactive accesses to the data are achieved by a source-level instrumentation of the code.

Figure 3.2. The architecture of DARP

Since HPF follows a simple global name space, “data parallel paradigm,” the DARP server can be implemented simply as an extrinsic HPF LOCAL procedure, in which case the server is part of the application and comes into existence only after the application is launched. In this scenario the application code is instrumented in such a way that the initialization of the DARP server is the first executable statement of the application. Once initialized, the server blocks the application waiting for the client to connect. From that point on, the execution is controlled by the client. Optionally, the initialization of the server may include processing a script that sets action/breakpoints and forces resuming of the execution without waiting for the user's commands.

In a general SPMD paradigm this simplistic implementation of the DARP server is not sufficient because the client loses control over the application when the code on a single node dies. We therefore extended the server architecture. Now (Figure 3.3), the manager, which is independent of the application, accepts requests from the client and multicasts them to all nodes participating in the computations. The DARP server is a part of the instrumented HPF application and is replicated over the nodes participating in the computation. The client communicates only with the DARP manager on a selected node (Figure 3.3). The DARP manager is a separate server that receives requests from the front-end user applet and the instrumentation server. In order for the server to get the user’s request even when the application is running, the server checks any request at the beginning of the server code. Remember that the function call to the server is inserted through instrumentation before each executable program statement. This mechanism gives full control of the program execution to the user and allows the user to monitor the updated sequences of any program variable during execution and suspend, continue, or stop the execution dynamically.

Figure 3.3. Middle-tier DARP manager controls HPF back end and communicates with other servers

The interprocessor communication required by the distributed application is not implemented using the Web-based protocols (such as CORBA IIOP), as is the case for client-manager interactions. Instead, we use the native HPF runtime support or MPI directly. For meta-computations, in our approach controlled by a network of managers, we consider replacing low-level MPI by Nexus[NexusJPDC97] and other services provided by Globus as the high-performance communication layer.

3.4 Instrumentation of the code

The instrumentation of the code involves three steps:

1. Adding server functions

2. Insertion calls to the HPF server before each HPF statement

3. Identification of the types of all variables used in the application

The process is fully automated and requires no user intervention. A preprocessor that transforms a valid HPF source code into an instrumented one does the instrumentation. The instrumented code is, itself, a valid HPF code to be compiled by a generic HPF compiler.

We built the preprocessor using the HPF Front End (HPFfe) [HPFfeWeb] developed at NPAC within the PCRC consortium [PCRCWeb]. HPFfe is based on the SAGE++ system [Sage++94Gannon], which, in addition to parsing, provides the means to access and modify the abstract parse tree (AST), the symbol table, the type tables, and source-level program annotations. For our purposes we developed functions that identify attributes of all variables used in the HPF application (including the data type and runtime memory addresses) and operates on the AST to insert variables' "registration" calls (allowing the server to determine the size and location of the data to be sent) and calls to the server.

Since HPF is a superset of Fortran 90 we can apply our preprocessor to any sequential Fortran code, particularly, to a node code of any parallel application developed in Fortran that uses explicit calls to a message-passing library such as MPI or PVM. The capability of processing HPF compiler directives enhances our system in that we can preserve information on the intended data distributions and assertions on the (lack

of) data dependencies.

3.5 DARP Front End

The DARP front end is implemented as a Java Applet (Figure 3.5). The user interacts with the code through an interface that closely resembles the interface provided by a typical debugger. The repertoire of commands includes setting break- and action-points, stopping/resuming execution of the application (including stepping one instruction at a time or one iteration of a loop at a time), changing values of the application's variables, requesting data (including distributed arrays), dynamically linking and executing shared objects (including codes generated on the fly by the interpreter), and more. Our prototype implementation supports three types of commands, of which the first category is data access commands (Table 1), the second is prototyping commands (Table 2), and the last one is control commands (Table 3). Unless otherwise stated in the fourth column of these tables, the user in the front end needs to submit the chosen processor ID (MIMD case) or no ID (SIMD case), the specific command tag (in the first column), and input parameters to the DARP manager, which delegates the request to the processor specified in the request. In the opposite direction, after the processor has performed the user’s request, it sends the result (with the ACKCOMMAND command tag) back to the manager, which finally forwards the result to the applet. If we look carefully at the return values of the user requests (in the third column of the tables), we see that we send back all of the information that was sent by the user. The reason for this symmetry is that our DARP system allows many users to connect to one running program. To do that, the manager keeps the currently connected users and whenever it receives a command result from the HPF server, it multicasts it to the bound users.

Note that since the code is instrumented on the source level, our "debugger" gives access only to the source-level data. In particular, we are unable to provide a complete state of the machine (registers, buffers, etc.) at any given time as many commercial debuggers do, and as is recommended by the High-Performance Debugging Forum [HPDForumWeb]. Also, since at this time we exclusively address applications in HPF, we ignore several features necessary for supporting a more general SPMD paradigm. In particular, we assume that interprocessor communications are facilitated by a bug-free HPF runtime system. However, the more advanced implementation of the DARP system with the independent DARP manager (cf. Section 3.3) makes it possible to control applications that use explicit message passing. Anyway, the DARP system is not designed to be a system-level debugger, but to perform such actions as manipulating large distributed data objects in order to investigate the convergence and stability of algorithms used in scientific simulations.

Typically, a client-server architecture is used to implement a portable debugger for distributed systems (cf. [DAQV96Sth, FalconWeb, Tuchman91Vis, CumulvsWeb , Panorama93MF, CH94p2d2]). Our approach is unique in that we use a three-tiered architecture and can easily integrate our source-level debugger with the HPF interpreter and a visualization tool that together comprise a powerful application development environment. Moreover, we envision that we will have different HPCC codes written by the HPCC community and we need to use them together to solve highly complex and computation-intensive applications. Our DARP manager helps to abstract a parallel application running on multiple nodes into a single application, thus allowing the user to manipulate an application through the manager, which can talk to the other managers and servers.

Data Access commands	Input Parameters	Return Value(s)	Description
DISPLAYVARIABLE	Variable Name	ACKCOMMAND Received command + Current Line Number(LN) + My Processor PID + Error code + DATADESC + DATAVALUE	Get the value of program variable, DATAVALUE whose data description is DATADESC.
DISARRAYSECTION	Variable Name + Array Section Stmt	Same as above	Get the array section specified by F90 array section expression
SETVARIABLE	Variable Name + DATAVALUE	Same as above except last two items	Set the value of variable to DATAVALUE
SETPARVAR	Variable Name + DATAVALUE	Same as above except last two items	Set the Data parallel(SIMD) array to the value DATAVALUE

Table 1. Data Access commands of HPF server

Prototyping Commands	Input Parameters	Return Value(s)	Description
INTER PRETER	Interpreted legal HPF Statement	None	After receiving , it sends HPF stmt to Instrumentation server
INST_CODE	Created function from user’s interpreted stms to achieve same affect of these stmsts	ACKCOMMAND Received command + Current Line Number(LN) + My Processor PID + Error code + Interpreted HPF Stmts	This command is received from Instrumentation server which creates function out of user stmts.
SCIVISENV	The host and port of SciVis server	Same as INST_CODE except last one	Set Scivis Host and Port
SCIVIS_FUNLIST	None	Same as SCIVISENV plus SciVis function names	Send SciVis interface function names
SEND_LOCALS	None	Same as SCIVISENV plus local variable names	Send local variable names of current execution point
ADD_ACTION	SciVis function name + LN	Same as SCIVISENV	Register Scivis function call for this line LN
DELETE_ACTION	Same as just above	Same as SCIVISENV	Deregister function call
ACTION_LIST	None	Same as SCIVISENV plus line numbers of action points	Return line numbers of all action points .
GET_FUNCTION	Line number(LN) and which function	Same as SCIVISENV plus arguments of function at LN	Returns the argument names of that function
STORE_CONFIG	File name	Same as SCIVISENV	Stores action or break points and other parameters in specified file
GET_PROFILE	None	Same as SCIVISENV plus profile info. For each program line	Returns number of executions and total time for each line.
SEND_CONFIGS	None	Same as SCIVISENV plus config files	Returns stored config file names
READ_PARSE_TREE	None	Same as SCIVISENV plus parse tree info	Returns the parse tree info. Of all project files

Table 2. Prototyping commands of HPF Server

Control Commands	Input Parameters	Return Value(s)	Description
PAUSE	None	ACKCOMMAND Received command + Current Line Number(LN) + My Processor PID + Error code	Pause the execution at program line, LN
PUTBREAKON	LN	Same as above	Put break point at line, LN
PUTBREAKOFF	LN	Same as above	Remove break point at line, LN
STEPINTO		Same as above	Go to next line, LN
STEPOVER	None	Same as above	Go to next line, LN (entering into function calls if any)
NSTEPINTO	N-the number of steps	Same as above	Perform N StepInto commands at once
NSTEPOVER	N-the number of steps	Same as above	Perform N StepOver commands at once
NITERATION	N-the number of loop iteration	Same as above	If execution inside a loop, make N iteration
CONTINUE	None	Same as above	Continue the execution until one break point is reached or program is stopped
STACKDOWN	None	Same as above	Go to one lower level in stack frame
STACKUP	None	Same as above	Go to one upper level in stack frame
STOP	None	Same as above	Stop the program execution

Table 3. Control commands of HPF Server

3.6 Integrated Environment for HPF Compiler and Interpreter

The architecture of this system allows for real-time interaction with an executing HPF code. At each synchronization point (when the DARP server is accepting requests), the data can be extracted and processed as if an explicit call to an HPF extrinsic procedure were made. HPF statements, in particular, can be executed in such an interactive fashion.

Figure 3.4. HPF interpreter

In this way the system achieves the functionality of an HPF interpreter. The interaction between the running application and the user's commands is based on dynamical linking of UNIX shared objects with the application. Any precompiled stand-alone or library routine with a conforming interface can be called interactively at a break point or at selected action points. In order to execute a new code entered in the form of an HPF source it must first be converted to a shared object. To do this, the user submits any legal HPF sequence of statements to the manager, which forwards the request to the specified processor(s). The processor(s) send statements to the instrumentation server, which creates a legal HPF subroutine code by using the HPFfe system and submits the code to the processor. The processor compiles it using an HPF compiler (Figure 3.4). Since any "interpreted" code is in fact compiled, the efficiency of the resulting code is as good as that of the application itself. Nevertheless, the time needed to create the shared object is prohibitively long for attempting to run complete applications, statement after statement, in the interpreted mode. On the other hand, the capability to manipulate and visualize data at any time during the execution of the application, without recompiling and rerunning the whole application, proves to be very time effective.

3.7 Runtime Visualizations

For visualizations we used the Scientific Visualization System, SciVis, a portable system developed at NPAC entirely in Java. With a very rich and user-extensible set of data filters, and full support for collaborative use, it is a very powerful tool for a rapid data analysis. From the user’s point of view, it consists of a stand-alone server that is typically run on the user's workstation and a client that supplies the data. The upper left panel shows the front-end applet with a fragment of an HPF code. The action points at which the SciVis proxy library is called are highlighted and a triangle on the left points to the current line (Figure 3.5).

Figure 3.5. A screen dump of a DARP session

The architecture of the SciVis system makes it particularly attractive for integration with the DARP system. The SciVis client API allows us to design a proxy library in Java with a simple and very intuitive interface. The library, on behalf of the user, causes the automatic creation of a SciVis client routine that corresponds to the data type requested by the user. The client is then dynamically linked with the running application and executed at a specified action point which results in sending the data to the SciVis server, which in turn displays them on the user's workstation screen.

The same mechanism can be used with dedicated proxy libraries to integrate the DARP system with other software packages such as computational libraries, data storage systems, or other visualization systems. By using proxy libraries the DARP system may request or provide services from other tier-2 servers, or become a module in data-flow type computations [WebFlowDARP].

3.8 Adding a different visualization engine to DARP system

Bringing a different visualization engine to DARP is nothing more than adding a new “Visualization Manager” instance with a special configuration for this new engine (Figure 5.6). The Visualization Manager is a generic module that can be configured for a visualization engine that may accept only special data formats and may have specific environmental variables. The HPF server sends a visualize command that includes the data with its descriptive information to the DARP manager, which forwards the data in order to the visualization manager. The last manager makes the necessary format conversion on the received data according to the manager’s configuration by the user at startup time and sends the data in accepted format to the visualization engine. With this method, the HPF server does not need to deal with any visualization engine-related operation, but may simply send the data with its definition to its DARP manager.

Figure 3.6. Adding a new visualization server to the DARP system

3.9 Summary

By reusing commodity components and technologies we have built a powerful tool for data analysis and rapid prototyping to be used by an HPF application developer. The most important feature of the system is interactive access to distributed data which makes it possible to select and send data to a visualization system at an arbitrary point of the application execution. The data can be modified using either native HPF commands or dynamically linked computational modules.

Consistent with our HPcc strategy, the system implements a three-tiered architecture: The Java front end holds proxy objects produced by an HPF front end operating on the back-end code. These proxy objects can be manipulated by an interpreted Web client interacting dynamically with compiled code through a middle tier (middleware). We successfully ran DARP as a WebFlow module and demonstrated it at SC’97. We fully integrated DARP functionality into WebFlow, which is explained in Chapter 5.

Although targeted for the HPF back end, the system's architecture is independent of the back-end language and can be extended to support other high-performance languages such as HPC++[HPC++Web] or HPJava[HPJavaACM98]. Finally, since we follow a distributed objects approach, the DARP system can be easily incorporated into a collaborative environment such as Tango [TANGOWeb] or Habanero [HabaneroWeb].

CHAPTER 4 – EARLY STAGE OF GATEWAY: WEBFLOW ARCHITECTURE

4.1 Introduction

Programming tools that are simultaneously sustainable, highly functional, robust and easy to use have been hard to come by in the HPCC arena, thus we have developed a new strategy, as explained in Chapter 2. It is called HPcc: High Performance Commodity Computing [HPccGridBook], and it builds HPCC programming tools on top of the remarkable new software infrastructure being built for the commercial web and distributed-object areas. This leverage of a huge industry investment naturally delivers tools with the desired properties with the one (albeit critical) exception that high performance is not guaranteed. Our approach automatically gives the user access to the full range of commercial capabilities (e.g., databases and compute servers), pervasive access from all platforms, and natural incremental enhancement as the industry software juggernaut continues to deliver software systems of rapidly increasing power. We add high performance to commodity systems using a multi-tiered architecture with the Globus [GlobusIntlJ97] metacomputing toolkit as the back end of a middle tier of commodity web and object servers.

This approach was not possible a few years ago when enterprise computing was still mainly client-server (two-tiered) and based on expensive custom solutions such as proprietary TP Monitors. The onset of the Web and the associated intranets accelerated the development of scalable and open three-tiered standards such as CORBA [OMGWeb], DCOM [COMWeb], and Enterprise JavaBeans (EJB) [EJBWeb]. We can therefore now prototype a dedicated and advanced, yet commercial-quality HPDC system, for HPCC applications by integrating open commodity standards for distributed enterprise computing with traditional (such as MPI or HPF) and emerging (GLOBUS) HPCC infrastructures that are optimized for performance.

Figure 4.1. Top-level view of the WebFlow environment

Our research addresses the need for high-level programming environments and tools to support distance computing on heterogeneous distributed-commodity platforms and high-speed networks that span across laboratories and facilities. More specifically, we are developing WebFlow--a scalable, high-level, commodity-standards-based HPDC system (Figure 4.1) that integrates the following:

· High-level front ends for visual programming, steering, run-time data analysis and visualization, and collaboration built on top of the Web and object-oriented commodity standards (Tier 1).

· Distributed, object-based, scalable, and reusable Web server and object broker middleware (Tier 2).

· High-performance back end implemented using the metacomputing toolkit of GLOBUS (Tier 3).

Note that this can be applied to either parallel or metacomputing applications and provides a uniform cross-platform, high-level computing environment. We believe that such an ambitious and generic framework as Webflow can be successfully built only when closely related to some specific large-scale application domains that can provide specification requirements, testbeds, and user feedback during the entire course of the system design, prototyping, development, testing, and deployment. We view NCSA Alliance and DoD modernization programs as attractive application environments for HPcc because of its unique, clear mission and advanced computational challenges, opportunities, and requirements.

WebFlow is a specific programming paradigm implemented over a virtual Web-accessible metacomputer and provided by a dataflow-programming model (other models under experimentation include data parallel, collaborative, and televirtual paradigms). The WebFlow application is given by a computational graph visually edited by end-users, using Java applets. Modules are written by module developers, people who have only limited knowledge of the system on which the modules will run. They need not concern themselves with issues such as allocating and running the modules on various machines, creating connections among the modules, sending and receiving data across these connections, or running several modules concurrently on one machine. The WebFlow system hides these management and coordination functions from the developers, allowing them to concentrate on the modules being developed.

4.2 WebFlow Overview

Another NPAC research group [WebFlow97Furm] originally developed WebFlow and we extended it to experiment with some high-performance simulation codes. The visual HPDC framework introduced by this WebFlow project offers an intuitive Web browser-based interface and a uniform point of interactive control for a variety of computational modules and applications running at various labs on different platforms and networks. New applications can be composed dynamically from reusable components just by clicking on visual module icons, dragging them into the active WebFlow editor area, and linking by drawing the required connection lines. The modules are executed using Globus [GlobusWeb] optimized components combined with the pervasive commodity services where native high-performance versions are not available. For instance, today one links Globus-controlled MPI programs to WebFlow (Java connected) Windows NT and database executables. When Globus becomes extended to full PC support, the default WebFlow implementation is replaced by the high- performance code.

Individual modules are typically represented by visualization, control, steering, or collaboration applets, and the system also offers visual monitoring, debugging and administration of all of the distributed applications and the underlying metacomputing resources. In the following chapter we introduce more sophisticated distributed-object-based Gateway architecture, offering tools for easy conversion of existing (sequential, parallel, or distributed) applications to visual modules via suitable CORBA, COM, or JavaBeans-based [JavaBeansWeb] wrapper/proxy techniques.

New applications created within the WebFlow framework follow a natural modular design, that accumulates in the first phase of a project a comprehensive, problem-domain-specific library of modules. It is then possible to explore the computational challenges of the project in a visual interactive mode in an effort to compose the optimal solution of a problem in a sequence of on-the-fly trial applications. The scripting capabilities of WebFlow, coupled with database support for session journalizing, facilitates playback and the reconstructing of optimal designs discovered during such rapid prototyping sessions.

For parallel object and module developers, we will also provide finer-grained visual and scripted parallel software development tools using the new Uniform Modeling Language (UML) [UMLWeb] recently accepted as an OMG standard. UML offers a spectrum of diagrammatic techniques that allows us to address various stages of the software process and several hierarchy layers of a complex software system. In this way WebFlow will combine the features of UML-based visual tools such as Rational Rose with both high performance and the proven value of data-flow-based visual programming environments such as Khoros and AVS.

Nothing prohibits the user from encapsulating a data-parallel application as a single WebFlow module. In this case the user is solely responsible for interprocessor communications (we used HPF- and MPI-based codes to run WebFlow modules on a multiprocessor system [DARPSC97]). Moreover, using the DARP system [DARP98Conc] implemented as a WebFlow module, we were able to interactively control an HPF application at runtime and dynamically extract the distributed data and send it to a visualization engine. This approach can be used for computational steering, runtime data analysis, debugging, and interprocessor communications on demand. Finally, we integrated two independently written applications that write checkpoint data [SC98Pres]. We used WebFlow to detect the existence of the new data, and transfer it to the other application.

4.3 Three-Tiered Architecture of the WebFlow

In our approach we adopted integrative methodology, i.e., we set up a multiple-standards-based framework in which the best assets of various approaches accumulate and cooperate rather than compete. We started the design from the middleware, which offers a core or a “bus” of modern three-tiered systems, and we adopted Java as the most efficient implementation language for the complex control required by the multi-server middleware. Atomic encapsulation units of WebFlow computations are called “modules,” and they communicate by sending objects along channels attached to a module. Modules can be dynamically created, connected, scheduled, run, relocated, and destroyed.

4.3.1 The Front End

The WebFlow Applet is the front end of the system. Through it users can request new modules to be initialized, their ports connected, and the whole application run and, finally, destroyed.

Figure 4.2 WebFlow Front End Applet

The WebFlow editor provides an intuitive environment for visually composing (click-drag-and-drop) a chain of data-flow computations from preexisting modules. In the edit mode, modules can be added to or removed from the existing network, and connections between the modules can be updated as well. Once created, a network can be saved (on the server side) to be restored at a later time. The workload can be distributed among several WebFlow nodes (WebFlow servers) with interprocessor communications taken care of by the middle-tier services. With the help of the interface of the Globus system in the back end, execution of particular modules can be delegated to powerful HPCC systems. In the run mode, the visual representation of the meta-application is passed to the middle tier by sending a series of requests (module instantiation, intermodule communications) to the Session Manager.

Control of module execution cannot be exercised just by sending relevant data through the module's input ports. The majority of modules that we have developed require some additional parameters that can be entered via “Module Controls” (in a way similar to that of systems such as AVS). These module controls are Java applets displayed in a card panel of the main WebFlow applet. The communication channels (sockets) between the back-end implementation of a module and its front-end module controls are generated automatically during the instantiation of the module.

Not all applications follow closely the data-flow paradigm. Therefore, it is necessary to define an interface so that different front-end packages can be "plugged in" to the middle tier, giving the user a chance to use the front end that best fits the application at hand. Currently, we offer visual editors based on GEF [GEFWeb] and VGJ [VGJWeb]. In the future, we will add an editor based on the UML standard, and we will provide an API for creating custom editors.

While designing the WebFlow we assumed that the most important feature of the front end should be a capability to create dynamically many different networks of modules tailored to application needs. However, working with real users and real applications we found that this assumption is not always true. WebFlow can be used just as a development tool by taking advantage of graphical authoring tools to create the application (or a suite of applications). Once created, the same application (i.e., network of modules) can be run by the end user over and over again without any changes in the different input sets. In such a case the design of the front end should be totally different. The decided functionality is to provide an environment that allows navigating and choosing the application and data that will solve the problem at hand, with any technical nuances of the application being hidden from the end user.

The front end provides a platform-independent and web-accessible interface to a high-performance metacomputing environment. Given access to the Internet, the user can create and execute an application using adequate computational resources anywhere, anytime, and even from a laptop personal computer. It is the responsibility of the middle tier to identify and allocate resources and to provide access to the data.

4.3.2 The Middle-Tier

Our prototype WebFlow system is given by a mesh of Java-enhanced Web Servers [APACHEWeb] running servlets that manage and coordinate distributed computation. This management is currently implemented in terms of three servlets: Session Manager, Module Manager, and Connection Manager. These servlets are URL addressable and can offer dynamic information about their services and current state. They can also communicate with each other through sockets. Servlets are persistent and application-independent.

In the implementation of WebFlow we ignored security issues. Again, in line with our HPcc strategy, we closely watched the development of industry standards. At this time the SSL suite of protocols is clearly the dominant technology for authorization, mutual authentication, and encryption mechanisms. The most recent release of Globus implements SSL-based security features. In order to access Globus high-performance computational resources, a user must produce an encrypted certificate digitally signed by the Globus certificate authority, and in return the Globus side (more precisely, the GRAM gatekeeper) presents its own certificate to the user. This mutual authentication is necessary for exchanging encrypted messages between the two parties. However, authorization to use the resources is granted by the system administration that owns the resources, and not by Globus. We are experimenting with a similar implementation for WebFlow.

4.3.3 Session Manager

The Session Manager is the part of the system in charge of accepting user commands from the front end and of executing them by sending requests to the rest of the system. The user requests, that the Session Manager honors, are creating a new module, connecting two ports, running the application, and destroying the application. Since the Session Manager and the front end generally reside on separate machines, the Session Manager keeps a representation of the application that the user is building, much like the representation stored in the front end. The difference between these two representations is that the Session Manager needs to worry about the machines on which each of the modules has been started, while the front end worries about the position of the representation of the module on the screen. The Session Manager acts as a server for the front end but uses the services of the Module and Connection Managers. All of the requests received from the user are satisfied by a series of requests to the Module and Connection Managers, which store the actual modules and ports.

4.3.4 Module Manager

The Module Manager is in charge of running modules on demand. When the creation of a module is requested, that request is sent to the Module Manager residing on the particular machine on which the module should be run. The Module Manager creates a separate thread for the module (thus enabling concurrent execution of multiple modules), and loads the module code, making the module ready for execution. Upon receipt of a request to run a module, the Module Manager simply calls a run method, which each module is required to have. That method, written by the module developer, implements the module's functionality. Upon receipt of a request to destroy a module, the Module Manager first stops the thread of execution of the module, then calls the special “destroy” method. The destroy method is again written by the module developer, which performs all the clean-up operations deemed necessary by the developer.

4.3.5 Connection Manager

The Connection Manager is in charge of establishing connections between modules, or more precisely, between input and output ports of the modules. As the modules can be executed on different machines, the Connection Manager is capable of creating connections across the network, in which case it serves as a client to the peer Connection Manager on the remote WebFlow server. The handshaking between the Managers follows a custom protocol.

4.4 The Back End

The module API is very simple: it implements a specific WebFlow Java interface, the metamodule. In practice, the module developer has to implement three methods: initialize, run, and destroy. The initialize method registers the module itself and its ports to the Session Manager and establishes the communication between itself and its front-end applet -- the module controls. The run method implements the desired functionality of the module, while the destroy method performs clean-up after the processing is completed. In particular, the destroy method closes all socket connections that are not destroyed by the Java garbage collector.

It follows that the development of WebFlow modules in Java is straightforward. Taking into account the availability of more and more Java APIs such as JDBC, this allows the creation of quite powerful, portable applications in Java. To convert existing applications written in languages other than Java, the Java native interface can be used. Moreover, the execution of the module can be delegated to an external system capable of resource allocation such as Globus. Indeed, at Supercomputing’97 in San Jose, California, we demonstrated an HPF application run under the control of WebFlow [DARPSC97]. The WebFlow front end gave us the control to be able to launch the application on a remote parallel computer, extract data at runtime, and process the data by WebFlow modules written in Java running on the local machine. The runtime data extraction was facilitated by the DARP system converted to a WebFlow module.

For more complex meta-applications, a more sophisticated back-end solution is needed. As usual, we go for a commodity solution. Since commercial solutions are practically nonexistent, in this case we use technology that comes from the academic environment, the metacomputing toolkit of Globus. The Globus toolkit provides all the functionality we need. The underlying technology is a high-performance communication library, Nexus. MDS (Metacomputing Directory Services) allows resource identification, while GRAM (Globus Resource Allocation Manager) provides secure mechanisms to allocate and schedule the resources. The GASS package (Global Access to the Secondary Storage) implements a high-performance, secure data transfer that is augmented with an RIO (Remote Input/Output) library that provides access to parallel data file systems.

In order to run WebFlow over Globus there must be at least one WebFlow node capable of executing Globus commands, such as “globusrun.” In other words, there must be at least one host on which both Globus and the WebFlow server are installed. This host serves as a "bridge" between two domains (cf. Figure 4.3), a network of WebFlow servers and a network of resources controlled by Globus called the Grid. The jobs that require the computational power of massively parallel computers are directed to the Globus domain, while others can be launched on much more modest platforms, such as the user’s desktop or even a laptop running Windows NT.

Figure 4.3. Bridge between WebFlow and Globus re sources (the Grid)

Both Globus and WebFlow gain from this symbiotic coexistence. From the WebFlow perspective, Globus is an optional, high-performance (and secure) back end, while WebFlow serves as a high-level, web-accessible visual interface and job broker for Globus. Together they cover a much wider application domain: Globus adds the HPCC world to WebFlow, and WebFlow adds commodity software, particularly the software that is available only on Microsoft Windows95/98/NT.

We are aware that providing a remote access to Globus resources (either via front-end applets or WebFlow server-to-server connections) we may introduce a security hole into the system. We made several experiments to upgrade WebFlow security standards in order to match those of Globus. However, at this time we have postponed incorporating any security mechanism into WebFlow until we rebuild the middle tier using CORBA and until some widely accepted standards emerge – preferably defined by DATORR [JGrandeWeb].

WebFlow interacts with Globus via the GRAM gatekeeper. A dedicated WebFlow module serves as a proxy of the gatekeeper client, which in turn sends requests to GRAM. Currently, the proxy is implemented using Java native interface. However, in collaboration with the Globus development team, we are working on a pure Java implementation of the gatekeeper client.

At this time GASS supports only the Globus native x-gass protocol, which restricts its use to systems in which Globus is installed. We expect support for other protocols, notably ftp, soon. This will allow us to use a GASS secure mechanism for data transfer from and to systems outside the Globus domain that are under control of the WebFlow. We also collaborate with the Globus development team to build support for other protocols, namely, HTTP and LDAP. In particular, support for HTTP will allow us to implement data filtering on the fly, as the URL given to GASS may point not to the data directly, but to a servlet or CGI script instead.

4.5.1 Logging into the WebFlow system

The user starts a session by sending the "start-session" command to the Session Manager (SM) that returns the host, SMHost (on which SM runs), and the port, SMport (SM port), to the user. It also returns the user-unique client identifier, ClientID, and the URL file, moduleListURL, that includes descriptions of user modules. This step is done automatically when the user downloads the WebFlow Front End applet from the HTTP server.

Figure 4.4. Starting a user session in the WebFlow system

4.5.2 Creating A New Module

While describing the creation of a new module, we will also give necessary details concerning the Session Manager (SM) and the module manager. The user always initiates the commands for the SM by visual authoring tools inside the applet with click and drag-drop mouse operations, and these SM commands are sent automatically on behalf of the user. We will refer to each entry of “X table” or “X list” as “X Object” in the following discussion.

SM holds the WFlowSession list (indexed by ClientID). Each entry of this list includes the ModuleRepresentation table (indexed by ModuleID) and the ViewerList-html Strings (indexed by htmlKey) for user modules and the module description list. The ModuleRepresentation object for each module keeps the port CMPort (its Connection manager Port), the port MMPort (its ModuleManager Port), and the host in which it lives. The MM (Module Manager) holds the ModuleWrapper table that, indexed by ModuleID, holds a ModuleWrapper object for the user-defined module in each entry. MM sends the commands to ModuleWrapper (running as a separate thread), which forwards them to the Module itself. The advantage to this is that ModuleWrapper can return control to the MM instantly, especially when running the Module, because ModuleWrapper runs the actual module run method as a separate thread.

The New-module command with the parameters ClientID and ClassName for the module to be created is sent to the SM after the user drops a visual icon for the module to the Graphic Editor Frame. Then the SM finds the corresponding MM for this module and sends the INITIALIZE command with the parameter module ClassName to it. The SM gets the MetaModule object for the created module, creates its ModuleRepresentation object by using information inside the MetaObject, and inserts this created object into the ModuleRepresentation table. The SM also inserts the htmlString of this module into the ViewerList table. The MetaModule object actually includes the state (description), ModuleID, port list, and viewerWinName for the new module. The SM forwards the contained information inside the received MetaModule object to the applet. If there is a matching front-end applet for this module, the applet sends a "new_viewer" command with parameters htmlKey and ClientID, receives the html string, and pops up the module applet.

Figure 4.5. Creating a new module

4.5.3 Making a connection between modules

Creating a connection between modules is more complex than initializing modules. As seen in Figure 4.6, to establish a connection from module fromModule to another module toModule, we need port Ids (fromPortId and toPortID) and Modules IDs (fromModuleID and toModuleId) of the module pair. Module IDs and their port IDs are received during the initialization of modules. We send all of this information together with ClientID as a Connect command to the SM. Then the SM finds the CMHost (CM host ) and CMPort (CM Port) for module toModule and sends them with port IDs of two modules inside the ESTABLISH command to peer CM for module fromModule.

CM keeps a portList table, indexed by a port ID, that includes a PortRepresentation for each module port. The CM also has two ports: one for requests and the other for connections, which is drawn in Figure 4.6 as a filled small circle at the left CM. In the connection process, the CM got fromPortID (for the source module) and toPortID (the target module). The user puts the module Ports obtained by creating OutputPort and InputPort objects (inside the initialize method), where the PortRepresentation object is created for each port and registered that into the CM by calling CM's static method, registerPort. The function of the CM job is to look up PortRepresentation objects for port pairs by means of fromPortID and toPortID and to make connections between these two ports and set the storedPort field (type of Port object) of PortRepresentation objects.

Returning to the connection process, the CM at fromHost sends the RECEIVE command with arguments fromPortID, toPortID, CMHost, and the connections port for fromHost. The CM at host toHost makes a socket connection (as client) to the connections port for host fromHost (used only to make connections), sets PortRepresentation of toPort to this established connection, and sends an OK message to the CM at fromHost. The CM of fromHost sets PortRepresentation of fromPort to a returned connection from the accept method call of the ServerSocket object and sends an OK message to the SM, which also sends an OK to the applet. This scenario is that of the general case. If two CMs are on the same host, steps 3 and 5 are not needed. Here the user doesn't send any command to CM to register its ports, but gives its Port objects to the CM, because the host names of the CM and MM of one module have to be the same.

Figure 4.6 Creating a connection between modules

4.5.4. Run/stop/destroy modules

The current WebFlow supports only the running of modules together, not separately. After the user finishes building a visual graph, he may send a run command to the SM with parameter ClientID. As seen in Figure 4.7, an SM looks up the WFlowSession object. For each ModuleRepresentation inside this object, the SM sends a RUN command with ModuleID to the appropriate Module Manager (MM). Each MM finds a ModuleWrapper corresponding to the received ModuleID and calls its runModules method, which then calls an actual run method from the user module implementation. Stop/Destroy command is used in the same way as the Run command.

4.7. Run/Stop/Destroy modules

4.6 Limitations of Original WebFlow and Other Alternative Solutions

To summarize, we have developed a platform-independent, three-tiered system: the visual authoring tools implemented in the front end, integrated with the middle-tier network of servers based on industry standards and following a distributed-object paradigm, facilitate a seamless integration of commodity software components. In particular, we use the WebFlow as a high-level, visual user interface for GLOBUS. This not only makes the construction of a meta-application much easier for an end user, but also allows this state-of-the-art HPCC environment to be combined with commercial software, including packages available only on Intel-based personal computers.

Although our prototype implementation of WebFlow proved to be very successful, we are not satisfied with it as a complete solution to the problem. Its advantages are the following:

1. It follows the industry-proved standard of a three-tiered model.

2. Developing back-end WebFlow modules and their front-end controls can be achieved independently of each other.

3. WebFlow supports a session concept for each working user that doesn’t interfere with other users.

4. Module developers don’t need to deal with low-level issues such as allocating computing resources and connecting modules.

However, WebFlow doesn’t have many of the features a scientist typically expects to see during the development of a distributed application from previously generated modules. These properties have to be developed and the failings must be corrected:

1. The user must construct a visual graph to represent the back-end application first and start all of the modules at once; the flexibility of adding and starting modules incrementally is also needed.

2. Whenever the user brings a corresponding icon into the front-end palette for a back-end module, there is middle-tier processing to allocate system resources for this new module, and there is no “undo” operation for this intervening middle-tier process.

3. When the user makes a connection between two modules in the front-end visual graph, the front-end applet sends the connection command to the middle tier automatically. There is no way to recover this connection process.

4. There is no utility to save and restore a finished, distributed application. The entire visual graph must be rebuilt every time you want to create it.

5. There is no security and transaction model, and it is difficult to put a security model on top of the current WebFlow.

6. WebFlow doesn’t provide for replacing an old running module with a new one. An entire application must be created out of new modules every time a module needs to be replaced with a new one.

7. Even though we tried to extend the original WebFlow so that the user front-end controls for back-end modules can communicate, much multithreading coding and many complex operations are required. There has to be an easy way for the front-end control to talk with the matching back-end module, such as CORBA method call.

8. Even though we successfully integrated WebFlow with the DARP system by representing DARP as an WebFlow module, much coding is needed and that includes more synchronization and multithreading issues. We have to find a way to strongly couple DARP with WebFlow, possibly by incorporating DARP’s functionality into WebFlow’s middle tier.

9. The original WebFlow doesn’t support the assigning of attributes of user modules or saving and recovering them from/to the database.

10. Because developing distributed applications is a complex process, it is very easy to make a mistake. Therefore, there has to be a monitoring environment for all interactions among front-end and back-end user modules. The original WebFlow doesn’t support this facility.

11. WebFlow doesn’t allow one user to construct and test multiple distributed applications concurrently.

12. WebFlow allows only a data-flow model, but we sometimes need event-driven associations among user modules, which we need to provide.

13. WebFlow doesn’t have a service concept. In the high-performance community we have to provide some services such as a Database server, a Directory Server, a Batch Job submitter, various metacomputing services such as MDS (Metacomputing Directory Service), GRAM (Globus Resource Allocation Manager), GSS (Globus Security Service), and GASS (Globus Access to Secondary Storage). We need a simple way to attach these kinds of services to and detach them from the WebFlow system so that user modules can reach it easily.

Pursuing HPcc goals, we would like to base our implementation on the emerging standards for distributed objects and take full advantage of the possible leverage realized by employing commercial technologies. Our research led us to the following observations: The "Java Platform" or "100% Pure Java" philosophy is being advocated by Sun Microsystems, while the industry consortium led by the OMG pursues a multi-language approach built around the CORBA model. It has been observed recently that the Java and CORBA technologies form a perfect match as complementary enabling technologies for distributed-system engineering. In such a hybrid approach, referred to as the Object Web, CORBA offers the base language-independent model for distributed objects and Java offers a language-specific implementation engine for CORBA brokers and servers.

Meanwhile, other total-solution candidates such as DCOM by Microsoft or WOM (Web Object Model) by the World-Wide Web Consortium for distributed objects/components are emerging. However, standards in this area and interoperability patterns between various approaches are still in the early formation stage. A closer inspection of the distributed object/component standard candidates indicates that, while each of the approaches claims to offer a complete solution, each in fact excels only in specific selected aspects of the required master framework. Indeed, it seems that WOM is the easiest, DCOM the fastest, pure Java the most elegant, and CORBA the most realistic complete solution.

Consequently, to implement the new WebFlow system, we chose CORBA as the base distributed-object model at the Intranet level, and the Web as the worldwide distributed-object model instead of patching the original WebFlow to solve the above problems.

5 Distributed Object-Based Gateway Architecture

5.1 Impact of Gateway on Seamless Access to HPCC Resources

Seamless access creates an illusion that all the resources needed to complete the user tasks are available locally. In particular, an authorized user can allocate resources without explicit login to the host controlling the resources. We have many examples of this scenario in other platforms such as NSF-mounted disk or a network printer. An user thinks his or her home directory files are located at a single location, but they are distributed across several disks that are NSF-mounted. The same command is used to print a file on any printer and the user doesn’t need to know where the printer in a network is installed.

A Web browser has become a centerpiece of the desktop. The rapidly evolving Web technologies add functionality to this ubiquitous tool, and what is perhaps even more important, new technologies add functionality to the Web servers. This in turn opens new opportunities for the content providers. Nowadays, the Web is not just a collection of static html documents, but also offers numerous services, from on-line shopping and banking to collaboratory environments used for distance training and for sharing scientific data.

The Gateway system offers a specific programming paradigm implemented over a virtual Web-accessible metacomputer. A meta-application is composed of independently developed modules implemented in Java that follow the distributed, modified JavaBeans model, somewhat similar to the EJB model. This gives the user the complete power of Java and of object-oriented programming in general with which to implement module functionality. However, the functionality of a module does not have to be implemented entirely in Java. Existing applications written in languages other than Java can be easily encapsulated as JavaBeans.

Module developers have only limited knowledge of the system on which the modules will run. They need not concern themselves with issues such as allocating and running the modules on various machines, creating connections among the modules, sending and receiving data across these connections, or running several modules concurrently on one machine. The Gateway system hides these management and coordination functions from the developers, allowing them to concentrate on the modules being developed. In addition to seamless access to modules running across networks, Gateway allows users to construct and work on many different applications composed on several modules concurrently.

Modules often serve as proxies for particular back-end services made available through the Gateway system. For example, an access to a database is provided through JDBC API that delegates the actual implementation of module functionality to a back-end DBMS. We follow a similar approach in providing access to high-performance resources: a Gateway module "merely" implements an API of back-end services.

The Gateway system supports many different programming models for distributed computations, from coarse-grained data flow to object-oriented to a fine-grained, data-parallel model. In the data-flow regime a Gateway application is given by a computational graph visually edited by the end users. The modules comprising the application exchange data through input and output ports in a way similar to that used in AVS. This model is generalized in our new implementation of the Gateway system. Thanks to the fact that the modules behave as distributed JavaBeans, each module may invoke an arbitrary method of the other modules involved in the computation.

Gateway has three-tiered architecture just as WebFlow does (Figure 5.1). A stand alone application or Web browser-based graphical user interface that assists the researcher in the selection of suitable applications, the generation of input data sets, specification of resources, and the post-processing of computational results, comprises tier 1. The distributed, object-oriented middle tier maps the user-task specification onto back-end resources, which form the third tier. In this way we hide the underlying complexities of a heterogeneous computational environment and replace it with a graphical interface through which a user can understand, define, and analyze scientific problems.

In our design we built on our experience of applying the WebFlow system to real-life applications, as described in our earlier papers [WebFlowSC98et, WebFlowHPCN99te]. It is important to note that we use the Globus metacomputing toolkit to provide access to high-performance resources in tier 3. Conversely, the Gateway system can be regarded as a high-level, Web-based user interface and job broker for Globus.

Figure 5.1. Gateway architecture

5.2 Gateway Middle Tier

5.2.1 Motivation

In an object-oriented approach, applications are made of components and containers. One builds a Java applet by placing AWT components – buttons, labels, text fields and so forth – into frames or panels, that are object containers. This idea can be easily extended to non-graphical components and is implemented as a hybrid of Enterprise JavaBeans and JavaBeans. An important element of the JavaBeans approach is a standardized model for interactions between components through event notification. Information that is to be shared between components is encapsulated as events and passed to all registered events listeners.

It follows that within the Gateway environment, building a distributed, high-performance application is a process similar to that of building a distributed applet (Figure 5.2). Note that we will use “application,” “context,” and “Gateway context” interchangeably throughout the thesis.

Figure 5.2. A hypothetical distributed applet. Each panel (a container) of this applet is placed on a different host.

Like the distributed applet in Figure 5.2, we have Gateway Context and User Module corresponding to AWT container and AWT component, respectively, as shown in Figure 5.3 in which an application actually is represented with Gateway Context. A user creates the user context, which is a container for all of his or her applications. The application context holds other application contexts or modules and a user can be represented by Gateway context that can contain an arbitrarily complex hierarchy of containers and objects in the middle tier. The web-based client tier provides tools for visual composing and for distributing the hierarchy.

Figure 5.3. A distributed Gateway application

5.2.2 GateWay Middle Tier

As shown Figure 5.4, a Gateway middle tier that actually represents the distributed application in Figure 5.3 consists of WEB servers, Gateway contexts, and proxy objects. Even though we showed user modules in Figure 5.4, they actually play a role in the back end. The context residing at the top of tree is referred to as master context. Only the master context can hold proxy objects. We call contexts other than the master context slave context. Contexts can run as a separate process, called the Gateway server, or as an object embedded in another object or Gateway server. As shown in Figure 5.4, WEB servers are used to configure and start Gateway and they include descriptions of the user modules. The WEB server holding the master context is called the master WEB server.

Figure 5.4. A simplified representation of the Gateway middle tier

As shown in Figure 5.5 (proxy objects and WEB servers removed for simplification), Gateway contexts and user modules are represented as ellipses and squares, respectively. The slave context registers itself to the master if the slave stays just below the master or to another slave and usually runs on a host different than the one on which its parent is running. Usually, there is only one instance of master or slave on each machine. Slaves can be introduced into the system dynamically and their parents instantly update the list of available children of Gateway servers. Actually, master and slave contexts are instances of the same object.

The labels on ellipses and squares indicate the Distinguished Name (DN) of the contexts and user modules, respectively, (Figure 5.5). This naming convention is similar to the way in which LDAP directory servers and the CORBA naming service name their components. The component at the tail of a solid arrow has reference to the one at the head.

Figure 5.5. Naming of Gateway contexts and user modules

5.2.3 Lifecycle Service

This service, compared with all other OMG specifications, is more like a guideline than a set of standard programming interfaces (although it contains both). One of the key concepts in the Lifecycle service is the object factory, the purpose of which is to create other objects. Created objects might be in the same address space or they could be remote objects somewhere else in the global computing environment. The CORBA architecture and its location transparency help to hide the complexity associated with these differences in location.

The Gateway requires a customized Lifecycle service. In particular, instantiation of a Gateway module or context on a remote or local host (to be run under control of the peer Gateway server) requires the creation of a local proxy module in the master context. Therefore, there is a proxy object in the master server for any object created in the system. Each context actually has an object factory and is responsible for all lifecycle operations of its children objects: contexts and user modules.

5.2.4 Proxy Objects

The master creates and maintains proxies for each component in the hierarchy. The original purpose of proxies is to forward requests from the Web client to remote objects (a Web client cannot contact objects on remote slave servers because of the Java sandbox security restrictions). In addition, proxies simplify the association of the distributed components. In our current implementation we generate proxies for all components, including local objects. This symmetric implementation allows the functionality of proxies to be extended. Among the most interesting is the capability of logging, tracking, and filtering all messages between components in the system. We use these capabilities to implement fault tolerance and security and transaction monitors, as well as for debugging purposes.

5.2.5 Interactions of user modules

For distributed applications we need a mechanism to transport events across address spaces or a distributed-object model. An application is made of contexts and modules that exchange information between each other through the event-notification mechanism. An event is a CORBA object itself that encapsulates the data to be sent from one module to another. To make this work, a registration mechanism must be provided that allows for “connecting” the modules. By “connection”, we mean here an association of a source object, which fires the event, and the target object whose registered method is called as the response to the event.

Since Gateway modules are developed independently of each other and are connected only at runtime, we need a dynamic mechanism for event binding. This functionality is offered by the CORBA event service that defines an event-channel object. Event suppliers subscribe to an event channel as event producers, and event consumers subscribe to a channel. This subscription is achieved by first getting ConsumerAdmin (for event supplier) and SupplierAdmin object (for consumers). The event supplier then gets a ProxyPull/PushConsumer object from ConsumerAdmin and starts pumping the event to the channel through this proxy. Event consumers get a ProxyPull/PushSupplier object from ConsumerAdmin and start getting the event from the channel through this proxy, either by pulling the event or pushing the event channel to the event consumer.

Whenever any event comes from any event supplier, the event channel forwards it to all subscribed consumers. There is no event filtering such that one specific type of event is passed to one particular consumer, because the event channel cannot identify the source of the event in order to forward it to one specific consumer. However, we choose not to use these event channels for security reasons. All events in the event channel are “public,” that is, any object can register itself as the listener for an arbitrary event. The support for point-to-point event exchange will be provided in future releases of CORBA as the event notification service. At this time, we are forced to develop our own event registration service.

Our implementation is based on an event adapter, which is a simple translation table maintained by each Gateway context. Each entry of the table contains a source object reference, event identifier, target-object reference, target-method name, and the type of the connection. We support two types of connections, push and pull. In the push model, whenever the source object fires an event, it is intercepted by its parent context, which immediately calls the registered target method. In the pull model, the captured event is kept inside the translation table until the target-object wants to take it by explicitly calling the “pull” method on its parent context. This dynamic event binding is achieved by using CORBA’s dynamic invocation interface (DII) and dynamic skeleton interface (DSI). In Figure 5.6 we illustrate our event model.

In current implementation, all binding information of modules or contexts nested inside one container context are kept inside the container so that whenever it is needed to modify connections from a user module the suitable request must be sent to its parent container. However, it is possible to view this binding table as a CORBA object instead of as internal tables.

This simple configuration (Figure 5.6) shows that we created master (M) and slave (M/S1 and M/S2) servers. Then we created the context (M/S1/UC1) inside slave S1 and the context (M/S2/UC2) inside slave S2. The modules M1 and M2 are put into contexts UC1 and UC1, respectively. Inside the master we created one proxy for each instantiated object. The arrow with dotted lines represents the relationship of the proxy in the master with the real object. Suppose we associate module M1 and module M2 with one type of event. The arrow with solid lines shows the execution path when module M1 fires one event.

Figure 5.6. Gateway event model.

Module M1 in context UC1 of S1 fires an event e to invoke a method m of module M2 placed in context UC2 of S2. The small blobs inside the master represent the Gateway components’ proxies maintained by the master. Thin dotted arrows show the relation between the proxies and the actual objects.

1. Module M1 fires the event, which is intercepted by the proxy of its parent context PUC1.

2. The proxy forwards it to the actual context UC1.

3. The context finds in its translation table the intended recipient (here, method m of module M2) and forwards the event to the target module proxy PM2.

4. The proxy invokes method m of module M.

At first, this model may seem unnecessarily complex. The use of proxies indeed adds some overhead. In practice, however, the performance penalty is barely noticeable, while the advantages of this model overweigh any possible shortcomings. First, we use this long path through proxies only to transfer control logic between objects and actual data transfer is carried out at the back end with optimized high-performance libraries such as MPI and PVM. Second, reference to the proxy of the target module instead of to the module itself leads to module location transparency. Third, firing the event by the Web client is the only way in which to access the remote module. Finally, as discussed above, sending events through proxies opens an opportunity for filtering events. These proxies act as general EJB objects when compared to EJB architecture, therefore the proxy for each user module can manage the persistency of the module automatically.

5.3 Gateway Back-End

The back-end consists of user modules and various software and hardware resources. We call these resources application services. There are two types of user modules: the principle module, which has no dependency on other services, but only on its own computational code, the delegate module or Gateway service, which serves as a proxy for application services. Modules are technically CORBA objects implemented in Java. However, that does not mean that the actual functionality of the module must be implemented in Java. Legacy applications can be easily encapsulated as CORBA objects and thus used as Gateway modules or delegate modules.

A typical task submitted to the Gateway system specifies the software components to be used rather than the actual code. Consequently, Gateway modules do not implement the task functionality directly, but instead act as proxies for that functionality. For example, one of the application services, Globus, which has many metacomputing services, provides a user to obtain high performance whenever needed. The ATD (Abstract Task Descriptor) may request to be allowed to submit an executable that can be found at a given location and store the output file at another specified location. In such case, the Gateway delegate module forwards the performing of these tasks to metacomputing services, supplying them with adequate arguments. In other words, the Gateway delegate module implements the metacomputing service interface and marshals the arguments. An application following any programming paradigm (including parallel) and implemented in any language can thus be run under the control of Gateway modules and the Gateway system.

Accessing to a delegate module has two steps. Because each object in the Gateway system has a proxy, a front-end request for any Gateway service is intercepted by its proxy in the middle tier, which forwards the request to the delegate module. In the second step, the delegate module makes forwards one more to the actual application service at back end. A user module running in the back end can request Gateway to obtain any service. This process may cause Gateway to make a traversal on the tree of objects to find specified service if necessary. A user module can register itself for a specific service so that whenever that service is added to the system, the module is informed immediately. In this way, Gateway not only provides an environment for multiple users to work on with their different applications, but also single point service discovery for user modules. Therefore, a user module can get any service object that is put inside any Gateway context dynamically. Because the hierarchy of contexts and user modules is very similar to LDAP architecture, this feature may open more functionality in the future.

The LDAP and Database services use LDAP API and JDBC protocol to access to the Directory server and Oracle database, respectively. NT service can access through Windows DCOM to various NT services or COM objects that were developed by the user or by Microsoft.

Figure 5.7. Gateway services at back end

Globus has various metacomputing services implemented on top of Nexus, and Gateway has built-in delegate modules to these services. The services (Figure 5.7) include secure resource allocation (GRAM), secure file transfer (GASS), metacomputing directory services (MDS), remote I/O for meta-systems (RIO), Metacomputing Directory Service (MDS), and others.

The GASS (Global Access to Secondary Storage) service, using Globus API, helps to transfer files across different machines in a way that eliminates logging onto a remote site and using ftp. The RIO service provides basic mechanisms for tools and applications that require high-performance access to data located in remote potentially parallel file systems. MDS service provides access to MDS, the information infrastructure of the Globus Metacomputing toolkit. MDS stores static and dynamic information about the status of a metacomputing environment. MDS and RIO access back end application services through globus API and LDAP API respectively.

The GRAM service acts as a client to the GRAM (Globus Resource Allocation Manager) server in the Globus system and is used to submit jobs. More specifically, GRAM service sends a request expressed in the Globus RSL (Resource Specification Language) that defines the target machine and the location of executable and input files, as well as instructions for dealing with standard output and standard error streams. Optionally, through GASS both executable and input data sets can be staged prior to the execution of the job, and output files can be uploaded to a specified location after the job is completed. The Java code of the Gateway delegate module generates the RSL command, and the Gateway module developer never needs to see the actual application, let alone make an attempt to rewrite the application in Java.

5.4 The Front End

Different classes of applications require different functionality of the front end (Figure 5.1). We therefore designed the Gateway system to support many different front ends, from very flexible authoring tools or problem-solving environments (PSE) that allow for the dynamical creation of meta-applications, from pre-existing modules to highly specialized front ends customized to meet the needs of specific applications (Custom Application GUI). Also, we support many different computational paradigms, from data-flow (Data-Flow Visual Authoring) to general object-oriented (Object-Oriented Visual Authoring Tools) to “a command line” approach (Standalone Application). This flexibility is achieved by treating the front end as a plug-in implementing the Gateway API. We will briefly discuss these front ends.

PSE provides an environment in which a scientist can solve a particular problem by constructing the task through a selection of several subtasks that may be contained in the specified task and the environment and input/output parameters for each subtask. Specifically, for each task the user has to give at least the application executable name with input/output parameters (or files), target host, and specific environment parameters. PSE produces an actual job descriptor from user selections on the fly and sends it to the middle tier.

The user can create an application-specific GUI (Custom Application GUI). We have built two different front ends for LMS simulations.

We may have some applications consisting of components that have simple relationships among themselves such that each component gets several inputs from upward components, runs an actual computational code, and sends any results to downward components. We have given a simple Web-based interface to QS following the Data-Flow Visual Authoring model. We have created several reusable modules where the user clicks and drags the module into a palette in which he makes data-flow connections among modules. This type of front end is suitable only for applications that have inherent data-flow computations.

Object-Oriented Visual Authoring Tools are the most generic of front ends and are similar to the JavaBeans Development Environment (like SUN BDK). The user chooses predefined beans and puts them into a bean palette. The user may then make several types of connections between beans after inspecting the properties and fired events of each one. Thus, the user can construct arbitrary connections between beans such that one bean can invoke method of another one when a source fires an event. After completion of an application, it can be saved and resurrected later.

The stand-alone application is the lowest level front end because it uses the API of Gateway to build its specific application.

5.5 Comparison of Gateway with EJB

EJB defines a server component model for JavaBeans. The current specification of EJB 1.1 doesn’t support attaching Enterprise JavaBeans through event binding. The simplified representation of the EJB model is shown in Figure 5.8. In the EJB model we have a container that provides transaction and security, state management, and the persistency of Enterprise JavaBeans nested inside the container. The container has tools that take Enterprise JavaBeans and produces the necessary helper classes to deploy the EJBs into the container. Usually, in order to create one Enterprise JavaBeans, the user needs to write, for example, an Account interface and its implementation, AccountBean (for Session Beans) and AccountHome. Assuming we are using the EJB development environment of the “Acme” company, the container tools get them and automatically produce AcmeAccountHome, AcmeRemoteAccount, and some other classes. AcmeAccountHome has life-cycle methods to implement the user interface, AccountHome. The user first finds AcmeAccountHome in JNDI and uses it to create the AcmeRemoteAccount object, simply an EJB object, which instantiates the real EJB AccountBean at the same time. This EJB object behaves as a wrapper for user EJB, AccountBean. This means that whenever any method call from the user comes to the EJB object, this object performs a transaction operation and security if needed and forwards the call to the real EJB, AccountBean.

In the Gateway system, our Bean is user-module defined in the idl files. Because we are using an interface repository to extract type information from user modules, we have a generic EJB object (AcmeRemoteAccount) called a proxy for any user-defined module. This proxy can get a request from the client for any user module and forward it to the module. In our model we create user modules simply by instantiating an empty constructor of user module implementation, as opposed to “creating” methods of AccountHome implementation classes (AcmeAccountHome). We can again place security and transaction policies into the proxy as EJB does inside the EJB Object.

As EJB tools generate the helper and other wrapper classes from user EJB interface and implementation classes to provide transaction, security, and persistency management of user EJB objects, we automatically generate wrappers for user module attributes. These attribute wrappers consist of the methods necessary to transform attributes from one type to another, e.g., a string attribute to its native type or vice versa, or from its native representation to the one in a database. These wrappers aid in user module persistency so that whenever one method call crashes for unknown reasons in the middle of its execution, its full state from the database can be recovered automatically. We will say more about persistency in a later chapter.

Figure 5.8. Basics of an EJB environment

CHAPTER 6 – Gateway Interfaces and Services

6.1 Gateway Interfaces

6.1.1 BeanContextChi ld interface

We started with Sun’s JavaBeans model, used its classes, and customized and updated them for a distributed environment. Later, we wrote our own Gateway interfaces on top of the Sun BeanContext interface and implemented them in CORBA. We will define interface methods and skip the ones used internally.

interface baseAttribute;

Interface BeanContextChild {

fireEvent(baseAttribute eventObj);

string saveStateInXml(in string saveCase);

void restoreXMLProperties();

void saveStateInDataBase();

void savePropertiesWithJDBC(in string tableName);

void setEntityFlag(in boolean flag);

boolean getEntityFlag();

void destroy();

void removeMyself();

void changeImpl(in string objName);

membersArray pull();

void setObjectID(in string objectId);

string getObjectID();

void setMyProxy(in Object proxyObj);

Object getMyProxy();

void setBeanContextChildPeer(in Object peer)raises(NullPointerException);

void setBeanContext(in Object bc) raises(event::PropertyVetoException);

Object getBeanContext();

Object getBeanContextChildPeer() ;

void addPropertyChangeListener(in string name, in Object pcl);

void removePropertyChangeListener(in string name, in Object pcl);

void addVetoableChangeListener(in string name, in Object vcl);

void removeVetoableChangeListener(in string name, in Object vcl);

};

Table 6.1. BeanContextChild interface

Our BeanContextChild interface (Table 6.1) is an extended version of SUN’s BeanContextChild interface with our special methods. We have an implementation class, BeanContextChildSupport, of this interface. Each Gateway module implementation must extend from the BeanContextChildSupport class so that the module code is able to access to the fireEvent method to fire any event. The Gateway context will use these interface methods to control easily life cycle and run-time information of the user modules.

saveXMLProperties saves the current state of a user module object by generating an XML document of its defined attributes and returning it as a result value where saveCase argument is the method for storing attributes in two forms, “ASCII” for text form and “binary” for CORBA CDR (Common Data Representation). Currently, we support basic data types and sequence and structure types; restoreXMLProperties restores the previous state of the called object in XML. Actually, it reads the values of all the attributes defined for the object and sets them. These method calls are intercepted by the proxy of the object and forwarded to the configuration server.

The destroy method disconnect from ORB. The removeMyself method removes caller object from the children list of its parent context and removes all of the incoming and outgoing connection entries for this object that are in the binding table of the parent, and finally calls the destroy method. The changeImpl method first calls the removeMyself method and instantiates a new user module with the name objName parameter by sending an addModule method call to the parent of the object. Finally, changeImpl substitutes this new instance into the vacant place of this object.

The pull method collects all of the events targeted to this object in an array at a particular time and returns the array. This method should be called inside the implementation of the user module after the module is attached to another module with the “pull” type of connection as event consumer.

The methods setObjectID and getObjectID set and get, respectively, the object identifier of the called object (we note that CORBA doesn’t have an exact solution for identifying objects). The methods setMyProxy and getMyProxy set and get, respectively, the proxy created for the called object. These methods are called internally by other methods such as addModule and addContext of the Gateway interface.

The method setBeanContextChildPeer sets the implementation object of this interface to the peer object and is called internally during the instantiation of this implementation; setBeanContext sets the parent context of this object to the bc object; GetBeanContextChildPeer and getBeanContext get the implementation peer of this interface and parent context, respectively.

The methods addPropertyChangeListener and removePropertyChangeListener add and remove the property listener object pcl with a name for fired property events of the called object, respectively; and the methods addVetoableChangeListener and removeVetoableChangeListener add and remove the vetoable property listener object vcl with the name for the fired vetoable property events of the called object. These event-adding and removing methods are called automatically on behalf of this object during attaching events.

The FireEvent method will be explained later in this chapter, and the saveStateInXml, restoreXMLProperties, saveStateInDataBase, savePropertiesWithJDBC, setEntityFlag, and getEntityFlag methods are explained in Chapter 6.3.

6.1.2 BeanContext interface

Interface Iterator{

Boolean hasNext();

Object next();

Void remove();

};

interface Collection{

boolean add(in Object o) raises (IllegalArgumentException,IllegalStateException,NullPointerException) ;

boolean addAll(in Collection c) ;

void clear() ;

boolean contains(in Object o) ;

boolean containsAll(in Collection c) ;

boolean equals(in Object o) ;

boolean isEmpty() ;

Iterator iterator() ;

Boolean remove(in Object o) ;

Boolean removeAll(in Collection c) ;

long size() ;

membersArray toArray() ;

};

Table 6.2. Iterator and Collection interfaces

The Gateway interface (Table 6.3) is the IDL definition of the so-called “Context” or “Container” that holds the other containers or user modules. The implementation of this interface, GatewayContextOps, is instantiated whenever one context is added inside another one through the addContext call of its parent container, or when the outermost context-acting master or slave server is started by hand or by URL. This interface provides nested components with a single point of service discovery and a logical environment for them to live in.

Interface DARP;

Interface GatewayContext:BeanContext, DARP

{

void recoverDownObjects(in string xmlfileName);

void setChildDeleted(in string childID);

boolean isAllChildrenDeleted();

Object getContext(in string bindingName);

void deactivate();

void attachPushProperty(in Object source, in string eventID, //Push Events

in Object targetObject, in string targetMethod);

void detachPushProperty(in Object source, in string eventID,

in Object targetObject, in string targetMethod);

void attachPushVetoableProperty(in Object source, in string eventID,

in Object targetObject, in string targetMethod);

void detachPushVetoableProperty(in Object source, in string eventID,

in Object targetObject, in string targetMethod);

void attachPushEvent(in Object source, in string eventID,

in Object targetObject, in string targetMethod);

void detachPushEvent(in Object source, in string eventID,

in Object targetObject, in string targetMethod);

//We give only attach and detach methods for pull type of events of generic events

// we don’t list the other similar methods for customized property event and vetoable events.

void attachPullProperty(in Object source, in string eventID,

in Object targetObject);

void detachPullProperty(in Object source, in string eventID,

in Object targetObject);

//pull events that came into my mailbox

membersArray pullEvents() ;

void propertyChange(in event::PropertyChangeEvent evt)

raises (event::PropertyVetoException); //Event/property adaptor methods

void vetoablePropertyChange(in event::PropertyChangeEvent evt)

raises(event::PropertyVetoException);

Object addNewModule(in string productName);

Object addNewContext(in string contextName) raises(event::PropertyVetoException,NullPointerException);

stringArray getModuleList();

long getMyColor();

string createObjectID(in string productName);

//remove module

void removeModule(in Object source) ;

void removeContext(in Object source) ;

void removeLocalChildren();

void copyModuleBinding(in Object sourceModule,in Object there);

//Gateway services methods

Object getService(in string serviceName);

void addService(in string serviceName);

void revokeService(in string serviceName);

void serviceAvailable(in Object serviceEvt);

void serviceRevoked(in Object serviceEvt);

boolean hasService(in string serviceName);

void addBeanContextServicesListener(in Object bcsListener);

void removeBeanContextServicesListener(in Object bcsListener);};

Table 6.3. GatewayContext interface extending Beancontext and DARP interfaces

As a containment service, the Gateway interface, extending the Collection interface in Table 5.10, includes add/remove methods to add or remove an object, specified in their parameters, to or from the children of the called object. It also includes addAll and removeAll to add and remove a collection of objects, and the clear method to remove all of its children. In addition, it will provide inquiry methods to check whether one object or collection of objects exists as a child with contains and containsAll methods, or whether it has empty children with the isEmpty method. The toArray and iterator methods return the children in an array and in an Iterator object, respectively. We used these methods to implement our high-level methods inside the Gateway interface.

We assign object identifier (objectID) to each object instantiated in the system with method createObjectID that takes the abstract name (relative name) of an object in the productName parameter and returns its ID. For example, if a user adds a module with the name “fileManager” to Context with the objectID “uc_1/uc-2”, then the object ID of this module will be “uc_1/uc_2/fileManager”; getObjectID returns the ID of the called object.

The addNewModule method assigns an object ID and the module with the name productName, specified in IDL definition of user modules as string constant, inserts into this Gateway context. This method creates the proxy object for this new module instance at the Gateway server holding the downloaded front-end applet. In a similar way, the addNewContext method adds the new context inside the called context by instantiating the Gateway object.

RemoveModule and removeContext remove a module and a context, specified with parameter source from their parent contexts respectively, and disconnect from the CORBA ORB object. These methods also remove the incoming and outgoing connections of the source object, if any, and update the related binding tables. If the removed object is an outermost context, master, or slave server, it will be deactivated or shut-down immediately; removeContext also recursively (Depth First Traversal-DFS) visits the children of the object and calls the removeModule or removeContext method for each one, according to the visited child; removeMyself is a general method that will perform the same function as the addNewModule or addNewContext method, depending on which object removeMyself is called on. The removeLocalChildren method should be called on a context object, makes DFS traversal over the children, and calls the removeModule or removeContext method for each one, according to the visited child.

Figure 6.1. The details of how addNewModule is executed in the Gateway system

As shown in Figure 6.1, the addNewModule method request has an eight-step process. First, the user sends the addNewModule request to the proxy object of the context named “uc_1” into which he wants to add a module named “M.” The proxy forwards this call to context “uc_1”, which then instantiates the specified module from Module Factory. The context assigns a unique identifier (objectID) for this created module object, Mobj, and returns it to its proxy. The proxy creates a new proxy for Mobj and adds it to the children of its real corresponding object, context “uc-1”; addNewContext has the same semantics as addNewModule, but it instantiates a context object, not a user module.

The removeModule operation (Figure 6.2) is the most complicated one of the Gateway methods. As indicated in Figure 6.2, the M1-M2 and M2-M3 connections were established before. First, the user sends a removeModule method call with a parameter Module M2 object to the proxy of context UC1. Context UC1 then removes local connections going to or from module M2--currently, there is only an M2-M1 connection. During its firePropertychangeEvent method call, the UC1 removes Module M1 and its proxy, PM1, and calls the propertyChange method of parent context from M1, UC0. Actually, UC1 sends the request to the proxy of UC0, PUC0, which forwards it to UC0. Finally, UC0 removes the association entry, M1-M2, from the binding table.

Figure 6.2. Interaction of objects during the removal of Module M2

The deactivate method is called internally when removeContext encounters a context acting as a server, and will shut down the server. The copyModuleBinding first completely copies all incoming and outgoing connections of sourceModule to the target object there, makes a call removeModule method for module sourceModule, and puts the object there into the resulting vacant position.

There are plenty of event-attaching methods in the Gateway interface for two types of events: push and pull. Each event category has three subcategories--generic event, property event, and vetoable event. The push type of event has four parameters: the event source object is source, the identifier of the event is eventID, the event target object is target, and the called method of target when the event is fired from the source object is targetMethod. The fired event is encapsulated and automatically delivered in a CORBA object on behalf of the fired object.

For a generic push event, attachPushProperty and removePushProperty attach and detach event binding, respectively. The property push event has attachPushProperty and detachPushProperty methods to connect and disconnect this type of event. The vetoable push property has attachPushVetoableProperty and detachPushVetoableProperty methods for event connection and disconnection.

The attaching (attachEvent) and detaching (attachEvent) of all types of events can be pictured as seen in Figure 6.3-6.4. As shown in Figure 6.3, to create an event binding from module M1 to M0 for event “propName,” the user sends to the proxy of context UC1 an attachEvent request with four parameters, PM0 and PM1 (proxies of M0, and M1), “propName,” and target method of M1, “targetMethod.” Context UC1 removes the association entry of M1-M0 in its binding table and adds itself as property “beanContext” change listener of modules M0 and M1.

DetachEvent, as shown in Figure 6.4, is the opposite of the attachEvent method call. However, instead of adding itself as a property listener to modules M0 and M1, UC0 removes itself as a listener for Modules M0 and M1 if there are no connections to these modules.

Figure 6.3. Making an association between Modules M1 and M2.

Figure 6.4. Making a dissociation between Modules M1 and M2

In the pull type of event, an event is not sent automatically to the target but captured and stored in a buffer area of the parent context of the firing user module. Therefore, the event target object has to pull an event explicitly to get it through the pull method call of the BeanContextChild interface. This pull method actually calls the pullEvents method of the Gateway interface; pullEvents first detects the requestor object calling this method and checks any events waiting for this object, collects them in an array and returns it. Because the target method is not needed in this type of event call, we dropped the targetMethod parameter. The other three parameters have the same semantics as the push type of event method.

For a generic pull event, attachPullProperty and removePullProperty attach and detach event binding, respectively. The property pull event has attachPullProperty and detachPullProperty methods to connect and disconnect this type of event. The vetoable pull property has attachPullVetoableProperty and detachPullVetoableProperty methods for event connection and disconnection.

The getContext method gives the context object whose object ID is given in the bindingName parameter, which must include the full path of this object. For example, in order to find the context with the relative name “myContext” inside the context with ID “uc_1/uc_2/uc_3”, the user has to give “uc_1/uc_2/uc_3/myContext” for bindingName.

For persistency, the Gateway interface has only one method, saveStateInXml. The way the current state of this object is saved is given by the saveCase parameter. The state of any context object includes the properties of objects nested recursively inside this context and the connections among these objects. The input “ascii” of saveCase indicates that the value of properties is stored as different XML elements that include them in string type. “Binary” means that properties are saved as binary; saveStateInXml makes a DFS traversal over its nested children objects and generates an XML document to save their state.

The user also can save the distributed state with saveStateInDataBase, which stores all the related information about the application in the database. (It really makes a traversal on the object tree.) When it encounters a user module, it first saves its attributes such as object ID, CORBA IOR, parent IOR, and some other information with the method call savePropertiesWithJDBC. When it finds a context, it will first save the information about the context object itself, second recursively its children objects consisting of other contexts and modules and third translation table holding outgoing connections from nested modules.

The Gateway system supports a single point of service discovery. We treat service objects as normal user modules. For this, we have several methods: addService adds the service with the name serviceName into the children’s list of this Gateway object by calling the addModule with the parameter serviceName. After adding the service object, we call the serviceAvailable method of all the listeners registered by the addBeanContextServicesListener method call in order to inform them of this new service object.

The getService method looks for the service with an absolute relative name specified in serviceName and makes a DFS traversal of the whole hierarchical tree from the root node of the master server in order to find this service. If it can’t find any, it instantiates one with the addService call and returns the created service. During DFS it will try to find one service whose relative name matches the serviceName parameter; revokeService is the opposite of the addService method, but it will call the serviceRevoked method, with the input of the removed service object, of all the listeners and remove these listeners with the removeBeanContextServicesListener from their parent context; hasService checks whether this service with serviceName is available or not.

6.1.3 DARP Interface

Two major experiments were our principal accomplishment through our development of DARP and Gateway. Although we succeeded in using two systems together in the SC97 demo, we were not able to present a clear architecture of how these interact with each other. Because WebFlow doesn’t have the capability of adding DARP functions directly into its middle tier, we will describe the complete architecture of how DARP and Gateway can work together.

As stated in Section 3, we have a DARP manager that abstracts a parallel application consisting of multiple nodes into a single node. Through the DARP manager, the user can control a specific processor node independently of other nodes. If looked at carefully, it is not difficult to deduce that the manager functionality should be incorporated into GatewayContext’s object behavior; that is, GatewayContext and the DARP manager are merged into a single object. As a result, the manager functionality is implemented as a CORBA object, and thus DARP’s front end will interact with the manager through CORBA requests instead of through TCP/IP protocol with XDR encoding. Similarly, the HPF server gets DARP commands through a combination of send_deferred and poll_response method calls of the CORBA Object instead of through TCP/IP protocol.

The complete path for sending DARP requests from the front end to the processor nodes is a somewhat complicated process. For a processor pX, the front end calls the DARP method, putNextCommand, of the Gateway context object, which stores this request with its parameters inside a tampon buffer, the requestBuffer. This buffer is also called as a DARP request buffer that holds requests for each processor. This method call is based on the asynchronous DII methods, send_deferred and poll_response. An example implementation of the putNextCommand of the manager can be like the method in Table 6.4.

/* Put a simple DARP command without parameters for processor pn into request buffer */

public synchronized void putNextCommand(String command, int pn)

{

synchronized(requestTable){

Vector list = (Vector) requestTable.elementAt(pn);

List.add(command); /* append new incoming command to the list of commands for pn */

RequestTable [pn] = true;

NotifyAll(); /* make up the sleeping method getNextCommand */

}

while(!resultAvailable[pn]){ /* loop back until this DAPR method is processed by processor pn */

try{

wait();

}catch(InterruptedException ex){};

}

resultAvailable[pn] = false;

notifyAll(); /* enable another DARP command */

/* Get the result from resultTable that came for this method request,

return the result in either parameters or as return value of this method depending on this method */

/* Remember this method is just a simple representative of DARP methods */

}

Table 6.4. A simple representative DARP method

Processor pX calls the asynchronous communication method send_deferred, which actually sends the getNextCommand method request to the manager to receive the next user command after completing the processing of the previous DARP command. A possible realization of the manager method, getNextCommand, is shown in Table 6.5.

In TCP/IP-based implementation of DARP, the HPF server checks any DARP command before each executable statement of the user code by making a select function call. A new, object-based implementation of the HPF server makes a poll_response method call to check whether a response for previous DII request through send_deferred is available.

public synchronized String[] getNextCommand(int pn) /* pn: pro cessor ID */

{

Vector list=null;

while(!requestAvailable [pn]){ /* check if any DARP command came for me. Otherwise loop back */

try{

wait();

}catch(InterruptedException ex){};

}

Object []olist=null;

Synchronized(requestTable){

Try{/* get the DARP command(s) from tampon request buffer staying for me */

List = (Vector)requestTable.elementAt(pn);

}catch(ArrayIndexOutOfBoundsException ex){};

olist = list.toArray();

list.removeAllElements();

requestAvailable[pn] = false;

}

String []rlist = new String[olist.length];

for(int k = 0;k<olist.length;k++) rlist[k] = (String)olist[k];

return rlist;

}

Table 6.5. How the DARP server gets the next command from the middle tier

The HPF server gets the DARP command(s) and processes, if any are available, and calls the putNextResult method of the manager to put the previous DARP command result into the resultBuffer of the manager. The resultBuffer, like the DARP request buffer, keeps results and has as much size as clients registered into this manager. A possible implementation of putNextResult can be made as follows:

After the result arrives at the manager and is put into the slot specified in the DARP result table, the current DARP request, waiting for this result, wakes up and gets the result from the resultBuffer. Then the current request returns the result either in parameters or as a return value, depending upon which DARP method is being used. This is a possible solution because of the Java security sandbox problem. If you sign the front-end applet, we can interpose a CORBA event channel between the manager and the applet, which runs a CORBA server, registers the server with the channel, and gets automatic notification of the command results. All of these invocation sequences are illustrated in Table 6.6.

Public synchronized void putNextResult(String result, int pn)

{

synchronized(resultTable){

/* find the entry inside resultTable for processor pn */

Vector list = (Vector)resultTable.elementAt(pn);

list.add(result); /* append the result of previous command into this entry */

resultAvailable[pn] = true;

notifyAll(); /* wake up the client waiting */

}

Table 6.6. How the DARP server puts the result of a previous client command into the middle tier

As shown in Figure 6.5, multiple clients can connect and register with a running parallel application organized and orchestrated by the DARP manager, and each client can access the state of application through the DARP data-access methods and steer and control the application through collaboration with each other.

Notice that for each DARP command in the previous TCP/IP-based implementation we have a CORBA method in the WebClowContext interface. All the semantics of the DARP commands are reflected in the matching interface method. Configuration and the starting and stopping of parallel applications are achieved through the DARP manager. SciVis visualization and instrumentation servers can be attached into the Gateway system as a Gateway service. As a result, we have a completely distributed program development environment. Notice also that instrumentation can be performed by directly using YACC and LEX with necessary actions attached to the end of matching grammar productions instead of by using SAGE++. These actions will produce the instrumented program, inserting HPF server calls and the function calls for registering program variables.

Figure 6.5. How the DARP server, middle tier, and client interact with each other to fully control the distributed execution

Interface DARP{

Typedef sequence<long> longArray; typedef sequence<any> anyArray;

Typedef struct basicCommandDataDef {

String commandName;

Long lineNumber;

Long pid;

Long errorCode;

}basicCommandData;

typedef struct dataAccessResultDef{

basicCommandData basicData;

longArray dataDesc;

any dataValue;

}dataAccessResult;

typedef struct protoTypingResultData{

basicCommandData basicData;

any dataValue;

}protoTypingResult;

any getNextReqFromManager();

void putPrevReqResult(in any theResult);

//Data access methods

void displayVariable(in long pn,

in string varName, out dataAccessResult theResult);

void displayArraySection(in long pn,

in string varName,

in string arraySectionStmt,

out dataAccessResult theResult);

void setVariable(in long pn, in string varName,

in any varValueout ,

out basicCommandData theResult);

void setParallelVariable(in long pn,

in string varName, in any varValue

out basicCommandData theResult);

//prototyping commands

void createInterpretedStmts(in long pn,

in string stmt,