NPSS on NASA's IPG:
Using CORBA and Globus to Coordinate Multidisciplinary Aeroscience Applications

Isaac Lopez, Gregory J. Follen, Richard Gutierrez

NASA Glenn Research Center

Ian Foster, Brian Ginsburg, Olle Larsson, Stuart Martin, Steven Tuecke, David Woodford

Argonne National Laboratory

 

Within NASA's High Performance Computing and Communication (HPCC) program, the NASA Glenn Research Center is developing an environment for the analysis and design of aircraft engines called the Numerical Propulsion System Simulation (NPSS) [1]. The vision for NPSS is to create a "numerical test cell" enabling full engine simulations overnight on cost-effective computing platforms. To this end, NPSS integrates multiple disciplines such as aerodynamics, structures, and heat transfer and supports "numerical zooming" from 0-dimensional to 1-, 2-, and 3-dimensional component engine codes. To facilitate the timely and cost-effective capture of complex physical processes, NPSS uses object-oriented technologies such as C++ objects to encapsulate individual engine components and Common Object Request Broker Architecture (CORBA) Object Request Brokers (ORBs) for object communication and deployment across heterogeneous computing platforms.

Recently, the HPCC and the BASE R&T programs have initiated a concept called the Information Power Grid (IPG) [2], a virtual computing environment that integrates computers and other resources at different sites [3]. IPG implements a range of Grid services such as resource discovery, scheduling, security, instrumentation, and data access, many of which are provided by the Globus toolkit [4]. IPG facilities have the potential to benefit NPSS considerably. For example, NPSS should in principle be able to use Grid services to discover dynamically and then coschedule the resources required for a particular engine simulation, rather than relying on manual placement of ORBs as at present. Grid services can also be used to initiate simulation components on massively parallel computers (MPPs) and to address intersite security issues that currently hinder the coupling of components across multiple sites.

These considerations led NASA Glenn and Globus project personnel at Argonne National Laboratory to formulate a collaborative project designed to evaluate whether and how benefits such as those just listed can be achieved in practice. This project involves, first, development of the basic techniques required to achieve coexistence of commodity object technologies and Grid technologies, and second, the evaluation of these techniques in the context of NPSS-oriented challenge problems.

The work on basic techniques seeks to understand how "commodity" technologies (CORBA, DCOM, Excel, etc.) can be used in concert with specialized "Grid" technologies (for security, MPP scheduling, etc.). In principle, this coordinated use should be straightforward because of the Globus and IPG philosophy of providing low-level Grid mechanisms that can be used to implement a wide variety of application-level programming models. (Globus technologies have previously been used to implement Grid-enabled message-passing libraries, collaborative environments, and parameter study tools, among others.) Results obtained to date are encouraging: a CORBA to Globus resource manager gateway has been successfully demonstrated that allows the use of CORBA remote procedure calls (RPCs) to control submission and execution of programs on workstations and MPPs; a gateway has been implemented from the CORBA Trader service to the Grid information service; and a preliminary integration of CORBA and Grid security mechanisms has been completed.

The challenge problems considered were as follows:

  1. Desktop-controlled parameter study. Here, an Excel spreadsheet is used to define and control a CFD parameter study, via a CORBA interface to a high throughput broker that runs individual cases on different IPG resources.

  2. Multicomponent application. Here, three distinct components ADPAC, NPSS and a controller program--are launched (on workstations or MPPs) and controlled via Globus mechanisms. The components then communicate among themselves using CORBA.

  3. Aviation safety. Here, ~100 near-real time jobs running NPSS need to be submitted and run and data returned in near-real time.

In our work to date, we have obtained preliminary results for the first two of these problems. This paper presents the following information:

  1. A detailed analysis of the requirements that NPSS applications place on IPG.

  2. A description of the techniques used to meet these requirements via the coordinated use of CORBA and Globus.

  3. A description of results obtained to date in the first two challenge problems.

The evaluation criteria used to report the results include time to port, execution time, potential scalability of simulation, and reliability of resources.

 

NPSS and the Grid

NPSS is interested in creating an architecture that adopts standards, or application program interfaces (APIs) which provide the ability to assemble engine simulations from a building block approach. In this way, the NPSS Architecture takes advantage or "leapfrogs" to the best ideas without re-architecting the whole NPSS system within the shortest time possible. NPSS has already done this when it adopted the object-oriented paradigm (leveraging its re-usability and extensibility features) and in adopting CORBA as the means of moving objects around a distributed computing simulation. Additionally, this approach has successfully been used in building a CAD API called CAPRI for common access to geometry within the NPSS Architecture.

Following the NPSS roadmap, a 0-dimensional engine system, written in C++ based upon object-oriented design, was first created. Designed into this system were the appropriate objects to assemble a multi-fidelity and multidisciplinary engine analysis capable of accessing engine component codes across differing computing platforms. This was and is the power of the object-oriented design. As NPSS emerged from concentrating on the 0 dimensional analysis and began to define the required architecture to assemble 1-, 2-, and 3-dimensional codes, the need for batch scheduling software emerged as a focused requirements. While batch-scheduling software has existed for some time (PBS, LSF, Condor), NPSS has never been in a position to dictate one piece of software over another. Indeed, in order to create a simulation using the most desirable codes, NPSS has to create an architecture that does not exclude the use of certain 1D, 2D, or 3D codes simply because those codes use a certain piece of software that is somehow incompatible with the architecture. What is required is portability, which is often defined in terms of making software run everywhere. While NPSS agrees with this position, NPSS further defines this concept to include "reach everywhere." If a particular piece of software executes on only one platform, NPSS should not force the conversion of that code to some favorable computing platform. Rather, NPSS should provide a means to reach that platform --that is power and flexibility of an object oriented design implement in C++, CORBA and now the Grid.

The NPSS project’s specific interest in the Grid centers on a means of providing transparent access to the differing platforms required to execute the various codes that are coupled to create a specific NPSS simulation. The task of "providing access" as we envision it embraces a wide range of problems: resource discovery, authentication, authorization, potential privacy of data, executable staging, scheduling, computation monitoring/control, and so forth.

The heavy use made of CORBA within NPSS introduces another set of concerns heretofore not encountered in developing and using Grid resources. While CORBA provides numerous attractive features for developers of complex systems such as NPSS, CORBA implementations are not typically constructed so as to support the specialized resources encountered in Grid environments (e.g., MPPs, high-speed networks) or to exploit specialized services provided by Grid systems such as NASA’s IPG (e.g., public key authentication). The effective use of Grid concepts within NPSS requires that methods be found that will allow CORBA and Grid services to co-exist.

NPSS's Interest in Globus - Portability, Security, and Reduced Turnaround Time

Portability

Globus has come along at the same time NPSS has started to formally design the 1D, 2D, and 3D object infrastructure required to assemble the aero CFD, structural, thermal, and acoustic codes that use schedulers such as PBS, LSF, and Condor. Within NPSS’s definition of portability, Globus provides a leapfrogging technique. Assembling a simulation comprising of not only multi-fidelity, multidiscipline codes but also multiple batch schedulers would not be possible without a tool like Globus. NPSS strives to be "scheduler indifferent" by adopting an appropriate API to build upon. Stability and extensibility within the NPSS Architecture can be achieved by adoption of the appropriate API’s.

 

Security

NPSS requires a security infrastructure that is capable of spanning multiple administrative domains. Its need to "reach everywhere" implies that different NPSS components may need to run on, and communicate between, resources that execute at different sites. Each resource may be governed by site specific policies and procedures for remote access and use. The Globus Security Infrastructure is designed specifically for this sort of multi-institutional, distributed computing environment. It provides features such as: single sign-on access to resources spanning multiple administrative domains; automatic mapping to local accounts and security mechanisms (e.g. Kerberos) within a domain; and delegation of security credentials so that the NPSS components running on the various resources can act on the user's behalf when authenticating with each other, and when accessing storage and other resources.

Reduced Turnaround Time

Researchers at the Glenn Research Center have been cycle hounds for some time. While there has not been any need to cluster Palm Pilots together into a network-addressable secure computing platform, the technology and the knowledge to do just that are currently available. Globus's ability to be "scheduler indifferent" in finding available resources for NPSS simulations helps to reduce or maintain a required overall simulation turnaround time. Without this, certain computing resources known only to the individual schedulers may become overloaded, exhausting the NPSS available resource pool and making the system unable to maintain a desired simulation turnaround time. Globus offers NPSS an extension into the available pool of computing platforms outside a particular scheduler's domain.

Understand that the NPSS architecture has been tasked with supporting simulations that must minimally execute to a solution overnight on cost-effective computing platforms regardless of the complexity of the assembled simulation. As the architecture matures, the simulation execution time must be reduced even further, approaching "realistic time" if not "real time". Positioning to use tools like Globus is an appropriate strategy for NPSS.

CoG Kits: Integrating Commodity Technologies and the Grid

The NPSS project's interest in using Grid/Globus services within a "commodity" (CORBA in this case) context meshed well with the research goals of the Commodity Grid Toolkits (CoG Kits) project being performed by the Globus project team. In concept, the notion of a CoG Kit is straightforward: it defines and implements a set of general components that map Grid functionality into a commodity environment/framework [5], allowing, for example, an application to be expressed in terms of familiar CORBA concepts and services, while still exploiting specialized services (e.g., security, resource discovery) provided by an underlying Grid environment. In practice, defining appropriate components and mappings is far from trivial and indeed can raise challenging research issues.

To date, CoG Kit project participants have developed a preliminary Java CoG Kit, which has already proved useful in a variety of settings, and have prototyped elements of a CORBA CoG Kit. It is the latter that we exploit in the work described here. In brief, the work with a CORBA CoG Kit has addressed the following issues:

 

Desktop-controlled Parameter Study

The objective of the desktop-controlled parameter study was to develop the CORBA infrastructure support within Globus. The NPSS V1.0 code was used for this demonstration. The NPSS code is a 0 dimensional aero-thermodynamic engine model that also has a number of design features for zooming and multidisciplinary coupling. However, used in its simplest form, NPSS V1.0 provides the performance of a given engine over its flight regime in both steady-state and transient operations. In characterizing an engine’s performance, hundreds to thousands or runs of NPSS V1.0 will be executed. To affect the need to execute 1000’s of jobs, NPSS and the Argonne team experimented with the use of a Globus-based system called the High Throughput Broker (HTB), which supports the mapping of a collection of tasks to Globus-accessible resources, handling, for example,

Using Globus to deploy 100-300 NPSS V1.0 jobs and returning all these results back into one time step presents an attractive use of Globus for engine design studies. The sample demonstration involved initiating 100-300 NPSS V1.0 jobs from an Excel spreadsheet. The individual NPSS jobs communicated between COM and CORBA and the Globus HTB service to deploy, manage, and return results to the spreadsheet for later analysis using the available graphing and editing features of Excel. Pictorially, this is represented by figure 1.

Figure 1

 

 

 

Results of Parameter Study

The creation of the ModelInfoServer and the COM/CORBA bridge took about two weeks. This time included both experimentations with various created versions and debugging. Most of the time was taken up with interfacing Excel via Visual Basic for Applications (VBA) to the created COM object and with getting a working version of Globus on machines for testing. For much of the development, the HTB ended up being stubbed out so as to test the rest of the simulation.

The rapid pace of testing and development was achieved by using Python. Python is an interpreted, interactive, object-oriented high-level programming language with dynamic semantics. It is often compared with Tcl, Perl, Scheme or Java. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components.

Python supports CORBA and provides COM integration. The first version of the ModelInfoServer, which was a CORBA server, was coded in 1 day in C++. For portability (languages like Python and Perl are far more portable than C++/Java) and ease of modification, it was recoded in Python. This took but a couple of hours. The COM integration was even more astonishing. Via some Python and COM magic, a Python class can be turned into a COM server via the inclusion of a few variables, for things such as the COM classid, and a registration function. Most important, this code does not get in the way of the regular functioning of the Python class. Thus a non-COM test suite could be easily developed, and the same class used to control the simulation could be used in a non-COM/windows environment. To test and debug the COM server object or the regular Python class was merely the difference between

#Create and attach to a COM Object registered as "NPSSDemo"

o=win32com.client.Dispatch("Python.NPSSDemo")

and

#Create and instance of NPSSDemo

o=NPSSDemo()

The functionality of the COM/Corba bridge consisted mostly of adapting the CORBA interfaces to COM interfaces. This included mapping function calls and massaging data from one format to another. The COM/CORBA layer included the so-called business logic, that is, the place where the work gets done. Thus, some functions were easy one-to on- mappings:

def pauseRun():

self.session.pause()

def getSession(self):

if not self.session:

print "...creating a session"

status,session=self.fact.create(self.sessionName)

if status != HTB.SUCCESS:

raise 'could not create session'

if session is None:

raise 'session is None'

self.session=session

return self.session


 

while others massaged the data, as in this example. Here the data from the infoServer.listModels command is actually returned as an array of structures that VBA cannot understand, so we put the relavant information into a list and return it to VBA through COM.

def getModelNames(self):

self.safeConnect()

shortModelList=self.infoServer.listModels()

ret=[] #creating a new empty list

for item in shortModelList:

ret.append(item.name)

return ret

As stated before, the first incarnation of this was done in C++, and although the wonderful tools provided with Visual C++ makes it easy to create the creation of a COM object , the rapid development and portability of the Python approach won out.

Execution Time

Execution was a little slow. Some of it could be attributed to the actual staging of files, as files would go from the location of the ModelInfoServer to the location of the HTB, then finally to the final destination of the machine. The size of the transferred files was around 28 MB. There were 19 files, plus one special case file generated per task executed. The size of the per case file was usually less than 1 KB. If we were running 100 jobs, we had 119 files, which would be 28M + 100 * (<1 KB). Since the purpose of this first demo was to explore the benefits provided by the CORBA coupling, the focus was on functionality rather than on speed.

Scalability and Reliability

Since Globus resources, predominantly the HTB were responsible for staging and running the jobs, scalability and reliability are essentially guaranteed. The issues will probably resolve more along how long it takes the HTB to pump out jobs, and how many sustained jobs it can maintain. These will all be handled at the HTB level though.

Problems did arise with getting a usable version of Globus 1.1 up and running on our machines at first, but those have all been resolved.

Multicomponent Application

The objective of this demonstration was to deploy a mixed-fidelity engine simulation over Globus. Two codes, ADPAC and NPSS V1.0, were chosen to perform the simulation of the complete low-pressure system of the Energy Efficient Engine (EEE). ADPAC is an MPI-based code that modeled the low-pressure subsystem in 3D, while the NPSS V1.0 modeled the engine core in 0D. The two codes communicated with each other through CORBA. This simulation was developed to measure the sophistication of Globus’s ability to deploy a mixed-fidelity, MPI CORBA simulation. The engineering purpose of the simulation was to determine the RPM at which the power required by the fan is balanced by the power available from the low-pressure turbine. The entire engine was simulated at different fidelity levels depending upon the required accuracy. Figure 2 shows the multi-fidelity engine model simulation. Three-dimensional CFD analysis of the low-pressure subsystem was performed by two instances of ADPAC represented by the blue area in the figure, while NPSS V1.0 was used for a cycle (0D) analysis of the core represented by the green area in the figure. The ADPAC code solves a three-dimensional problem of the Reynolds-averaged Navier-Stokes equations using a time-marching, finite volume algorithm. The ADPAC code employs a flexible multiple-block structured grid discretization scheme with user-configurable boundary conditions. This package provides a flexible aerodynamic simulation environment for complex compressible flows. Numerous features are available for the simulation of multistage turbomachinery flows. The ADPAC code uses coarse-grained domain decomposition for parallel processing with interprocessor message passing. The ADPAC code has been validated for a wide variety of propulsion flow applications including multistage compressor and turbine turbomachinery predictions, inlets, nozzles, and propellers.

The demonstration at SC'99 was run across various NASA IPG hosts via the Globus utilities "globus-gass-server" and "globusrun." Communication between the various analysis codes was done via CORBA, while communication within the parallel CFD codes was done via MPI. A Java executive implemented a simple iterative solver to determine the RPM and provided a GUI to view the simulation's progress.

 

Figure 2

 

Results of the Multicomponent Application

A full evaluation of this simulation was not completed in time for reporting within this paper. Suffice it to say that the subject simulation passed a believability toll-gate in that the Globus did not interfere with this simulation even though minimal Globus features were exercised to this point.

 

 

 

Time to Port

Execution Time

Scalability

Reliability

Multicomponent application

Hard to measure: must include learning Globus and acquiring security ticketing. After this time to port is minimal.

Similar to any batch queuing system – Sluggish at startup, then no difference.

For particular jobs very scalable. Those that require intense messaging - not good.

No difference to any current batch queuing systems –LSF, PBS, etc.

 

 

Summary

 

The team has described a project designed to evaluate the feasibility of combining "Grid" and "commodity" technologies, with a view to leveraging the numerous advantages of commodity technologies in a high-performance Grid environment. Results obtained to date are encouraging. The team has successfully demonstrated a CORBA to Globus resource manager gateway that allows the use of CORBA remote procedure calls to control submission and execution of programs on workstations and MPPs; a gateway from the CORBA Trader service to the Grid information service; and a preliminary integration of CORBA and Grid security mechanisms. We have applied these technologies to two applications related to NPSS, namely a parameter study and a multi-component simulation.

In future work, there are plans to build on these foundations with the goal of enabling complex simulations to be solved in quasi real time. As an example, NPSS will be supporting the Aviation Safety Program (AvSP) concept of modeling the National Airspace System. In this simulation the NPSS V1.0 code will be deployed onto a Globus computing platform capable of processing 2000-3000 flights per day. NPSS V1.0 will accept flight data from airline departures, route and landings for a major US airport currently sized to handle 3000 flights per day. Flight data will be transmitted to NASA Glenn Research Center from NASA Ames Research Center over a Web based CORBA connection where NPSS V1.0 will process flight data and return an appropriate number of engine performance and risk assessment parameters to Ames. NPSS expects to handle 5000-6000 engine models per day in realistic to real time.

 

 

 

Acknowledgments

Research activities at Argonne National Laboratory relating to NPSS and Commodity Grid Toolkits were supported by NASA's Information Power Grid project, by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Computational and Technology Research, U.S. Department of Energy, under Contract W-31-109-Eng-38, and by the National Science Foundation.

The authors would also like to express their appreciation to management of the High Performance Computing and Communications Program and the Glenn NPSS team.





[1] A. L. Evans, J. Lytle, J., G. Follen, and I. Lopez, An Integrated Computing and Interdisciplinary Systems Approach to Aeropropulsion Simulation, ASME IGTI, June 2, 1997, Orlando, FL.

[2] W. Johnston, D. Gannon, B. Nitzberg, Grids as Production Computing Environments: The Engineering Aspects of NASA's Information Power Grid, Proc. 8th High Performance Distributed Computing Symposium, IEEE Press, 1999.

[3] Ian Foster and Carl Kesselman (eds). The Grid: Blueprint for a Future Computing Infrastructure, Morgan Kaufmann Publishers, 1999.

[4] I. Foster and C. Kesselman, The Globus Project: A Status Report, Proceedings of the Heterogeneous Computing Workshop, IEEE Press, 4-18, 1998; see also http://www.globus.org/

[5] Gregor von Laszewski, Ian Foster, Jarek Gawor, Warren Smith, and Steven Tuecke, CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids, submitted for publication, 2000.