A: Innovative Claims

Development of novel technology Application Emulators/HLAM/PETASIM which will support rapid prototyping and the design phases for new machines, software and architecture. As high level this process will be robust and relatively fast so that large applications can be simulated on very high performance machines such as those contemplated in the PetaFlop initiative. We expect future computational steering systems to be enabled by our technology.

Development and formal specification of Hierarchical High Level Application Modeling Framework (HLAM). The HLAM will include mechanisms to allow users to provide: a) hierarchical description of applications, b) high level specification of a target machine, c) procedures to define how the hierarchical application description is mapped to the machine model and d) cost model to be used in performance estimation. All user interfaces will be built in Java to enhance broad distribution.

Development of a simulation framework (PETASIM) that is able to generate performance predictions for application-emulators using information provided by the high level application modeling framework HLAM. PETASIM will include support for the generation and use of data parallel aggregates as building blocks for specifying modules. PETASIM achieves high performance even on large applications by using event driven simulations for coarse grain task parallel parts of an application but exploiting explicitly the loosely synchronous structure of SPMD data parallel components.

Parallel execution of PETASIM will use novel adaptive load balancing and runtime compilation techniques developed by the team with other DARPA contracts.

Development of application-emulators that can be used to predict how future very high end architectures will perform on future leading edge applications. Application-emulators will be produced for two large and crucial application classes: 1) adaptive irregular computational science and engineering applications and 2) applications that carry out data analysis, data exploration and data fusion.

Detailed simulation tools will be used to define the cost model for HLAM (especially in I/O area) and to selectively validate the results of PETASIM.

HLAM high level hierarchical machine description will be basis of future extensions of today's parallel programming environments such as HPF and MPI which currently embody a simple two level (on and off processor) view of memory.

Technology transfer to several HPCC communities including commercial and academic designers of new machines and software, the PetaFlop initiative and the large HPCC education and training enterprise who will be able to use the pedagogical value of our high level descriptions. A General Dynamics/Raytheon/SAIC team will use project technology in developing the new ship design code LAMP.

B: Deliverables

Application-emulators

Construction of two application-emulators motivated by loosely synchronous adaptive applications and by data exploration and data fusion applications. An application-emulator is a suite of programs that, when run, exhibits computational and data access patterns that resemble the patterns observed in a particular type of application.

The irregular scientific application emulator will be designed to simulate the behavior of coupled adaptive unstructured mesh codes, integro-differential equations solvers and particle codes.

The data fusion application-emulator will be designed to 1) simulate data intensive applications that run on a single multiprocessor platform and 2) simulate an additional computational step where results from individual data intensive calculations are combined. The data fusion application emulator will emulate operations such as image segmentation, image registration and compositing. The emulator will carry out I/O to disk arrays. Application emulators will be validated by comparing performance (on current multiprocessor architectures) of application-emulators with performance of full applications.

Options:

The irregular scientific application emulator will be extended to simulate behavior of Monte Carlo, multipole and structured adaptive codes.

The data fusion application-emulator will be extended to take tertiary storage into account.

HLAM/PETASIM

Development and formal specification of Hierarchical High Level Application Modeling Framework (HLAM). The HLAM will include mechanisms to allow users to provide: a) hierarchical description of applications, b) high level specification of a target machine, c) procedures to define how the hierarchical application description is mapped to the machine model and d) cost model to be used in performance estimation.

Development of a simulation framework (PETASIM) that is able to generate performance predictions for application-emulators using information provided by the high level application modeling framework. PETASIM will include support for the generation and use of aggregates as building blocks for specifying modules.

Options:

Development of runtime/compile time techniques that support semi-automatic generation of HLAM/PETASIM aggregates and modules.

PETASIM support for additional collective data movement primitives which will increase range of multi-phase loosely synchronous problems handled. This corresponds to optimized support of the advanced MPI calls such as MPI_Gather and MPI_Address.

Development of optimized parallel implementation of PETASIM. We will focus our efforts on paralleling the data parallel portions of PETASIM but will tackle the event driven portions of the simulation as necessary. The computational demands associated with PETASIM will vary with the structure of a problem's HLAM hierarchical graph. In an adaptive problem, the structure of the hierarchical graph will change as the program progresses. We consequently anticipate that an efficient PETASIM implementation will require the use of adaptive load balancing methods.

Detailed Performance Simulation and Validation of HLAM/PETASIM

We will use the application emulators to produce application and machine specifications at varying levels of granularity and then use PETASIM to estimate performance obtained on selected current and future architectures. We will use detailed simulation tools developed at Maryland and at other sites to characterize the performance of the application emulators on selected current architectures (e.g. IBM SP-2) and on a limited range of future architectures.

We will use various techniques including detailed simulation tools, instrumented static and runtime compilation and analytic models to produce PETASIM cost models.

F: Cost Schedule Milestones

Year 1:

Initial development and formal specification of Hierarchical High Level Application Modeling Framework (HLAM).

Initial definition of PETASIM and the delineation of the relationship between PETASIM and HLAM. Construction of first HLAM and PETASIM prototype.

Initial versions of data fusion and adaptive application emulators. These first emulators will emulate only a single irregular or data parallel application. The adaptive application emulator will focus on emulation of unstructured mesh and particle codes. The data fusion application emulator will carry emulate processing and I/O associated with data exploration.

Year 2:

A variety of HLAM representations at different levels of granularity are generated for both application emulators. Initial performance predictions are carried out. Results are validated on current multiprocessor architecture using detailed simulation techniques.

Application-emulators extended to represent coupled applications running on network connected collections of multiprocessors.

Application-emulator performance is validated through comparison with performance of real applications.

Year 3:

HLAM representations are generated for application-emulators running on network connected collections of multiprocessors.

Irregular problem aggregation routines incorporated into HLAM/PETASIM software environment.

Irregular adaptive application-emulators extended to represent coupled adaptive unstructured mesh codes, integro-differential equations solvers and particle codes. Data fusion application emulators extended to emulate multiple coupled applications that carry out image segmentation, image registration and compositing.

Options:

Year 1:

The irregular scientific application emulator will be extended to simulate behavior of Monte Carlo, multipole and structured adaptive codes.

Year 2:

The data fusion application-emulator will be extended to take tertiary storage into account.

PETASIM support for additional collective data movement primitives which will increase range of multi-phase loosely synchronous problems handled.

Begin development of optimized parallel implementation of PETASIM

Year 3:

PETASIM cost function will be extended to take tertiary storage into account; results validated using microscope sensor application.

Development of runtime/compile time techniques that support semi-automatic generation of HLAM/PETASIM aggregates and modules.

Optimized parallel implementation of PETASIM