A National Software Exchange for the High Performance Computing and Communications Program


Jack Dongarra
Geoffrey Fox
Ken Kennedy
Jim Pool
Rick Stevens

Center for Research on Parallel Computation

We propose to facilitate the development and distribution of software enabling technologies for high performance computing by developing a National Software Exchange (NSE) for High Performance Computing and Communications. Key components of the project are: distribution via the national information infrastructure, a multilevel software classification system, a network-based catalog that will serve as a "road map" to important HPCC enabling technologies, a system for selective promoting of important emerging technologies, a research and development program in advanced network-based software distribution mechanisms, an outreach and technology transition effort and a methodology for measurement of the success of the project.

The Center for Research on Parallel Computation will implement this system over a five-year period, emphasizing early deployment of the simplest mechanisms early in the cycle.

Audience

The potential audience for the NSE has three important components:
  1. The HPCC application and computer science community. These groups will be a major source of material for all aspects of the NSE but they will not be the major users. The HPCC applications community often needs to develop highly optimized components from scratch and so generic reusable components as in the NSE may not be so helpful. Furthermore, the Grand Challenge program has already linked application scientists strongly with the computer science community.

  2. Users of NASA, NSF, DoE and other federal centers. These users are an important audience for the NSE because they are less knowledgeable and less performance oriented than the first class. They will be users of good if not optimal libraries and templates. Further we have a natural support organization in place through the consultants at the supercomputer centers which they use. We intend that the NSE work closely with these consultants and other supercomputer center staff to both design improve and disseminate the NSE.

  3. Other users of high performance computers. This group includes the many current and potential industrial users for whom the NSE will be a very valuable resource. This category presents special challenges as there is no natural support organization to help these users.

Software Distribution via the National Information Infrastructure

The key to a successful software distribution system will be the establishment of a scalable mechanism for distributing software via the national information infrastructure. Many Internet sites have sizable collections of documents or software. It is both unnecessary and undesirable to require that these collections reside at a single site. We plan to put together a CRPC HPCC Mosaic Home Page that will capture in one place the software from the HPCC effort.

We plan to develop a software exchange capability for the HPCC community based on existing network interfaces such as Mosaic. This capability is intended to provide mechanisms for the community to access a number of existing software.

The maximal benefit to the community will be achieved by having an architecture for software exchange that is open. This will allow a network repository to grow over time. Initially, we will provide existing software technologies to the HPCC community. This will be done by building on existing components to build a prototype that will serve as a model of how such a system should be put together. This prototype can be in place and operational within a month and will evolve through several stages of experiment, prototype, and operating capability.

This prototype for the HPCC National Software Exchange will demonstrate how software can be effectively exchanged and reused. It will be based on the underlying premise (reflecting today's software practices) that successful software modules and software, although they may take advantage of pre-existing components, are written primarily by a single team. Software is re-used either by incorporation as subroutines or by use as a starting point for writing new code.

The system will have the following attributes:

Multilevel Software Classification System

Software in the National Software Exchange will be classified into multiple level, representing different levels of evaluation that the software has passed. Initially, there will be two levels:

Level 1: software that is contributed by authors with no review process except for a mechanical check that the software meets certain minimum requirements such as providing simple documentation and keywords for use in a classification system.

Level 2: software that has been reviewed by a process to be described shortly and certified to meet certain minimum quality and usefulness criteria.

Eventually, higher levels of quality, and an associated review process, can be established. Possible criteria might be robustness and level of support provided by the authors.

The process for certification at Level 2 would be as follows: First, the authors would propose such a move, including a proposed classification for the software. Based on the initial classification, the review would be assigned to one of the CRPC scientists who will serve as area editors in the process. The area editor would then select 2 to 6 reviewers who would review the software and provide a review report. Based on this review, the area editor would accept the software, return it to the author for specific improvements or reject it. Once accepted, the reviewers' reports would be used to incorporate information about this software into the road map described in the next section.

The following is a list of the people within CRPC who would serve as senior area editors and would recruit other area editors as needed.

Area Editors:

Technology Transition

We described three classes of users who we expect to use the NSE. The first class, the ongoing HPCC community, are knowledgeable enough that the NSE will be directly usable without additional outreach efforts. The road map included with the NSE will be sufficient for technology transition to these users. The other users will certainly benefit from the purely electronic NSE but we expect that additional proactive efforts will be needed for technology transition. The second class of users are supported by existing supercomputer centers and we propose that the NSE group proactively work with the staff of these centers to allow them to provide consulting and other help needed for this class of NSE user. In particular we expect that the supercomputer centers will bring the NSE to the attention of these users and provide us critical help in designing and implementing system to be effective.

The remaining users in the third class do not have such a clear support organization but we have already started two efforts that are prototypes for supporting these users. In the first we are working with the NSF supercomputer centers on a set of educational activities which will reach out to the many state and regional supercomputer centers and hence to their support staff. In this way we can extend the cadre of those qualified to help the less sophisticated NSE user by teaching the consultants at these regional sites.

We have also started at direct outreach to small businesses in our InfoMall project which we can use as a prototype for other needed outreach activities. For instance New York State has funded NPAC to provide very proactive HPCC business consulting for the mid-Hudson region of the state. We will use this and other initiatives to prototype outreach to nontraditional users of supercomputer and HPCC resources.

Methodology for Measurement

We will conduct systematic surveys to determine the extent of usage of the NSE as a whole and individual software components within the NSE. Survey forms would be submitted electronically and would be very simple to ensure a high return rate. They would ask questions designed to determine whether a software component downloaded from the NSE is actually in use and, if so, the volume of usage. We would attempt to survey all the usage of level-2 software and a randomly-chosen subset of level-1 contributions. We would pay particular attention to usage within industry, tracking that as accurately as possible