Focused Effort Title: Enforcing Scalability of Parallel CMS

PI Name: David Bernholdt
PI EMail Address: bernhold@npac.syr.edu
PI Telephone: 315 443 3857            PI Fax: 315 443 1973

Description: 

To turn our prototype Parallel CMS into a more robust and scalable
parallel module, we need to go more deeply into the CMS code and to
perform more significant reorganization of the inner loop. We propose
to develop such fully scalable production quality Parallel CMS module
as Year 4 Focused Effort. Apart from experimenting with and selecting
the optimal configuration of compiler pragmas, we will also explore
other parallelization techniques based on MPI, CRAY SHMEM, OpenMP and
SPEEDES communication models.

Accomplishments:

We are pursuing this task along several complementary lines of attack:

1. We need to understand the CMS source code in more detail to be able
   to re-engineer the inner simulation loop. For this purpose, we will
   need some help from Ft. Belvoir engieers. In turn, they suggested
   we upgrade our CMS to the latest version they use right now so that
   both groups work with the same version. We just completed the
   (significant) upgrade and we start exploring the new CMS software.

2. In parallel with the ongoing parallelization effort, based on
   SGI compiler pragmas, we intend to explore alternative parallelization
   techniques such as the use of SPEEDES as the core CMS engine.
   We developed some SPEEDES know-how recently in the process of
   acting as external reviewer for the CHSSI projects and we realized
   that we could use this system - which scales well over the broad range
   of processor arrays - as the Parallel CMS framework. We are currently
   analysing some SPEEDES demos and the engine source code with the goal
   of assessing its adequacy for scalable parallel CMS.

3. One possible conclusion of our SPEEDES-for-CMS analysis discussed
   above could be that the SPEEDES core parallelization techniques
   (based exclusively on fully portable UNIX fork and shmem constructs)
   could be useful for our purposes but the whole SPEEDES system is
   in fact too heavy and its numerous high level features are too
   redundant with what already exists in the CMS code. For this purpose,
   we are constructing a micro-SPEEDES like kernel that uses fork and
   shem only but it is otherwise an empty shell, ready to be attached
   and parallelize any object based event driven simulation. 

Problems:

none