Replied: Sat, 30 Oct 1999 19:40:12 -0400 Replied: Zhijun Wu Received: from caam.rice.edu (caam.rice.edu [128.42.17.10]) by postoffice.npac.syr.edu (8.9.3/8.9.3) with ESMTP id PAA18894 for ; Thu, 28 Oct 1999 15:04:36 -0400 (EDT) Received: from caam.rice.edu (dh2110sgi.caam.rice.edu [128.42.17.152]) by caam.rice.edu (8.9.3/8.9.3) with ESMTP id OAA19397; Thu, 28 Oct 1999 14:03:35 -0500 (CDT) Sender: zhijun@caam.rice.edu Message-ID: <38189E07.477AEAEF@caam.rice.edu> Date: Thu, 28 Oct 1999 14:03:35 -0500 From: Zhijun Wu Organization: caam X-Mailer: Mozilla 4.51C-SGI [en] (X11; I; IRIX 6.5 IP32) X-Accept-Language: en MIME-Version: 1.0 To: gcf@npac.syr.edu CC: Zhijun Wu , linda@rice.edu Subject: applications description Content-Type: multipart/mixed; boundary="------------8DD8EEEEA92BF2DF9D9E9011" Content-Length: 10342 This is a multi-part message in MIME format. --------------8DD8EEEEA92BF2DF9D9E9011 Content-Type: multipart/alternative; boundary="------------256B8C1C176AF05B92F68F28" --------------256B8C1C176AF05B92F68F28 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dear Dr. Fox, Here are two short descriptions on applications of parallel high-performance computing in computational structural biology and car crash analysis and simulation. Please let me know if you have any questions. Best regards, Zhijun Wu Rice University --------------256B8C1C176AF05B92F68F28 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Dear Dr. Fox,

Here are two short descriptions on applications of parallel high-performance computing
in computational structural biology and car crash analysis and simulation. Please let me
know if you have any questions.

Best regards,

Zhijun Wu
Rice University
 
 
 
 


 

--------------256B8C1C176AF05B92F68F28--

--------------8DD8EEEEA92BF2DF9D9E9011
Content-Type: text/plain; charset=us-ascii;
 name="pcsb.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="pcsb.txt"

Applications of Parallel High Performance Computing in Computational 
--------------------------------------------------------------------
Structural Biology
------------------

A fundamental research issue in computational structural biology is the prediction or 
determination of the three-dimensional structures of various macromolecules such as 
proteins. Many approaches to the problem, theoretical or experimental, have been taken, 
all requiring intensive computation.  Parallel high-performance computing has been key 
to the implementation of the approaches. 

For example, in the potential energy minimization approach, a semi-empirical function is 
minimized to find a minimal potential energy conformation of a molecule. The function 
usually has thousands of variables and the search for a global minimum is almost 
impossible if without using parallel high-performance computing. Most notable work in 
the area is done by Scheraga's group at Cornell, who obtained the best structures for 
several large proteins on various parallel platforms. Reports on this group's work can be 
found in the web-page http://www.tc.cornell.edu. Efforts have also been made in computer 
science and applied math communities for developing efficient parallel search algorithms 
for the potential energy minimization problem. Byrd and Schnabel at University of 
Colorado have developed parallel stochastic global optimization algorithms and applied 
them to protein polymers. More' and Wu of Argonne National Lab developed parallel 
global continuation software on SP2 for the determination of protein structures using 
distance data from various sources. 

Another example is the molecular dynamics simulation approach, which studies 
structural changes of molecules in certain time period. The structural changes are 
governed by the Newton's Second Law of Motion, and are described by a system of 
ordinary differential equations, each corresponding to the movement of one of the atoms 
in the molecule. Given any initial condition, the system of equations can be solved by 
numerically following the solution trajectory with very small time steps. Millions of time 
steps may be required even for a nanosecond time period. With current technology, the 
simulation can only reach several hundreds of nanoseconds, while many of the long-time 
dynamics are not possible to obtain. For example, protein folding may take several 
hundreds of milliseconds to several seconds. The protein-folding problem would have 
been solved if the simulation were feasible for it. Parallel high-performance computing 
has been used to speedup molecular dynamics simulation. For example, Duan and 
Kollman at UCSF have been able to simulate protein folding for a small protein with 36 
amino acids in a millisecond scale on Cray T3E, which have been considered a major 
computational breakthrough in structural molecular biology. The result was published in 
Science. A CRPC-related activity in this area was participated by McCammon and Scott 
at University of Houston for parallelizing molecular dynamics simulation with particle 
and space distribution. Two parallel packages, UHGromos and EulerGromos, have been 
developed as a result of the effort. Details about the work can be found in the CRPC 
report CRPC-TR93356.

A final example in computational structural biology is X-ray crystallography computing 
for solving protein crystal structures. X-ray crystallography is so far the most successful 
approach to structure determination. It is responsible for more than 80% of protein structures 
solved to date. X-ray crystallography structure determination relies on obtaining protein 
crystals, producing their X-ray diffraction images, and then deriving their structures 
from the images. The last step requires a lot of computation, from processing the images, 
to determining the phases for the diffraction patterns, to computing the electron density 
mapping for the crystal. One of the most difficult and computationally intensive parts is 
the solution of the phase problem: The electron density distribution of a crystal can be 
expanded as a Fourier series with a set of complex coefficients called structure factors. 
The amplitudes of the structure factors can be obtained from the X-ray diffraction data, 
while the phases are unknown. The phase problem is to determine the phases of the 
structure factors given their amplitudes from X-ray diffraction experiments. The problem 
is difficult to solve, requires intensive computation, and has been recognized as one of the 
grand challenges in the DOE's recent research initiatives in computational sciences. 
Research efforts on using supercomputers to provide faster and more reliable solutions to 
the phase problem have been taken in various labs and institutions such as Argonne 
National Lab and Hauptman-Wooward Institute.   
  

--------------8DD8EEEEA92BF2DF9D9E9011
Content-Type: text/plain; charset=us-ascii;
 name="pcca.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="pcca.txt"

Applications of Parallel High Performance Computing in Car Crash Simulation
---------------------------------------------------------------------------

Crash analysis is an important part of automobile design process. While crash test is 
costly yet never adequate, it often is studied through simulation. Several crash codes 
have been developed over the years such as LSDYNA using an explicit finite-element 
model. A typical model consists of irregular grids of nodes constrained by elements. 
Nodes each have a physical state including position and velocity, which need to be 
computed in every time step. Each element is parameterized by material and geometric 
type and constrains a fixed number of nodes to behave as if they are connected by that 
piece of material. An automobile may be represented by tens of thousands of elements 
with several times more nodes. Simulation of such a system requires a lot of computation. 
For example, simulation for several seconds of car crash may take several weeks of 
workstation time.

Crash simulation is an interesting application for parallel high-performance computing. 
First, it has a great industrial value and potential market. Car companies would pay 
millions of dollars for fast machines if they can afford the computation required for 
realistic simulations. Second, crash simulation is a complex process; it is not easy to 
parallelize, and is hard to scale.  For this reason, crash codes such as LSDYNA have 
been used for long time as benchmarks for measuring the performance of parallel 
architectures. For example, LSDYNA has versions for all types of architectures, serial, 
vector, parallel with shared or distributed memory. The performance of the code has 
been used in industry to justify purchases of new machines.  

The main reasons that crash simulation is hard to parallelize are the following. First, 
automobiles are built with wide variety of materials. When a piece of material deforms, 
the underlying physical model changes. It then requires a different computation to be 
used to achieve accurate results. For example, initially as metal is stretched it behaves 
elastically, but at some point it becomes brittle. For efficiency, a typical crash code 
groups elements by their material types so that instructions can be applied to sets or 
vectors of elements. The computation is organized as series of vector loops. However, if 
there is great variation in the sizes and computational requirements of the groups, it may 
be difficult to distribute and balance the workload on multiprocessors. Second, no matter 
what strategies are used to distribute the elements or nodes over the processors, the 
structures will change over time as they deform. In the presence of contact, the 
deformation alters the structures and their interactions; some new elements or nodes may 
be introduced while others deleted. The dynamical changes of the structures cause 
difficulties for maintaining load-balance among processors. Redistribution of elements 
and nodes or dynamic load balancing may help to improve the performance, but
the overhead by doing so can be computationally prohibitive. Third, when crash occurs, 
there will be many contacts between different pieces of materials or parts. The contact 
detection and simulation are complicated, often irregular and random, and the 
computation is hard to organize on parallel multiprocessors to achieve scalability. 
Usually in realistic crash simulation, more than 20% of computation may involve contact 
or impact. There are some other issues in parallel crash simulation such as the accuracy 
and repeatability of the results with varying the number of processors.  In any case, 
parallel crash simulation seems remain one of great challenges for today's parallel high-
performance computing.   

Two relevant web-pages:

http://www.gwuva.gwu.edu/ncac
http://www.lstc.com
      

--------------8DD8EEEEA92BF2DF9D9E9011--