Since fmd is derived from namd, Version 1.4, the following holds:
The FMD code is being developed under the CCM-4 CHSSI effort of the DOD High Performance Computing Program. This effort supports the Computational Chemistry and Materials Science (CCM) computational technology area. A partial list of Participants in the effort include: Dr. Ruth Pachter, Dr. James A. Lupo, Dr. Alan M. Mckenney, Dr. Soumya Patnaik and Dr. Zhiqiang Wang (AFRL/MLPJ); Dr. Mark Gordon (Iowa State University); Dr. Lennart Johnsson (University of Houston); Dr. Sarm Krimm (University of Michigan); Dr. Betsy Rice and William Mattson (ARL); and Dr. Greg Voth (University of Utah).
FMD is derived, in part, from the source code of the NAMD program, and is redistributed under the modification and redistribution terms of the NAMD license agreement. The NAMD license agreement states:
Copyright (C) 1995-96 The Board of Trustees of the University of Illinois. All rights reserved.
NOTICE: The program NAMD is *not* in the public domain. However, it is freely available without fee for education, research, and non-profit purposes. By obtaining copies of this and other files that comprise the NAMD program, you, the Licensee, agree to abide by the following conditions and understandings with respect to the copyrighted software:
M. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L. Kale, R. Skeel and K. Schulten, Intl. J. Supercomput. Applics. High Performance Computing, Vol 10, #4, pp.251-268, 1996.
The fast fourier transform routines, comprising the source code contained in the FFT subdirectory, are from the double precision version of FFTPack as distributed via Netlib. It is apparently Version 4, dated 1985, as authored by Paul N. Swarztrauber of the National Center for Atmospheric Research, Boulder, Colorado. NCAR is sponsored by the National Science Foundation. No restrictions were found in the the Netlib distributions of either the single or double precision versions.
This should be considered a working document, and as such should be associated with the specific version of FMD it is released with. It need not describe depricated features in previous versions, nor correctly describe those in future versions. The information contained within represents only part of the overall FMD effort. It has been adapted, in part, from the documentation released with NAMD 1.4.
The program fmd is a parallel molecular dynamics program. It is designed for high-performance simulations in material science. A general structure based on a small set of generic objects for communication between processors, calculation of atomic forces, and integration of the equations of motion allows fmd to adopt different algorithms for these aspects of the molecular dynamics calculation.
The program uses a spatial decomposition scheme to distribute atoms among processors. The model is broken down into uniform cubes of space referred to as patches. A set of patches is assigned to each processor. Patches are dynamically redistributed during the simulation to achieve approximate load balance. The program is message driven, meaning that the order of computation is determined by the arrival of messages indicating that the data necessary for a given computation is available. Scheduling tasks in this manner allows for the greatest possible overlap of communication and computation. fmd uses the same force field and input files as the program xplor. It will produce trajectory files in DCD and HDF format, and is able to communicate with the DICE visualization environment.
This document is the Programmer's Guide to fmd, and complements the User's Guide. It is intended to provide a detailed guide to exactly what fmd does, how it does it, and why it does it a particular way. The document is divided into several chapters. Chapter 2 details exactly what fmd computes. It includes details of and derivations for the CHARMM/X-PLOR force field and the integration scheme used. Chapter 3 outlines the design considerations that went into fmd. It covers the overall design of the program, including details of it's multi-threaded, message-driven operation and it's related spatial decomposition scheme. Chapter 4 discusses the basics of working with fmd, including how to set up a working directory for the program, the coding conventions used, and the details of the user input file used to run fmd. The last section of the guide is the longest and most detailed. This section provides the implementation details of fmd, including a detailed description of every C++ class used in the program. Armed with this information, a user should be able to understand how fmd works, and modify it to test new algorithms and methods.
This section details exactly what fmd calculates during a simulation. The first section describes the various components of the force field used, the second describes the integration method, and the third section describes the units and constant values used by the program.
The force field used by fmd is compatible with the force fields used by the programs xplor (see section References) and charmm (see section References). The force field includes energy terms for electrostatic and van der Waals interactions, linear bonds, angular bonds, dihedral bonds, improper bonds, hydrogen bonds, and constraints. The total energy function used by fmd can be expressed as:
E(total) = E(elec) + E(vdw) + E(bond) + E(angle) + E(d/i) + E(constraint)The following sections describe the individual energy terms in detail, as well as how the forces associated with each term are determined.
The electrostatic interaction is determined by the electrical charges on a pair of atoms and is described by Coulomb's Law: (see Postscript version for equation) where
Epsilon_14= Scaling factor for 1-4 interactions. This is equal to 1.0 for any interactions other than a modified 1-4 interaction. The value used for modified 1-4 interactions is specified in the user input file using the
1-4scaling
parameter.
C = 2.31e-19 J nm
q_i, and q_j = charges on atoms i and j, as specified in .psf file.
Epsilon_0 =dielectric constant specified by the user input parameter
dielectric
(see Postscript version for equation)
(see Postscript version for equation)
By differentiating the formula for Coulomb's Law, we arrive at an equation for the force due to the electrostatic interaction: (see Postscript version for equation). where
(see Postscript version for equation).
The van der Waals interactions describe the forces resulting from local interactions of atoms. The van der Waals energy between two atoms i and j is given by: (see Postscript version for equation)
The constants A and B are specified for any pair of atoms in the
parameter file using an NBFix
entry. With this command, values of
A and B for normal interactions and modified 1-4 interactions are
specified explicitly. If an NBFix
entry is not found for an atom
pair, the constants are calculated using the parameters
Sigma_ij and Epsilon_ij
as read from the parameter file, using the equations:
(see printed manual for equations)
are calculated from the
Sigma and Epsilon
values specified for the atom types of atoms i and j by the
NBOnd
data records in the parameter file using the equations:
where
Epsilon_i, Epsilon_j = Epsilon values for atoms i and j.
The values of Epsilon and Sigma for a given pair can be related to the minimum energy and distance, E_min and R_min by the equations: (see Postscript version for equations)
For modified 1-4 interactions, the
Sigma_ij and Epsilon_ij
are calculated as above, except that the
Sigma_i^14 and Epsilon_i^14
entries from NBOnd
records in the parameter file are used.
The force for the van der Waals interactions is obtained by differentiating the energy function:
Certain pairs of atoms are excluded from electrostatic and van der Waals
calculations because of the bonded interactions. The rules to apply in
choosing bonded exclusions are specified by the user input parameter
exclude
. The choices for exclusion policy are none
,
1-2
, 1-3
, 1-4
, and scaled1-4
. With
none
, no atom pairs are excluded and all interactions are
calculated. With 1-2
, only atoms that are connected via a linear
bond are excluded. The 1-3
policy excludes pairs of atoms that
are connected via a linear bond to a common third atom, as well as those
excluded by the 1-2
policy. The 1-4
excludes atoms that are
connected via three bonds, as well as those covered by the 1-2
and 1-3
policies. The scaled1-4
policy is similar to the
scaled1-4
policy, except it applies a scaling factor to the forces
between atoms connected by three bonds.
Truncation of the electrostatic and van der Waals forces is handled in
one of two ways by fmd. The default way is to simply set the force
and energy to zero (truncate) if the interaction
distance
between two atoms is greater than that specified by the user input
parameter keyword cutoff
. If the interaction distance is less
than cutoff
, the forces are calculated as usual. This method
leads to discontinuity in the force field, however. As the interaction
distance between two atoms changes from greater than to less than
cutoff, the electrostatic and van der Waals energies suddenly jump
from 0 to some finite value.
The other means of dealing with truncation in fmd is via switching
functions. These functions are designed to bring the forces and energies
smoothly to zero at the cutoff
distance over some user specified
interval. This reduces the discontinuity effects introduced by truncation,
but does not fully eliminate them. The switching functions used are
equivalent to those used in the xplor program when the VSWitch
and SHIft
options are specified. In fmd, switching is controlled
by the user input parameter switching
.
There are different switching functions used for the electrostatic and Van der Waals interactions. For the electrostatic interaction, the energy function is modified to be: (see equations in printed manual)
For van der Waals interactions, the energy function is modified to be:
(see Postscript version for equations)
where SW is the switching function defined as:
(see Postscript version for equations)
where
R_on
is a constant set by the user input parameter switchdist
, and
R_off
is specified by the parameter cutoff
.
Since the energy functions are modified by multiplying by a function of r, the forces applied are also affected. The electrostatic force under switching becomes: (see Postscript version for equations)
Similarly, the van der Waals forces are modified to be: (see Postscript version for equations)
Bonds describe the connection between two atoms. The bonds are modeled as springs, so the energy between any two atoms i and j is given by: (see Postscript version for equations) where k is the spring constant and r_o is the rest distance. These constants are read from the parameter file.
The force for a bond is found by differentiating the equation, giving: (see Postscript version for equations)
Angles describes the interaction of two atoms connected to a common third atom. The two atoms are modeled by a spring dependent on the angle between them. If the atoms are identified by i, j, and k, with atoms i and k connected to atom j, then the energy is given by: (see Postscript version for equations) where (see Postscript version for equations) are the spring constant and rest angle, respectively. These constants are read from the force field parameter file, and must be specified for each unique interacting pair. The angle, Theta_ijk is calculated from the equation: (see Postscript version for equations)
In order to determine the forces acting on the three atoms involved an angular bond, the gradient of the energy function must be determined. The derivation begins with the following: (see Postscript version for equations) For convenience, let (see Postscript version for equations) Then the reader can easily show that: (see Postscript version for equations)
Dihedral and improper dihedral angles model the interactions between 4 bonded atoms. They are modeled as a torsional spring between the planes formed by the first three atoms and the second three atoms. The energy for a dihedral or improper dihedral angle between atoms i, j, k, and l is given by: (see Postscript version for equations) where k is the torsion spring constant, Phi is the angle between the plane formed by atoms i, j, and k, and the plane formed by atoms j, k, and l; n is the periodicity constant for the bond specified; and Delta is the bond's phase shift constant. These constants are all specified in the force field parameter file. The angle is calculated from: (see Postscript version for equations)
To determine the force, the negative gradient of the energy must be found. It can be shown that: (see Postscript version for equations) where (see Postscript version for equations)
Using the formula given for Phi above, the gradient is given by: (see Postscript version for equations) But, this can lead to a singularity if sin(Phi) goes to 0. Therefore, following the method used by xplor, if sin(Phi) is nearly 0, then a third vector is defined by: (see Postscript version for equations) and given a new angle, sin(Psi) between (see Postscript version for equations) then the following can be shown to hold: (see Postscript version for equations)
By the expressing the vectors in terms of the coordinates of atoms i, j, k, and l, the forces can be expressed in one of two ways. In terms of the first formulation using vectors (see Postscript version for equations) the forces are given by: (see Postscript version for equations)
The second formulation is in terms of the vectors (see Postscript version for equations)
This formulation gives the forces as: (see Postscript version for equations)
Harmonic constraints provide a mechanism for holding certain parts of a molecule relatively immobile during a simulation. Specified atoms are allowed to move about a reference position under a harmonic force law. The constraint energy is given by: (see Postscript version for equations) where (see Postscript version for equations)
Once again, differentiation gives the force due to the constraint: (see Postscript version for equations)
The reference positions, force constraints, exponent, and atoms which are
to be constrained are defined by the user using several input keywords.
The keywords of interest include constraints
, consexp
,
conskfile
, conskcol
, and consref
.
The spherical boundary conditions in fmd allow a molecule to be confined to a spherical region of space using a harmonic force near the surface of the sphere. One or two constraint functions may be applied. If two constraint functions are applied, and one is negative, a surface tension condition can be simulated. The potential energy equation takes the same form for each potential specified: (see Postscript version for equations) where (see Postscript version for equations)
Differentiating the potential equation gives the force equation:
(see Postscript version for equations)A positive force constant will cause a force that moves atoms back towards the center of the sphere, and a negative force constant will force atoms away from the center of the sphere. Values of 2 or 4 are the only ones considered reasonable. The Combination of a negative 2 exponent with a positive 4 exponent, and suitably chosen force constants, can create a potential well near the surface of the sphere, simulating a surface tension effect.
The sphere may be automatically centered about the center of mass of the
molecule, or may be specified by the user. The force constants, radii,
and exponents for the potentials are defined by user input. The
keywords of interest include sphericalbc
, sphericalbck1
,
sphericalbcr1
, sphericalbcexp1
, sphericalbck2
,
sphericalbcr2
, sphericalbcexp2
, and
sphericalbccenter
.
Integration of the equations of motion uses the velocity form of the Verlet, or leapfrog, method. Beginning at time step n, and given the mass (M), position (X), velocity (V), and force (F) acting on each atom, the following equations are used to obtain values for the next time step: (see Postscript version for equations)
While this is the most natural way to state the method, this is not actually the order things are performed in fmd. Instead, a normal time step will center velocities in time as follows: (see Postscript version for equations)
In order to keep the interface with the rest of the program more natural,
the velocities at the half time step intervals will be stored internally
in the Integrate
object, and the velocities used by the rest of the
program will be the velocities at each time step boundary.
There are slightly different formulations of the Verlet algorithm that rely on centering the positions in time, but it is felt such formulations suffer from a higher degree of round-off error than the velocity formulation.
A fairly simple depth-first search is used to perform energy
minimization. This method uses a modified form of the Verlet algorithm
described in the Integration section above. The algorithm is modified
in two ways. The first is that the velocity of the particles is set to
zero after each time step. This means that direction of movement of
each particle during an integration step will always be in the
direction of the gradient, thus creating a depth-first search. The
second modification is to place a bound on the movement of an atom
during any time step, since a structure with very high potential
energies may experience very rapid movement of its atoms during the
beginning of minimization. The amount of movement per time step can be
specified with the user input keyword maximummove
. If no value
is specified, a default value will be assumed as given by:
0.75 x pairlistdist ------------------- stepspercycle
which will insure that no atom can move more than 3/4 of the pairlist
distance during a cycle. If greater accuracy in the energy calculation
is desired, then maximummove
should be set to a value smaller
than this.
Simple Langevin dynamics can be performed with fmd. This consists of adding a random force and subtracting a friction force from each atom during each integration step. The random force is calculated such that the average force is zero, and the standard deviation is: (see Postscript version for equations) where (see Postscript version for equations)
The friction force that is applied is given by: (see Postscript version for equations)
In order to apply these forces, fmd uses the same third-order finite difference approximation in dt as does xplor. This approximation uses the following equations to update the atom positions and velocities: (see Postscript version for equations)
Equilibration to a desired temperature is performed by periodically
rescaling the velocities of all particles such that a specified
temperature , set by user input keyword rescaletemp
, is achieved.
This rescaling is accomplished by first applying the integration scheme
to obtain a set of velocities for time step n. These velocities are
then rescaled to the desired temperature by multiplying by a factor
given by:
(see Postscript version for equations)
where
(see Postscript version for equations)
Temperature rescaling will work with either normal dynamics or with
Langevin dynamics. The number of time steps between rescalings is
specified using the input keyword rescalefreq
.
This section presents the units and constants which appear in fmd. Some handy conversion factors are also presented.
Measure Units ============================================================ Length Angstroms (1.0e-8 m) Time femto-seconds (1.0e-15 s) Energy Kilocalaries per mole (2.3885e-04 J per mole) Mass Atomic mass units (1.660440e-24 g) Charge Electron charge Temperature Degrees Kelvin
Name Value ========================================================== Boltzman's Constant 1.987191e-3 KCal/(mol degree-K) Avogadro's Number 6.022045e23 1/mol Coulomb's Constant 332.0636 (KCal Angstrom)/(mol e^2) Electron Charge (e) 1.6021892 Coulombs Pi 3.1415926535898
1 Kcal = 4184.0 Joule 1 AMU = 1.6605655e-27 Kilograms 1 Angstrom = 1e-10 Meters
fmd uses a multi-threaded, message driven design built on top of a spatial decomposition of the molecular system to provide a high performance, scalable parallel application. This section contains a description of all aspects of this design and why it was used.
fmd is a parallel message passing program. It consists of P processes, where P is the number of available processors. The process that resides on logical node 0 is referred to as the master process. It has responsibilities beyond those of the other processes for such things as startup, shutdown, I/O, and other tasks.
Communication between processes is accomplished using a
Communicate
object that resides in each process. This
Communicate
object provides an interface between the other
objects in fmd and the actual message passing system being used.
Thus, fmd can be ported to any message passing system by modifying
the Communicate
class. Currently, only MPI is supported by
fmd. However, namd, the code from which fmd was derived,
supports PVM.
The parallelization strategy of fmd is based on spatial decomposition. The space occupied by the model is divided into cubes of space called patches. Each patch has dimensions which are the electrostatic cutoff distance plus a small safety margin. These patches are assigned to physical processors in a many to one relationship. That is, multiple patches may be assigned to any given processor. Thus each process can be thought of as being a pseudo multi-threaded process, where each thread of control is a patch. As the simulation progresses, the number of patches assigned to each processor and the location of each patch will be dynamically adjusted to provide load balancing.
The mapping of patches to processors will be managed by a
PatchDistrib
object located on each processor. When a patch
wants to send a message, it will first query the PatchDistrib
object to determine the processor that the patch resides on. The patch
will then send this message to the communicate object with both the
appropriate patch and processor ID. Also, each message will have a tag
specifying the purpose of the message and a time step identifier, to
insure that if one patch gets ahead of other patches, messages will not
be incorrectly used for the wrong time step.
Each patch will be responsible for the atoms within its region of space. This means that it is responsible for storing the current position and velocity of each of these atoms, as well as gathering/computing all of the forces necessary to perform integration during each time step.
In order to gather and calculate the necessary forces, each patch will have to communicate with its neighboring patches. This communication will take place in cycles. Any patch reassignment occurs at the beginning of a cycle. At this time, any explicit calculation of long range forces are also accomplished. During the first time step after this reassignment, each patch will be responsible for again determining the interactions that it has with each neighboring patch.
Each patch is responsible for calculating all local electrostatic, van der Waals, and bonded interactions. These are interactions which involve only local atoms and therefore don't require any communication of coordinates. Each patch is also responsible for calculating the bonded forces for any bonds which contain any local atoms. This requires the atom coordinates from neighboring patches for any atoms involved in bonds with local atoms. This scheme also involves the duplicate calculation of these forces rather than the use of Newton's Third Law to reduce the computation. Due to the small number of these interactions and the sizeable complexity in dealing with all the possible combinations of atoms, bonds, and patches (consider a dihedral bond with each of its four atoms in a different patch), it is thought that this duplicate calculation does not result in significant overhead. Lastly, each patch is responsible for calculating the electrostatic and van der Waals forces between its local atoms and the atoms from some of its neighboring patches. These forces are then communicate back to the neighboring patches to exploit Newton's Third Law. The electrostatic forces between the local atoms and the rest of its neighboring patches will be calculated by the neighboring patches and the resulting forces will be sent back by those patches.
The PatchDistrib
class will be responsible for maintaining a list
of neighboring patches for each patch, and also assigning the
communication direction between each pair of patches. Therefore, at the
beginning of each time step, each patch will receive two lists of
neighbors from the PatchDistrib
object. One list will contain
the neighboring patches that it will receive all atom coordinates from
and the other will contain the neighboring patches that it will send all
atom coordinates to. From the patches that the patch sends all of its
atom coordinates to, it will receive coordinates for atoms participating
in bonded interactions with its atoms. The determination of the
direction of the communication between patches is part of the load
balancing algorithm. The Figure 1 shows an example of patch
assignments.
To try and overlap communication and computation as efficiently as possible, every calculation, including local interactions, will be triggered by an incoming message. To insure that messages will be handled in the most efficient manner, the messages will be prioritized. The highest priority messages will be those containing coordinates for electrostatic coordinates from other nodes. These messages will require calculation, and then an interprocessor message. The next most important is electrostatic coordinates from a local patch, since they will require a local communication. Next will be messages containing bonded coordinates, since they will require calculation, but no return messages. The messages containing forces calculated by a neighbor patch will be next, since these messages will again require calculation with no communication. The least important messages will be the messages sent from a patch to itself to trigger either local force calculation or integration, since these will required only local computation, without even the receipt of incoming coordinates.
Figure 1: Example of Patch Assignments to Processors
As an example, consider the two dimensional patches shown in Figure 2. Patch 5 would send all of its atom coordinates to some of its neighbors, say Patch 2, 3, 6, and 9. It would receive all of the coordinates from Patches 1, 4, 7, and 8. Based on the atoms received from Patches 1, 4, 7, and 8, Patch 5 would calculate the atoms that it needed to send to these patches so that they could compute their bonded interactions, and send these coordinates every time step. So during each time step, Patch 5 would send out atom coordinates to all 8 neighbors. To Patches 2, 3, 6, and 9, it would send all of its atom coordinates. To Patches 1, 4, 7, and 8, it would only send those coordinates necessary for bonded interactions. Patch 5 would expect to receive atom coordinates from all 8 neighbors. From Patches 2, 3, 6, and 9, it would only receive the coordinates necessary for bonded interactions. These coordinates would be used for force calculations and then discarded. No force messages would be sent back to these patches. From Patches 1, 4, 7, and 8, Patch 5 would receive all atom coordinates. Bonded, electrostatic, and van der Waals forces would be calculated, and the electrostatic and van der Waals forces would be communicated back to these patches. Patch 5 would also expect to receive a force message containing electrostatic and van der Waals forces from Patches 2, 3, 6, and 9.
Figure 2: Example Patch Communication
The PatchDistrib
object will also be responsible for gathering
load statistics for each node. Periodically, this data will be used to
redistribute patches to maintain load balance among processors. For the
first implementation, this reassignment will be done by sending all of
the information to the master process which will determine what
reassignments should be done. To avoid this becoming a point of
synchronization, the data will be gathered during each cycle, but not
used until the next cycle. For example, if each cycle consists of 10
time steps, then the data sent during time step 10 will be used to
determine the redistribution that will occur at time step 20. It this
system is found to be too slow, then statistics being sent and the
reassignment being passed out can be staggered by 1 cycle. For example,
if each cycle consists of 10 time steps, then at step 10 statistics
would be sent in, and at step step 20 a reassignment would be passed
back out. At step 30, statistics would be sent in, and at step 40,
another reassignment would be sent out. The reassignments will include
not just the movement of patches between processors, but also the switch
of the communication direction between patches. A simple algorithm for
this is if neighboring processors vary in time more than 25%, then a
patch is exchanged between them. If the difference is less than 25%,
then just the direction between patches would be changed until no more
directions could be altered, then the patch would be exchanged. Also,
there would have to be some logic to avoid moving the last patch off a
processor, since this would definitely not help load balancing.
The following is the logic that each processor will follow during a time step:
for ( all patches ) // This next will send out coordinate messages as well as send // a message from each patch to itself telling it to perform // the local force calculation. patch->send_messages (); while ( not all patches are done ) { receive_msg ( msg ); // patch->process_msg is the message processing routine for // for the appropriate patch. Incoming messages will be one // of the following: // * All coordinates // * Bonded coordinates // * Electrostatic and vdw forces // * Local force self message // * Integration self message patch->process_msg ( msg ); // total_energies gets the energy totals from each patch and // sends it off to the master process for output. total_energies (); // Check the time step and send out the positions and/or // the velocities to the I/O object for output to the // trajectory files. if ( timestep matches frequency ) send_pos_and_vel (); }
Each node will have the following objects:
PatchList
Molecule
PatchDistrib
I/O Mechanism
SimParameters
Reassignment Object
Each patch will have the following objects or data structures:
Coordinates
Velocities
Forces
Set of Force Objects
pairlist
for electrostatics, or a bonded list for bonded
bond interactions. A description of the general interface for these objects
is given in the Force Object Interface
subsection below.
Integrator
List of Neighboring Patches
Each force object will have a set of interface routines that will basically perform the following functions:
Local Initialization
pairlist
for electrostatics and a bond
list for bond interactions.
Neighbor Initialization
Local Initialization
above, except that the
atoms involved in the interactions are now from another processor. Also
for the bonded force object, this routine will have to produce a list
of local atoms that need to be sent to the neighboring patches so that
they can complete their bond calculations.
Local Force Calculation
Local Initialization
has been
done, compute the forces due to local interactions.
Neighbor Force Calculation
Neighbor Initialization
has been performed for this
neighbor, compute the forces due to interactions with this neighbor.
Energy Calculation
Clean Up
fmd is able to calculate all electrostatic interactions. Two methods are provided for:
The full electrostatic calculation, whether using the full direct or a fast method, is organized within fmd on a per-node basis, in contrast to most other calculations in fmd, which are organized on a per-patch basis. In other words, there is one instantiation of the full-electrostatics class on each node. The interface between the patches and the full-electrostatics object on a node is handled as follows:
The FullDirect class uses the fmd messaging system, and expects fmd to call it with incoming messages for it. Fast methods are expected to do their own message-handling, and keep their messages separate from fmd's (e.g., by using different message type IDs.) Moreover, if the spatial decomposition used by the fast method does not happen to coincide with how the atoms are assigned to nodes, it is FMAInterface's responsibility to move the coordinates to and the forces from the appropriate nodes.
Due to the high cost of a single full electrostatics calculation, regardless of what method is used, electrostatic forces are incorporated using a multiple time step integration scheme. In this scheme, short-range electrostatic forces are directly calculated every time step. Every k'th time step, the full electrostatic algorithm is used to compute the remaining, or long-range, electrostatic forces. These values are then added to the short-range forces during each of the next k time steps. While these forces will become more and more inaccurate as k becomes larger, it is felt that the inaccuracies that are introduced are significantly less than those introduced by the discretization to a finite time step, provided the criterion which defines "short-range" is chosen sufficiently large.
The program on which fmd is based, namd, incorporated the Distributed Parallel Multipole Tree Algorithm (DPMTA) implementation of the FMA, which was developed by the Scientific Computing group at Duke University. The DPMTA library utilized PVM. The decision was made to build fmd on MPI, and adopt the Fast Multipole Method in Three Dimensions (FMM3D) implementation developed by the Air Force Research Laboratory (see section References), Wright-Patterson AFB, OH. The following subsection describes FMM3D and the version of the FMA interface used with FMM3D.
At each beginning-of-cycle step, the set of all possible pairs of atoms are split into short- and long-range. Those pairs of atoms with the two atoms closer than some @i{r(short)} are considered to be short-range interactions, and are are entered into a pairlist. This pairlist specify the interactions that are to be recomputed every timestep.
Note that this means that by the end of a cycle, some "long-range" atom pairs may in reality be closer than many "short-range" pairs. However, the force between these atoms will remain the same, as if they had not moved at all.
The FMA is based on the idea of using power series to handle groups of long-range interactions and using Coulomb's law directly, referred to as the direct method, for handling the short-range interactions. More specifically: a cubic domain enclosing all the charges is divided up into 8^L smaller cubes, or cells. The partial electrostatic field within a given cell due to all charges within all "sufficiently distant" cells is represented by a local expansion, which is a power series in (r - c), where c is the center of the cell. Thus, the forces on a charge in the given cell due to charges in distant cells may be computed by evaluating the field at the location of the given charge. Forces due to charges the same or nearby cells are explicitly computed using the direct method.
The use of power series introduces one user-specified parameter: @i{p} , the order of accuracy for the series. The number of terms is O(@i{p}^2) , while the relative error is O((epsilon)^@i{p}) , where (epsilon) is a constant that depends on certain design decisions in the implementation. The choice of @i{p} is a trade-off between accuracy and execution speed.
p is specified by the user input parameter FMAmp.
To compute the local expansions, the FMA first computes for each cell a multipole expansion, which is essentially a power series in 1/(@b{r}-@b{c}) ; in contrast to a Taylor series, this series converges faster the farther r is from c. The multipole expansion represents the electric field far from its cell due to charges inside its cell.
The local expansions are then computed from the multipole expansions using a multi-scale approach whose computational cost is asymptotically proportional to the number of cells. This phase of the calculation, called the translation phase, requires a hierarchy of divisions into cells. The whole domain is the single level-0 cell. It is divided into 8 level-1 cells. Each level-1 cell is then divided into 8 level-2 cells, giving 64 level-2 cells altogether. This continues down to level L. (This is why the number of cells must be a power of 8.)
The choice of L does not (in principle) affect the accuracy, but it does affect the computational cost. The direct phase of the calculation costs roughly O(N^2 / 8^L) , while the translation phase is O(8^L f(p)) . L must be chosen to minimize their sum (the other phases are essentially independent of L), which in practice means to make the cost of the two phases roughly equal. Since changing L by one changes the cost of each by a factor of roughly 8, "roughly equal" means "within a factor of 10" or so.
L is specified by the user input parameter FMAlevels
An important design decision in an implementation of the FMA is the criterion for nearby vs. distant cells.
We refer to those cells that are near to a certain cell (including the cell itself) as the cell's neighborhood. One possibility is to use a 5 x 5 x 5 cube of cells with the specified cell in the center. Others include using a 3 x 3 x 3 , or all cells whose centers are within a specified number of cell widths from the center of the specified cell. Note that the dimensions of a neighborhood are always relative to the size of the cell it belongs to.
The choice of neighborhood involves a trade-off.
The larger the neighborhood, the smaller
(epsilon)
will be,
and the smaller p needs to be
for the local expansions to attain a specified order of
accuracy (thus reducing the computational cost), but the
more interactions have to be handled by the direct phase.
Figure 3: Multi-scale approach to translations
The same consideration applies to the translation phase. The local expansion for some cell on some level i uses the local expansion for its parent cell, i.e., the level-i-1 cell that it lies in, the expansion being translated, or adjusted to account for the different cell center. However, since the parent cell's neighborhood is larger than the child's (the level-i cell's), using the parent cell's expansion alone would ignore the field due to charges that are in the parent's neighborhood but not the child's. Contributions to account for this must be added to the child's local expansion. The FMA uses the multipole expansions from cells covering this in-between region, the region that is in the parent's neighborhood but not the child's. The set of such cells for a given cell is called the cell's interaction list.
Figure 3 shows how this works. The cell with center P is the parent cell, the white and dark grey regions are its neighborhood, and the light grey region shows the cells whose charges contribute to its local expansions. The dotted lines show the division into child cells: the letter C is at the center of the child cell of interest, and the white region is C's neighborhood. The dark grey region is the region whose charges' electric field must be added to P's local expansion to get C's local expansion.
The effect on the cost is obvious: the larger the neighborhood, the larger will be the interaction list, and the larger the cost.
There is another design decision here. If the cells covering this in-between region include 8 children of a common parent cell, the multipole expansion for the parent may be used instead of those for the 8 children, for a reduced cost and increased (epsilon) . Figure 3 shows 16 such parent cells. (In the 3-dimensional version of figure 3, there are 98.)
FMM3D allows for a number of neighborhood-interaction list combinations, through the IPARAM argument. This argument cannot be set through a user input parameter, but can be changed by modifying the interface routine fmd_to_fmm3d (in the file fmd2fmm3d.c.) Some documentation is provided in the source code in the file fmm3d.c.
In this section are briefly discussed some design decisions in the writing of FMM3D. Most are based on the expectation that FMM3D would be used in molecular dynamics applications.
The force parameters and other quantities used in molecular dynamics are not known to high accuracy; in addition, the approximations made in reducing the more complex quantum chemical behavior to a ball-and-spring model limit the accuracy obtainable. Thus, it was assumed that relatively small values of p would be used, typically in the range 4 to 10. This lead to the following decisions:
In chemical applications, the charge density does not vary greatly: there are regions with no atoms, and regions where there are atoms, and the atom density does not vary by more than a factor of 4 or so.
Those who study the FMM3D source code will note that it includes facilities for computing van der Waals forces and excluding certain interactions. fmd is not set up to use these facilities, so the interface sets the input arguments to FMM3D to turn them off.
Here are some brief notes on some details of the implementation.
The following section describes some strategies for choosing good FMM3D parameter values.
FMAmp
Figure 4: Typical Relative Error vs. p
FMAlevels
StepsPerCycle
FMAOn
This section contains design decisions that have been made for fmd that do not fit into other sections. Right now, this section is just a list of those decisions that are kept here to make sure they don't get lost. Hopefully, they will eventually become part of a more elegantly designed section.
fmdErr
, fmdWarn
, fmdInfo
, and
fmdDebug
. By using these objects rather than directly outputting
messages, schemes such as the current one where all output is sent to
the master process and them output can be implemented.
Parameter
and Molecule
objects will be
written in such a way that this data could be distributed in the future.
stdin
(ie. UNIX standard input).
REAL
and BigReal
These will be
used in place of float
or double
to allow flexible
switching of accuracy. By default, Real
is typedef'd to
float
and BigReal
is typedef'd to double
.
#define
'd in the source file
common.h
.
Communicate
object will implement a kind of prioritization of
messages in that messages that are from other processors will be
retrieved before messages that are sent from the same processor. This
will insure that in situations where an incoming message requires a
message to be sent in response, it will be handled before those
requiring only local communication.
send_message()
should be called at the end of
mid-cycle finishing time steps rather than at the beginning of new time
steps so that the communication time of these messages can be overlapped
with the computation being done by other patches on the processor.
fmd source files are kept in a CVS repository. The cvs command checkout will build a set of working files for you. If you are only interested in building an executable, you are referred to the Installation chapter of the User's Guide.
The directory tree for fmd consists of a project directory containing configuration files and several subdirectories containing source files and architecture specific files. Here is what the directory tree looks like:
./FMD `-----ARCH | `-----alpha-dec-osf4.0 | `-----t3e-cray-unicosmk2.0 | `-----mips-sgi-irix5.3 `-----Data `-----Docs | `-----Info `-----Misc `-----Src | `-----Docs | `-----FFT | `-----FMD | `-----FMM3D | `-----Include | `-----Misc `-----Validation | `-----Canonical | | `-----P | | |-----`vr_00 | | |-----`... | | |-----`vr_nn | | `-----S | | |-----`vr_00 | | |-----`... | | |-----`vr_nn | `-----Data | `-----Jobs | | `-----P | | `-----S | `-----Runjobs | | `-----LAM
The most important files in the project directory are the make and configuration files. These are discussed fully in the User's Guide. The GNU Autoconf program is used to generate the configure script, capturing all the architecture specific information needed for the programs to run on a variety of different workstations. The subdirectory ARCH should contain subdirectories with libraries and binaries built for various platforms. The contents of this directory may vary from installation to installations, depending on how the system manager has decided to install the programs, and what architectures are available.
The Data subdirectory contains sample user input files, PDB, PSF, and parameter files for several different models. The subdirectory Docs should contain Postscript versions of the User's Guide, Programmer's Guide, and Quick Reference Guide. It also contains a subdirectory called Info, with GNU-info formatted on-line documentation.
Src is the most important subdirectory for programmers, as it contains
all of the source code necessary to compile fmd. Docs contains
TeXinfo source files for producing printed or info documentation.
The fig
source files are also provided for generating the figures
used in the documentation. The Quick Reference Guide is formatted as as
plain TeX file. The FFT subdirectory contains the double precision
version of FFTPack obtained from Netlib. FMD contains the fmd main
source files, which FMM3D contains the Fast Multipole Method in 3D
library files. Include contains all of the header files used in the
system. Misc gathers all of the unusual files, such as GAWK scripts and
other such things that aid in the building the distribution for a given
system.
Each file in the source directories begins with revision information and
comments in the first few lines. Here is an example from the fmd.C
file:
/* $RCSfile: fmd_pg.texi,v $ -*-Mode: c++;-*- fmd - Fast Molecular Dynamics program $Revision: 1.46 $ $Date: 1999/12/03 15:59:58 $
The first line should contain the RCSfile
CVS keyword.
Optionally, it may contain a GNU emacs
mode flag so the
emacs
editor will enter the correct editing mode when the file is
read. This is followed by comment lines which provide a basic description
of what the file is for. Following these are two lines, one with the
Revision
CVS keyword, and the last with the Date
CVS key word.
Additional comments may follow this. A file revision history placed at the
end of each file. This makes it much easier to scan a file for information
without having to wade though a lot of change history. The revision history
section in a make file looks like:
################################################################### # # Revision History: # # Log: fmd_pg.texi,v # Revision 1.14 1997/07/14 17:21:25 lupoja # Added BufferMPI description. # # Revision 1.13 1997/07/10 13:58:31 lupoja # Added notes on FMM3D. # # Revision 1.12 1997/07/08 18:49:04 lupoja # Consistent handling of iftex, ifinfo and ifhtml. # # Revision 1.11 1997/07/07 15:52:09 lupoja # Added ifhtml conditionals to include GIF images of figures in the # HTML versions of the online files. - J. Lupo # # Revision 1.10 1997/06/13 15:03:53 lupoja # Using email.texi to encapsulate response address. Changed # installation instructions to reflect change from M4 to Autoconf. # # . . .
Note that when adding files to the CVS repository, it is not necessary to specify what or if a comment character is necessary. CVS is smart enough to figure this out for itself.
Communicate.C
.
util.c
.
All .c
and .C
files should have a corresponding .h
. With C++, however, it will be possible to have a header file without
a corresponding .C
file. All code in a header file should be
placed between a security wrapper to prevent multiple inclusion. This
would look like:
#ifndef _HEADER_ID_ #define _HEADER_ID_ . . . #endif
This table describes how various constructs in c and C++ source files should look.
Class Names
Class Member Names
Global Variables
Class Function Names
Global Function Names
fmd_die()
.
Macros
fmd has a wide variety of parameters that the user can set at startup to control fmd's behavior. These options and variable settings determine the exact behavior of fmd, which features will be active or inactive, how long the simulation will run, etc. This section describes syntax expected, and what settings are available.
Each line in the user input file consists of a keyword
identifying
the option being specified, and a value
which is to be assigned to
the option. The line can take one of two forms. The keyword
and
value
can be separated by only white-space (ie. spaces and/or tabs),
such as:
keyword value
or they can be separated by an equal sign and white-space, such as:
keyword = value
Blank lines in the file are ignored. Comments are prefaced with a pound sign (#), and may appear at the end of a line with a keyword- value pair:
foo = value1 bar = value2 # Here is a comment fie = value3 # Skip comment lines like this one or blank lines
It is important to note that keywords
are case insensitive.
Hence, entries such as coorTrjFile
and coortrjfile
are
recognized as the same keyword
. However, values
ARE case
sensitive. This is particularly important when you specify file names,
as they must be written exactly as you expect to find them. There is
one exception. The toggle values of off
and on
may be
written with any capitalization desired.
This sections describes those options which a user is required to set. They define the most basic properties of the simulation to be performed.
NumSteps
numsteps x timestep
.
Coordinates
Structure
Parameters
parameters
keywords may be
used to specify several different parameter files, if required.
parameters params1 parameters params2 parameters params3The files will be read in the order they appear, and warning messages will be printed if duplicated values are found. The last value read is the the value used, so the order in which the files are specified could be important.
Cutoff
Exclude
none
, 1-2
, 1-3
, 1-4
, or
scaled1-4
none
, no
bonded pairs of atoms will be excluded. With the value 1-2
, all
atoms pairs that are directly connected via a linear bond will be
excluded. With a value of 1-3
, all 1-2
pairs will be
excluded, along with all pairs of atoms that are bonded to a common
third atom (ie. if atom A is bonded to atom B, and atom B is bonded to
atom C, then the atom pair A-C would be excluded). With a value of
1-4
, all 1-3
pairs will be excluded along with all pairs
connected by a set of two bonds (ie. if atom A is bonded to atom B, and
atom B is bonded to atom C, and atom C is bonded to atom D, then the
atom pair A-D would be excluded). With a value of scaled1-4
, all
1-3
pairs are excluded and all pairs that match the 1-4
criteria are modified. The electrostatic interactions for such pairs
are scaled by the factor 1-4scaling
. The van der Waals
interactions are modified by using the special 1-4
parameters
defined in the parameter files.
OutputName
.coor
will be appended to the prefix to create the name
of the coordinate file, and .vel
will be append to create the name
of the velocity file. Thus, if one has:
OutputName = /tmp/my_outputthen the coordinate file will be named /tmp/my_output.coor and the velocity file will be named /tmp/my_output.vel. The format of the files may be set by
outputformat
, as described below.
These are parameters and options that are not required, but will almost always be specified. These, along with the required options of the previous section, form the fundamental options for a simulation.
InitCoordsFile
initCoordsFile
is specified,
coordinates
must also be specified.
InitCoordsFormat
pdb
, bin
or hdf
bin
InitCoordsDataset
initCoordsFile
. It defaults to 0. The data is checked
to make sure coordinate data is being read. The setting is ignored if
binary format is being used.
Title
Title2
TimeStep
Temperature
temperature
or velocities
must be defined to set initial
velocities for the model. The options can not be used together. NOTE:
if this file represents a continuation of a previous run, FirstTimestep
must be set to something other than 0!
OutputFormat
pdb
, bin
or hdf
pdb
.coor
and .vel
files
written for output (see the outputname
keyword) are in PDB
format. This keyword allows them to be written in binary or HDF format.
Both save space and preserve accuracy. The HDF format is machine independent.
Velocities
temperature
or velocities
must be defined to set initial
velocities for the model. The options can not be used together.
VelocitiesFormat
pdb
, bin
or hdf
pdb
bin
and hdf
formats both save space and preserve
accuracy. The hdf
format has the advantage of being machine
independent.
VelocitiesDataset
velocitiesFile
. It defaults to 0. The data is checked
to make sure velocity data is being read. The setting is ignored if
binary format is being used.
COMmotion
yes
or no
no
yes
, center-of-mass motions
will be allowed (ie. ignored). If set to no
, center-of-mass motions
will not be allowed. Once initial velocities are assigned, they will be
adjusted so as to remove any center-of-mass rotational and translational
motions.
RestartName
outputname
. That is, .vel
is appended to
the prefix to create the velocity file name, and .coord
is
appended to create the coordinate file name. These files specified here
differ from outputname
files only in the number of times they are
written during a simulation. outputname
are written at the end
of a simulation, restartname
are written every restartfreq
time step during the simulation. The final restartname
files may
or may not equal the outputname
files. The files may be written
in binary or HDF format if restartformat
is set, as described
below. If restartname
is specified, then restartfreq
must
also be specified.
RestartFreq
restartfreq
time steps. If restartfreq
is
specified, then restartname
must also be specified.
RestartFormat
pdb
, bin
or hdf
pdb
bin
and hdf
formats preserve accuracy and conserve space. In addition, the HDF format
is machine independent.
Dielectric
dielectric
.
1-4Scaling
exclude
keyword is set to scaled1-4
.
In that case, this factor is used to modify the electrostatic interactions
of 1-4 atom pairs. If the exclude
parameter is set to anything but
scaled1-4
, this parameter is ignored.
CoorTrjFile
coorTrjFile
is
specified, the coorTrjFreq
must also be defined.
CoorTrjFreq
coorTrjFreq
time step, the data is written. If
coorTrjFreq
is specified, then coorTrjFile
must also be
defined.
CoorTrjFormat
EnergyTrjFile
energyTrjFile
is specified, then energyTrjFreq
must also be
defined.
EnergyTrjFreq
energyTrjFreq
time step, the data is written. If
energyTrjFreq
is specified, then energyTrjFile
must also be
defined.
VelTrjFile
velTrjFile
is
specified, then velTrjFreq
must also be defined.
VelTrjFreq
velTrjFreq
time step, the data is written. If velTrjFreq
is specified, then velTrjFile
must also be defined.
VelTrjFormat
CWD
OutputName = job1/run3 RestartName = /tmp/run3 CWD = /scrthen the coordinate files would become
/scr/job1/run3.coord /tmp/run3.coordIf no
CWD
is specified, then the file paths
are not modified.
Seed
OutputEnergies
outputenergies
time steps.
The default setting of every time step may produce huge amounts of output
during very long simulations.
FirstTimestep
do_integration
are set at the
half timestep boundary.
Switching
on
or off
off
switching
is set to off
, then a truncated cutoff is
performed. If switching
is set to on
, then smoothing functions
are applied to both the electrostatic and van der Waals forces. For a
complete description of the non-bonded force parameters, see below. If
switching
is on
, then switchdist
must also be defined.
SwitchDist
cutoff
switching
is on
. The value of switchdist
must be
less than or equal to the value of cutoff
, since the switching function
is only applied on the range from switchdist
to cutoff
.
PairListDist
cutoff
cutoff
switching
is set
to on
to specify the allowable distance between atoms for
inclusion in the list. This is equivalent to the xplor cutnb
parameter. If no atom moves more than pairlistdist
-cutoff
during one cycle, then there will be no jump in electrostatic or van der
Waals energies when the next pair list is built. Since such a jump is
unavoidable when truncation is used, this parameter may only be
specified when switching
is on
.
StepsPerCycle
Margin
SnapshotFile
SnapshotFormat
SnapshotStart
SnapshotStop
AllForceTrjFile
allForceTrjName
is specified, the
allForceTrjFreq
must also be defined.
AllForceTrjFreq
allForceTrjFreq
time step, the data is written. If
allForceTrjFreq
is specified, then allForceTrjFile
must also be
defined.
AllForceTrjFormat
ElectTrjName
electTrjName
is specified, the
electTrjFreq
must also be defined.
ElectTrjFreq
electTrjFreq
time step, the data is written. If
electTrjFreq
is specified, then electTrjFile
must also be
defined.
ElectTrjFormat
DiceFreq
The FMM3D parameters control use of the three-dimensional fast multipole method for calculation of the long-range electrostatic interactions.
FMA
on
or off
off
on
, not used if set to
off
.
FMALevels
fma
is on
.
FMAMp
fma
is on
.
FMADegSep
FMASuperNode
The harmonic constraint feature of fmd is controlled by its own set of input options. The implementation might be more correctly called harmonic restraints. It follows the implementation seen in xplor. Through the use of these options, harmonic restraints may be applied to any atom or set of atoms in a model.
Constraints
on
or off
off
on
or off
based on the obvious value specified.
If on
is chosen, the parameters consref
, conskfile
,
conskcol
, and consexp
will be recognized.
ConsExp
constraints
is off
.
ConsRef
coordinates
constraints
is on
, the same file used for coordinates
will be read, indicating
the atoms are to be constrained about their initial positions.
ConsKol
X, Y, Z, O,
or B
O
ConsKfile
coordinates
coordinates
PDB file is assumed to contain the constants.
fmd does have the ability to perform energy minimization using a steepest descent method. While this algorithm is not the fastest to converge, it is sufficient for most applications. There are only two parameters for minimization to set, one to specify that minimization is active, and the other to specify the maximum movement of any one atom.
Minimization
on
or off
off
on
or off
in
the obvious fashion.
MaximumMove
cutoff
/stepspercycle
fmd is capable of performing Langevin dynamics, where additional damping and random forces are introduced into the model. This follows the same implementation as found in xplor.
Langevin
on
or off
off
on
, then the parameter langevintemp
must also be set.
LangevinTemp
langevin
is set to on
.
LangevinFile
coordinates
coordinates
PDB file is assumed to contain the parameters.
LangevinCol
X
, Y
, Z
, O
, or B
O
fmd allows equilibration of a system by means of temperature rescaling. Using this method, all of the velocities in the system are periodically rescaled so that the entire system is set to the desired temperature. The following parameters specify how often rescaling is to be performed, and the temperature to be used.
RescaleFreq
rescaletemp
is required.
RescaleTemp
rescalefreq
is set.
The periodic boundary conditions (PBC) in fmd support periodicity in any combination of directions (ie. line, slab, or volume), and arbitrary triclinic unit cells. To use PBC's, the user must enable PBC support, specify four vectors which define the unit cell, and optionally, specify the periodicity type if something other than volume is desired. The current implementation does not support use of fast multipole methods, and uses the nearest neighbor approximation when computing non-bonded forces. The fast multipole method will eventually support infinite images.
UnitCell
on
or off
. off
. UnitOrigin
UnitX
UnitY
UnitZ
Periodicity
x
, y
, z
,
xy
, xz
, yz
, or xyz
The only boundary conditions currently supported by fmd are spherical harmonic boundary conditions. These boundary conditions can consist of a single potential, or a combination of two potentials active at the outer edge of the model.
SphericalBC
on
or off
off
sphericalbcr1
and sphericalbck1
must
also be set.
SphericalBCr1
SphericalBCk1
SpericalBCexp1
SphericalBCr2
SphericalBCk2
SphericalBCexp2
SphericalBCCenter
fmd supports the application of an external electric field. This,
along with setting a dielectric
constant for the model, allows
for a wide variety of electrostatic conditions.
EFieldOn
on
or off
off
EField
must also be set.
EField
EFieldOn
is not set "on", else
it defaults to 0.
fmd was designed to provide many of the same molecular dynamics functions as found in xplor. Thus, there are many similarities between the types of parameters passed to both fmd and xplor. This table lists the fmd parameter and the equivalent xplor parameter.
Cutoff
SwitchDist
PairListDist
1-4Scaling
Dielectric
Exclude
NBXMod
have fmd equivalents. These equivalents are:
NBXMod exclude Description ==================================================== 1 none no atom pairs excluded 2 1-2 only 1-2 atom pairs excluded 3 1-3 1-2 and 1-3 pairs excluded 4 1-4 1-2, 1-3, and 1-4 pairs excluded 5 scaled1-4 1-2, 1-3, 1-4 pairs excluded, 1-4 pair interactions modified
Switching
switching
on
is equivalent to xplor
SHIFt and VSWItch on. Setting to off
is equivalent to xplor
option TRUNcation.
Temperature
RescaleFreq
RescaleTemp
RestartName
RestartFreq
CoorTrjFile
CoorTrjFreq
VelTrjFile
VelTrjFreq
NumSteps
This is a collection of parameters that serve primarily as aids in code development, are experimental, or are of questionable usefulness. Use at your own risk if you find a need.
GlobalTest
true
or false
. false
LdbStrategy
none
, random
, nolocality
, bisection
, or other
. bisection
LdbStepsPerCycle
ldbstrategy
is specified.
LdbSendStep
ldbstepspercycle
> Integer > 0 ldbstrategy
is specified.
LongSplitting
sharp
, xplor
, or c1
sharp
MTSAlgorithm
naive
, verleti
, or verletx
naive
Description:
Selects among several methods for time step splitting.
PLMmarginCheck
on
or off
. off
. TCouple
true
or false
. false
. TCoupleTemp
tcouple
specified as true
.
TCoupleFile
tcouple
is true
.
TCoupleCol
X
, Y
, Z
, O
, or B
O
tcouple
is
true
. (Data format is \%6.2f. Occupancy data in columns 55-60,
beta-coupling data in columns 61-66.)
Dihedral
true
or false
false
Cold
true
or false
false
dihedral
is true
.
ColdTemp
cold
and dihedral
are true
.
ColdRate
dihedral
and cold
are
true
.
ElectForceTrjName
electForceTrjName
is specified, then electForceTrjFreq
must also be defined.
ElectForceTrjFreq
electForceTrjFreq
time step, the data is written. If
electForceTrjFreq
is specified, then electForceTrjName
must also be defined.
AllForceTrjFile
allForceTrjFile
is specified, then allForceTrjFreq
must also be defined.
AllForceTrjFreq
allForceTrjFreq
time step, the data is written. If
allForceTrjFreq
is specified, then
allForceTrjFile
must also be defined.
The HDF files generated by FMD use a common format. Each file contains a header made up of several file level attributes, and 3 data sets. There is one data set for the main data array, one for the elapsed time values, and one for the time step value. Each is an extensible array, with successive data dumps adding 1 to the index of the most significant dimension (the left-most dimension for C/C++ programmers).
The file header contains the following attributes:
title
title2
program
revision_id
revision_date
creator
date
type
class
format
order
The three data sets are created using the value set in the class attribute, and the names "elapsed_time" and "time_step". In most of the trajectory files, the main data array takes the form data[time_index][atom_id][3], where the "time_index" dimension corresponds to the extensible dimension. For the energy trajectory file, the main data array takes the form data[time_index][energies]. In both cases, the data is of type float64. The "elapsed_time" data set contains elapsed time values, also of type float64, while the "time_step" data set contains int32 type values. Both of these data sets are organized in the form data[time_index][1]. Each new data dump results in the value of "time_index" being incremented by one. When the files are read, the "dim[0]" returned by the HDF library info routines is basically the number of dumps written to the file. Since all three arrays are extensible, the file size is limited only by operating system constraints.
This chapter provides a detailed description of all aspects of the fmd implementation. This includes everything from the high level program structure to detailed description of each class that is used within the program.
The first object created is the Communicate object, which makes sure the program is running on all other nodes in the parallel machine. When constructed, this object knows how many nodes are available, and the logical node ID of each node. Node 0 is the master node, and with N nodes assigned, the last node will be N-1.
The master node reads in all data, verifies structure and parameters, parses configuration file, and sends data to all other nodes. All nodes keep a full copy of the parameters and the complete structure of the molecule.
(More to come.)
Global items are defined in common.h
; global variables and
functions are there declared extern
. The global functions are in
common.C
; global variables are actually declared in fmd.C
.
FMD_title(void)
FMD_check_messages(void)
FMD_quit(void)
FMD_die(char *)
BigReal fmdInfo
Inform fmdWarn
Inform fmdErr
Inform fmdDebug
Communicate *comm
Node *fmdMyNode
int fmdNumNodes
PI, TWOPI, ONE, ZERO
TRUE, FALSE, YES, NO
BOLTZMAN
Real
Bool
This chapter contains descriptions for the different major classes used in fmd. These descriptions include the class name, interface, examples of use, and files involved. Note that the list of files involved in a class follows UNIX file names with wild card conventions (ie. foo.[Ch] is equivalent to foo.C and foo.h).
AngleForce
class is used to calculate the forces and energies
due to the angle between 2 bonds in a molecule.
Since angles, dihedral angles, and improper angles are handled
exactly alike, except for the force formulae,
all three use the abstract base class MultiBond
for everything except the calculation of the forces
on an angle from the 3 or 4 atom positions.
For a description of the public functions and use of
AngleForce
, see the Multibond
class description,
section MultiBond.
AngleForce.[Ch]
, MultiBond.h
, and structures.h
BondForce
object present in every
patch and it is responsible for calculating the forces and energies
due to bonds between two local atoms as well as between one local atom
and one atom on a nearby patch.
In contrast to MultiBond
objects,
if a bond connects atoms on two different patches,
only one patch will compute the forces and energies,
and it will send the force on the non-local atom to the other patch.
BondForce.[Ch]
and structures.h
BondForce ( Patch *parentPatch, PatchList *parentList )
The parameters parentPatch
and parentList
allow the object
to know who owns it.
~BondForce()
BondForce
objects have 7 public functions, in addition to
the constructor and destructor:
initialize_timestep(), set_recycle(), local_init(), neighbor_init(), local_force(), neighbor_force(), get_energy()At the beginning of each timestep,
initialize_timestep()
must be called to reset various flags.
At the beginning of a cycle, if no atoms have been reassigned
to or from this patch,
set_recycle()
may be called to avoid recomputing certain
data structures;
otherwise the object must be delete
ed and recreated;
there is currently no other way to indicate that
the bond lists must be rebuilt from scratch.
The next four functions -- local_init()
through
neighbor_force()
-- update the local forces
based on positions supplied to the functions,
as data becomes available during the timestep.
The functions local_init()
and neighbor_init()
are called during the first timestep in a cycle,
and set up data structures in addition to computing forces.
The functions local_force()
and neighbor_force()
only compute the forces.
Fist, local_init()
or local_force()
must be called
to compute the forces due to bonds which do not involve
neighbor atoms.
Then, each time a set of all coordinates from a patch
(a message with tag NBCOORTAG
)
comes in from a neighbor patch,
neighbor_init()
or neighbor_force()
must be called
to update the forces due to bonds with atoms in that patch.
The functions
neighbor_init()
or neighbor_force()
should not
be called for positions from a patch to which this patch sends
all positions.
These functions accumulate the forces on the neighbor atoms
in a separate array of positions, so that these forces
may be sent back to patch that sent all its positions
(in a message with tag NBFORCETAG
).
The force computation is complete when positions or forces
have been received
from all neighbors with whom this patch shares any bonds.
(No check is made for this.)
There were several decisions made during the design of this object that
should be reconsidered later. The main one is the way in which the bond
list is built. Currently this relies on having a list of bonds that
each atom is involved in. This is quite fast, but expensive and
non-scalable in terms of memory usage. If the structure of the molecule
were distributed rather than stored on every, this would become more
reasonable.
void initialize_timestep ( void )
void set_recycle ( void )
set_recycle()
may be called
as an alternative to delete
ing and recreating
the object.
It should be called before the first local_init()
or
neighbor_init()
call of the cycle.
void local_init ( int nlocal, int *localGlb, Vector *localx,
Vector *localf )
local_force()
.
local_force(localx,f)
to compute the forces.
void local_force ( Vector *x, Vector *f )
local_init()
, then bad things will happen.
void neighbor_init ( int nid, int nremote, int nlocal,
int *remoteGLB, int *localGlb, Vector *localx,
Vector *remotex, Vector *f, Vector *remotef )
neighbor_force()
for processing positions from neighbor nid
.
neighbor_force(nid,localx,remotex,f,remotef)
to compute the forces due to bonds between one local atom and
one atom on neighbor nid
.
neighbor_force()
have the same number of positions corresponding to the same atoms
in the same order as the call to neighbor_init()
.
This function should only be called once per neighbor,
only for a neighbor who does not call this function for
coordinate messages from this patch
and only during the first time step of each cycle.
neighbor_force ( int nid, Vector *localx, Vector *remotex,
Vector *f, Vector *remotef )
nid
.
neighbor_init()
must have been called already, thus
neighbor_force()
must be called for every timestep
except the first in a cycle.
BigReal get_energy ( void )
BoundaryMap.[Ch]
BoundaryMap ( void )
~BoundaryMap ( void )
patchIdIndex( patchId )
will give you the index for a patch with
the given patchId, coordIndex( i, j, k )
will give you the index of a
patch that is in the i, j, k
position of the simulation. Other
information about a neighbor can be obtained from the BoundaryMap
object. If an index corresponds to a ghost patch, Image(index)
will return TRUE, and patchOffset(index)
will return the vector
offset in units of the simulation cell to add to the corresponding real
patch to map it into the ghost patches position. If the simulation has
isolated boundary conditions where there are no ghost patches, then
Image(index)
will be FALSE and patchOffset(index)
will be
the zero vector.
The possible boundary conditions are isolated ( there is nothing beyond
the edge of the simulation cell, or periodic ( beyond the boundary is
another identical simulation cell). The two boundary conditions may be
applied to any dimension regardless of the boundary conditions on any
other dimension.
The ID of the i'th neighbor is calculated as:
neighborId [ i ] = map [ map.patchIdIndex ( patchId ) + neighborList [ i ] ]
const int createBoundaryMap ( PatchDistrib& pd, const int *overlap,
const int bt, const int xdim, const int ydim,
const int zdim )
const int& operator[] ( const int index ) const
const int patchIdIndex ( const int patchId ) const
const int coordIndex ( const int x, const int y, const int z ) const
const int middleIndexOfMap ( int *coord ) const;
const Vector& patchOffset ( const int index ) const
const int Image ( const int index ) const
BufferMPI.[Ch]
BufferMPI ( void )
~BufferMPI ( void )
BufferState: ( UNINITIALIZED, RECEIVING, UNPACKING, FREE, PACKING,
SENDING )
BufferMPIError: ( OK = 0, BUSY_BUFFER_ERR, BUFFER_STATE_ERR,
ENLARGE_BUFFER_ERR )
void error_abort ( const char *mess, int error )
void err_if_busy ( const char *name )
void init ( Bool must )
Bool is_busy ( void )
static int get_free_buf ( Bool wait, int nbufs, Buffer_MPI *bufs[] )
void wait ( void )
reset_buffer ( void )
void set_communicator ( MPI_Comm comm = MPI_COMM_WORLD )
int get_size ( void )
int get_data_size ( void )
void hex_dump ( void )
void print_buffer ( void )
int pack_buffer ( void *inbuf, int num, MPI_Datatype type )
int send_buffer ( int node, int tag )
void recv_buffer ( Bool wait, int node, int tag,
Bool &data_present, int &where_error, int &errno )
int unpack_buffer ( void *data, int len, MPI_Datatype type,
Bool &end_of_message )
Collect
class gathers the useful data produced by the patches.
This data includes the sum of the various energy terms over all atoms, and
the positions and velocities of each atom. The current implementation
accumulates the data on the master node, then invokes the output object to
report the data. The collection of energies is performed at every time step.
Positions and velocities are collected at some frequency determined by the
user. A spanning tree is employed for the the sum of energies to yield
better scalability. Currently, positions and velocities are sent directly
to the master node.
Collect.[Ch]
Collect ( int myNode, Output *output )
Initializes the spanning tree, and creates the output object only on the
master node.
~Collect ( void )
Collect
object exists on each node. The
constructor builds the spanning tree which is used to propagate the
energy data. The interaction between the Collect
object and the
rest of the program is as follows. At the beginning of each time step,
the node
object invokes the collect object on the same processor
for time step initialization. During the time step, each patch then
sends their data, when it is ready, to the Collect
object. At
the end of the time step, the PatchList
object tells the
Collect
object to propagate the data collected so far (at this
point all the patches contributed their data). The Collect
object then carries out the gathering of data. For energy collection,
the Collect
object combines the partial results from its children
(in the spanning tree) and propagates the combined result to its parent.
The Collect
object sends the positions and velocities directly to
the master node. Once the data is assembled on the master node, then
the output object is invoked.
maxEnergyCount
void init_timestep ( int timestep )
Collect
object for a particular time step. This
includes initialization of the data areas for partial results, and determining
if the time step is okay for collection of positions and velocities.
void energy ( BigReal *energy )
void coordinate ( int timestep, int natoms, int *globalIndex,
Vector *coords )
globalIndex
.
Collect
then combines them on each node,
and finally does a parallel collect operation to get all positions
to the master node.
Note that, in order for the positions and velocities collected during
a given timestep to the same time, positions are sent at the
beginning of a timestep and velocities at the end.
void velocity ( int timestep, int natoms, int *globalIndex,
Vector *vels )
coordinates()
sends coordinates.
void long_force ( int timestep, int natoms, int *globalIndex,
Vector *f )
void short_force ( int timestep, int natoms, int *globalIndex,
Vector *f )
void all_force ( int timestep, int natoms, int *globalIndex,
Vector *f )
coordinates()
sends coordinates.
void propagate ( int )
Communicate.[Ch]
Communicate ( void )
~Communicate ( void )
NOERROR, ERROR, NONODES, NOSEND, NORECEIVE
SEND, RECEIVE
NOW, WAIT
Message
object is created,
which is loaded with the data to be sent. Then the send
routine
is called with a pointer to the Message
. This Message
object is not copied, so the user must new
a Message object
before sending it via send
. If a message is sent successfully,
the Communicate
object will automatically free up it's storage
space; if it cannot be sent, the user is responsible for freeing the
space.
Each message sent to another node must have a user-supplied tag to
uniquely identify the message for the receiver. This tag should be
greater than or equal to 0. Some specific tags are used by other
components of fmd, such as the Inform
messages ( tags 1000,
2000, 3000, and 4000). When receiving a message, you can ask to receive
a message from a given node with a given tag, or specify a wild card
for either node or tag, by asking for a message with node of -1 or tag
of -1.
The Communicate object can send all messages as soon as they are
provided, or wait until the routine send.all()
is called, which
will send out all cached Message objects. This is quite often
preferable when several messages are to be sent to the same node; the
Communicate
class can combine these into a single message which
is then broken into individual Message
's by the receiver.
By default, the send method is NOW
, which means when a message is
requested to be sent, it is indeed sent over the network. If the send
method is WAIT
, the message will be stored but not actually sent
until send.all()
is called. The send method can be set via the
send_method
routine. Also, there is a send_now
function
which will send a message NOW, regardless of the current send method
setting.
To receive a message, the receive
routine is called, which
returns a new instantiation of a Message
object if a message has
arrived, or NULL
if no messages are available. Once a message
has been successfully received, the user must retrieve data from it, and
delete
it.
A message can also be broadcast to all other nodes, or to all nodes
including the sender. Broadcasts always occur immediately; they are not
combined with other messages.
int add_node ( void *id )
int debug ( void ) || void debug ( int )
CommError errorno ( void )
int nodes ( void )
int this_node ( void )
nodes()
-1.
SendMethod send_method ( void )
NOW
or WAIT
.
void send_method ( SendMethod )
NOW
or WAIT
.
int send ( Message *msg, int node, int tag )
Message
object. Whether it is actually sent at
the time the routine is called or cached depends on the current
SendMethod
.
int send_now ( Message *msg, int node, int tag )
Message
object now
, regardless of the
current SendMethod
setting.
int send_all ( void )
Message *receive ( int& node, int& tag )
Message
object, for which the user is responsible
for delete
ing.
int broadcast_all ( Message *msg, int tag )
int broadcast_others ( Message *msg, int tag )
CommunicateMPI ( CommunicateMPI.[Ch] BufferMPI.[Ch] )
Communicate
. Verifies that all nodes requested are
available and communicating. If there is any problem, it assumes only one
node is available.
CommunicateMPI.[Ch]
CommunicateMPI ( int *, char **[], int = 0 )
- The first two
arguments echo the command line arguments. The last is used to set
a debugging flag if not 0.
virtual ~CommunicateMPI ( void )
void pack_message ( Message *, int, Buffer_MPI * )
tag (int) number of items (int) type of item 1 (short) size of item 1, in number of elements (int) item 1 data (various) ... type of item N (short) size of item N, in number of elements (int) item N data (various)
Message *unpack_message ( Buffer_MPI *, int &tag, int &node )
number of individual messages (int) sending node (0 ... N-1) (int) data for message 1 (various) ... data for message N (various)
Buffer_MPI *get_sendbuf ( void )
virtual int do_send_queue ( int )
virtual int do_send_msg ( Message *, int, int, int delmsg=TRUE )
virtual Message *do_receive ( int &node, int &tag )
ConfigList.[Ch]
ConfigList ( char * filename )
Here, filename
is the name of the file to read. The constructor
opens the file, reads and parses the information within, and closes the
file. The "filename" for standard input is just a hyphen, "-" (this
follows a somewhat standard UNIX idiom). It it could not open the file,
then the member function okay(void)
is set to FALSE
,
otherwise it is set to TRUE
. Lines that could not be parsed are
written to fmdWarn
.
~ConfigList ( void )
ConfigList config ( "bR.in" ) if ( ! config.okay () ) FMD_die ( "Could not read user input file." );Information about a keyword can be retrieved with the
find
function.
This returns a linked list of type StringList
. For example, to get
a list of all the force field parameter files:
StringList *tmp, *params = config.find ( "parameters" ); if ( ! params ) FMD_die ( "No parameter files listed." ); for ( tmp = params; tmp != NULL; tmp = tmp->next ) cout << tmp->data << '\n';
StringList
data
is a char *
to the string value being stored, and next
is a StringList
*
to the next element of the list. The list is NULL
terminated.
This typedef has its own constructor and destructor. The constructor
takes a char *
, creates space for the string, and copies it. The
destructor deallocates that space. This structure is used to return the
value(s) for a given parameter name.
Bool okay ( void )
TRUE
if the file could be opened, FALSE
otherwise.
StringList *find ( char *keyword )
StringList
containing all the data
associated with the keyword
. If there are no entries in the
specified file that match that keyword
, then the returned value
is NULL
. The order of the elements is the same as the order
in the file. This function does not allocate any new space, so do
not attempt to delete
it.
ConstraintForce
object present in every patch and it is
responsible for calculating constraints for all atoms in the list of
constrained atoms.
ConstraintForce.[Ch]
ConstraintForce ( void )
The constructor sets the number of constrained atoms to zero and sets
the array consAtom to NULL.
~ConstraintForce ( void )
initialize_timestep()
, get_energy()
, init()
, and
force()
.
There are a few things that go on internally. init()
is called
during the first step of each cycle. It checks each atom to see if it is
to be constrained, and if so, adds it to the list of constrained atoms.
This list is referred to when force()
and get_energy()
are
called.
void initialize_timestep ( void )
void get_energy ( void )
void init ( int numAtoms, int *atomInd, Vector *x, Vector *f )
void force ( Vector *x, Vector *f )
DihedralForce
class is used to calculate the forces and energies
due to the dihedral angle of each chain of 3 bonds in a molecule.
Since angles, dihedral angles, and improper angles are handled
exactly alike, except for the force formulae,
all three use the abstract base class MultiBond
for everything except the calculation of the forces
on an angle from the 3 or 4 atom positions.
For a description of the public functions and use of
DihedralForce
, see the Multibond
class description,
section MultiBond.
DihedralForce.[Ch]
, MultiBond.h
, and structures.h
ElectForce
object present in every patch. It is responsible for
all local interactions in the local patch, as well as interactions
between local and remote atoms.
ElectForce.[Ch]
ElectForce ( Patch *parentPatch, PatchList *parentList )
The passed parameters let this object know about the object it owns.
~ElectForce()
local_init()
routine that
initializes the object for calculation of local interactions and then
calculates the force. This function is used only during the first time
step of a cycle. The functions local_force()
is then used during
normal mid-cycle steps to calculate the local interactions. Similarly,
the function neighbor_init()
is used to initialized the object
for interactions with a given neighbor, and neighbor_force()
is
used to calculate the interaction with a given neighbor during mid-cycle
time steps. The function get_energy()
is used to return the energy
calculated during a time step. The function initialize_timestep()
is used to get the object ready for a new timestep.
Internally, this object relies on pair lists of interactions. A pairlist
is built for the local interactions, and a pairlist is built for each
neighbor that sends all of its atoms to this patch. Each pairlist is
built by looking at every pair of atoms involved. First the
Molecule
class is queried to check for exclusions. Explicit
exclusions, as well as bonded exclusions are checked. The bonded
exclusions are applied according to the value of the exclude
keyword in the user input file. If there are no exclusions, the
distance between the two atoms is calculated and compared to the cutoff
distance. If the distance is within the cutoff distance, then the pair
needs to be calculated. At this point, the constant factor
electrostatics,
is computed and stored in the pairlist along with the computed van der
Waals parameters A and B. From this point, the energy and forces can be
computed by using the distance between the atoms and the pre-computed
constants.
void initialize_timestep ( void )
void neighbor_init ( int nid, int remoteNum, int *remoteGlb, Vector *remoteX, Vector *remoteF, int localNum, int localGlb, Vector *localX, Vector *localF )
nid
and then calculate the interactions with this
neighbor, returning the forces on the local atoms. These are added to
the array localF
, and forces on the atoms from the neighbor are
returned in the array remoteF
. This basically involves building
the pairlist, and as each pair is added to the pairlist, computing the
forces and energies due to this interaction. The other parameters are:
remoteNum
is the number of atoms that the neighbor has;
remoteGlb
is the array of global atom indexes for the neighboring
atoms; remoteX
is the array of positions for the neighboring
atoms; localNum
is the number of atoms on the local patch;
localGlb
is the array of global atom indexes for the local atoms;
and localX
is the array of positions for the local atoms. This
function should be called only during the first time step of each cycle.
void neighbor_force ( int nid, Vector *remoteX, Vector *remoteF, Vector *localX, Vector *localF )
nid
.
The forces on local atoms are added to the array localF
, and the
forces on neighboring atoms are returned in the array remoteF
.
The arrays remoteF
and localF
specify the neighboring and
local positions of the atoms. This function should only be called
during mid-cycle time steps.
void local_init ( int numLocal, int *localGlb, Vector *x, Vector *f )
f
. This involves building
the local pairlist and calculating the forces and energies for each
pair. The other parameters are: numLocal
specifies the number of
local atoms; localGlb
is the array containing the global atom
indexes for the local atoms; and x
is the array of positions for
the local atoms. This function should only be called during the first
time step of each cycle.
void local_force ( Vector *x, Vector *f )
f
. The current positions of the atoms are passed
in via the array x
.
BigReal get_energy ( void )
FieldForce.[Ch]
FieldForce ( void )
This is the constructor for the ForceField class. It just gets
the eField vector from the SimParameters object and initializes
the other attributes to starting values.
~FieldForce ( void )
initialize_timestep()
, get_energy()
, init()
, and
force()
. There are a few things that go on internally.
init()
is called during the first step of each cycle. It checks
each atom and records its partial charge. These are used to compute the
force for each time step in the cycle.
void initialize_timestep ( void )
void get_energy ( void )
void init ( int numAtoms, int *atomInd, Vector *x, Vector *f )
void force ( Vector *x, Vector *f )
FMAInterface.[Ch]
FMAInterface ( Bool IamMaster )
The FMM3D version basically just sets the class members
to a defined state.
(Note that FMM3D does not need any per-run initialization.)
~FMAInterface ( void )
Releases any dynamically allocated memory in the object.
execute_FMA()
,
deposit_coordinates
, and get_patch_forces()
. They serve
as the means for getting data into and results out off the FMA library.
void execute_FMA ( void )
fmd_to_fmm3d
, which
void deposit_coords ( int pid, int num, int *indexes,
Vector *x )
void get_patch_forces ( int pid, Vector *f, BigReal &patchEnergy )
FullDirect.[Ch]
FullDirect ( void )
The constructor calls the appropriate setup routine in the FMM3D library
and sets up the necessary data structures.
~FullDirect ( void )
void start_direct ( void )
void calc_with_node ( int node, Message *msg )
void add_forces ( Message *msg )
void deposit_coords ( int pid, int num, int *indexes, Vector *x )
void get_patch_forces ( int pid, Vector *f, BigReal &patchEnergy )
void wait_for_calcs ( void )
GenericList<T>
is a template class which implements a
one-directional list in a somewhat more storage-efficient manner than,
say, a simple linked list.
Currently, it uses a two-stage table system.
It is fully inlined, i.e., there is no .C
file,
only a .h
file.
GenericList.h
GenericList ( int numEntries )
numEntries is the maximum number of atom entries that
could be stored in this list. It is used to approximate
how big the segments should be. An incorrect number here won't
cause failure, but could cause the segment size to be too large
or too small. This would lead to either wasted memory, or
inefficient access
~GenericList ( void )
GenericList
only provides for one-way linear iteration
through the list.
Provision is made to add individual entries, but not delete them
(the whole list must be reset
or deleted and recreated.)
Objects of type T
are created, assigned, copied, and deleted
in the usual fashion, so a default constructor, copy constructor,
destructor, and assignment operator must be available
(possibly just the compiler-generated ones.)
int size()
void reset()
void add ( Entry entry )
const Entry *head ()
const Entry *next ()
GlobalIntegrate.[Ch]
GlobalIntegrate ( void )
The constructor calls the appropriate setup routine in the FMM3D library
and sets up the necessary data structures.
~GlobalIntegrate ( void )
void process_msg ( int /* node */, Message *msg )
void deposit_atoms ( int pid, int num, int *indexes, Vector *px,
Vector *pv, Vector *pf, Vector *pvh, int cycle, int first, int tstep )
void return_atoms ( PatchInfo *cl )
void do_integration ( void )
void do_verlet ( void )
void do_COLD ( void )
void do_dihedral ( void )
void do_hard_dist_constrs ( void )
HashTable<T>
is a template class which implements a simple hash
table with entries of type T
, and non-negative
indices of type long
.
It is fully inlined, i.e., there is no .C
file,
only a .h
file.
The implementation is rather simple-minded, and
ignores all of the literature on designing hash tables.
It should not be used as a model for writing hash tables.
Its only virtues are that it is simple (no divides needed)
and that it is fast enough so that the total time spent in it
is negligible compared with the rest of fmd.
For example,
the table length is always a power of 2, so that the
hash is computed with shifts, adds, and masks.
Collisions are handled by having the actual elements in the
table array be pointers to linked lists of the entries that hash
to the same array element.
Because NameTable
creates, deletes, and assigns objects of type
T
as if they were of a built-in type, if type T
has a more
complex structure, it must have constructors, destructors, and
assignment operators as necessary to allow this sort of use.
HashTable
does not explicitly initialize the objects;
if this is desired, a no-argument constructor must be supplied.
Type T
must also be able to convert 0 to type T
.
An additional feature is a "current entry" pointer.
This internal pointer is set by functions that reference by an index to
point to the most recently referenced entry, or set to a "reset" state
if there is no sensible definition if the entry referred to by the index
has been deleted or did not exist and was not created.
It can also be explicitly reset, moved to the next entry in the table,
and the current entry's index or value can be returned.
It is also used as a "cache": if the same entry is referenced by index
twice in a row, the cached values are used directly, rather than
searching the table again.
HashTable.h
HashTable<T> ( nentries, maxindex )
Creates the table. nentries
is a rough estimate of the number of
entries expected; it is used to choose the size of the table array.
If it is too big, the table will have many unused elements, and if it is
too small, there will be more collisions and thus more searching of the
linked lists.
maxindex
is the largest index value used that will be used, and
is used to choose the hash function.
~HashTable ( void )
Deletes the hash table and its entries.
operator []
), which returns a reference to the entry
corresponding to the index supplied, creating one if necessary.
Associated with this function are other functions that check for the
presence of entries, create and delete them, and allow one to dump out
the entries, etc.
There are also "iterator" functions, which implicitly use the current
entry pointer in place of an index.
T& operator []( long index )
Bool exists( long index )
TRUE
, otherwise reset the
current entry pointer and return FALSE
.
void create( long index )
void remove( long index )
int nentries( )
void reset_iterator( )
Bool next( )
FALSE
.
If there is a next entry, return TRUE
,
long current_index( )
next()
to advance to the first entry, and
return its index value.
If there is no first entry, return -1
.
T current_data( )
next()
to advance to
the first entry, and return its data value.
If there is no first entry, return T(0)
.
T* current_datap( )
next()
to advance to the first entry, and
return a pointer to the value.
void dump_data( char *formatdata( T& data ) )
printf()
.
The data objects are formatted using formatdata()
, a
user-supplied function.
formatdata()
must return a pointer to a character string, created
with new[]
, containing the formatted data.
ImproperForce
class is used to calculate the forces and
energies due to each improper angle, i.e., the non-flatness of
each set of 3 atoms directly bonded to a fourth atom in a molecule.
Since angles, dihedral angles, and improper angles are handled exactly
alike, except for the force formulae, all three use the abstract base
class MultiBond
for everything except the calculation of the
forces on an angle from the 3 or 4 atom positions.
For a description of the public functions and use of
ImproperForce
, see the Multibond
class description,
section MultiBond.
ImproperForce.[Ch]
, MultiBond.h
, and structures.h
Integrate.[Ch]
Integrate ( void )
New empty integrate, with no intermediate velocities.
~Integrate ( void )
Integrate
and uses the same
integrate for its lifetime. The integrate is initialized by invoking
init()
and supplying the number of local atoms maintained by the
patch. There are three more functions that are invoked by the patch:
do_integration()
, add_atoms()
, and delete_atoms()
.
void init ( int )
void add_atoms ( int, Vector * )
int delete_atoms ( int, Vector * )
void do_integration ( Vector *x, Vector *v, Vector *f, int *global_nums, int tstep, int firsttstep )
x
is the current position vector, v
is the current
velocity vector, f
is the current force vector, global_nums
is the array of global atom indexes for the atoms, tstep
is the
current time step we're in, and firsttstep
was the time step
this run started with.
Inform
object should be present on all nodes. All but one of the
nodes will just forward their messages to the host node with logical ID 0.
The host node will periodically check for messages, and print them out.
Inform.[Ch]
Inform ( char *name [, ison = 1 ] )
Creates a new Inform
object, with the string identifier
name
. This identifier is the name of the object and is printed
with every message. By default, messages will be printed. This can
be changed by setting the optional argument ison
to 0.
~Inform ( void )
Inform
object has an active message buffer. To put data
into this buffer, use the << operator. For example:
inform << "Message number " << 2 << "\n Second line." << sendmsg;The << operator will accept strings and all atomic data types (int, double, float, etc.). The
sendmsg
manipulator can be used to signal that the
message is complete and should be sent. Its use is equivalent to doing
the following:
inform << "Message number " << 2 << "\n Second line."; inform.send ();Newline characters in the message are used to break up the message into multiple lines when the message is printed out. When displayed, each line up to a newline is printed with a leading message. The above example, if executed from node 2 by an object named "Messenger", would produce the following output on the host node:
Node 2:Messenger> Message number 2 Node 2:Messenger> Second line.
void use_comm ( Communicate *c, int newnode = 0, int newtag = 0 )
Inform
object with data needed to send messages.
The given Communicate
object is used to send messages to the
destination node newnode
, with the given tag newtag
. This
can be called at any time. By default, an Inform
object has no
Communicate
object, and will send to node 0 with tag 0. If no
Communicate
object has been provided when a message is told to be
sent, the message is printed to the console device.
void on ( int ) || int on ( void )
Inform
object on/off, or queries whether the Inform
object will send or display it's messages. By default, messages are
sent.
void destination ( ostream * )
cout
, but can be changed to anything,
a log file for example.
int check ( void )
Inform
messages from
the other nodes, and sends them on to the designated ostream. This must
be called periodically to keep the incoming messages from accruing. For
non-host nodes, this does nothing. The number of messages received during
a check is returned.
int send ( void )
Inform
object has a current message buffer, to which data
is added by using the << operator. Once a message has been set up, the
send
routine takes the current message and forwards it to the host
node. On the host node, this results in the message being printed out
immediately.
IntList.[Ch]
IntList ( void )
~IntList ( void )
int num ( void )
void add ( int newint )
Bool unique ( void )
IntTree
takes a series of integers as
input. It then stores only the distinct values as a binary tree. Upon
request, all the values in the tree can be returned in the form of an
array of integers.
The IntTree
class is used to track atoms that need to be sent to
a neighbor as bonded coordinates. A single IntTree
object is
used and is passed to each of the bond force objects as they initialize
themselves with a neighbor who sends all of their coordinates. Each
object places the local atom indexes of the atom coordinates that need
to be sent to the neighbor into the IntTree
object. Since the
tree only stores distinct values, after the object has been passed to
all the force objects, the contents can be dumped to an array of
integers that then represents all the atoms that need to be sent to this
neighbor.
IntTree.[Ch]
IntTree ( void )
~IntTree ( void )
size
,
add_value
, and make_array
.
int size ( void )
void add_value ( int intval )
int *make_array ( void )
NULL
is returned.
LintList.h
LIST_EMPTY
- This is the value returned when the end of the
list is encountered.
LintList ( void )
~LintList ( void )
add()
is used to add a new integer to the list.
Additions are accomplished in constant time. The function head()
returns the first value in the list and sets the current position in the
list to be the head of the list. The function next()
returns the
next value in the list and moves the current position in the list to
this element. Both head()
and next()
return
LIST_EMPTY
if the list is completely empty or if the end of the
list has been reached. Thus, to traverse the list, head()
is
called once, followed by repeated calls to next()
until the value
of LIST_EMPTY
is returned.
void add ( int addvalue )
int head ( void )
LIST_EMPTY
is returned if the list is empty.
int next ( void )
LIST_EMPTY
is returned.
LoadBalance.[Ch]
LoadBalance ( int myNode )
The constructor initializes the node's data structure.
~LoadBalance ( void )
void init_patches ( int numPatches )
void delete_patches ( void )
void ldb_method1 ( vertex **VArray, int NumCells, int NumProcs )
void down_heap ( procLoad *a, int n, int k ) int cmp_loads ( const void *p1, const void *p2 ) void deposit_load_stats ( LoadStats *lstats ) void send_load_info () int receive_load_info ( Message *msg ) void compute_patch_changes () void send_patch_changes ( int *numToRecv, IntList *sendList, IntList *destList ) void patch_changes_complete ( void )
LongForce.[Ch]
LongForce ( int patchid )
LongForce ( int patchid, int n, Message *msg )
~LongForce()
initialize_timestep()
, end_cycle()
, init()
,
initialize_first_timestep()
, calc_eff_force()
,
get_long_force()
, send_atom_info()
,
prepare_to_receive_atoms
, receive_atom_info
,
send_forces
, and save_state()
.
void calc_eff_force ( Vector *subForce, BigReal subEnergy )
void init ( int n )
void initialize_first_timestep ( int n, int *localGlb, Vector *x,
int step )
void initialize_timestep ( void )
void get_long_force ( Vector *localF )
void end_cycle ( void )
void send_atom_info ( Message *msg, int numSend,
LintList *send_atoms )
void prepare_to_receive_atoms ( int numAdded, int numRemoved,
LintList *atomsRemoved)
void receive_atom_info ( int nAtoms, Message *msg )
void send_forces ( int nAtoms, int *atoms )
void save_state ( Message *msg )
Message
object is created, filled with
the data to be sent, and given to a Communicate
object to be
delivered. Each Message
consists of one or more items of
different types. A user can query how many items there are, the type
and size of each item, and retrieve the items.
Message.[Ch]
Message ( void )
~Message ( void )
Types: CHAR, SHORT, INT, LONG, FLOAT, DOUBLE, UNKNOW
Message
,
first a new Message
object is created, then data is put into the
object using the put()
function. For example:
Message *msg = new Message; msg->put("Start of message.").put(5,intarray).put(3.14); communicate->send(msg,node,tag);The
put()
routine is overloaded to accept all atomic data types,
either single value, or arrays (two arguments, number of elements and
pointer to array).
There are two ways to store the data for each item in the
Message
: make a copy of the data, or store a pointer to the data.
By default, data you provide in a call to put()
is copied into
some new storage area allocated by the Message
object. For
scalar data, a copy of the data is always made. However, for arrays of
data, you can instead choose to just have the #Message
object
store the pointer to the data that you provide in the put()
call.
Doing this eliminates the overhead of allocating new memory space and
performing the copy, but requires you to make sure the data is still
available when the Message
is actually sent to the destination
node (which could be some time after the creation of the
Message
).
When just the pointer is stored, the Message
object by default
will not free up the storage when the Message
is deleted, as it
normally does for data it copies to its own storage. You can also
choose to have the storage space for data you put()
be freed up
by the Message
object when it is deleted. This might be
advantageous if a Message
is being constructed with data you have
allocated earlier, but have no further need of other than sending it
out.
When calling put()
for arrays of data, there are two optional
parameters:
TRUE
) or just reference the data (FALSE
) via the
given pointer. This is TRUE
by default (ie. by default, data is
copied).
Message
for the data (TRUE
) or
just leave the storage space unchanged (FALSE
). This option is
ignored if the third argument is TRUE
, that is, when a copy is
made, the storage is always deleted (since it is allocated by the
Message
object in the first place). This is FALSE
by
default (ie. by default, if data is not copied, the storage is not
deleted).
put()
routine for scalar data, there is
only one argument ( the data ). There are no optional second and
third arguments. For example, the command given just above copies
the data from intarray
, while this example just specifies the
msg
store the pointer intarray
:
msg->put(5,intarray,FALSE,FALSE); communicate->send(msg,node,tag);When the
Message
is deleted, the memory pointed to by intarray
is freed up for future use.
There are two ways to get data form a Message
:
items()
routine will report how many data items there
are. The type(n)
and size(n)
routines report the type and size
(in number of elements) for each item, and each item can then be retrieved
via the item(n)
function.
Message
object has a current item. The
reset()
routine sets the current item to the beginning of the
Message
. After this has been done, the type()
, size()
,
and item()
routines with no arguments will return data about the
current item. To retrieve data and automatically move the current item
to the next in the list use the get()
routine. This will retrieve
the data in the current item, place it in the storage given by the argument,
and increment the current item.
Message
. It is
possible to delete individual items from the message
(using the
del(n)
routine), or delete all items in the message (with the
clear()
routine).
ostream &operator<<
<<
operator can be used on an ostream object (such as
cout
or cerr
) to print a summary of the current message.
For example:
Message msg; cout << "Contents of of the message: " << msg;
int items ( void )
Message
.
Items are numbered 0..items()-1.
int size ( int n = (-1) )
Types type ( int n = (-1) )
void *item ( int n = (-1) )
Message &del ( int n = (-1) )
Message &clear ( void )
Message &reset ( void )
Message &skip ( void )
&back ( void )
Message ¤t ( int n )
Message &put ( char * d )
Message &put ( <type) d )
Message &put ( int n, <type> *d, int copy = TRUE, int delstor = FALSE )
put
routine adds new items to the end of a Message
.
The first form adds a null terminated string, the second adds a scalar
of type Types
, and the thirds adds an array of type Types
of size n
. The third form has two optional arguments. copy
indicates if a copy of the data is to be made. delstor
indicates
if Message
is responsible for deleting storage when it is done.
Message &get ( <type> &d )
Message &get ( <type> *d )
get
routine retrieves the data from the current item, and
copies it into the argument d
. The caller must know what type of
data is to be retrieved, and call get with the appropriate argument
type. Space must be provided by the caller to store the retrieved data;
it is NOT allocated by get
. The first form retrieves a scalar
value, the second a vector (array) value. If the type of the current
item does not match the argument, or there is no current message,
get
does nothing. The size and type of the current message can
be found with the size()
and type()
functions,
respectively.
Message
s that have been received. It allows the retrieval
of a Message
with a specific tag from any node in constant time.
This is the case used to retrieve virtually all messages in fmd.
It searches for messages with any tag but from a specific node or
from any node with any tag in O(N) time, where N is the total number of
messages currently stored. (There are currently no such searches done in
fmd.)
The class is currently implemented using an array of MessageQueue
objects, where each queue stores messages with a specific tag. There is
one such object for each tag used in the program. Therefore, all
messages with a given tag are stored in a single MessageQueue
object. Two MessageQueue
objects are used by each
Communicate
class: one to store messages from other nodes, and
one to store messages sent by patches on the same node.
MessageManger.[Ch]
MessageManager ( void )
~MessageManager ( void )
int num ( void )
Message
objects currently stored.
void add_msg ( MsgList *new_msg )
Message
enclosed in the structure new_msg
. This
addition is performed in constant time.
MsgList *get_head ( void )
Message
at the head of the queue and takes
it off of the queue. If the list is empty, NULL
is returned. This
operation is performed in constant time.
MsgList *find_msg ( int tag, int node )
tag
and node
. If
tag=-1
, the first message matching node
with any tag
is returned.
MesageQueue
class is an almost completely inlined class that
provides a FIFO queue of Message
objects. The principle property
of these queues is constant time addition of messages and retrieval of
the head of the queue. Searches of the queue for messages with a
specific property, such as node
or tag
value, are O(N),
where N is the number of messages stored in the queue. The class is
currently implemented as a doubly linked list. The class is presently
used only by MessageManager
, which uses these queues to efficiently
search for stored messages.
MessageQueue.[Ch]
MessageQueue ( void )
~MessageQueue ( void )
int num ( void )
Message
s currently stored in the queue.
void add_msg ( MsgList *new_msg )
Message
enclosed in the structure new_message
into
the queue. The addition is performed in constant time.
MsgList *get_head ( void )
Message
at the head of the queue and take
it off the queue. If the list is empty, NULL
is returned. This
operation is also performed in constant time.
MsgList *get_msg_by_node ( int node )
Message
on the queue with the node
indicated by node
and remove it from the queue. If there is no
Message
on the queue with a node
that matches, return
NULL
. This operation is accomplished in O(N) time.
MsgList *get_msg_by_tag ( int tag )
Message
on the queue with the tag
indicated by tag
and remove it from the queue. If there is no
Message
with matching tag, NULL
is returned. This
operation is accomplished in O(N) time. In the current implementation
of MessageManager
, this operation is never sued, but is provided
for completeness.
Molecule
class is used to read in, store, and access the
molecular structure.
Information is read from an X-PLOR formatted .psf file.
This information includes a list of all the atoms along with their mass
and charge; lists of all bonds, bond angles, dihedral angles, and
improper dihedral angles; and a list of explicit electrostatic
exceptions.
A Molecule
object will reside on each of the processors, with the
same data in each.
The .psf file is read in on the master node, and the information is
then distributed to the other processors.
Molecule.[Ch]
, structures.h
Molecule ( void )
~Molecule ( void )
read_psf_file()
is used to read in the structure
file specified.
This function should be called only on the master processor (node 0) to
read in the .psf file specified in the user input file.
The master processor then sends all the class data values with a call to
send_molecule()
, and the other processors receive them with a
call to receive_molecule()
.
The Parameter
object then verifies that all of the parameters
needed by the .psf file have been read in.
int numAtoms
int numBonds
int numAngles
int numDihedrals
int numImpropers
int numExclusons
int numConstraints
int numMultipleDihedrals
int numMultipleImpropers
numMultipleDihedrals
, but for improper angles.
void read_psf_file ( char *name, Parameters *params )
name
and use the
Parameters
object params
to verify that all of the
parameters necessary for this structure have been specified. When
the NTITLE block is read from the PSF file, the function checks for
the presence of the keyword BASE64 anywhere in the lines. If found,
all numbers are treated as if they were written in base-64 notation, using
the character set "0-9A-Za-z@#".
void send_Molecule(Communicate *msg)
Molecule
object to another node
(usually from the master to the other nodes.)
void receive_Molecule(Message *msg);
Molecule
object with them.
If used correctly, the object should be effectively a copy of the one
on the master node.
void build_constraint_params ( StringList *consref,
StringList *conskfile, StringList *conskcol, PDB *initial_pdb,
char *cwd )
consref
file is supplied), and the last is the working directory.
void build_langevin_params ( StringList *langfile,
StringList *langcol, PDB *initial_pdb, char *cwd)
langfile
langcol
initial_pdb
cwd
Real atommass ( int anum )
Real atomimass ( int anum )
Real atomcharge ( int anum )
Index atomvdwtype ( int anum )
Bond *get_bond ( int bnum )
Angle *get_angle ( int anum )
Dihedral *get_dihedral ( int dnum )
Improper *get_improper ( int inum )
long get_atomtype ( int anum )
LintList *get_bonds_for_atom ( int anum )
anum
.
LintList *get_angles_for_atom ( int anum )
anum
.
LintList *get_dihedrals_for_atom ( int anum )
anum
.
LintList *get_impropers_for_atom ( int anum )
anum
.
Bool checkexcl ( int atom1, int atom2 )
TRUE
if the electrostatic force between the atoms with
global number atom1
and atom2
is explicitly (via the PSF
file) or implicitly excluded (according to the value of the
configuration parameter Exclude
.)
1-4 exclusions are considered only if Exclude
is
set to 1-4
.
Bool check14excl ( int atom1, int atom2 )
TRUE
if the electrostatic force between the atoms with
global number atom1
and atom2
is to be modified according
to the value of the configuration parameter 1-4Scaling
.
Use this only if Exclude
is to set to scaled1-4
.
Bool is_atom_constrained ( int atomnum )
TRUE
if the atom with global number atomnum
is constrained.
void get_cons_params ( Real &k, Vector &refPos, int atomnum )
atomnum
.
Do not call this if the atom is not constrained.
Real langevin_param ( int atomnum )
Real langevin_force_val ( int atomnum )
void print_atoms ( void )
void print_bonds ( void )
void print_angles ( void )
void print_dihedrals ( void )
void print_impropers ( void )
void print_exclusions ( void )
MultiBond
provides the code and data which are the same in
the classes AngleForce
, DihedralForce
,
and ImproperForce
, which is to say, almost everything.
The derived classes supply only
AngleForce
,
4 for the others), and
MultiBond
is an abstract base class,
any MultiBond
object must also be an
AngleForce
,
DihedralForce
, or
ImproperForce
object.
We will speak of MultiBond
objects, with the
understanding that actual objects will be an instance of one of
the three derived classes just mentioned.
We shall also use the term "multibond" to refer generically to
an angle, a dihedral angle, or an improper angle.
There is an AngleForce
, a DihedralForce
, and a
ImproperForce
object present in every patch, and each is
responsible for calculating the forces due to angles of its type
between
local atoms, as well as between local atoms and atoms from
neighboring patches.
In contrast to BondForce
objects,
each MultiBond
object processes every multibond involving
an atom on its patch, but
computes the forces only on its local atoms.
Thus, each patch must send coordinates to and wait for coordinates
from each neighbor with whom it shares a multibond.
MultiBond.[Ch]
, HashTable.h
, and structures.h
MultiBond(Patch *parentPatch, PatchList *parentList)
AngleForce(Patch *parentPatch, PatchList *parentList)
DihedralForce(Patch *parentPatch, PatchList *parentList)
ImproperForce(Patch *parentPatch, PatchList *parentList)
The parameters parentPatch
and parentList
allow the object
to know who owns it.
~MultiBond()
AngleForce
, DihedralForce
, and ImproperForce
inherit the destructor.
MultiBond
objects have 7 public functions, in addition to
the constructor and destructor:
initialize_timestep(), set_recycle(), local_init(), neighbor_init(), local_force(), neighbor_force(), get_energy()At the beginning of each timestep,
initialize_timestep()
must be called to reset various flags.
At the beginning of a cycle, if no atoms have been reassigned
to or from this patch,
set_recycle()
may be called to avoid recomputing certain
data structures;
otherwise the object must be delete
ed and recreated;
there is currently no other way to indicate that
the multibond lists must be rebuilt from scratch.
The next four functions -- local_init()
through
neighbor_force()
-- update the local forces
based on positions supplied to the functions,
as data becomes available during the timestep.
The functions local_init()
and neighbor_init()
are called during the first timestep in a cycle,
and set up data structures in addition to computing forces.
The functions local_force()
and neighbor_force()
only compute the forces.
First, local_init()
or local_force()
must be called
to compute the forces due to multibonds which do not involve
neighbor atoms.
Then, each time a set of coordinates comes in from a neighbor
patch,
neighbor_init()
or neighbor_force()
must be called
to update the forces on local atoms, based on the coordinates
that come in.
The idea is that the forces for a given multibond are computed
as soon as the messages received so far include all the atom
positions needed for that multibond.
The force computation is complete when coordinates have been received
from all neighbors with whom this patch shares any multibonds.
(No check is made for this.)
Each patch accumulates
the multibond potential energy for those
multibonds whose first atom belongs to the patch;
this assures that the sum of the multibond potential
over patches will get the right answer, counting
each multibond exactly once.
The energy thus accumulated by a patch is returned by
a call to get_energy()
.
To hold the atom positions for multibonds
which have to wait for coordinates from a neighbor patch,
a MultiBond
object keeps a list with a structure for
each such multibond.
Each such structure has an entry for the position of each atom
involved in the multibond.
As each atom position comes in
(via neighbor_init()
or neighbor_force()
),
it is entered into the structures for the multibonds
that include that atom.
Once all the position entries have been filled in,
the forces are computed.
To match neighbor atoms with multibond structures efficiently,
a MultiBond
object also keeps, for each neighbor,
an array of pointers, each of which corresponds to an atom
whose positions it gets from that neighbor,
and points to
a list of the multibonds it appears in,
if any.
These arrays and lists are built by neighbor_init()
during the
first timestep of a cycle.
Since neighbor_force()
receives a bare array of
positions, without atom numbers, it must
identify the atom by its index in the position array.
However, the multibond arrays initially only have global atom numbers.
To convert from one to the other, yet another data structure
is built: a hash table, indexed by global atom number,
each of whose entries points to a list of multibonds which the
atom with that global atom number appears in.
neighbor_init()
then uses the global atom number for each
position received to pick out the list of multibonds for that atom.
At the beginning of a cycle, an object knows which atoms belong
to other patches, but not which patches those atoms belong to.
It cannot send its positions to a particular patch
until it has gotten positions from it to know which multibonds
it shares with the local patch.
The apparent deadlock is resolved by the fact that,
for any pair of patches, one of the patches will send all its positions
to the other without waiting for a request for specific positions.
This is managed outside the MultiBond
class;
it is just that in the receiving patch,
neighbor_init()
will be called with all of the
sending patches positions.
The receiving patch then signals its caller which positions
should be sent back.
There are certain aspects of the current implementation of this object
that are questionable. Currently, the most questionable is the way the
angle list is constructed. It requires a list of the angles that each
atom is involved in to be stored. This is fairly expensive in terms of
memory, but quick. This scheme should be revisited some time later.
void initialize_timestep ()
set_recycle()
.)
set_recycle()
delete
ing and recreating the object, as "recycling"
the object avoids having to rebuild all the data structures.
void local_init ( int nlocal, int *localGlb, Vector *localx,
local_force()
.
local_force(localx,f)
to compute the forces.
void local_force ( Vector *x, Vector *f )
local_init()
, then bad things will happen.
void neighbor_init ( int nid, int nremote, int nlocal,
neighbor_force()
for processing positions from neighbor nid
.
send_tree
is not null,
adds to it those atoms whose positions
need to be sent to the neighbor nid
.
neighbor_force(nid,localx,remotex,f)
to compute
the forces due to multibonds which can only now be computed.
neighbor_force()
have the same number of positions corresponding to the same atoms
in the same order as the call to neighbor_init()
.
This function should only be called once per neighbor, and
only during the first time step of each cycle.
neighbor_force ( int nid, Vector *localx, Vector *remotex, Vector *f )
nid
and
compute the forces due to multibonds which only now have all atom
positions.
neighbor_init()
must have been called already, thus
neighbor_force()
must be called for every timestep
except the first in a cycle.
BigReal get_energy ( void )
NameTable.[Ch]
NameTable ( int initial_size = 64 )
~NameTable ( void )
long index ( const char *name )
long size ( void )
char *name ( long index )
delete[]
'ing the
string.
NeighborList.[Ch]
NeighborList ( void )
~NeighborList ( void )
createNeighborList
which is passed a three integer overlap array
and a BoundaryMap. The overlap array gives the number of patches that
must be considered neighbors in each direction. Therefore the number of
neighbors in a single dimension j is twice the offset[j] plus one. The
doubling factor on the offset comes from the offset being a measure of
only one direction and there are two directions in every dimension. The
addition of one to the doubled offset includes the central patch in the
calculation of the number of patches in each dimension. The total
number of neighbors is then the product of the number of neighbors in
each dimension subtracting one to exclude the central patch, because a
patch is NEVER it's own neighbor (the code will detect if this
happens, issue an error message and abort the run).
createNeighborList
then calculates the relative offsets (in terms
of the BoundaryMap indices) of these neighboring patches. This set of
relative offsets is then used with the BoundaryMap to calculate the
patchIds of the neighboring patches of any given patch.
This is seen most clearly in the following figure.
This is seen most clearly in the following figure.
createNeighborList ( const int *overlap, const BoundaryMap& map )
const int& operator[] ( const int index ) const
const int Length() const
Node
class is where all of the actual work for the simulation
is done. It can be thought of as the main()
routine which each
processor runs. It contains all of the other objects to be used during
the simulation and the highest level control structure for the
simulation. One Node
object will exist on each processor and
will handle all of the activity for the simulation required of this
processor.
Node.[Ch]
Node ( int id )
- where id
specifies the logical node
number of the processor this Node
is on.
~Node ( void )
Node
object is created on each processor, there are
basically only two routines that called externally on each Node
.
One performs all of the necessary setup and is different for the master
node and the client nodes. On the master node, the routine used is
master_startup()
and on the client nodes client_startup()
is used. Once the Node
object is started, the
doSimulation()
function is called, and the Node
object
then takes over the control of the simulation.
Node
object is to hold all of the other objects
in the program and control their execution. The objects that are
contained by each Node
object are:
Molecule *stucture
Parameters *params
SimParameters *simParams
PatchDistrib *patchMap
PatchList *patchList
Patch
s that are assigned to this
processor.
void master_startup ( int argc, char **argv )
Node
object on the master node. This involves
starting up all the other objects, reading in the appropriate files, and
then sending this information to all of the client nodes. The arguments
argc
and argv
are used to obtain the user input file name
that is passed on the command line.
void client_startup ( void )
Node
object on a client node. This involves
allocating all of the objects and then receiving the information from
from the master process.
void doSimulation ( void )
void velocities_from_PDB ( Vector *v, char *fname )
fname
and the array
of velocity vectors to populate is given by v
.
void random_velocities ( Vector *v, BigReal temp )
temp
argument, and the array of velocity vectors to populate is given by
v
.
Output
class is used to produce the useful output of fmd,
such as the energy values for a time step, trajectory files, restart
files, and the final position and velocity files. This object exists
only on the master node, and is owned and operated by the Collect
object.
Output.[Ch]
, dcdlib.[Ch]
Output()
~Output()
Output
class. They
are used by the Collect
object to pass along values that is has
collected. The Output
object then takes this data and calls the
appropriate private routines to actually output the data. There are
only three public routines, and each is pretty self explanatory.
void energy ( int timestep, BigReal *energies )
timestep
in the
array energies
and outputs them, currently just to standard output.
It also sums all the component energies to get a total energy sum, and
computes and reports the temperature based on the kinetic energy.
void coordinate ( int timestep, int n, Vector *coor )
n
positions for time step timestep
from the array
coor
and calls the appropriate output routines to
write them to the restart file, trajectory files, etc.,
as specified by the configuration parameters.
void velocity ( int timestep, int n, Vector *vel )
n
velocity vectors
for time step timestep
from the array
vel
and calls the appropriate output routines to
write them to the restart file, trajectory files, etc.,
as specified by the configuration parameters.
void short_force ( int timestep, int n, Vector *f )
n
force vectors
for time step timestep
from the array
f
and calls the appropriate output routines to
write them to the trajectory files, etc., for
short-range electrostatic forces,
as specified by the configuration parameters.
Short_range electrostatic forces are
the ones that are recomputed every timestep.
void long_force ( int timestep, int n, Vector *f )
n
force vectors
for time step timestep
from the array
f
and calls the appropriate output routines to
write them to the trajectory files, etc., for
long-range electrostatic forces,
as specified by the configuration parameters.
Long_range electrostatic forces are the electrostatic
forces that are not recomputed every timestep.
void all_force ( int timestep, int n, Vector *f )
n
force vectors
for time step timestep
from the array
f
and calls the appropriate output routines to
write them to the trajectory files, etc., for
the total forces on the atoms,
as specified by the configuration parameters.
Parameters
class is used to read in, store, and find
parameters from xplor
style parameter files. Parameters read in
include those for forces due to bonds, bond angles, dihedral angles, improper
dihedral angles, single atom van der Waals, and pairwise van der Waals.
Multiple parameter files may be read, with the latest read values
overriding any previously read. All of the parameters are stored in
internal data structures that are efficient for later retrieval of these
parameters, both during the reading of the molecular structure file, and
during the simulation.
A Parameters
object will exist on each node.
The copy on the master node is used to read the parameter files,
interact with the Molecule
object while the PSF file is
read, and send the fully digested parameter information to the
slave nodes.
Thereafter,
any object may access the parameters through the Parameters
object on its node.
Parameters.[Ch]
, structures.h
, NameTable.[Ch]
Parameters ( void )
~Parameters ( void )
Parameters
object goes through three phases,
during which the data are stored in two different
kinds of data structure.
The first set of structures are various kinds of
tree structures which allow easy insertion of new
parameters while maintaining an ordering which
permits binary searches for lookups.
(See Parameters.C for declarations of the node types.)
The second set are linear arrays for all parameters,
except van der Waals constants which are kept in a
two-dimensional array.
(See Parameters.h for declarations of the array element types.)
Note that phases one and two are only performed on the
master node.
In the first phase, the parameter files are read,
through one or more calls to read_parameter_file()
.
This puts the contents of the parameter files into the
tree structures.
The end of this phase is signaled by a call to
done_reading_files()
.
In the second phase, the structure (PSF file) is read (see class
Molecule
.)
During this phase, each atom, bond, angle, etc., is assigned a "type
index," which is an index in the linear arrays that will be created at
the end of this phase.
The Molecule
object calls functions with names of the form
assign_*_index
to get these type indices, and also let the
Parameters
object know which atoms, bonds, angles, etc., are
actually being used, (so that it does not need to keep the entire
contents of all parameter files.)
The end of the phase is signaled by a call to
done_reading_structure()
.
This call puts the parameters that are actually needed into linear
arrays, indexed by the type index just mentioned.
After this, the master Node
object needs to call
send_Parameters()
to broadcast the parameter arrays and
associated variables to the other nodes, and the slave Node
objects need to call receive_Parameters()
to get them and store
them in the slave's Parameters
object.
Thereafter, the parameters are accessed through inlined functions which
just look up the values in the linear arrays.
Type indices are of type Index
, which is currently
typedef
'ed to unsigned short
, usually a 16-bit integer.
However, the type indices in the tree structure nodes are currently
declared short
, which limits limits the code to 32,767 different
bond types, angle types, etc., in most C++ implementations.
void read_parameter_file ( char * filename )
name
and
store them for later use. This routine can be called multiple times
with different file names.
void done_reading_files ( )
long atom_type_index ( char * id )
char *atom_type_name ( long index )
NameTable
.
Index assign_vdw_index ( long atomtype )
Index
for the van der Waals parameters for the atom
type specified by atomtype
.
Index assign_bond_index ( long atomtype1, long atomtype2 )
Index
for the bond parameters for a bond between
atom types atomtype1
and atomtype2
.
Index assign_angle_index ( long atomtype1, long atomtype2,
long atomtype3 )
Index
for the angle parameters for an angle between
atom types atomtype1
, atomtype2
, and atomtype3
.
Index assign_dihedral_index ( long atomtype1, long atomtype2,
long atomtype3, long atomtype4, int multiplicity )
Index
for the dihedral angle parameters for a dihedral
angle between atom types atomtype1
, atomtype2
,
atomtype3
, and atomtype4
.
multiplicity
is the number of times in a row that this particular
dihedral (i.e., this particular sequence of atom numbers, not types)
appears in the PSF file.
The maximum multiplicity specified for a given dihedral type is
what get_dihedral_params()
will return as the number
of sets of dihedral parameters to be used for this dihedral angle.
Index assign_improper_index ( long atomtype1, long atomtype2,
long atomtype3, long atomtype4, int multiplicity )
Index
for the improper angle parameters
for an improper angle between
atom types atomtype1
, atomtype2
,
atomtype3
, and atomtype4
.
multiplicity
is the number of times in a row that this particular
improper (i.e., this particular sequence of atom numbers, not types)
appears in the PSF file.
The maximum multiplicity specified for a given improper type is what
get_improper_params()
will return as the number of sets of
improper parameters to be used for this improper angle.
void done_reading_structure ( )
send_Parameters()
or receive_Parameters()
(depending on the node)
must be called after this
After these calls, the parameter access functions may be called
void send_Parameters ( Communicate *comm )
done_reading_structure()
must be called before this
function.
After this call, the parameter access functions may be called
on any node.
void receive_Parameters ( Message *msg )
void get_bond_params ( Real *k, Real *x0, Index index )
k
and x0
for the bond with index
index
.
void get_angle_params ( Real *k, Real *theta0, Index index )
k
and theta0
for the bond angle
with index index
.
void get_improper_params ( Real *k, int *n, Real *delta, Index index )
k
, n
, and delta
for the
improper dihedral angle with index index
.
void get_dihedral_params ( Real *k, int *n, Real *delta, Index index )
k
, n
, and delta
for the
dihedral angle with index index
.
void get_vdw_params ( Real *sigma, Real *epsilon, Real *sigma14,
Real *epsilon14, Index index )
sigma
, epsilon
,
sigma14
and epsilon14
for the atom with index
index
.
This is used for display
and output (Molecule.C
) purposes.
In the MD calculation, get_vdw_pair_params
is used instead.
VdwCoef *get_vdw_pair_params(Index ind1, Index ind2, Bool is14 )
A
and B
for the atoms with van der Waals indices ind1
and ind2
.
If the atoms form a 1-4 pair and the configuration file
parameter Exclude
is set to 1-4scaled
,
TRUE
should be passed as is14
to get the modified
van der Waals parameters for the pair.
(See Parameters.h
for a definition of type VdwCoef
.)
print_param_summary ( void )
print_bond_summary ( void )
print_angle_summary ( void )
print_dihedral_summary ( void )
print_improper_summary ( void )
print_vdw_summary ( void )
print_vdw_pair_summary ( void )
ConfigList
. It is designed
to simplify the SimParameters
class by providing a uniform
mechanism for determining which options are valid at run time. It is
rather limited as future fmd development will tend towards run-time
communications with a more general controller. Hence, a general parser
is not required. This class is used in four phases: set up the
dependencies, verify the internal consistency, read in the
ConfigList
, and get or use the defined variables.
The first stage tells the class which options are dependent on others.
Only one dependency is allowed. The depended option is called the
"child" and the option it is dependent upon is call the "parent". The
dependencies can be expressed as a tree with the option "main" defined
as the uppermost parent. Hence, all other options are dependent on
"main". In addition, it can be used to define a range for which a given
option is valid, as well as give its units.
The second phase checks that all elements are derived from the "main",
and that there are no loops.
The third phase reads the elements from the ConfigList
and sets
the appropriate options. If desired, a pointer can be passed into
ParseOptions
which will also be set at this time. If options
are out of range or too many are given, an error message is printed.
If extra or unknown options are defined in the ConfigList
, they
are printed as warnings.
The forth phase is used to read options which were set. This is in some
sense optional, as the same information can be extracted through the
variable pointer passed in during the third phase.
ParseOptions.[Ch]
ParseOptions ( void )
~ParseOptions ( void )
Range
FREE_RANGE, POSITIVE, NOT_NEGATIVE, NEGATIVE,
NOT_POSITIVE
Units
UNIT, FSEC, NSEC, SEC, MIN, HOUR, ANGSTROM, NANOMETER,
METER, KCAL, KJOULE, EV, KELVIN, UNITS_UNDEFINED
const char *string ( Range r )
const char *string ( Units u )
BigReal convert ( Units to, Units from )
from
to
to
.
PARSE_BIGREAL
FLOAT
without giving a pointer (same as
"(BigReal *)NULL
").
PARSE_FLOAT
PARSE_BIGREAL
.
PARSE_VECTOR
VECTOR
without giving a pointer (same as
"(Vector *)NULL
").
PARSE_INT
INT
without giving a pointer (same as
"(int *)NULL
").
PARSE_UINT
UINT
without giving a pointer (same as
"(unsigned int *)NULL
").
PARSE_BOOL
BOOL
without giving a pointer (same as
"(int *)NULL
").
PARSE_STRING
STRING
without giving a pointer (same as
"(char *)NULL
").
PARSE_ANYTHING
STRINGLIST
.
PARSE_MULTIPLES
STRINGLIST
that can have multiple elements (same as (StringList **)NULL, TRUE
").
fmalevels
is dependent on the option
fmaon
; if the latter (the parent) is not given in the input file,
then the former (the child) is never needed. There are two levels of
dependency; "require" and "optional". In the given example, if it is
required that fmalevels
be given in the user input file whenever
fmaon
is true
, then the appropriate code would be:
ParseOptions opts; . . . opts.require ( "fmaon", "fmalevels", "Number of FMA expansion levels" ); . . .The first term is a string
(char *)
with the name of the parent,
while the second string is the name of the child. The name strings are
not case sensitive. The third argument is the message string displayed
when a warning or error occurs which involves the child option. The
special parent name main
is used for options which are not dependent
on other options. Now if fmalevels
is optional, the code would look
like:
ParseOptions opts; . . . opts.optional ( "fmaon", "fmalevels", "Number of FMA expansion levels" ); . . .These functions are overloaded and can take several more parameters. The first passes a pointer of type
(int *)
, (BigReal *)
,
(Vector *)
, (char *)
, or (StringList *)
to the
ParseOptions
class. Then, when the values are set from the
ConfigList
, this pointer is used to set the appropriate variable
automatically. For the int
, BigReal
, and Vector
types, the next term (if it exists) defines a default value which is
used if the option is not defined in the user input file but the parent
is. There is no way to give a default value for (char *)
or
StringList
. The "set" for a (char *)
is actually a
strcpy()
, so you need to have allocated the space.
Using the same example, suppose the value of fmalevel
is to be
placed in the variable num_fma_levels
. The following examples
show the cases with and without a default value of 5 being specified:
ParseOptions opts; int fma_levels; . . . opts.require ( "fmaon", "fmalevels", "Number of levels", &fma_levels ) . . . opts.require ( "fmaon", "fmalevels", "Number of levels", &fma_levels, 5 ) . . .Functionally, there is no difference between "require" and "optional" if a default value is given. Also, if the pointer is
NULL
, nothing
bad happens. This is useful if you want an option to have a range, but
don't want to use it immediately.
There is a special form of these functions for BOOL
values. fmd
does not have a real "boolean" variable; it uses "typedef Bool int
",
which is indistinguishable from int
to the compiler. Since it is
much nicer to be able to say "ffton yes
" than "ffton 1
", the
functions "requireB" and "optionalB" were made. Boolean terms do more
than define a "yes/no" value (which, by the way, is internally handled
as an inter); they are also used to turn on other blocks of code.
Now normally, if the option is defined in the user input file and other
data is dependent on that parent, then those dependencies are checked.
Using that definition, setting fmaon
to either on
or off
will tell ParseOptions
to check terms like fmalevels
, which are
dependent on that parent. To get around that problem, if a boolean value
is a parent and is defined false
, then it is undefined. This make
"fmaon no
" identical to not listing fmaon
at all.
Here is how the dependencies could look for fma
:
ParseOptions opts; int fmaon, fmalevels, fmamp, fmaffton, fmafftblock; . . . opts.optionalB("main","fma","Should FMA be used?",&fmaon); opts.require("fma","fmalevels","Tree levels in FMA", &fmalevels,5); opts.require("fma","fmamp","Number of FMA multipoles", &fmamp,4); opts.requireB("fma","fmafft","Use FFT enhancements?", &fmaffton,FALSE); opts.require("fmafft","fmafftblock","FFT blocking factor", &fmafftblock,4); . . .By default, at most one option of a given name is allowed. If there are multiple definitions,
ParseOptions
prints that fact to
fmdErr
. Very few options allow multiple inputs of the same
option; the only one at this time is parameters
. The only way to
define this is through the StringList
version of optional/require.
The first three arguments are identical to the other similar functions, and
the fourth takes a "(StringList **)
". The fifth argument, which
by default is FALSE
, defines if multiple instances are allowed.
For example, the following allows multiple parameters
keywords:
ParseOptions opts; . . . opts.require ( "main", "parameters", "One entry for each force field file", (StringList **)NULL, TRUE ); . . .Many types of input must be positive. For example timestep size and the number of steps per cycle. Range checking of this type and others can be done in the
ParseOptions
class with the range()
function. This takes the option name and the range it can take (one
of those in the Range enumeration list above). Here's an example of how
to specify that fmalevels
must be positive:
ParseOptions opts; . . . opts.range ( "fmalevels", POSITIVE ); . . .Some of the options represent physical values, which have units associated with them. The options can assume that the input will always have a specified unit, and the user must use those units by default. However, if the parser understands some common units (like "fs" and "nm"), then the user can specify those directly. The function named
units()
gets and sets the units associated with the given option.
During the parsing, the user input gets translated as need be so that it
is in the specified units. Here's how the code could look:
ParseOptions opts; . . . opts.units ( "timestep", FSEC ); // femto seconds time steps . . .The following three keyword entries all specify a time step size of 1 fs:
timestep = 1 timestep = 1fs timestep = 0.001nsAfter the dependency, range and units functions are defined, the
ParseOptions
should be checked to ensure there are not cyclic
dependencies and that all options are accessible from "main".
This is done with the check_consistency()
function. It returns
TRUE
if the system contains no errors.
Once that is done, the values can be set via the function set()
,
which is passed the ConfigList
. This does the dependency and
range checks, as well as set any variables which may have been passed in
via a pointer. If there was a range error, it prints (to fmdErr
)
the reason for the error as well as the message associated with that
option. Errors related to units conversion are also printed at this time.
If there was an option in the ConfigList
that was not in the language,
it prints (to fmdWarn
) all such unknown options. It also prints
(to fmdWarn
) all options for which the keyword was known but was
not required. The last several functions would look like:
ParseOptions opts; . . . if ( ! opts.check_consistency() ) { fmdErr << "Internal parsing unsuccessful" << sendmsg; return 1; } ConfigList clist ( "test.fmd" ); // open and read the file if ( ! clist.okay() ) { fmdErr << "Cannot read 'test.fmd'" << sendmsg; return 1; } if ( ! opts.set ( clist ) ) { fmdErr << "There were errors in the input file" << sendmsg; return 1; } . . .There are several ways to access the information. As mentioned earlier, it is possible to have variables set automatically via pointers passed during the dependency definition. It is also possible to use one of the
get()
functions. These all take as the first argument the name of
the option and, for the second argument, a pointer to where the information
should be stored. ParseOptions
knows the data type of
that option from the dependency definition and can perform type
conversion for most cases; the exceptions are listed in the definition
section. All these functions return 1 if the function was successful,
and prints (to fmdWarn
) a warning if a type conversion took place.
There are two ways to access data with multiple values; as a
"(StringList *)
" or via a "(char *)
". "get(char
*s,int n=0)
" takes as an optional parameter the index of the string to
return in the StringList
. The total number of elements in a
StringList
is accessible via num()
.
For example, the following code could be used to list the given
parameter files:
int num = opts.num ( "parameters" ); char s[100]; for ( int i = 0; i < num; i++ ) { opts.get ( "parameters", s, i ); fmdInfo << " " << s << sendmsg; }Finally,
defined()
tells if a given option was defined during
the set()
, and exists()
tells if a given option was
stated in the dependencies.
void add_element ( DataElement * )
int make_dependencies ( DataElement * )
atoBool
yes
, on
, or
true
; 0 if false
, no
, or off
; and -1 if
anything else.
Bool is_parent_node ( DataElement * )
TRUE
if the given DataElement
has any children.
Used to determine if a BOOLEAN
which is false should be undefined.
int require ( const char *newname, const char *parent, const char *msg,
BigReal *ptr, BigReal default ) or
BigReal *ptr )
BigReal
, with and without a default.
int require ( const char *newname, const char *parent, const char *msg,
Vector *ptr, Vector default ) or
Vector *ptr )
Vector
, with and without a default.
int require ( const char *newname, const char *parent, const char *msg,
int *ptr, int default ) or
int *ptr )
int
, with and without a default.
int requireB ( const char *newname, const char *parent, const char *msg,
int *ptr, int default ) or
int *ptr )
Boolean
(variant of int
), with and
without a default. Looks for "yes/no", "true/false", or "1/0".
int require ( const char *newname, const char *parent, const char *msg,
unsigned int *ptr, unsigned int default ) or
unsigned int *ptr )
unsigned int
, with and without a default.
int require ( const char *newname, const char *parent, const char *msg,
StringList **ptr=NULL, int many_allowed=FALSE )
StringList
return. If TRUE
, the
second argument allows a StringList
with more than one element. There
are no defaults for a StringList
.
int require ( const char *newname, const char *parent, const char *msg,
char *ptr )
(char *)
return. There are no defaults
for a (char *)
. If the option exists in the ConfigList
,
it is copied (via strcpy
) to ptr.
int optional ( const char *newname, const char *parent, const char *msg,
BigReal *ptr, BigReal default ) or
BigReal *ptr )
BigReal
, with and without a default.
int optional ( const char *newname, const char *parent, const char *msg,
Vector *ptr, Vector default ) or
Vector *ptr )
Vector
, with and without a default.
int optional ( const char *newname, const char *parent, const char *msg,
int *ptr, int default ) or
int *ptr )
int
, with and without a default.
int optionalB ( const char *newname, const char *parent, const char *msg,
int *ptr, int default ) or
int *ptr )
Boolean
(variant of int
), with and
without a default (see requireB
above).
int optional ( const char *newname, const char *parent, const char *msg,
unsigned int *ptr, unsigned int default ) or
unsigned int *ptr )
unsigned int
, with and without a default.
int optional ( const char *newname, const char *parent, const char *msg,
StringList **ptr=NULL, int many_allowed=FALSE )
StringList
return. If TRUE
, the
second argument allows a StringList
with more than one element. There
are no defaults for a StringList
.
int optional ( const char *newname, const char *parent, const char *msg,
char *ptr )
(char *)
return. There are no defaults
for a (char *)
. If the option exists in the ConfigList
,
it is copied (via strcpy
) to ptr.
void range ( const char *name, Range newrange )
name
has the range specified
by argument newrange
Range range ( const char *name )
name
.
void units ( const char *name, Units newunits )
name
has the units specified
by argument newunits
Units units ( const char *name )
name
.
Bool scan_float ( DataElement *el, const char *s )
BigReal
. If there are units, do the necessary
conversion. Put the final result in el->fdate
. Returns error
value.
Bool scan_vector ( DataElement *el, const char *s )
Vector
. If there are units, do the necessary
conversion. Put the final result in el->fdate
. Returns error
value.
Bool scan_int ( DataElement *el, const char *s )
int
. If there are units, do the necessary
conversion. Put the final result in el->idate
. Returns error
value.
Bool scan_uint ( DataElement *el, const char *s )
unsigned int
. If there are units, do the necessary
conversion. Put the final result in el->uidate
. Returns error
value.
Bool scan_bool ( DataElement *el, const char *s )
TRUE
or FALSE
. If so,
set el->idata
appropriately. Returns error code.
Bool set_float ( DataElement * )
BigReal
data value is in range (returns FALSE
if not) and sets the (BigReal *)
, if appropriate.
Bool set_vector ( DataElement * )
Vector
pointer is defined, set it to the current values.
Bool set_int ( DataElement * )
int
data value is in range (returns FALSE
if not) and sets the (int *)
, if appropriate.
Bool set_uint ( DataElement * )
unsigned int
data value is in range (returns
FALSE
if not) and sets the (unsigned int *)
, if appropriate.
void set_bool ( DataElement * )
BOOLEAN
data value is in range (returns FALSE
if not) and sets the (BOOLEAN *)
, if appropriate.
void set_stringlist ( DataElement * )
(StringList **)
, if appropriate.
void set_string ( DataElement * )
ConfigList
to the
(char *)
, if appropriate.
Bool check_consistancy ( void )
DataElement
array.
Bool set ( const ConfigList& configlist )
configlist
, checks the internal
data array, sets the appropriate data pointers, prints all
warnings and/or errors, and returns TRUE
if it all worked out.
DataElement *internal_find ( const char *name )
NULL
if it doesn't exist.
get ( const char *name, int *val )
int
value associated with the given
name, doing type conversion if necessary. If conversion was needed,
prints the warning (or error, if it is a Vector
) to the
screen. Returns FALSE
if the name doesn't exist.
get ( const char *name, BigReal *val )
BigReal
value associated with the given
name, doing type conversion if necessary. If conversion was needed,
prints the warning (or error, if it is a Vector
) to the
screen. Returns FALSE
if the name doesn't exist.
get ( const char *name, Vector *val )
Vector
value associated with the given
name. It can only do conversions from STRING
and STRINGLIST
.
Returns FALSE
if the conversion is not possible.
get ( const char *name, StringList **val )
(*val)
to (configList->find(name))
, it name
exists, else it returns FALSE
.
get ( const char *name, char *val, int n=0 )
n
'th element of val
, or returns FALSE
if
name
or the element does not exist.
exists ( const char *name )
TRUE
if an element with the specified name exists in the
internal DataElement
array, else returns FALSE
.
defined ( const char *name )
TRUE
if an element with the specified name exists and
was given either a default or was in the ConfigList
.
DataElement
ParseOptions
and is private to the rest of the code.
Its only functions are constructors, which are used to set the correct
values and ensure that other values are sane. It stores all the
information needed to describe the data elements used by
ParseOptions
. It knows the name of the datum, its parent's name,
the pointer to the parent (if it exists), whether the datum is optional
or required, the type of the data (through run-time data typing), the
default value (if it exists), and so on. Inside ParseOptions
,
the DataElement
's are stored as an array which grows (during
add_element
, as needed.
UNDEF
, FLOAT
, VECTOR
, INT
,
BOOL
, STRINGLIST
, STRING
- the types used for run-time typing.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
BigReal *ptr, BigReal default ) or
BigReal default )
FLOAT
type with the given default value which can set ptr, or
without a default value.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
Vector *ptr, Vector default ) or
Vector default )
VECTOR
type with the given default value which can set ptr, or
without a default value.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
int *ptr, int default ) or
int default )
INT
type with the given default value which can set ptr, or
without a default value.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
unsigned int *ptr, unsigned int default ) or
unsigned int default )
UINT
type with the given default value which can set ptr, or
without a default value.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
StringList **ptr, int many_allowed=FALSE )
STRINGLIST
type which may or may not have multiple elements.
DataElement ( const char *newname,
const char *newparent,
int optional, const char *err,
char *ptr )
STRING
type which can strcpy
to ptr
.
Patch
object is responsible for maintaining a region of
space in the simulation. This includes maintaining the current
positions and velocities of all atoms in this region, calculating all
the forces acting on these atoms, and integrating the equations of
motion for these atoms during each time step.
Patch.[Ch]
Patch
object is created from scratch. It creates an empty patch
with no atoms. The second constructor is used when the Patch
object being created is a patch that has migrated from another processor.
Patch ( int, PatchList * )
PatchList
object it owns.
Patch ( int, PatchList *, Message * )
Message
. After
construction, the Message
is deleted.
~Patch ( void )
Patch
object is responsible for managing most of the work
done during the simulation. It owns a region of space and is
responsible for all of the atoms in that region. This responsibility
includes maintaining the current positions and velocities for the atoms,
calculating and gathering all of the forces necessary to integrate the
equation of motion at each time step, and add or delete atoms that have
moved into or out of its region. To accomplish these things, each
Patch
object contains a number of other objects. These objects
include force objects that calculate the various force components for
the local atoms, and an Integrate
object that is responsible for
integrating the equations of motion during each time step.
The Patch
object's main purpose is to provide a control and data
framework which allows each of it's member objects to function. To this
end, each time step follows a basic sequence of steps. These steps are:
send_msgs()
function. This allows interactions with neighboring
atoms to be calculated. A self-message is also sent, telling the
patch to actually perform the local calculation. During normal, mid-cycle
time steps, this step actually occurs at the end of the previous time
step, rather that at the beginning of a new time step. See the
Communicate
description for more details.
Integrate
object to obtain new values for the position and velocity of all local
atoms. Then the energies from all the force objects are gathered and
the kinetic energy is calculated. The resulting values are passed to
the local Collect
object so that they can be summed across all
processors.
Patch
object deals with
administrative activities, such as start-up and shut-down, the transfer
of atoms from patch to patch, recalibration of force objects at the
cycle boundaries, etc.
Patch
object:
BondedWithNeighbor
num
atoms
num
that contains the local atom indexes of the
atoms that are to be sent to this neighbor.
coords
Vector
array of size num
that is used to gather the
coordinates to be sent to this neighbor. This provides a buffer to
gather the positions in during each time step so that space doesn't need
to be allocated and freed during each time step.
BondedWithNeighbor
, the act of gathering atoms for a
neighbor structure is accomplished as follows:
BondedWithNeighbor bwn; int i; for ( i = 0; i < bwn.num; i++ ) bwn.coords[i] = x[bwn.atoms[i]];
ReturnForceVectors
num
forces
num
which holds the vectors to be
accumulated. The array is set to all 0's at the beginning of each time
step. When the coordinate message from this neighbor is processed,
these forces are populated and then sent back to the neighbor.
AtomMsgList
Message
's that have been received. During atom reassignment, all
the atom reassignment messages received from neighbors that actually
contain new atoms for this patch are buffered in these lists. Once all
the messages have been received, all the messages are processed at once. The structure contains:
num
msg
Message
object.
next
Patchlist *parentList
PatchList
object that owns this patch. This is used
to access items contained in the PatchList
object easily.
int myId
int numAtoms
int currentTimestep
Patch
is currently working on. At present,
patches that reside on the same node will always be working on the same
time step. Patches that reside on different nodes may be working on a
different time step. Thus, at any time, any neighboring patch may be working
on a different time step.
int *atoms
numAtoms
that contains the global indexes for
the local indexes. Thus atom[i]
contains the global index for
the i'th local atom.
Vector *x
Vector *v
Vector *f
BondForce *bondForce
AngleForce *angleForce
ElectForce *electForce
ImproperForce *improperForce
DihedralForce *dihedralForce
Integrate *integrator
int numSend
int numRecv
int *sendNeighbors
numSend
that contains the patch ID's of the
neighbors that we send all coordinates to.
int *recvNeighbors
numRecv
that contains the patch ID's of the
neighbors that we receive all coordinates from.
BondedWithNeighbor *bondedInfo
numRecv
that contains the information about the
bonded coordinates that we need to send to each neighbor that we send
only bonded coordinates to. The information is placed into this array
in the same order as the patch ID's are placed in recvNeighbors
.
In other words, if patch ID p
satisfies
p=recvNeighbors[i]
, then the information about the bonded
coordinates to send to patch p
is contained in the structure
bondedInfo[i]
.
ReturnForceVectors *returnForces
numSend
that provides a place to accumulate
forces that are to be returned to a neighboring patch that sends this
patch all of its atoms coordinates. As with bondedInfo
, the
entries in returnForces
correspond to the patch ID of the
corresponding element in sendNeighbors
.
Bool doneLocal
int tsType
BEGIN_CYCLE
or MID_CYCLE
,
which indicates whether or not this is the first time step in a cycle.
This is important, since during the first time step of each clock, the
messages received and actions performed are slightly different,
as is the initialization done by each force object. The MID_CYCLE
steps all perform the same operations.
int coorMsgsOutstanding
numSend+numRecv
. As each
bonded coordinate and all coordinate message is processed, this value is
decremented by one. When this value reaches 0, the patch has processed
all of the coordinate messages for the current time step.
int forceMsgsOutstanding
numSend
. When this value
reaches 0, the patch has processed all of the force messages for
the current time step.
int atomMsgsOutstanding
int numAtomsRemoved
int numAtomsAdded
LintList *atomsRemoved
AtomMsgList *atomsReceived
AtomMsgList *atomListTail
atomsReceived
so that new messages can
be added to the list in constant time.
Patch
object is owned and managed by a PatchList
object. Thus, almost all calls to these public functions are done by
the PatchList
object.
void get_initial_positions ( Message *msg )
void get_initial_velocities ( Message *msg )
void initialize_timestep ( int timestep )
void send_msgs ( int time step )
void process_msg ( Message *msg, int tag )
void end_cycle ( void )
void send_atom_reassignments ( void )
void receive_atom_reassignment ( Message *msg )
void copy_self ( Message *msg )
int getNumAtoms ( void )
int *getAtomList ( void )
int id ( void )
void patch_debug ( void )
Patch
object itself,
and they perform much of the work of the simulation.
void all_coordinate_init ( Message *msg, int nId, int nIndex )
msg
is the object containing the data, nId
is the ID of
the neighbor who sent the message, and nIndex
is the index of
this neighbor into the recvNeighbors
and other arrays. This
function is only used during the first timestep of each cycle.
During mid-cycle steps, the function all_coordinate_force()
is used
to calculate forces without doing the initialization activities.
void all_coordinate_force ( Message *msg, int nId, int nIndex )
msg
is
the object containing the data, nId
is the ID of the neighbor who
sent the message, and nIndex
is the index of this neighbor into
the recvNeighbors
and other arrays. This function is used during
every mid-cycle time step.
bonded_coordinate_init ( Message *msg, int nId )
msg
contains the data sent out, and
nId
is the patch ID of the neighbor it was sent to. This function
is only used during the first time step of each cycle. During mid-cycle
steps, the function bonded_coordinate_force()
is used to calculate
the forces without doing the initialization activities.
void bonded_coordinate_force ( Message *msg, int nId )
msg
contains the coordinates and nId
is the
patch ID of the neighbor who sent the message. This function is used
during mid-cycle time steps.
void local_init ( void )
local_force()
is used to calculate forces during mid-cycle time
steps.
void local_force ( void )
void add_forces ( Message *msg )
void prepare_bonded_coords ( BondedWithNeighbors *bwn )
BondedWithNeighbors
structure that describes the coordinates
that need to be sent are passed in bwn
.
BigReal get_kinetic_energy ( void )
void update_atom_info ( void )
atoms
, x
, v
, and f
to match the new number
of atoms in the patch
int get_list_index ( int value, const int *list, int numvalues )
value
is the integer to be searched for, list
is
the list of integers to be searched, and numvalues
is the number
of integers in list
.
PatchDistrib
object is used to determine and report the
distribution of patches to processors. For the current implementation,
it is responsible for determining what processor every patch on the
system belongs to and distributing this information to every other
PatchDistrib
object on the system. Each object is then also
responsible for returning the processor location of any patch. In order
to be efficient, the PatchDistrib
object allows constant time
access to patch information in two ways: by patch ID number, and by grid
coordinates. Grid coordinates identify the position of a patch in a set
of grid coordinates where the origin is the lowest corner of the model.
PatchDistrib.[Ch]
PatchDistrib ( void )
~PatchDistrib ( void )
create_initial_distrib()
, which is passed a PDB
object
containing the initial position of all the atoms. This function obtains
the maximum and minimum corners of the molecule. It then determines
how many patches it will take to cover these dimensions. Then, a layer
of empty patches is added around all edges of the system to provide the
buffer zone of empty patches that will be used. Patch structures are
then created for each of these patches. Coordinates, grid positions,
and patch numbers are then assigned to each patch. Also, the number of
neighbors, the patch ID's of the neighbors, and which neighbors to send
and receive from are also determined. The patches to send and receive
from are determined in a very simplistic way at the moment. The gird
coordinates are compared in order from x
to y
to z
.
If the first coordinate that is not equal is found to be grater, then
the patch is sent to. Otherwise, it is received from. Next, these
patches are mapped to processors. A recursive coordinate bisection
algorithm is used to map patches to processors. The
create_initial_distrib()
function invokes the RecBisection
object to accomplish this. If the recursive bisection algorithm fails
for some reason, then a VERY simplistic scheme is used where an equal
number of patches is assigned to each processor in sequential patch ID
order (the function simple_strip_division()
does this).
Now this distribution can be sent from the master processor to the
client processors. Once it is received on each processor, the
PatchList
object can use this information to determine how many
patches it is responsible for and it can allocate the required
Patch
objects.
Finally, since the PDB
information was already parsed to
determine the initial distribution, this information can now be used to
send the initial coordinates and velocities to each Patch
object.
This isn't necessarily the most logical place for this to be done, but
it turns out to be quite practical.
void create_initial_distrib ( PDB * )
void send_Distrib ( Communicate * )
void receive_Distrib ( Message * )
void send_initial_positions ( Communicate *, PDB * )
PDB
object.
void send_initial_velocities ( Communicate *, Vector * )
int patch_node ( int pnum )
Patch
assigned patch pnum
to.
int get_patch_id ( int i, int j, int k )
IntList *patches_for_node ( int )
IntList
object that contains the
patch numbers assigned to that node. This is intended to be used by a
give Node
object to determine what patches belong to it.
int num_recv_for_patch ( int pnum )
int num_send_for_patch ( int pnum )
int get_maxNeighborPatches ( void )
int get_maxPatchNum ( void )
int get_recv_patches ( int pnum, int *plist )
pnum
will receive all atoms
from is returned in the array plist
int get_send_patches ( int pnum, int *plist )
pnum
will send all atoms
to is returned in the array plist
int simple_strip_division ( int )
int numPatches
Patch
objects
(see section Patch)which reside on each processor node. The
PatchList
contains routines to add, move, or delete patches, and
contains the logic to complete on individual time step. The
Patch
objects (see section Patch) then contain particular atoms
and have the knowledge of how to calculate forces and integrate the
equations of motion.
PatchList.[Ch]
PatchList ( void )
~PatchList ( void )
TSType
: BEGIN_CYCLE
, MID_CYCLE
PatchList
object exists on each processor node; it
holds all the individual patches and loops through them to do a time
step calculation. It is created during the simulation initialization
phase, before any time steps are calculated, and deleted when the
simulation is complete. When constructed, it contains no patches; these
are added by the PatchConfig
object, which calls specific
routines within PatchList
to add patches that should be on the
node.
A node-level algorithm is responsible for determining when each time step
should be calculated; when a time step is to be done, this algorithm calls
do_timestep(int tsType)
in the PatchList
object. The argument
tsType
is a flag to indicate the type of time step this is to be.
In the current design, there are two types:
BEGIN_CYCLE
: The very first time step in a cycle of k steps. For
this time step, data may have to be regenerated for the PatchList
and each patch due to movement of atoms, and redistribution/creation/deletion
of patches.
MID_CYCLE
: All other time steps in a cycle other than the first; for
these steps, the status of atoms and patches is taken to be static, and local
data structures do not have to be generated.
do_timestep()
is called, the PatchList
allows each
Patch
to calculate the forces on its atoms, and to integrate the
equations of motion. The algorithm used to do this is:
BEGIN_CYCLE
, regenerate PatchList
data structures due to
changes in atom and Patch
distributions.
patchesDone = 0
.
patch.send_msgs(tsType)
patchesDone < "patches in PatchList"
:
patch.process_msg(msg)
patch.process_msg(msg)
returned TRUE
, increment
patchesDone
.
patch.report_data(tsType)
Communicate
object
must then return messages of a given tag with the following priority:
send_to_patch(Message *, patch, id)
. The argument id
is a
code used by the patches to distinguish the contents of the Message, it
is not the tag. A specific tag is used for all messages that are sent between
Patch
objects; this tag is defined in common.h
.
void startup ( IntList * )
IntList
that contains the patch
ID's that will be owned by this object, creates all the necessary data
structures, and initializes them.
void receive_initial_positions ( void )
Patch
.
void receive_initial_velocities ( void )
Patch
.
void reconfig ( void )
do_timestep(BEGIN_CYCLE)
is
called. It regenerates the internal data structures when atoms and/or
patches are redistributed.
void send_to_patch ( Message *msg, int patch, int id )
Message
to the given patch
, encoding
within the Message
which patch
the Message
is for,
the id
, and the current time step. The Message
sent has
the following format:
Patch
number.
Patch
number.
void do_timestep ( int tsType )
void add_patch ( Patch * )
Patch
object to the current set of patches being
managed by the PatchList
. This function is called by a
higher-level algorithm, such as by PatchConfig
.
void del_patch ( int patchnum )
Patch
object from the current set of patches
being managed by the PatchList
. This function is called by a
higher-level algorithm, such as by PatchConfig
.
void move_patch ( int patchnum, int destnode )
destnode
with all data required to create a
patch identical to Patch patchnum
, and then deletes Patch
patchnum
from the list of local patches. The remote node is then
responsible for receiving the message, creating a new patch, and adding
it to the PatchList
on that node.
X-PLOR
type
PDB file. In the most general sense, it should organize the various
data types present in the file. However, that generality is not needed.
Thus, all this class does is read the ATOM and HETATM records. The
generality (and complexity) remains in the code because it was taken
from a project to read in and manipulate all PDB records.
The atom information is access as an array. There are several ways to
search the information in the PDB
object. All these search
functions are named find_atom_*
, where the "*" can be one of many
possible search criterion. The result of a search is an IntList
pointer which contains the indices to all the atoms found. Memory is
allocated for this list, so it must be deleted after it is no longer
needed.
PDB.[Ch]
, IntList.h
PDB ( const char *pdbfilename )
pdb.pdb.bnl:/pub/format.desc.ps
. There are
two differences between this standard format and the X-PLOR variant,
which are explained in PDBAtom
(see section PDBAtom, PDBAtomRecord, PDBHetatm). The
constructor reads the file and stores the ATOM and HETATM records into
one linked list. After all the data is read, the linked list is
converted into an array, in order to increase access speed.
The following record is a special type recognized by fmd:
KEYWRD....BASE64
where the "...." are replaced by spaces. If the constructor encounters
this record, it sets a flag indicating that all atom and residue sequence
numbers should be written in base-64 notation using the character set
"0-9A-Za-z@#". This allows a PDB file to contain over 1 billion atoms
and 16 million residues, but is clearly highly non-standard and specific
to fmd alone. Such a notation is also used by the PSF file reader.
~PDB ( void )
PDB
class is best described by this snippet
of code:
PDB pdbinput ( "pti.pdb" ); if ( pdbinput.numatoms() == 0 ) FMD_die ( "There were not atom records in PDB file." ); IntList *search = pdbinput.find_atom_name ( "CA" ); cout << "There are" << search->num() << "alpha carbons./n"; cout << "The fifth is atom number " << pdbinput.atom(search[5]->serialnumber() << "./n"; delete search;
PDBAtomList
PDBAtom *data
PDBAtomList *next
NUL
terminated. This is used internal to the class
to maintain the list of all atoms read from the PDB file before it is
converted into an array.
int num_atoms ( void )
IntList *find_atom_serialnumber ( int serialnumber )
IntList *find_atom_name ( const char *name )
IntList *find_atom_alternatelocation ( const char *alternatelocation )
IntList *find_atom_residuename ( const char *residuename )
IntList *find_atom_chain ( const char *chain )
IntList *find_atom_residueseq ( int *residueseq )
IntList *find_atom_insertioncode ( const char *insertioncode )
IntList *find_atom_segmentname ( const char *segmentname )
IntList *find_atom ( const char *name=NULL,
const char *residue=NULL, int residueseq=-1,
const char *segment=NULL )
NULL
, it will not be
used in the search.
PDBAtom *atom ( int place )
place
. The place numbers
are 0 based, hence atom(1)
refers to the 2nd atom present.
PDBAtomList *atoms ( void )
IntList *find_atoms_in_region ( Real x1, Real y1, Real z1,
Real x2, Real y2, Real z2 )
void find_extremes ( Vector *, Vector * )
PDBData
(see section PDBData, PDBUnknown)
PDBData.[Ch]
Start
STYPE=1 SSERIAL=7 SNAME=13 SALT=17 SCHAIN=22 SRESSEQ=23 SINSERT=27 SSEGNAME=73 SOCC=55 STEMPF=61 SFOOT=68 SRESNAME=18 SX=31 SY=39 SZ=47
Length
Start
, specifies the fixed format for the PDB file.
The Enumerators include:
LTYPE=6 LSERIAL=5, LNAME=4 LALT=1 SRESNAME=4 LCHAIN=1, LRESSEQ=4 LINSERT=1 LOCC=6 LOCCPREC=2 LTEMPF=6 LTEMPFPREC=2 LSEGNAME=4 LCOORPREC=3 LCOOR=8 LFOOT=3
PDBPossibleAtoms
PDBAtom ( char *dataline, PDBPossibleAtoms whichatom )
PDBAtomRecord ( char *dataline )
PDBAtom
,
to make an ATOM record.
PDBHetatm ( char *dataline )
PDBAtom
,
to make a HETATM record.
~PDBAtom ( void )
~PDBAtomRecord ( void )
~PDBHetatm ( void )
void parse ( const char *s )
sprint ( char *s, PDBFormatStyle usestyle = COLUMNS )
int serialnumber ( void )
void serialnumber ( int newserialnumber )
const char *name ( void )
void name ( const char *newname )
const char *alternatelocation ( void )
void alternatelocation ( const char *newalternatelocation )
const char*residuename ( void )
void residuename ( const char *newresiduename )
const char *chain ( void )
void chain ( const char *newchain )
int residueseq ( void )
void residueseq ( int newresidueseq )
const char *insertioncode ( void )
void insertioncode ( const char *newinsertioncode )
Real xcoor ( void )
Real ycoor ( void )
Real zcoor ( void )
void xcoor ( Real newxcoor )
void ycoor ( Real newycoor )
void zcoor ( Real newzcoor )
const Real *coordinates ( void )
void coordinates ( const Real *newcoordinates )
Real occupancy ( void )
void occupancy ( Real newoccupancy )
Real temperaturefactor ( void )
void temperaturefactor ( Real newtemperaturefactor )
int footnote ( void )
void footnote ( int newfootnote )
const char *segmentname ( void )
void segmentname ( const char *newsegmentname )
PDBUnknown
, which is
constructed with the string containing the data record. All it does
is copy and save the string so that it may be printed later.
PDBData *new_PDBData ( const char *data )
(char *)
containing the line from the PDB file. Since the parent class cannot
create the appropriate child (in C++), there must be a helper function
which knows which child to create. For the classes derived from
PDBData
, the appropriate helper function is new_PDBData
.
It knows how to look at the string to figure out which child to create.
It then calls the constructor for that class and returns the pointer.
PDBData.[Ch]
PDBType
HEADER OBSLTE COMPDN SOURCE EXPDTA SPRSDE JRNL REMARK SEQRES FTNOTE HELIX SHEET TURN SSBOND SITE SCALE MTRIX TVECT MODEL ATOM ANISOU SIGUIJ TER ENDMDL CONECT UNKNOWN REVDAT FORMUL ORIGX SIGATM AUTHOR HET CRYST1 HETATM MASTER END
PDBFormatStyle
PDBData ( PDBType newtype )
- where type is the PDB record type.
~PDBData ( void )
PDBType type ( void )
static void scan ( const char *data, int len, int start, int size,
int *ans, int default ) or
Real *ans, Real default ) or
char *ans )
len
, start at position start
and read the next size
characters. If reading an Int or Real,
place the result in ans
, using the default
value if the
field is blank. Otherwise, just return the string of length size
in ans
.
static void field ( const char *data, int fld, char *result )
fld
in the data string and return the
information in result
.
static void sprintcol ( char *s, int start, int len,
const char *val ) or
int val ) or
int prec, Real val )
len
characters from string val
into string
s
starting at position start
. If val
is Real, use
only prec
digits of precision.
virtual void sprint ( char *s, PDBFormatStyle uesstyle = COLUMNS )
s
, using
the given format style.
PatchDistrib
object (see section PatchDistrib) and
has direct access to its private data structure which holds the map of
patches. The partitioning algorithm has three steps:
compute_patch_load()
function determines the load of each patch.
PatchDistrib.[Ch]
directions: ( XDIR=0, YDIR, ZDIR )
RecBisection ( void )
~RecBisection ( void )
partition
.
This function is called by the PatchDistrib
object after the
computational domain has been divided into patches. Then,
RecBisection
accesses the private data of PatchDistrib
to
gather the patch information, performs partitioning, and assigns
partitions to processors.
int partition ( void )
void compute_patch_load ( void )
void rec_divide ( int n, const Partition& P )
void assignNodes ( void )
int prev_better ( float, float, float )
TRUE
if the previous bisection point is better than the
current one.
void refine_edges ( void )
void refine_boundaries ( void )
void refine_surface ( void )
SimParameters.h
for a list.
Nearly all correspond closely to the similarly-named
configuration file parameters described in section 4.3.
There are also a number of inlined inquiry functions,
which should be called instead of having many copies of the code
which decodes certain sets of parameters.
Currently, there are only inquiry functions to indicate when
per-atom data vectors should be sent to be written out,
but inquiry functions should probably be used in more places
in fmd.
SimParameters.[Ch]
SimParameters ( void )
~SimParameters ( void )
void initialize_config_data ( ConfigList *, char *&cwd )
void send_SimParameters ( Communicate * )
void receive_SimParameters ( Message * )
is_
return TRUE
if
that particular type of file is going to be written to at timestep
step
; if step
is negative or absent, they return
TRUE
if that type of file will be written to at all during the
run.
The last two such functions are composites,
e.g., is_vel_out_step()
is just
is_velTrj_step() || is_restart_step() || is_last_step()
.
The functions whose names begin with num_
return the number of
data sets of that type that will be written during the run, and are used
to provide data for trajectory file headers.
Bool is_coorTrj_step (int step=-1)
Bool is_velTrj_step (int step=-1)
Bool is_electForceTrj_step (int step=-1)
Bool is_allForceTrj_step (int step=-1)
Bool is_restart_step (int step=-1)
Bool is_snapshot_step (int step=-1)
Bool is_last_step (int step=-1)
Bool is_coor_out_step (int step=-1)
Bool is_vel_out_step (int step=-1)
int num_coorTrj_steps ()
int num_velTrj_steps ()
int num_electForceTrj_steps ()
int num_allForceTrj_steps ()
int num_snapshot_steps ()
SphericalBCForce.[Ch]
SphericalBCForce ( Vector *origin, BigReal size )
This is the constructor for the SphericalBCForce force object. It
is responsible for getting all the parameters from the
SimParameters object and then determining if this object needs to
perform any computation. It only needs to do so if there is some
portion of the patch that lays outside of the spherical
boundaries.
~SphericalBCForce ( void )
The destructor for the SphericalBCForce force object currently does
ABSOLUTELY NOTHING!!
void initialize_timestep ( void )
Big_Real get_energy ( void )
force ( int numAtoms, Vector *x, Vector *forces )
Timer
class is used for timing various aspects of fmd. It
is modeled after the CMMD_node_timers
implemented on the Thinking
Machines Incorporated CM5. Each Timer
can be started and stopped
multiple times, accumulating the total time from each cycle. A Timer
can also be cleared to set the accumulated time to zero. Each Timer
tracks clock time, user cpu time, system cpu time, and total cpu time.
Timer.[Ch]
Timer ( void )
~Timer ( void )
start()
, stop()
, and clear()
are use
to start, stop and clear a Timer
. The functions
clock_time()
, cpu_time()
, user_time()
, and
system_time()
are used to report the current values of a
Timer
.
For example, to time both the total time to complete a loop and each
iteration of the loop, the following code could be used:
Timer total_time; Timer iter_time; total_time.start(); while ( . . . ) { iter_time.start(); . . . iter_time.stop(); printf ( "ITERATION TOOK %f sec/n", iter_time.clock_time() ); iter_time.clear(); } total_time.stop(); printf ( "TOTAL TIME = %f secs/n" , total_time.clock_time() );
void start ( void )
void stop ( void )
void clear ( void )
float clock_time ( void )
float cpu_time ( void )
user_time()
+ system_time()
.
float user_time ( void )
float system_time ( void )
UnitCell.[Ch]
UnitCell ( void )
~UnitCell ( void )
setup()
, which is passed the
origin and the three lattice vectors (There must always be three lattice
vectors even if the simulation uses mixed boundary conditions.
Isolated, meaning non-periodic, boundaries will have margin patches
added as necessary). setup
calculates the reciprocal lattice.
The UnitCell can then be used to calculate the overlap which is need for
BoundaryMap and NeighborList as well as transform atomic coordinates
across the boundary to implement periodic boundary conditions.
int setup ( Vector O, Vector X, Vector Y, Vector Z );
void numberOfCellsForRadius ( BigReal radius, const int xdim,
c2i ( Vector in, Vector &out )
i2c ( Vector in, Vector &out )
offset ( Vector &v )
removeOffset ( Vector &v )
warp ( Vector &v )
BigReal
values; x, y, and z.
There are two ways to use Vectors. The most natural is through operator
overloading, which allows us to say things like:
However, this means that multiple copies may occur needlessly, so
we provide an alternate means of doing these functions. In essence,
these are "two operand functions". So for example, v1.add(v2)
will add v1
to v2
.
Vector.h, common.h
Vector ( void )
Vector ( const Vector &v2 )
Vector ( BigReal x, BigReal y, BigReal z )
~Vector ( void )
Vector v1(1.1,2.2,3.3); // Vector with explicit elements Vector v2(-1,55,32.1); Vector v3(v1+2*v2); Vector v4; cout << "v1.x = " << v1.x << "/n"; cout << v1 << " " << v2 << " " v3 << "/n"; v4 = v3*5 - v2/4; cout << cross(v3,v4) << "/n"; v3 = (v1 += v2)/-4.25; v1.sub(v2); v4.mult(3.14); cout << v3.length() << " " << v2.dot(v1) << "/n"; v2 = v4.unit(); // Set v2 to unit vector along v4.
friend int operator == ( const Vector &v1, const Vector &v2 )
TRUE
if they are the same.
friend int operator != ( const Vector &v1, const Vector &v2 )
TRUE
if they are NOT the same.
friend Vector operator + ( const Vector &v1, const Vector &v2 )
friend Vector operator - ( const Vector &v1, const Vector &v2 )
friend Real operator * ( const Vector &v1, const Vector &v2 )
Vector operator * ( const Real &v1, const Real &f )
Vector operator * ( const Real &f, const Vector &v1 )
Vector operator / ( const Real &v1, const Real &f )
friend Vector cross ( const Vector &v1, const Vector &v2 )
friend ostream& operator << ( ostream& strm, const Vector &v1 )
Real &operator[](int i)
Vector& operator+=(const Vector &v)
Vector& operator-=(const Vector &v)
void add(const Vector &v)
add_const(BigReal f)
void sub(const Vector &v)
Real length ( void )
Vector unit ( void )
void cross ( const Vector &v )
void mult ( BigReal f )
void div ( BigReal f )
BigReal dot ( const Vector &v )
NameTable
, the ability to
recognize "wild card" names.
These are names ending in "*", which match any name which can be formed
from the wild card by replacing the "*" with 0 or more appropriately
chosen characters (allowing for the reduced character set supported by
NameTable
.)
The user can enter names, some with "*", some without.
The user of the class can later ask for all wildcards that match a given
(non-wild card) name.
This class is derived from NameTable
, and primarily adds
functionality to its base class.
Only the added functions are described here.
NameTable.[Ch]
WildCard ( int initial_size = 64 )
~WildCard ( void )
NameTable
class can
be used with a WildCard
as well.
long enter_wildcard( const char *name );
long *match( const char *name ); long *match( long index )
This chapter contains descriptions for some of the support functions provided as non-class objects. These include base-64 conversion routines, DCD file support, HDF file support, and string manipulation routines.
ltob64
and
b64tol
. Prototypes for the functions are provided by
base64.h, while the functions and the character set are defined in
base64.c. The character set is defined in a static variable in
base64.c as
static char base64_digit[] = "0123456789" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz@#";
base64.[ch]
long b64tol ( char *s )
*s
as a base-64 number and returns
it's value as a long. Leading spaces are ignored. Negative signs are allowed,
but are not used by fmd's application.
char *ltob64 ( long num, char *result, int n )
num
and converts it to a base-64
string in *result
. n
is the size of *result
and is
checked to make sure the conversion result will fit. It should be at least
two greater than the number of base-64 digits expected, to allow for a
possible "-" sign, and the trailing null.
// DEFINE ERROR CODES THAT MAY BE RETURNED BY DCD ROUTINES #define DCD_DNE -2 // DCD file does not exist #define DCD_OPENFAILED -3 // Open of DCD file failed #define DCD_BADREAD -4 // read call on DCD file failed #define DCD_BADEOF -5 // premature EOF in DCD file #define DCD_BADFORMAT -6 // format of DCD file is wrong #define DCD_FILEEXISTS -7 // output file already exists #define DCD_BADMALLOC -8 // malloc failed // FUNCTION ALLUSIONS typedef int DCDFileDesc; // File handle type
dcdlib.[ch]
DCDFileDesc open_dcd_read ( const char *filename )
*filename
containing the name of the DCD file
to be read, this function opens the file and returns a file handle to be
used by the other DCD functions when referencing the file. It returns
DCD_DNE
if the file does not exist, DCD_OPENFAILED
if some
other failure occured, or a valid DCD file handle if successful.
int read_dcdheader ( DCDFileDesc fd, int *N, int *NSET,
int *ISTART, int *NSAVC, double *DELTA, int *NAMNF,
int **FREEINDEXES )
fd
, it returns data from the DCD file
header.
N
NSET
ISTART
NSAVC
DELTA
NAMNF
FREEINDEXES
int read_dcdstep ( DCDFileDesc fd, int N, float *X, float *Y,
float *Z, int num_fixed, int first, int *indexes )
fd
. N
is the desired time step number, and coordinates
are returned in *X, *Y, and *Z
, respectively. Must set first
set to 1 the first time the function is called. *indexes
are the
free atom indices, if any. The function returns 0 upon success, or a
DCDlib error code on failure.
DCDFileDesc open_dcd_write ( const char *name )
int write_dcdheader ( DCDFileDesc fd, const char *filename,
int N, int NSET, int ISTART, int NSAVC, double DELTA,
const char *title, const char *title2 )
fd
*filename
N
NSET
ISTART
NSAVC
DELTA
int write_dcdstep ( DCDFileDesc fd, int N, float *X, float *Y,
float *Z )
fd
for time step
N
. *X, *Y,
and *Z
are arrays containing the
respective coordinates.
int close_dcd_read ( DCDFileDesc fd, int num_fixed,
int *indexes )
int close_dcd_write ( DCDFileDesc fd )
void pad ( char *s, int len )
*s
with spaces to reach the length given
by len
. If the string is longer than len
, it is truncated
to len
.
HDF2_FORMAT "Version 2"
HDF2_MAX_ALEN 128
HDF2_MAX_RANK 2
HDF2_MAX_FILES 64
HDF2_DATASETS 3
HDF2_READ DFACC_READ
HDF2_WRITE DFACC_CREATE
HDF2_MAX_ATTRS 11
HDF2_<name>
, where <name>
is replace
with the desired attribute. The attributes are:
0 : title
1 : title2
2 : program
3 : revision_id
4 : revision_date
5 : creator
6 : date
7 : type
8 : class
9 : format
10 : order
typedef char attrs_t[HDF2_MAX_ATTRS][HDF2_MAX_ALEN];
hdflib.[ch]
int hdf2_open ( int mode, char *filename, char *dataclass,
attrs_t attrs, int *rank, int ddims[], int *nsets )
FAIL
is something bad happens.
mode
*filename
*dataclass
attrs
*rank
ddims
rank
specifying the size of each rank dimension.
nsets
int hdf2_close ( int fid )
SUCCESS
or FAIL
, as appropriate.
int32 hdf2_dimids ( int fid, char *nm1, char *nm2 )
SUCCESS
or FAIL
, as appropriate
int32 hdf2_getdimids ( int fid, char *nm1, char *nm2 )
SUCCESS
or
FAIL
as appropriate.
int32 hdf2_init_read ( int32 sd_fid, char *class, attrs_t attrs,
int32 *rank, int32 ddims[], int32 *nsets )
attrs
, the rank and
dimensions of the main data set are return in *rank
and ddims
,
respectively, and the number of data sets. The function returns
SUCCESS
or FAIL
, as appropriate.
int32 hdf2_init_write ( int32 sd_fid, char *class, attrs_t attrs,
int32 rank, int32 ddims[], int32 *nsets )
sd_fid
class
attrs
rank
ddims
nsets
int hdf2_read ( int fileid, int nset, VOIDP data,
float64 *elapsed_time, int *time_step )
SUCCESS
or FAIL
as appropriate.
int hdf2_write ( int fileid, int nset, VOIDP data,
float64 *, int * )
nset
of the HDF file specified by the file
handle. Returns SUCCESS
or FAIL
as appropriate.
int32 hdf2_rw ( int mode, int fid, int nset, VOIDP data,
float64 *elapsed_time, int *time_step )
int32 hdf2_storeattr ( int32 idx, char *str, attrs_t attrs )
strlib.[ch]
void FMD_truncate ( char *s )
s
.
int FMD_read_line ( FILE *fd, char *buf )
buf
. Return 0 on success or -1 failure (usually EOF).
int FMD_find_word ( char *s, char *w )
w
is contained in string
s
. Returns 1 if the word is found, 0 otherwise.
int FMD_blank_string ( char *s )
s
and returns 1 if
isspace()
is true for each (ie. all spaces), or 0 otherwise.
void FMD_find_first_word ( char *s, char *w )
s
. The first word is
defined to be the first set of continuous non-space characters in a
string. So in the string " AB14^ FDGFD GFDG" the first word would be
"AB14^". The word is returned in the string pointed to by w
.
int FMD_read_int ( FILE *fd, char *msg, Bool base64 )
void FMD_pad ( char *s, int n )
s
out to length n
with spaces. If
s
is already longer than n
characters nothing happens.
The user MUST insure that memory allocated s
is sufficient to
hold n
characters, or memory overwrites may occur.
void FMD_remove_comment ( char *s )
s
is just truncated at the point where
the mark is found.
strcasecmp ( const char s[], const char t[] )
strncasecmp ( const char s[], const char t[], int n )
The fmd program can be validated by running known models and comparing results against previous accepted results. There is a subdirectory, Validation, in the FMD source tree which helps accomplish this. This directory provides fmd input files for a variety of sequential and parallel runs, sets of known, or canonical output files used for comparison purposes, and a execution script, runjobs, which allows execution of one or more of the models. The various models are identified with the nomenclature vr_nn, where the vr means "validation run", and nn is the run number. As seen below, the validation system is tightly coupled to these run names.
The basic structure of the Validation subdirectory is as follows:
./FMD `-----Validation | `-----Canonical | | `-----P | | |-----`vr_00 | | |-----`... | | |-----`vr_nn | | `-----S | | |-----`vr_00 | | |-----`... | | |-----`vr_nn | `-----Data | `-----Jobs | | `-----P | | `-----S | `-----Runjobs | | `-----LAM
The Data subdirectory contains the PDB and PSF files required for each model used. The types of runs are divided into parallel and sequential (single processor), hence the P(arallel) and S(equential) subdirectories under Canonical and Jobs.
A validation run is executed via the runjobs script in the Validation subdirectory.
Usage: runjobs [-h | [-p range,range | no] [-s range,range | no]]
where -h .. displays a help message -p .. selects specific parallel jobs Default is all defined. -s .. selects specific sequential jobs Default is all defined. For -p and -s, range lists are specified as comma delimited lists of job numbers using the form "j-k,m,..". This means jobs numbered "j" through "k" inclusive, job "m", and so on. "no" will deselect all jobs.
Prior to execution, the environment variable "FMD_PARTEST" must be set to YES or NO. This is simply to force awareness that runjobs must be hand configured to run fmd based on the architecture and MPI environment. When runjobs executes, it generates a log file with the name formed from the current date, "yymmdd.log", and records the jobs run and test results.
The actual process of executing a job involves several steps.
The main reason runjobs currently requires hand configuration is the plethora of ways MPI environments use to execute parallel jobs. There are several syntaxes associated with the standard mpirun command, and one may or may not have to take special actions to secure the required number of processors (lamboot with LAM-MPI, for instance). Sample runjob scripts, configured for two different LAM-MPI environments, are provided in the Validation/Runjobs/LAM subdirectory.
Note that some jobs depend on output from previous jobs (to test restarts, for instance). Hence not all can be run in isolation. A description of the various tests performed can be found in the "descriptions" file in each of the Jobs subdirectories.
B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan and M. Karplus, "CHARMm: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations.", J. Comput. Chem., 4(2), pages 187-217, 1983.
Axel T. Br@"unger, "X-PLOR, Version 3.1: A System for X-ray Crystallography and NMR", The Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry, Yale University, 260 Whitney Avenue, P.O. Box 6666, New Haven, CT 06511, 1992.
L. Greengard and V. Rokhlin (a), "A Fast Algorithm for Particle Simulations", J. Comp. Phys., 73, pages 325-348, 1987.
L. Greengard and V. Rokhlin (b), "Rapid evaluation of potential fields in three dimensions." In C. Anderson and Claude Greengard, editors, Vortex Methods, 1360 of Lecture Notes in Mathematics, pages 121-141, Berlin, May 1988. Springer Verlag.
A. McKenny and R. Pachter, "Implementation Issues for Fast Multipole Implementations for Molecular Dynamics Simulations", SIAM Annual Meeting, July, 1996.
M.T. Nelson, W.F. Humphrey, A. Gursoy, A. Dalke, L.V. Kal'e, R.D. Skeel, K. Schulten and R. Kufrin, "NAMD: A Parallel, Object-Oriented Molecular Dynamics Program", International Journal of Supercomputer Applications and High Performance Computing, 10, #4, pages 251-268, 1996.
M. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L. Kale, R. Skeel, K. Schulten and R. Kufrin, "MDScope: A Visual Computing Environment for Structural Biology", Comp. Phys. Comm., 1995 (in press).
Jump to: 1 - a - b - c - d - e - f - g - i - l - m - n - o - p - r - s - t - u - v - x
Jump to: ! - & - * - + - - - / - < - = - [ - a - b - c - d - e - f - g - h - i - l - m - n - o - p - r - s - t - u - v - w - x - y - z
Jump to: 1 - [ - a - b - c - d - e - f - g - h - i - l - m - n - o - p - r - s - t - u - v - w - x
This document was generated on 20 March 2000 using the texi2html translator version 1.52.