LAM Installation Guide
The LAM source directory is packaged as a compressed tape archive.
lam.tar.Z
The release number will also be present in the filename.
Uncompress the archive and extract the LAM sources.
% uncompress lam.tar.Z
% tar xvf lam.tar
Another distribution file with the same release number is a tape
archive containing patches for serious bugs. Extract the
patches from the archive as shown above. Read the preamble to
each patch. Apply the relevant patches to the the specified files
using your favourite editor. Or, apply all the patches with the UNIX
patch(1) utility, run from the source directory.
% tar xvf lam61-patch.tar
% cat lam61-patch[0-9][0-9] | patch -p0
Machine-Dependent Configuration
The machines and UNIX flavours listed below are supported by the current
version of LAM. Other variations may require source code changes.
- Sun, SunOS 5.4
- SGI, IRIX 6.2
- IBM, AIX version 3, release 2
- DEC, DEC UNIX V4.0
- HP, HP-UX 10.01
- Intel X86, LINUX v2.0.24
In the source directory, create the symbolic link Config/config pointing
at one of the machine-dependent configuration files under Config/.
% ln -s config.sun4_sol config
Customizing LAM
The configuration file contains information for building and installing the
libraries and executables on a particular architecture. Many variables are
rarely changed or not intended to be changed by you. Only one of these
variables must be set.
HOME = installation directory
Libraries and executables are built in the source directory (actually in object
subdirectories of the source directory) and installed in the installation
directory. The installation directory, which can be located anywhere, is
configured by the HOME variable.
Other variables which can be set are described below. However, they should
be modified from the default values only if you thoroughly understand their
usage.
Timers
- TO_BOOT
-
LAM considers a new daemon started on a remote node to be dead
because it has not received initial contact.
- TO_DLO_ACK
-
A previously transmitted network packet is re-transmitted because an
acknowledgement has not been received.
- TO_DLO_IDLE
-
A heartbeat message is sent to quiet nodes to verify that they are
still alive. This feature is enabled with the -x option to lamboot(1).
Counters
- TO_DLO_ESTIMATE
-
After so many requests to the LAM daemon, a pending timeout period is declared
to have expired. This compensates for deficiencies in select(2).
- DOMAXRESEND
-
After so many network packet retransmissions, the destination node is
considered dead. This feature is enabled with the -x option to
lamboot(1).
- MPI_GER
-
A minimum number of message envelopes is protected for each process
pair in an MPI application.
Building LAM
The build procedure is a two step process. First the Makefiles establish
dependencies on header and source files. In the second step, the executables
and libraries are compiled, linked and archived as necessary. Both of these
steps are performed by locating to the top of the source directory and running
the make(1) utility with the default target. We suggest that you capture
standard output and error in case you need to diagnose problems with
the build.
% make >& LOG &
You can monitor progress with the tail(1) utility.
% tail -f LOG
A convenient feature of LAM is the ability to support multiple installations
for experimentation. To build another installation directory, simply repeat
the building steps with a different value of the HOME variable. (A command
line setting of this variable is not sufficient - you must edit the
configuration file.) Users choose among different installations by setting
their shell's search path accordingly.
Boot Schema
A boot schema is a description of a multicomputer on which LAM will be run.
You can create boot schema files (see bhost(5) for syntax) for typical
configurations of the local multicomputer(s). Place these files under boot/
in the installation directory. They will be found by LAM tools such as
lamboot(1), recon(1) and wipe(1).
Using LAM
If the LAM installation directory is moved after it is built, users must
set the LAMHOME environment variable to the new location. On each UNIX
machine, users must add the LAM executable directory to their shell's
search path. LAM executables are found under bin/ in the installation
directory. These steps must be taken on each and every machine that
might be part of a multicomputer running LAM. Set the variables in
the shell's start-up file, not the .login file.
The recon(1) tool checks if LAM can be started on the given boot schema.
There are several prerequisites that enable LAM to be started on a remote
machine.
- The machine must be reachable and operational.
- The user must have an account on the machine.
- The user must be able to rsh(1) to the machine (permissions must be
set in either the /etc/hosts.equiv file or the user's .rhosts file on
the machine).
- The LAM executables must be locatable on that machine, using the
shell's search path and possibly the LAMHOME environment variable,
as described above.
- The shell's start-up script must not print anything on standard error.
The user can take advantage of the fact that rsh(1) will start
the shell non-interactively. The start-up script can exit early in
this case, before executing many commands relevant only to interactive
sessions and likely to generate output.
Refer users to the lam(7) manual page to get started using LAM tools
and libraries.
Clearing Space
After LAM has been built, all of the objects can be removed by running the
make(1) utility with the "clean" target in the source directory.
% make clean
If further space is required, the source directory can be taken off-line.
Only the installation directory need be maintained on-line.
Ohio Supercomputer Center, lam@tbag.osc.edu, http://www.osc.edu/lam.html