Full HTML for

Basic foilset MPI Message Passing Interface

Given by Geoffrey C. Fox, Nancy McCracken at Computational Science for Simulations on Fall Semester 1998. Foils prepared 19 September 98
Outside Index Summary of Material


This covers MPI from a user's point of view and is meant to be a supplement to other online resources from the MPI Forum, David Walker's Tutorial, Ian Foster's Book "Designing and Building Parallel Programs", Gropp,Lusk and Skjellum "Using MPI"
An Overview is based on subset of 6 routines that cover send/receive, environment inquiry (for rank and total number of processors) initialize and finalization with simple examples
Processor Groups, Collective Communication and Computation, Topologies, and Derived Datatypes are also discussed

Table of Contents for full HTML of MPI Message Passing Interface

Denote Foils where Image Critical
Denote Foils where Image has important information
Denote Foils where HTML is sufficient

1 CPS615 Introduction to Computational Science The Message Passing Interface MPI
2 Abstract of MPI Presentation
3 MPI Overview -- Comparison with HPF -- I
4 MPI Overview -- Comparison with HPF -- II
5 Some Key Features of MPI
6 What is MPI?
7 History of MPI
8 Who Designed MPI?
9 Some Difficulties with MPI
10 Sending/Receiving Messages: Issues
11 What Gets Sent: The Buffer
12 Generalizing the Buffer in MPI
13 Advantages of Datatypes
14 To Whom It Gets Sent: Process Identifiers
15 Generalizing the Process Identifier in MPI
16 Why use Process Groups?
17 How It Is Identified: Message Tags
18 Sample Program using Library
19 Correct Library Execution
20 Incorrect Library Execution
21 What Happened?
22 Solution to the Tag Problem
23 MPI Conventions
24 Standard Constants in MPI
25 The Six Fundamental MPI routines
26 MPI_Init -- Environment Management
27 MPI_Comm_rank -- Environment Inquiry
28 MPI_Comm_size -- Environment Inquiry
29 MPI_Finalize -- Environment Management
30 Hello World in C plus MPI
31 Comments on Parallel Input/Output - I
32 Comments on Parallel Input/Output - II
33 Blocking Send: MPI_Send(C) or MPI_SEND(Fortran)
34 Example MPI_SEND in Fortran
35 Blocking Receive: MPI_RECV(Fortran)
36 Blocking Receive: MPI_Recv(C)
37 Fortran example: Receive
38 Hello World:C Example of Send and Receive
39 HelloWorld, continued
40 Interpretation of Returned Message Status
41 Collective Communication
42 Some Collective Communication Operations
43 Hello World:C Example of Broadcast
44 Collective Computation
45 Examples of Collective Communication/Computation
46 Collective Computation Patterns
47 More Examples of Collective Communication/Computation
48 Data Movement (1)
49 Examples of MPI_ALLTOALL
50 Data Movement (2)
51 List of Collective Routines
52 Example Fortran: Performing a Sum
53 Example C: Computing Pi
54 Pi Example continued
55 Buffering Issues
56 Avoiding Buffering Costs
57 Combining Blocking and Send Modes
58 Cartesian Topologies
59 Defining a Cartesian Topology
60 MPI_Cart_coords or Who am I?
61 Who are my neighbors?
62 Periodic meshes
63 Motivation for Derived Datatypes in MPI
64 Derived Datatype Basics
65 Simple Example of Derived Datatype
66 Derived Datatypes: Vectors
67 Example of Vector type
68 Why is this interesting?
69 Use of Derived Types in Jacobi Iteration
70 Derived Datatypes: Indexed
71 Designing MPI Programs
72 Jacobi Iteration: The Problem
73 Jacobi Iteration: MPI Program Design
74 Jacobi Iteration: MPI Program Design
75 Jacobi Iteration: Fortran MPI Program
76 Jacobi Iteration: create topology
77 Jacobi iteration: data structures
78 Jacobi Iteration: send guard values
79 Jacobi Iteration: update and error
80 The MPI Timer
81 MPI-2
82 I/O included in MPI-2

Outside Index Summary of Material



HTML version of Basic Foils prepared 19 September 98

Foil 1 CPS615 Introduction to Computational Science The Message Passing Interface MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
September 9, 1998
Geoffrey Fox, Nancy McCracken
  • NPAC, Syracuse University
  • This presentation contains material from:
  • Designing and Building Parallel Programs,
  • by Ian Foster, Gina Goff, Ehtesham Hayder and Charles Koelbel
  • and
  • Tutorial on MPI: The Message Passing Interface
  • by William Gropp

HTML version of Basic Foils prepared 19 September 98

Foil 2 Abstract of MPI Presentation

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
This covers MPI from a user's point of view and is meant to be a supplement to other online resources from the MPI Forum, David Walker's Tutorial, Ian Foster's Book "Designing and Building Parallel Programs", Gropp,Lusk and Skjellum "Using MPI"
An Overview is based on subset of 6 routines that cover send/receive, environment inquiry (for rank and total number of processors) initialize and finalization with simple examples
Processor Groups, Collective Communication and Computation, Topologies, and Derived Datatypes are also discussed

HTML version of Basic Foils prepared 19 September 98

Foil 3 MPI Overview -- Comparison with HPF -- I

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI collected ideas from many previous message passing systems and put them into a "standard" so we could write portable (runs on all current machines) and scalable (runs on future machines we can think of) parallel software
MPI plays the same role to message passing systems that HPF does to data parallel languages
  • BUT whereas MPI has essentially all one could want -- as message passing is "fully understood"
  • HPF and related technologies will still evolve as there are many unsolved data parallel compiler issues
    • e.g. HPC++ -- the C++ version of HPF has important differences
    • and there is no data parallel version of C due to pointers (there is a C* language which has restrictions)
    • HPJava is our new idea but again not same as HPF or HPC++
  • whereas MPI is fine with Fortran C or C++ and even Java

HTML version of Basic Foils prepared 19 September 98

Foil 4 MPI Overview -- Comparison with HPF -- II

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
HPF runs on SIMD and MIMD machines and is high level as it expresses a style of programming or problem architecture
MPI runs on MIMD machines (in principle it could run on SIMD but unnatural and inefficient) -- it expresses a machine architecture
Traditional Software Model is
  • Problem --> High Level Language --> Assembly Language --> Machine
    • Expresses Problem Expresses Machine
So in this analogy MPI is universal "machine-language" of Parallel processing
MPI can be built efficiently at low risk whereas HPF compiler is difficult project with many unsolved issues

HTML version of Basic Foils prepared 19 September 98

Foil 5 Some Key Features of MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
An MPI program defines a set of processes, each executing the same program (SPMD)
  • (usually one process per parallel computer node)
... that communicate by calling MPI messaging functions
  • (point-to-point and collective)
... and can be constructed in a modular fashion
  • (communication contexts are the key to MPI libraries)
Also
  • Support for Process Groups -- messaging in subsets of processors
  • Support for application dependent (virtual) topologies analogous to distribution types in HPF
  • Inquiry routines to find out properties of the environment such as number of processors

HTML version of Basic Foils prepared 19 September 98

Foil 6 What is MPI?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
A standard message-passing library
  • p4, NX, PVM, Express, PARMACS are precursors
MPI defines a language-independent interface
  • Not an implementation
Bindings are defined for different languages
  • So far, C and Fortran 77, C++ and F90
  • Java Grande Forum is defining Java bindings
Multiple implementations

HTML version of Basic Foils prepared 19 September 98

Foil 7 History of MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Began at Williamsburg Workshop in April 1992
Organized at Supercomputing 92 (November 92)
Followed HPF Forum format and process
  • Met every 6 weeks for two days
  • Extensive, open email discussions
  • Drafts, readings, votes
Pre-final draft distributed at Supercomputing 93
Two-month public comment period
Final version of draft in May 1994
Widely available now on the Web, ftp sites, netlib
Public and optimized Vendor implementations available for Unix and Windows NT
Further MPI Forum meetings through 1995 and 1996 to discuss additions to the standard
Standard announced at Supercomputing 1996

HTML version of Basic Foils prepared 19 September 98

Foil 8 Who Designed MPI?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Broad Participation
Vendors
  • IBM, Intel, TMC, Meiko, Cray, Convex, Ncube
Message Passing Library writers
  • PVM, p4, Zipcode, TCGMSG, Chameleon, Express, Linda
Application specialists and consultants
Companies:
  • ARCO, Convex, Cray Research, IBM, Intel, KAI, Meiko, NAG, nCUBE, Parasoft, Shell, TMC
Laboratories:
  • ANL, GMD, LANL, LLNL, NOAA, NSF, ORNL, PNL, Sandia, SDSC, SRC
Universities:
  • UC Santa Barbara, Syracuse, Michigan State, Oregon Grad Institute, New Mexico, Miss. State, Southampton, Colorado, Yale, Tennessee, Maryland, Western Michigan, Edinburgh, Cornell, Rice, San Francisco

HTML version of Basic Foils prepared 19 September 98

Foil 9 Some Difficulties with MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI was designed by the Kitchen Sink approach and has 129 functions and each has many arguments
  • This completeness is strength and weakness!
  • Hard to implement efficiently and hard to learn all its details
It is not a complete operating environment and does not have ability to create and spawn processes etc.
PVM is the previous dominant approach
  • It is very simple with much less functionality than MPI
  • However it runs on essentially all machines including heterogeneous workstation clusters
  • Further it is a complete albeit simple operating environment
However it seems clear that MPI has been adopted as the standard messaging system by parallel computer vendors

HTML version of Basic Foils prepared 19 September 98

Foil 10 Sending/Receiving Messages: Issues

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Questions:
  • What is sent?
  • To whom is the data sent?
  • How does the receiver identify it?

HTML version of Basic Foils prepared 19 September 98

Foil 11 What Gets Sent: The Buffer

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
First generation message passing systems only allowed one to transmit information originating in a contiguous array of bytes
  • Hid the real data structure from hardware and programmer
    • Might make it hard to provide efficient implementations as implied a lot of expensive memory accesses
  • Required pre-packing dispersed data, e.g.:
    • Rows (in Fortran, columns in C) of a matrix must be transposed before transmission
  • Prevented convenient communication between machines with different data representations

HTML version of Basic Foils prepared 19 September 98

Foil 12 Generalizing the Buffer in MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI specifies the buffer by starting address, datatype, and count
  • starting address is obvious
  • datatypes are constructed recursively from
    • Elementary (all C and Fortran datatypes)
    • Contiguous array of datatypes
    • Strided blocks of datatypes
    • Indexed array of blocks of datatypes
    • General structures
  • count is number of datatype elements

HTML version of Basic Foils prepared 19 September 98

Foil 13 Advantages of Datatypes

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Cominations of elementary datatypes into a derived user defined datatype allows clean communication of collections of disparate types in a single MPI call.
Elimination of length (in bytes) in favor of count (of items of a given type) is clearer
Specifying application-oriented layouts allows maximal use of special hardware and optimized memory use
However this wonderful technology is problematical in Java where layout of data structures in memory is not defined in most cases
  • Java's serialization subsumes user defined datatypes as a general way of packing a class of disparate types into a message that can be sent between heterogeneous computers

HTML version of Basic Foils prepared 19 September 98

Foil 14 To Whom It Gets Sent: Process Identifiers

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
1st generation message passing systems used hardware addresses
  • Was inflexible
    • Had to recode on moving to a new machine
  • Was inconvenient
    • Required programmer to map problem topology onto explicit machine connections
  • Was insufficient
    • Didn't support operations over a submachine (e.g., sum across a row of processes)

HTML version of Basic Foils prepared 19 September 98

Foil 15 Generalizing the Process Identifier in MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI supports process groups
  • Initial "all" group
  • Group management routines
    • Split group
    • Define group from list
All communication takes place in groups
  • Source/destination identifications refer to rank in group
  • Communicator = group + context

HTML version of Basic Foils prepared 19 September 98

Foil 16 Why use Process Groups?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
We find a good example when we consider typical Matrix Algorithm
  • (matrix multiplication)
  • A i,j = ?k B i,k C k,j
  • summed over k'th column of B and k'th row of C
Consider a block decomposition of 16 by 16 matrices B and C as for Laplace's equation. (Efficient Decomposition as we will see later)
Each sum operation involves a subset(group) of 4 processors
k = 2

HTML version of Basic Foils prepared 19 September 98

Foil 17 How It Is Identified: Message Tags

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
1st generation message passing systems used an integer "tag" (a.k.a. "type" or "id") to match messages when received
  • Most systems allowed wildcard on receive
    • wildcard means match any tag i.e. any message
    • Unsafe due to unexpected message arrival
  • Most could match sender id, some with wildcards
    • Wildcards unsafe; strict checks inconvenient
  • All systems let users pick the tags
    • Unsafe for libraries due to interference

HTML version of Basic Foils prepared 19 September 98

Foil 18 Sample Program using Library

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Calls Sub1 and Sub2 are from different libraries
Same sequence of calls on all processes, with no global synch
  • Sub1();
  • Sub2();
We follow with two cases showing possibility of error with messages getting mixed up between subroutine calls

HTML version of Basic Foils prepared 19 September 98

Foil 19 Correct Library Execution

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index

HTML version of Basic Foils prepared 19 September 98

Foil 20 Incorrect Library Execution

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index

HTML version of Basic Foils prepared 19 September 98

Foil 21 What Happened?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Each library was self-consistent
  • Correctly handled all messages it knew about
Interaction between the libraries killed them
  • "Intercepting" a message broke both
The lesson:
  • Don't take messages from strangers
Other examples teach other lessons:
  • Clean up your own messages
  • Don't use other libraries' tags
  • Etc. ...

HTML version of Basic Foils prepared 19 September 98

Foil 22 Solution to the Tag Problem

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Generalize tag to tag and communicator
A separate communication context for each family of messages
  • Used for queuing and matching
  • This is the context for communicators
No wild cards allowed in communicator, for security
Communicator allocated by the system, for security
Tags retained for use within a context
  • wild cards OK for tags

HTML version of Basic Foils prepared 19 September 98

Foil 23 MPI Conventions

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
All MPI routines are prefixed by MPI_
  • C is always MPI_Xnnnnn(parameters) : C is case sensitive
  • Fortran is case insensitive but we will write MPI_XNNNNN(parameters)
MPI constants are in upper case as are MPI datatypes, e.g. MPI_FLOAT for floating point number in C
Specify overall constants with
  • #include "mpi.h" in C programs
  • include "mpif.h" in Fortran
C routines are actually integer functions and always return an integer status (error) code
Fortran routines are really subroutines and have returned status code as last argument
  • Please check on status codes although this is often skipped!

HTML version of Basic Foils prepared 19 September 98

Foil 24 Standard Constants in MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
There a set of predefined constants in include files for each language and these include:
MPI_SUCCESS -- successful return code
MPI_COMM_WORLD (everything) and MPI_COMM_SELF(current process) are predefined reserved communicators in C and Fortran
Fortran elementary datatypes are:
  • MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_COMPLEX, MPI_DOUBLE_COMPLEX, MPI_LOGICAL, MPI_CHARACTER, MPI_BYTE, MPI_PACKED
C elementary datatypes are:
  • MPI_CHAR, MPI_SHORT, MPI_INT, MPI_LONG, MPI_UNSIGNED_CHAR, MPI_UNSIGNED_SHORT, MPI_UNSIGNED, MPI_UNSIGNED_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_LONG_DOUBLE, MPI_BYTE, MPI_PACKED

HTML version of Basic Foils prepared 19 September 98

Foil 25 The Six Fundamental MPI routines

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI_Init (argc, argv) -- initialize
MPI_Comm_rank (comm, rank) -- find process label (rank) in group
MPI_Comm_size(comm, size) -- find total number of processes
MPI_Send (sndbuf,count,datatype,dest,tag,comm) -- send a message
MPI_Recv (recvbuf,count,datatype,source,tag,comm,status) -- receive a message
MPI_Finalize( ) -- End Up

HTML version of Basic Foils prepared 19 September 98

Foil 26 MPI_Init -- Environment Management

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
This MUST be called to set up MPI before any other MPI routines may be called
For C: int MPI_Init(int *argc, char **argv[] )
  • argc and argv[] are conventional C main routine arguments
  • As usual MPI_Init returns an error
For Fortran: call MPI_INIT(mpierr)
  • nonzero (more pedantically values not equal to MPI_SUCCESS) values of mpierr represent errors

HTML version of Basic Foils prepared 19 September 98

Foil 27 MPI_Comm_rank -- Environment Inquiry

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
This allows you to identify each process by a unique integer called the rank which runs from 0 to N-1 where there are N processes
If we divide the region 0 to 1 by domain decomposition into N parts, the process with rank r controls
  • subregion covering r/N to (r+1)/N
  • for C:int MPI_Comm_rank(MPI_Comm comm, int *rank)
    • comm is an MPI communicator of type MPI_Comm
  • for FORTRAN: call MPI_COMM_RANK (comm, rank, mpierr)

HTML version of Basic Foils prepared 19 September 98

Foil 28 MPI_Comm_size -- Environment Inquiry

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
This returns in integer size number of processes in given communicator comm (remember this specifies processor group)
For C: int MPI_Comm_size(MPI_Comm comm,int *size)
For Fortran: call MPI_COMM_SIZE (comm, size, mpierr)
  • where comm, size, mpierr are integers
  • comm is input; size and mpierr returned

HTML version of Basic Foils prepared 19 September 98

Foil 29 MPI_Finalize -- Environment Management

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Before exiting an MPI application, it is courteous to clean up the MPI state and MPI_FINALIZE does this. No MPI routine may be called in a given process after that process has called MPI_FINALIZE
for C: int MPI_Finalize()
for Fortran:call MPI_FINALIZE(mpierr)
  • mpierr is an integer

HTML version of Basic Foils prepared 19 September 98

Foil 30 Hello World in C plus MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
# all processes execute this program
#include <stdio.h>
#include <mpi.h>
void main(int argc,char *argv[])
{ int ierror, rank, size
  • MPI_Init(&argc, &argv); # Initialize
  • # In following Find Process Number
  • MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  • if( rank == 0)
    • printf ("hello World!\n");
  • # In following, Find Total number of processes
  • ierror = MPI_Comm_size(MPI_COMM_WORLD, &size);
  • if( ierror != MPI_SUCCESS )
    • MPI_Abort(MPI_COMM_WORLD,ierror); # Abort
  • printf("I am processor %d out of total of %d\n", rank, size);
  • MPI_Finalize(); # Finalize }

HTML version of Basic Foils prepared 19 September 98

Foil 31 Comments on Parallel Input/Output - I

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Parallel I/O has technical issues -- how best to optimize access to a file whose contents may be stored on N different disks which can deliver data in parallel and
Semantic issues -- what does printf in C (and PRINT in Fortran) mean?
The meaning of printf/PRINT is both undefined and changing
  • In my old Caltech days, printf on a node of a parallel machine was a modification of UNIX which automatically transferred data from nodes to "host e.g. node 0" and produced a single stream
  • In those days, full UNIX did not run on every node of machine
  • We introduced new UNIX I/O modes (singular and multiple) to define meaning of parallel I/O and I thought this was a great idea but it didn't catch on!!

HTML version of Basic Foils prepared 19 September 98

Foil 32 Comments on Parallel Input/Output - II

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Today, memory costs have declined and ALL mainstream MIMD distributed memory machines whether clusters of workstations/PC's or integrated systems such as T3D/ Paragon/ SP-2 have enough memory on each node to run UNIX or Windows NT
Thus printf today means typically that the node on which it runs will stick it out on "standard output" file for that node
  • However this is implementation dependent
If on other hand you want a stream of output with information in order
  • Starting with that from node 0, then node 1, then node 2 etc.
  • This was default on old Caltech machines but
Then in general you need to communicate information from nodes 1 to N-1 to node 0 and let node 0 sort it and output in required order
MPI-IO standard links I/O to MPI in a standard fashion

HTML version of Basic Foils prepared 19 September 98

Foil 33 Blocking Send: MPI_Send(C) or MPI_SEND(Fortran)

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
call MPI_SEND (
  • IN message start address of data to send
  • IN message_len number of items (length in bytes determined by type)
  • IN datatype type of each data element
  • IN dest_rank Process number (rank) of destination
  • IN message_tag tag of message to allow receiver to filter
  • IN communicator Communicator of both sender and receiver group
  • OUT error_message) Error Flag (absent in C)

HTML version of Basic Foils prepared 19 September 98

Foil 34 Example MPI_SEND in Fortran

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
integer count, datatype, dest, tag, comm, mpierr
real sndbuf(50)
comm = MPI_COMM_WORLD
tag = 0
count = 50
datatype = MPI_REAL
call MPI_SEND (sndbuf, count, datatype, dest, tag, comm, mpierr)

HTML version of Basic Foils prepared 19 September 98

Foil 35 Blocking Receive: MPI_RECV(Fortran)

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
call MPI_RECV(
  • IN start_of_buffer Address of place to store data(address is Input -- values of data are of course output starting at this address!)
  • IN buffer_len Maximum number of items allowed
  • IN datatype Type of each data type
  • IN source_rank Processor number (rank) of source
  • IN tag only accept messages with this tag value
  • IN communicator Communicator of both sender and receiver group
  • OUT return_status Data structure describing what happened!
  • OUT error_message) Error Flag (absent in C)
Note that return_status is used after completion of receive to find actual received length (buffer_len is a maximum length allowed), actual source processor source_ rank and actual message tag

HTML version of Basic Foils prepared 19 September 98

Foil 36 Blocking Receive: MPI_Recv(C)

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
In C syntax is
int error_message = MPI_Recv(
  • void *start_of_buffer,
  • int buffer_len,
  • MPI_DATATYPE datatype,
  • int source_rank,
  • int tag,
  • MPI_Comm communicator,
  • MPI_Status *return_status)

HTML version of Basic Foils prepared 19 September 98

Foil 37 Fortran example: Receive

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
integer status(MPI_STATUS_SIZE) An array to store status of received information
integer mpierr, count, datatype, source, tag, comm
integer recvbuf(100)
count = 100
datatype = MPI_REAL
comm = MPI_COMM_WORLD
source = MPI_ANY_SOURCE accept any source processor
tag = MPI_ANY_TAG accept any message tag
call MPI_RECV (recvbuf, count, datatype, source, tag, comm, status, mpierr)
  • Note source and tag can be wild-carded

HTML version of Basic Foils prepared 19 September 98

Foil 38 Hello World:C Example of Send and Receive

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
# All processes execute this program
#include "mpi.h"
main( int argc, char **argv )
{
  • char message[20];
  • int i, rank, size, tag=137; # Any value of tag allowed
  • MPI_Status status;
  • MPI_Init (&argc, &argv);
  • MPI_Comm_size(MPI_COMM_WORLD, &size) # Number of Processes
  • MPI_Comm_rank(MPI_COMM_WORLD, &rank); # Who is this process

HTML version of Basic Foils prepared 19 September 98

Foil 39 HelloWorld, continued

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
if( rank == 0 ) { # We are on "root" -- Process 0
  • strcpy(message,"Hello MPI World"); # Generate message
  • for(i=1; i<size; i++) # Send message to the size-1 other processes
  • MPI_Send(message, strlen(message)+1, MPI_CHAR, i, tag, MPI_COMM_WORLD); }
else { # Any processor except root -- Process 0
  • MPI_Recv(message,20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status); }
printf("This is a message from node %d saying %s\n", rank, message);
MPI_Finalize();
}

HTML version of Basic Foils prepared 19 September 98

Foil 40 Interpretation of Returned Message Status

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
In C status is a structure of type MPI_Status
  • status.source gives actual source process
  • status.tag gives the actual message tag
In Fortran the status is an integer array and different elements give:
  • in status(MPI_SOURCE) the actual source process
  • in status(MPI_TAG) the actual message tag
In C and Fortran, the number of elements (called count) in the message can be found from call to
call MPI_GET_COUNT (IN status, IN datatype,
OUT count, OUT error_message)
  • where as usual in C last argument is missing as returned in function call

HTML version of Basic Foils prepared 19 September 98

Foil 41 Collective Communication

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Provides standard interfaces to common global operations
  • Synchronization
  • Communications, i.e. movement of data
  • Collective computation
A collective operation uses a process group
  • All processes in group call same operation at (roughly) the same time
  • Groups are constructed "by hand" with MPI group manipulation routines or by using MPI topology-definition routines
Message tags not needed (generated internally)
All collective operations are blocking.

HTML version of Basic Foils prepared 19 September 98

Foil 42 Some Collective Communication Operations

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI_BARRIER(comm) Global Synchronization within a given communicator
MPI_BCAST Global Broadcast
MPI_GATHER Concatenate data from all processors in a communicator into one process
  • MPI_ALLGATHER puts result of concatenation in all processors
MPI_SCATTER takes data from one processor and scatters over all processors
MPI_ALLTOALL sends data from all processes to all other processes
MPI_SENDRECV exchanges data between two processors -- often used to implement "shifts"
  • this viewed as pure point to point by some

HTML version of Basic Foils prepared 19 September 98

Foil 43 Hello World:C Example of Broadcast

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
#include "mpi.h"
main( int argc, char **argv )
{ char message[20];
  • int rank;
  • MPI_Init (&argc, &argv);
  • MPI_Comm_rank(MPI_COMM_WORLD, &rank); # Who is this processor
  • if( rank == 0 ) # We are on "root" -- Processor 0
    • strcpy(message,"Hello MPI World"); # Generate message
  • # MPI_Bcast sends from root=0 and receives on all other processor
  • MPI_Bcast(message,20, MPI_CHAR, 0, MPI_COMM_WORLD);
  • printf("This is a message from node %d saying %s\n", rank, message);
  • MPI_Finalize(); }
Note that all processes issue the broadcast operation, process 0 sends the message and all processes receive the message.

HTML version of Basic Foils prepared 19 September 98

Foil 44 Collective Computation

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
One can often perform computing during a collective communication
MPI_REDUCE performs reduction operation of type chosen from
  • maximum(value or value and location), minimum(value or value and location), sum, product, logical and/or/xor, bit-wise and/or/xor
  • e.g. operation labeled MPI_MAX stores in location result of processor rank the global maximum of original in each processor as in
  • call MPI_REDUCE(original, result, 1, MPI_REAL, MPI_MAX, rank, comm, ierror)
    • One can also supply one's own reduction function
MPI_ALLREDUCE is same as MPI_REDUCE but it stores result in all -- not just one -- processors
MPI_SCAN performs reductions with result for processor r depending on data in processors 0 to r

HTML version of Basic Foils prepared 19 September 98

Foil 45 Examples of Collective Communication/Computation

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Four Processors where each has a send buffer of size 2
  • 0 1 2 3 Processors
  • (2,4) (5,7) (0,3) (6,2) Initial Send Buffers
  • MPI_BCAST with root=2
  • (0,3) (0,3) (0,3) (0,3) Resultant Buffers
  • MPI_REDUCE with action MPI_MIN and root=0
  • (0,2) (_,_) (_,_) (_,_) Resultant Buffers
  • MPI_ALLREDUCE with action MPI_MIN and root=0
  • (0,2) (0,2) (0,2) (0,2) Resultant Buffers
  • MPI_REDUCE with action MPI_SUM and root=1
  • (_,_) (13,16) (_,_) (_,_) Resultant Buffers

HTML version of Basic Foils prepared 19 September 98

Foil 46 Collective Computation Patterns

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Processors
Memory Locations

HTML version of Basic Foils prepared 19 September 98

Foil 47 More Examples of Collective Communication/Computation

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Four Processors where each has a send buffer of size 2
  • 0 1 2 3 Processors
  • (2,4) (5,7) (0,3) (6,2) Initial Send Buffers
  • MPI_SENDRECV with 0,1 and 2,3 paired
  • (5,7) (2,4) (6,2) (0,3) Resultant Buffers
  • MPI_GATHER with root=0
  • (2,4,5,7,0,3,6,2) (_,_) (_,_) (_,_) Resultant Buffers
  • Now take four Processors where only rank=0 has send buffer
  • (2,4,5,7,0,3,6,2) (_,_) (_,_) (_,_) Initial send Buffers
  • MPI_SCATTER with root=0
  • (2,4) (5,7) (0,3) (6,2) Resultant Buffers

HTML version of Basic Foils prepared 19 September 98

Foil 48 Data Movement (1)

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Processors
Memory Locations
A
B
C
D
E

HTML version of Basic Foils prepared 19 September 98

Foil 49 Examples of MPI_ALLTOALL

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
All to All Communication with i'th location in j'th processor being sent to j'th location in i'th processor
Processor 0 1 2 3
Start (a0,a1,a2,a3) (b0,b1,b2,b3) (c0,c1,c2,c3) (d0,d1,d2,d3)
After (a0,b0,c0,d0) (a1,b1,c1,d1) (a2,b2,c2,d2) (a3,b3,c3,d3)
There are extensions MPI_ALLTOALLV to handle case where data stored in noncontiguous fashion in each processor and when each processor sends different amounts of data to other processors
Many MPI routines have such "vector" extensions

HTML version of Basic Foils prepared 19 September 98

Foil 50 Data Movement (2)

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index

HTML version of Basic Foils prepared 19 September 98

Foil 51 List of Collective Routines

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
"ALL" routines deliver results to all participating processes
Routines ending in "V" allow different sized inputs on different processors

HTML version of Basic Foils prepared 19 September 98

Foil 52 Example Fortran: Performing a Sum

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
call MPI_COMM_RANK( comm, rank, ierr )
if (rank .eq. 0) then
read *, n
end if
call MPI_BCAST(n, 1, MPI_INTEGER, 0, comm, ierr )
# Each process computes its range of numbers to sum
lo = rank*n+1
hi = lo+n-1
sum = 0.0d0
do i = lo, hi
sum = sum + 1.0d0 / i
end do
call MPI_REDUCEALL( sum, sumout, 1, MPI_DOUBLE,
& MPI_ADD_DOUBLE, comm, ierr)

HTML version of Basic Foils prepared 19 September 98

Foil 53 Example C: Computing Pi

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
#include "mpi.h"
#include <math.h>
int main (argc, argv)
int argc; char *argv[];
{
int n, myid, numprocs, i, rc;
double PI25DT = 3.14159265358979323842643;
double mypi, pi, h, sum, x, a;
MPI_Init(&argc, &argv);
MPI_Comm_size (MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank (MPI_COMM_WORLD, &myid);

HTML version of Basic Foils prepared 19 September 98

Foil 54 Pi Example continued

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
{ if (myid == 0)
{ printf ("Enter the number of intervals: (0 quits) ");
scanf ("%d", &n);
}
MPI_Bcast (&n, 1, MPI_INT, 0, MPI_COMMWORLD);
if (n == 0) break;
h = 1.0 / (double) n;
sum = 0.0;
for (i = myid+1; i <= n; i += numprocs)
{ x = h * ((double) i - 0.5); sum += 4.0 / 1.0 + x*x): }
mypi = h * sum;
MPI_Reduce (&mypi, &pi,1, MPI_DOUBLE,MPI_SUM, 0,MPI_COMMWORLD);
if (myid == 0)
printf("pi is approximately %.16f, Error is %.16f\n",pi, fabs(pi-PI35DT); }
MPI_Finalize; }

HTML version of Basic Foils prepared 19 September 98

Foil 55 Buffering Issues

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Where does data go when you send it?
  • Multiple buffer copies, as in A)?
  • Straight to the network, as in B)?
B) is more efficient than A), but not always correct
B)
A)

HTML version of Basic Foils prepared 19 September 98

Foil 56 Avoiding Buffering Costs

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Copies are not needed if
  • Send does not return until the data is delivered,
  • or
  • The data is not touched after the send
MPI provides modes to arrange this
  • Synchronous: Do not return until recv is posted
  • Ready: Matching recv is posted before send
  • Buffered: If you really want buffering

HTML version of Basic Foils prepared 19 September 98

Foil 57 Combining Blocking and Send Modes

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
All combinations are legal
  • Red are fastest, Blue are slow

HTML version of Basic Foils prepared 19 September 98

Foil 58 Cartesian Topologies

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI provides routines to provide structure to collections of processes. Although it also has graph topologies, here we concentrate on cartesian.
A Cartesian topology is a mesh
Example of a 3 x 4 mesh with arrows pointing at the right neighbors:
(0,0)
(0,1)
(0,2)
(0,3)
(1,0)
(1,1)
(1,2)
(1,3)
(2,0)
(2,1)
(2,2)
(2,3)

HTML version of Basic Foils prepared 19 September 98

Foil 59 Defining a Cartesian Topology

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
The routine MPI_Cart_create creates a Cartesian decomposition of the processes, with the number of dimensions given by the ndim argument. It returns a new communicator (in comm2d in example below) with the same processes as in the input communicator, but different topology.
ndim = 2;
dims[0] = 3; dims[1] = 4;
periods[0] = 0; periods[1] = 0; // periodic is false
reorder = 1; // reordering is true
ierr = MPI_Cart_create (MPI_COMM_WORLD, ndim,
dims, periods, reorder, &comm2d);
  • where reorder specifies that it's o.k. to reorder the default process rank in order to achieve a good embedding (with good communication times between neighbors).

HTML version of Basic Foils prepared 19 September 98

Foil 60 MPI_Cart_coords or Who am I?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Given the rank of the process returned by MPI_Comm_rank in a variable myrank, this routine gives a two element (for two dimensional topology) array (coords in example below) with the (i, j) coordinates of this process in the new cartesian communicator.
ierr = MPI_Cart_coords (MPI_COMM_WORLD,
myrank, ndim, coords);
coords[0] and coords[1] will be the i and j coordinates.

HTML version of Basic Foils prepared 19 September 98

Foil 61 Who are my neighbors?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
The routine MPI_Cart_shift finds the neighbors in each direction of the new communicator.
dir = 0; // in C 0 for columns, 1 for rows
// in Fortran, it's 1 and 2
disp = 1; // specifies first neighbor to the right and left
ierr = MPI_Cart_shift (comm2d, dir, disp, &nbrbottom,
&nbrtop):
This returns the process numbers (ranks) for a communication of the bottom and top neighbors.
If a process in a non-periodic mesh is on the border and has no neighbor, then the value MPI_PROCNULL is returned. This process value can be used in a send/recv, but it will have no effect.

HTML version of Basic Foils prepared 19 September 98

Foil 62 Periodic meshes

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
In a periodic mesh, as shown below the processes at the edge of the mesh wrap around in their dimension to find their neighbors. The right neighbor is wrapped

HTML version of Basic Foils prepared 19 September 98

Foil 63 Motivation for Derived Datatypes in MPI

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
These are an elegant solution to a problem we struggled with a lot in the early days -- all message passing is naturally built on buffers holding contiguous data
However often (usually) the data is not stored contiguously. One can address this with a set of small MPI_SEND commands but we want messages to be as big as possible as latency is so high
One can copy all the data elements into a single buffer and transmit this but this is tedious for the user and not very efficient
It has extra memory to memory copies which are often quite slow
So derived datatypes can be used to set up arbitary memory templates with variable offsets and primitive datatypes. Derived datatypes can then be used in "ordinary" MPI calls in place of primitive datatypes MPI_REAL MPI_FLOAT etc.

HTML version of Basic Foils prepared 19 September 98

Foil 64 Derived Datatype Basics

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Derived Datatypes should be declared integer in Fortran and MPI_Datatype in C
Generally have form { (type0,disp0), (type1,disp1) ... (type(n-1),disp(n-1)) } with list of primitive data types typei and displacements (from start of buffer) dispi
call MPI_TYPE_CONTIGUOUS (count, oldtype, newtype, ierr)
  • creates a new datatype newtype made up of count repetitions of old datatype oldtype
one must use call MPI_TYPE_COMMIT(derivedtype, ierr) before one can use the type derivedtype in a communication call
call MPI_TYPE_FREE(derivedtype, ierr) frees up space used by this derived type

HTML version of Basic Foils prepared 19 September 98

Foil 65 Simple Example of Derived Datatype

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
integer derivedtype, ...
call MPI_TYPE_CONTIGUOUS(10, MPI_REAL, derivedtype, ierr)
call MPI_TYPE_COMMIT(derivedtype, ierr)
call MPI_SEND(data, 1, derivedtype, dest, tag, MPI_COMM_WORLD, ierr)
call MPI_TYPE_FREE(derivedtype, ierr)
is equivalent to simpler single call
call MPI_SEND(data, 10, MPI_REAL, dest, tag, MPI_COMM_WORLD, ierr)
and each sends 10 contiguous real values at location data to process dest

HTML version of Basic Foils prepared 19 September 98

Foil 66 Derived Datatypes: Vectors

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
MPI_TYPE_VECTOR (count, blocklen, stride, oldtype, newtype, ierr)
  • IN count Number of blocks to be added
  • IN blocklen Number of elements in block
  • IN stride Number of elements (NOT bytes) between start of each block
  • IN oldtype Datatype of each element
  • OUT newtype Handle(pointer) for new derived type

HTML version of Basic Foils prepared 19 September 98

Foil 67 Example of Vector type

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Suppose in Fortran, we have an array
  • u (0 : nxblock+1, 0 : nyblock+1)
  • where we want to send rows and columns of elements from 1 : nxblock and 1 : nyblock
  • in Fortran, arrays are stored column major order (C is row major)
Contiguous elements
MPI_Type_vector( nyblock, 1, nxblock+2, MPI_REAL, rowtype )
defines a row of elements 1:nyblock
0 nyblock+1
nxblock+1

HTML version of Basic Foils prepared 19 September 98

Foil 68 Why is this interesting?

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
In Jacobi like algoritms, each processor stores its own nxblock by nyblock array of variables as well as guard rings containing the rows and columns from neighbours. One loads these guard rings at start of computation iteration and only updates points internal to array
Guard Rings

HTML version of Basic Foils prepared 19 September 98

Foil 69 Use of Derived Types in Jacobi Iteration

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Display Fortran version (interchange row and column for C)
Integer nbrtop, nbrbottom, nbrleft, nbrright
# These are processor ranks of 4 nearest neighbors ( top, bottom, left and right respectively) -- find from MPI_CART_SHIFT (see later)
integer rowtype, coltype # The new derived types
call MPI_TYPE_CONTIGUOUS(nxblock, MPI_REAL, coltype, ierr)
call MPI_TYPE_COMMIT(coltype, ierr)
call MPI_TYPE_VECTOR(nyblock, 1, nxblock+2, MPI_REAL, rowtype, ierr)
call MPI_TYPE_COMMIT(rowtype, ierr)
# Now Transmit from internal edges rows to guard rings on neighbors
call MPI_SEND( u(1,1), 1, coltype, nbrleft, 0, comm, ierr)
call MPI_SEND( u(1,nyblock), 1, coltype, nbrright, 0, comm, ierr)
call MPI_SEND( u(1,1), 1, rowtype, nbtop, 0, comm, ierr)
call MPI_SEND( u(nxblock,1), 1, rowtype, nbtop, 0, comm, ierr)

HTML version of Basic Foils prepared 19 September 98

Foil 70 Derived Datatypes: Indexed

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Array of indices, useful for gather/scatter
MPI_TYPE_INDEXED (count, blocklens, indices, oldtype, newtype, ierr)
  • IN count Number of blocks to be added
  • IN blocklens Number of elements in each block -- an array of length count
  • IN indices Displacements (an array of length count) for each block
  • IN oldtype Datatype of each element
  • OUT newtype Handle(pointer) for new derived type

HTML version of Basic Foils prepared 19 September 98

Foil 71 Designing MPI Programs

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Partitioning
  • Before tackling MPI
Communication
  • Many point to collective operations
Agglomeration
  • Needed to produce MPI processes
Mapping
  • Handled by MPI

HTML version of Basic Foils prepared 19 September 98

Foil 72 Jacobi Iteration: The Problem

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Used to numerically solve a Partial Differential Equation (PDE) on a square mesh -- below is Poisson's Equation
Method:
  • Update each mesh point by the average of its neighbors
  • Repeat until converged
x
u
y
This is right hand side f(x,y)

HTML version of Basic Foils prepared 19 September 98

Foil 73 Jacobi Iteration: MPI Program Design

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Partitioning is simple
  • Every point is a micro-task
Communication is simple
  • 4 nearest neighbors in Cartesian mesh
  • Reduction for convergence test
Agglomeration works along dimensions
  • 1-D packing for high-latency machines (as minimizes number of messages)
  • 2-D packing for others (most general as minimizes information sent)
  • One process per processor practically required

HTML version of Basic Foils prepared 19 September 98

Foil 74 Jacobi Iteration: MPI Program Design

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Mapping: Cartesian grid directly supported by MPI virtual topologies
For generality, write as the 2-D version
  • Create a 1?P (or P?1) grid for 1-D version
Adjust array bounds, iterate over local array
  • For convenience, include shadow region to hold communicated values (not iterated over)
0 nylocal
nxlocal
nx by ny points in a
npx by npy decomposition

HTML version of Basic Foils prepared 19 September 98

Foil 75 Jacobi Iteration: Fortran MPI Program

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
PARAMETER(nxblock=nx/nxp, nyblock=ny/nyp, nxlocal=nxblock+1, nylocal=nyblock+1)
REAL u(0:nxlocal,0:nylocal),
unew(0:nxlocal,0:nylocal),
f(0:nxlocal,0:nylocal)
REAL coord(1:2), dims(1:2), periods(1,2)
dims(1) = nxp; dims(2) = nyp;
periods(1) = .false.;
periods(2) = .false.
reorder = .true.
ndim = 2

HTML version of Basic Foils prepared 19 September 98

Foil 76 Jacobi Iteration: create topology

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
call MPI_CART_CREATE(MPI_COMM_WORLD,
ndim,dims,periods,
reorder,comm2d,ierr)
CALL MPI_COMM_RANK( MPI_COMM_WORLD,
myrank, ierr )
CALL MPI_CART_COORDS( MPI_COMM_WORLD,
myrank, 2, coords, ierr )
CALL MPI_CART_SHIFT( comm2d,
2, 1, nbrleft, nbrright, ierr )
CALL MPI_CART_SHIFT( comm2d,
1, 1, nbrbottom, nbrtop, ierr )

HTML version of Basic Foils prepared 19 September 98

Foil 77 Jacobi iteration: data structures

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
CALL MPI_Type_vector( nyblock,
1, nxlocal+1, MPI_REAL, rowtype, ierr)
CALL MPI_Type_commit(
rowtype, ierr );
dx = 1.0/nx; dy = 1.0/ny; err = tol * 1.0 e6
DO j = 0, nylocal
  • DO i = 0, nxlocal
    • f(i,j) = -2*(dx*(i+coords(2)*nxblock))**2
    • + 2*dx*(i+coords(2)*nxblock)
    • u(i,j) = 0.0
    • unew(i,j) = 0.0
  • END DO
END DO

HTML version of Basic Foils prepared 19 September 98

Foil 78 Jacobi Iteration: send guard values

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
DO WHILE (err > tol)
  • CALL MPI_SEND(u(1,1),
  • nxblock,MPI_REAL,nbrleft,0,comm2d, ierr)
  • CALL MPI_RECV(u(1,nylocal),
  • nxblock,MPI_REAL,nbrright,0,comm2d, ierr)
  • CALL MPI_SEND(u(1,nylocal),
  • nxblock,MPI_REAL, nbrright,1,comm2d, ierr)
  • CALL MPI_RECV(u(1,0),
  • nxblock,MPI_REAL,nbrleft,1,comm2d, ierr)
  • CALL MPI_SEND(u(1,1),
  • 1,rowtype,nbrtop,2,comm2d, ierr)
  • CALL MPI_RECV(u(nxlocal,1),
  • 1,rowtype,nbrbottom,2,comm2d, ierr)
  • CALL MPI_SEND(u(nxlocal-1,1),
  • 1,rowtype,nbrbottom,3,comm2d, ierr)
  • CALL MPI_RECV(u(0,1),1,rowtype,nbrtop,3,comm2d, ierr)

HTML version of Basic Foils prepared 19 September 98

Foil 79 Jacobi Iteration: update and error

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
myerr = 0.0
DO j=1, nylocal-1
  • DO i = 1, nxlocal-1
    • unew(i,j) = (u(i-1,j)+u(i+1,j)
    • +u(i,j-1)+u(i,j+1)+f(i,j))/4
    • myerr = max(err, ABS(unew(i,j)-u(i,j)))
  • END DO
END DO
CALL MPI_ALLREDUCE(myerr,err,
1,MPI_REAL,MPI_MAX,MPI_COMM_WORLD,ierr)
DO j=1, nylocal-1
  • DO i = 1, nxlocal-1
    • u(i,j) = unew(i,j)
  • END DO
END DO
END DO

HTML version of Basic Foils prepared 19 September 98

Foil 80 The MPI Timer

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
The elapsed (wall-clock) time between two points in an MPI program can be computed using MPI_Wtime:
  • double t1, t2;
  • t1 = MPI_Wtime ( );
  • . . .
  • t2 = MPI_Wtime ( );
  • printf ("Elapsed time is %f \n", t2-t1 );
The times are local; the attribute MPI_WTIME_IS_GLOBAL may be used to determine if the times are also synchronized with each other for all processes in MPI_COMM_WORLD.

HTML version of Basic Foils prepared 19 September 98

Foil 81 MPI-2

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
The MPI Forum produced a new standard which include MPI 1.2 clarifications and corrections to MPI 1.1
MPI-2 new topics are:
  • process creation and management, including client/server routines
  • one-sided communications (put/get, active messages)
  • extended collective operations
  • external interfaces
  • I/O
additional language bindings for C++ and Fortran-90

HTML version of Basic Foils prepared 19 September 98

Foil 82 I/O included in MPI-2

From MPI Message Passing Interface Computational Science for Simulations -- Fall Semester 1998. *
Full HTML Index
Goal is to provide model for portable file system allowing for optimization of parallel I/O
  • portable I/O interface POSIX judged not possible to allow enough optimization
Parallel I/O system provides high-level interface supporting transfers of global data structures between process memories and files.
Significant optimizations required include:
  • grouping, collective buffering, and disk-directed I/O
Other optimizations also achieved by
  • asynchronous I/O, strided accesses and control over physical file layout on disks.
I/O access modes defined by data partitioning expressed with derived datatypes

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Sun Apr 11 1999