More general versions of MPI_?SEND and associated inquiry routines to see if messages have arrived. Use of these allows you to overlap communication and computation. In general this is not used even though more efficient
-
Also use in more general asynchronous applications -- blocking routines are most natural in loosely syncronous communicate-compute cycles
|
Application Topology routines allow to find rank of nearest neighbor processors as North,South,East,West in Jacobi iteration
|
Packing and Unpacking of data to make single buffers -- derived datatypes are usually a more elegant approach to this
|
Communicators to set up subgroups of processors (remember matrix example) and to set up independent MPI universes as needed to build libraries so that messages generated by library do not interfere with those from other libraries or user code
-
Historically (in my work) WOULD have been useful to distinguish debugging and application messages
|