In MPI, this is a single call |
CALL MPI_ALLREDUCE (TEST,TEST,1,MPI_REAL,MPI_MAX,comm) |
Flag MPI_MAX specifies global maximum |
The implementation is quite subtle and usually involves a logarithmic tree |
There is a clever extension to get maximum in all processors in "same" time as that on one processor on hypercube and other architectures |