GA Operations

GA_INITIALIZE

      subroutine ga_initialize()

Allocate and initialize internal data structures for Global Arrays.

This is a collective operation.

GA_INITIALIZE_LTD

      subroutine ga_initialize_ltd(limit)
      integer limit  - amount of memory in bytes per process [input]

Allocate and initialize internal data structures for global arrays and set limit for memory used in global arrays.

limit < 0 means "allow unlimited memory usage"

This is a collective operation.

GA_CREATE

      logical function ga_create(type, dim1, dim2, array_name, chunk1, chunk2, g_a)

      integer type, dim1, dim2, chunk1, chunk2, g_a
      character*(*) array_name

      array_name        - a unique character string [input]
      type              - MA type [input]
      dim1/2            - array(dim1,dim2) as in FORTRAN [input]
      chunk1/2          - minimum size that dimensions should
                          be chunked up into [input]
      g_a               - integer handle for future references [output]

Creates an array.

Setting chunk1=dim1 gives distribution by vertical strips (chunk2*columns); setting chunk2=dim2 gives distribution by horizontal strips (chunk1*rows). Actual chunks will be modified so that they are at least the size of the minimum and each process has either zero or one chunk. Specifying chunk1/2 as <= 1 will cause that dimension to be distributed evenly.

Note: current distribution algorithm is primitive.

This is a collective operation.

GA_CREATE_IRREG

      logical function ga_create_irreg(type, dim1, dim2, array_name, map1, nblock1, map2, nblock2, g_a)

      integer type, map1(*), map2(*), nblock1, nblock2, g_a
      character*(*) array_name

       array_name    - a unique character string [input]
       type          - MA type [input]
       dim1/2        - array(dim1,dim2) as in FORTRAN [input]
       nblock1       - no. of blocks first dimension is divided into [input]
       nblock2       - no. of blocks second dimension is divided into [input]
       map1          - ilo for each block [input]
       map2          - jlo for each block [input]
       g_a           - integer handle for future references [output]

Creates an array by following the user-specified distribution.

This is a collective operation.

GA_DUPLICATE

      logical function ga_duplicate(g_a, g_b, array_name)
      integer         g_a, g_b;
      character*(*)   array_name;

      array_name    - a character string                 [input]
      g_a           - Integer handle for reference array [input]
      g_b           - Integer handle for new array       [output]

Creates a new array by applying all the properties of another existing array.

This is a collective operation.

GA_DESTROY

      logical function ga_destroy(g_a)
  
      integer g_a     [input]

Deallocates the array and frees any associated resources.

This is a collective operation.

GA_TERMINATE

      subroutine ga_terminate()

Delete all active arrays and destroy internal data structures.

This is a collective operation.

GA_SYNC

      subroutine ga_sync()

Synchronize processes (a barrier) and ensure that all global array operations are complete (or at least appear complete).

This is a collective operation.

GA_ZERO

      subroutine ga_zero(g_a)

      integer g_a                         [input]

Sets value of all elements in the array to zero.

This is a collective operation.

GA_DDOT

      double precision function ga_ddot(g_a, g_b)

      integer g_a, g_b                    [input]

Computes the element-wise dot product of the two arrays which must be double precision, the same shape and identically aligned:

         ga_ddot = SUM_ij a(i,j)*b(i,j)

This is a collective operation.

GA_ZDOT

      double complex function ga_zdot(g_a, g_b)

      integer g_a, g_b                    [input]

Computes the element-wise dot product of the two arrays which must be double complex, the same shape and identically aligned:

         ga_zdot = SUM_ij a(i,j)*b(i,j)

This is a collective operation.

GA_DDOT_PATCH

      double precision function ga_ddot_patch(g_a, ta, ailo, aihi, ajlo, ajhi,
                                              g_b, tb, bilo, bihi, bjlo, bjhi) 

      integer     g_a, g_b                   [input]
      integer     ailo, aihi, ajlo, ajhi     [input] g_a patch coordinates
      integer     bilo, bihi, bjlo, bjhi     [input] g_b patch coordinates
      character*1 ta, tb                     [input] transpose flags

Computes the element-wise dot product of the two (possibly transposed) patches which must be double precision and have the same number of elements.

This is a collective operation.

GA_ZDOT_PATCH

      double complex function ga_zdot_patch(g_a, ta, ailo, aihi, ajlo, ajhi,
                                            g_b, tb, bilo, bihi, bjlo, bjhi)

      integer     g_a, g_b                   [input]
      integer     ailo, aihi, ajlo, ajhi     [input] g_a patch coordinates
      integer     bilo, bihi, bjlo, bjhi     [input] g_b patch coordinates
      character*1 ta, tb                     [input] transpose flags

Computes the element-wise dot product of the two (possibly transposed) patches which must be double complex and have the same number of elements.

This is a collective operation.

GA_MATMUL_PATCH

     subroutine ga_matmul_patch(transa, transb, alpha, beta,
                                 g_a, ailo, aihi, ajlo, ajhi,
                                 g_b, bilo, bihi, bjlo, bjhi,
                                 g_c, cilo, cihi, cjlo, cjhi)

     integer g_a, ailo, aihi, ajlo, ajhi    [input]  patch of g_a
     integer g_b, bilo, bihi, bjlo, bjhi    [input]  patch of g_b
     integer g_c, cilo, cihi, cjlo, cjhi    [input]  patch of g_c 
     double precision/complex alpha, beta   [input]
     character*1    transa, transb          [input]

ga_matmul_patch is a patch version of ga_dgemm:

         C[cilo:cihi,cjlo:cjhi] := alpha* AA[ailo:aihi,ajlo:ajhi] *
                                   BB[bilo:bihi,bjlo:bjhi] ) + beta*C[cilo:cihi,cjlo:cjhi],

where AA = op(A), BB = op(B), and op( X ) is one of

         op( X ) = X   or   op( X ) = X',

Valid values for transpose arguments: 'n', 'N', 't', 'T'. It works for both double precision and double complex data tape.

This is a collective operation.

GA_SCALE

      subroutine ga_scale(g_a, s)
 
      integer g_a                         [input]
      double precision/complex/integer s  [input]

Scales an array by the constant s.

This is a collective operation.

GA_SCALE_PATCH

      subroutine ga_scale_patch(g_a, ailo, aihi, ajlo, ajhi, s)

      integer g_a                         [input]
      double precision/complex/integer s  [input]
      integer  ailo, aihi, ajlo, ajhi     [input] g_a patch coordinates

Scales a patch of an array by the constant s.

This is a collective operation.

GA_ADD

      subroutine ga_add(alpha, g_a, beta, g_b, g_c)

      integer g_a, g_b, g_c                          [input]
      double precision/complex/integer alpha, beta   [input]

The arrays (which must be the same shape and identically aligned) are added together element-wise

           c = alpha * a  +  beta * b.

The result (c) may replace one of the input arrays (a/b).

This is a collective operation.

GA_ADD_PATCH

      subroutine ga_add_patch (alpha, g_a, ailo, aihi, ajlo, ajhi,
                               beta,  g_b, bilo, bihi, bjlo, bjhi,
                                      g_c, cilo, cihi, cjlo, cjhi)

      integer g_a, g_b, g_c                        [input]
      double precision/complex/integer alpha, beta [input]
      integer  ailo, aihi, ajlo, ajhi              [input] g_a patch coordinates
      integer  bilo, bihi, bjlo, bjhi              [input] g_b patch coordinates
      integer  cilo, cihi, cjlo, cjhi              [input] g_c patch coordinates

Patches of arrays (which must have the same number of elements) are added together element-wise.

This is a collective operation.

GA_FILL_PATCH

      subroutine ga_fill_patch(g_a, ailo, aihi, ajlo, ajhi, s)

      integer g_a                         [input]
      double precision/complex/integer s  [input]
      integer  ailo, aihi, ajlo, ajhi     [input] g_a patch coordinates

Fills the patch of an array with value s. Type of s and array must match.

This is a collective operation.

GA_SUMMARIZE

      subroutine ga_summarize(verbose)
      logical verbose                     [input]! If true print 
                                                   distribution info

Prints info about allocated arrays.

GA_SYMMETRIZE

      subroutine ga_symmetrize(g_a)

      integer g_a                         [input/output]

Symmetrizes matrix A represented with handle g_a: A:= .5 * (A+A').

This is a collective operation.

GA_TRANSPOSE

      subroutine ga_transpose(g_a, g_b)

      integer g_a                         [input]  ! remains unchanged
      integer g_b                         [output]

Transposes a matrix: B = A', where A and B are represented by handles g_a and g_b.

This is a collective operation.

GA_DIAG

      subroutine ga_diag(g_a, g_s, g_v, eval)

      integer g_a               [input]   ! Matrix to diagonalize
      integer g_s               [input]   ! Metric
      integer g_v               [output]  ! Global matrix to return evecs
      double precision eval(*)  [output]  ! Local array to return evals

Solve the generalized eigen-value problem returning all eigen-vectors and values in ascending order. The input matrices are not overwritten or destroyed.

This is a collective operation.

GA_DIAG_REUSE

      subroutine ga_diag_reuse(control, g_a, g_s, g_v, eval)

      integer control           [input]   ! Control flag 
      integer g_a               [input]   ! Matrix to diagonalize
      integer g_s               [input]   ! Metric
      integer g_v               [output]  ! Global matrix to return evecs
      double precision eval(*)  [output]  ! Local array to return evals

Solve the generalized eigen-value problem returning all eigen-vectors and values in ascending order. Recommended for REPEATED calls if g_s is unchanged. Values of the control flag:

             value       action/purpose
   
              0          indicates first call to the eigensolver
             >0          consecutive calls (reuses factored g_s)
             <0          only erases factorized g_s; g_v and eval unchanged
                         (should be called after previous use if another 
                         eigenproblem, i.e., different g_a and g_s, is to 
                         be solved)

The input matrices are not destroyed.

This is a collective operation.

GA_DIAG_STD

      subroutine ga_diag_std(g_a, g_v, eval)

      integer g_a               [input]   ! Matrix to diagonalize
      integer g_v               [output]  ! Global matrix to return evecs
      double precision eval(*)  [output]  ! Local array to return evals

Solve the standard (non-generalized) eigenvalue problem returning all eigenvectors and values in the ascending order. The input matrix is neither overwritten nor destroyed.

This is a collective operation.

GA_LU_SOLVE

      subroutine ga_lu_solve(trans, g_a, g_b)

      character trans           [input]          ! transpose or not transpose 
      integer g_a               [input]          ! coefficient matrix
      integer g_b               [output/output]  ! rhs/solution matrix

Solve the system of linear equations op(A)X = B based on the LU factorization.

op(A) = A or A' depending on the parameter trans:

trans = 'N' or 'n' means that the transpose operator should not be applied.

trans = 'T' or 't' means that the transpose operator should be applied.

Matrix A is a general real matrix. Matrix B contains possibly multiple rhs vectors. The array associated with the handle g_b is overwritten by the solution matrix X.

This is a collective operation.

GA_CHOLESKY

      integer function ga_cholesky(uplo, g_a)
      character uplo            [input]           ! lower/upper triangle
      integer g_a               [input/output]    ! matrix

Computes Cholesky factorization for an SPD matrix. The result overwrites the original matrix. The input flag uplo ('L' or 'U') determines if lower or upper triangular matrix is created.

It returns

            = 0 : successful exit
            > 0 : the leading minor of this order
                  is not positive definite and the
                  factorization could not be completed

This is a collective operation.

GA_SPD_INVERT

      integer function ga_spd_invert(g_a)
      integer g_a               [input/output]    ! matrix

It computes the inverse of a double precision using the Cholesky factorization of a NxN double precision symmetric positive definite matrix A stored in the global array represented by g_a. On successful exit, A will contain the inverse.

It returns

            = 0 : successful exit
            > 0 : the leading minor of this order is not positive
                  definite and the factorization could not be completed
            < 0 : it returns the index i of the (i,i)
                  element of the factor L/U that is zero and
                  the inverse could not be computed

This is a collective operation.

GA_LLT_SOLVE

      integer function ga_llt_solve(g_a, g_b)
      integer g_a               [input]          ! coefficient matrix
      integer g_b               [output/output]  ! rhs/solution matrix

Solves a system of linear equations

               A * X = B

using the Cholesky factorization of an NxN double precision symmetric positive definite matrix A (epresented by handle g_a). On successful exit B will contain the solution X.

It returns:

            = 0 : successful exit
            > 0 : the leading minor of this order is not positive
                  definite and the factorization could
                  not be completed

This is a collective operation.

GA_SOLVE

      integer function ga_solve(g_a, g_b)
      integer g_a               [input]          ! coefficient matrix
      integer g_b               [output/output]  ! rhs/solution matrix

Solves a system of linear equations

               A * X = B

It first will call the Cholesky factorization routine and, if sucessfully, will solve the system with the Cholesky solver. If Cholesky will be not be able to factorize A, then it will call the LU factorization routine and will solve the system with forward/backward substitution. On exit B will contain the solution X.

It returns

            = 0 : Cholesky factoriztion was succesful
            > 0 : the leading minor of this order
                  is not positive definite, Cholesky factorization
                  could not be completed and LU factoriztion was used

This is a collective operation.

GA_DGEMM

      subroutine ga_dgemm(transa, transb, m, n, k, alpha, g_a, g_b, beta, g_c )
      Character*1        transa, transb        [input]
      Integer            m, n, k               [input]
      Double Precision   alpha, beta           [input]
      Integer            g_a, g_b,             [input]
      Integer            g_c                   [output]

Performs one of the matrix-matrix operations:

         C := alpha*op( A )*op( B ) + beta*C,

where op( X ) is one of

         op( X ) = X   or   op( X ) = X',

alpha and beta are scalars, and A, B and C are matrices, with op( A ) an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.

On entry, transa specifies the form of op( A ) to be used in the matrix multiplication as follows:

               transa = 'N' or 'n',  op( A ) = A.
               transa = 'T' or 't',  op( A ) = A'.
 
      m      - On entry,  m  specifies  the number  of rows  of the  matrix
               op( A )  and of the  matrix  C.  m  must  be at least  zero.
      n      - On entry,  n  specifies the number  of columns of the matrix
               op( B ) and the number of columns of the matrix C. n must be
               at least zero.
      k      - On entry,  k  specifies  the number of columns of the matrix
               op( A ) and the number of rows of the matrix op( B ). K must
               be at least  zero.

This is a collective operation.

GA_COPY

      subroutine ga_copy(g_a, g_b)
 
      integer g_a, g_b                    [input]

Copies elements in array represented by g_a into the array represented by g_b. The arrays must be the same type, shape, and identically aligned.

This is a collective operation.

GA_COPY_PATCH

      subroutine ga_copy_patch(trans, g_a, ailo, aihi, ajlo, ajhi,
                                  g_b, bilo, bihi, bjlo, bjhi)
 
      character trans                     [input] transpose operator
      integer g_a, g_b                    [input]
      integer ailo, aihi, ajlo, ajhi      [input] g_a patch coordinates
      integer bilo, bihi, bjlo, bjhi      [input] g_b patch coordinates

Copies elements in a patch of one array into another one. The patches of arrays must have the same number of elements.

trans = 'N' or 'n' means that the transpose operator should not be applied.

trans = 'T' or 't' means that transpose operator should be applied.

This is a collective operation.

GA_GET

      subroutine ga_get(g_a, ilo, ihi, jlo, jhi, buf, ld)

      integer g_a                          [input]
      integer ilo, ihi, jlo, jhi           [input]
      double precision/complex/integer buf [output]
      integer ld                           [input]

Performs the equivalent of the following Fortran-90.

      dimension A(1:dim1,1:dim2)
      dimension buf(1:ld, 1:*)

      buf(1:ihi-ilo+1, 1:jhi-jlo+1) = A(ilo:ihi, jlo:jhi)

where g_a represents a handle to the array A.

Array bounds are always checked.

GA_PUT

      subroutine ga_put(g_a, ilo, ihi, jlo, jhi, buf, ld)

      integer g_a                          [input]
      integer ilo, ihi, jlo, jhi           [input]
      double precision/complex/integer buf [output]
      integer ld                           [input]

Performs the equivalent of the following Fortran-90:

      dimension g_a(1:dim1,1:dim2)
      dimension buf(1:ld, 1:*)

      g_a(ilo:ihi, jlo:jhi) = buf(1:ihi-ilo+1, 1:jhi-jlo+1)

where g_a represents a handle to the array A.

Array bounds are always checked.

GA_ACC

      subroutine ga_acc(g_a, ilo, ihi, jlo, jhi, buf, ld, alpha)

      integer g_a                         [input]
      integer ilo, ihi, jlo, jhi          [input]
      double precision/complex buf        [input]
      integer ld                          [input]
      double precision/complex alpha      [input]

Performs the equivalent of the following Fortran-90. Array bounds are always checked. Double precision/complex types are supported. The operation is atomic.

      dimension g_a(1:dim1,1:dim2)
      dimension buf(1:ld, 1:*)

      A(ilo:ihi, jlo:jhi) = A(ilo:ihi, jlo:jhi) + 
                              alpha*buf(1:ihi-ilo+1, 1:jhi-jlo+1)

where g_a represents a handle to the array A.

GA_DISTRIBUTION

      subroutine ga_distribution(g_a, iproc, ilo, ihi, jlo, jhi)

      integer g_a, iproc                  [input]
      integer ilo, ihi, jlo, jhi          [output]
 
       g_a             - array handle [input]
       iproc           - process number [input]
       ilo/ihi/jlo/jhi - range held by process iproc [output]

If no array elements are owned by process iproc, the range is returned as [0, -1] for the i and [0,-1] for the j dimensions.

GA_COMPARE_DISTR

      logical function ga_compare_distr(g_a, g_b)

      integer g_a, g_b                   [input]

Compares distributions of two global arrays. Returns .TRUE. if distributions are identical.

This is a collective operation.

GA_ACCESS

      subroutine ga_access(g_a, ilo, ihi, jlo, jhi, index, ld)

      integer g_a                         [input]
      integer ilo, ihi, jlo, jhi          [input]
      integer index                       [output]
      integer ld                          [output]

Provides access to the specified patch of array. Returns leading dimension ld and and MA-like index for the data. This routine is intended for writing new GA operations. Call to ga_access should normally follow a call to ga_distribution that returns coordinates of the patch associated with a processor. You need to make sure that the coordinates of the patch are valid (test values returned from ga_distribution).

Your code should include a MA include file, mafdecls.h. The addressing convention:

       
          dbl_mb(index)  - for double precision data
          int_mb(index)  - for integer data

          dcpl_mb(index) - for double complex data

provides access to the first element (ilo,jlo) of the patch.

For a given subroutine:

          subroutine foo(A,  nrows, ncols lda)
          double precision A(lda,*)
          integer nrows, ncols
             ....
          end

you can reference A(ilo:ihi,jlo:jhi) in the following way:

          call foo(dbl_mb(index), ihi-ilo+1, jhi-jlo+1, lda)

NOTE: You have to worry about mutual exclusion in simultaneous overlapping access to the data by multiple processors (critical sections are not protected).

Each call to ga_access has to be followed by a call to either ga_release or ga_release_update. You can access only local data.

GA_RELEASE

      subroutine ga_release(g_a, ilo, ihi, jlo, jhi)

      integer g_a                         [input]
      integer ilo, ihi, jlo, jhi          [input]

Releases access to a global array when the data was read only.

Your code should look like:

           call ga_distribution(g_a, myproc, ilo,ihi,jlo,jhi)
           call ga_access(g_a, ilo, ihi, jlo, jhi, index, ld)
               
                < operate on the data >

           call ga_release(g_a, ilo, ihi, jlo, jhi)

NOTE: see restrictions specified for ga_access

GA_RELEASE_UPDATE

      subroutine ga_release_update(g_a, ilo, ihi, jlo, jhi)

      integer g_a                         [input]
      integer ilo, ihi, jlo, jhi          [input]

Releases access to the data. It must be used if the data was accessed for writing. NOTE: see restrictions specified for ga_access.

GA_READ_INC

 integer function ga_read_inc(g_a, i, j, inc)

      integer g_a                         [input]
      integer i, j, inc                   [input]

Atomically read and increment an element in an integer array.

      *BEGIN CRITICAL SECTION*
      return value = a(i,j)
      a(i,j) += inc
      *END CRITICAL SECTION*

GA_SCATTER

      subroutine ga_scatter(g_a, v, i, j, n)

      integer g_a                         [input]
      double precision v(n)               [input]
      integer i(n), j(n), n               [input]

Scatters the array elements into the array. The contents of the input arrays (v, i, j) are preserved, but their contents might be (consistently) shuffled on return.

      do k = 1, nv
             a(i(k),j(k)) = v(k)    
      enddo

GA_GATHER

      subroutine ga_gather(g_a, v, i, j, n)

      integer g_a                         [input]
      double precision v(n)               [output]
      integer i(n), j(n), n               [input]

Gathers specified elements (i,j) from the global array into a local single-dimesional array. The contents of the arrays (v, i, j) might be (consistently) shuffled on return.

      do k = 1, nv
           v(k) =  a(i(k),j(k))
      enddo

GA_ERROR

      subroutine ga_error(message, code)
      character*1 message(*)              [input]
      integer     code                    [input]

To be called in case of an error. Print an error message and an integer value that is intended to be the error code. Releases some system resources. This is the recomended way of aborting the program execution.

GA_LOCATE

      logical function ga_locate(g_a, i, j, owner)

      integer g_a, i, j                   [input]
      integer owner                       [output]

Return in owner the GA compute process id that 'owns' the data. If i/j are out of bounds .FALSE. is returned, otherwise .TRUE..

GA_LOCATE_REGION

      logical function ga_locate_region(g_a, ilo, ihi, jlo, jhi, map, np )

      integer g_a, ilo, ihi, jlo, jhi              [input]
      integer map(5,*), np                         [output]

Parts of the specified patch might be actually 'owned' by several (precisely np) processes. Return the list of the GA processes id that 'own' the data. If i/j are out of bounds .FALSE. is returned, otherwise .TRUE..

        map(1,*)    -  ilo
        map(2,*)    -  ihi
        map(3,*)    -  jlo
        map(4,*)    -  jhi

specify coordinates of a subpatch 'held' by the process which number is stored in map(5,*)

GA_INQUIRE

      subroutine ga_inquire(g_a, type, dim1, dim2)

      integer g_a                        [input]
      integer type                       [output]
      integer dim1                       [output]
      integer dim2                       [output]

Returns type and dimensions of array.

GA_INQUIRE_MEMORY

    integer function ga_inquire_memory()

Returns amount of memory (in bytes) used in the allocated global arrays on the calling processor.

GA_INQUIRE_NAME

      subroutine ga_inquire_name(g_a, array_name)

      integer g_a                        [input]
      character*(*) array_name           [output]

Returns the name of an array represented by the handle g_a.

GA_MEMORY_AVAIL

   integer function ga_memory_avail()

Returns amount of memory (in bytes) left for allocation of new global arrays on the calling processor.

Note: If ga_uses_ma returns true, then ga_memory_avail returns the lesser of the amount available under the GA limit and the amount available from MA (according to ma_inquire_avail operation). If no GA limit has been set, it returns what MA says is available.

If ( ! ga_uses_ma() && ! ga_memory_limited() ) returns < 0, indicating that the bound on currently available memory cannot be determined.

GA_USES_MA

      logical function ga_uses_ma()

Returns .true. if memory in global arrays comes from the Memory Allocator (MA). .false. means that memory comes from another source, for example System V shared memory is used.

GA_MEMORY_LIMITED

      logical function ga_memory_limited()

Indicates if limit is set on memory usage in Global Arrays on the calling processor.

GA_PROC_TOPOLOGY

      subroutine ga_proc_topology(g_a, proc, prow, pcol)

      integer g_a                         [input]
      integer proc                        [input] 
      integer prow, pcol                  [output]

Based on the distribution of the array associated with handle g_a, determines block row and column coordinates for the array section 'owned' by specified processor. The numbering starts from 0. The value of -1 means that the processor doesn't 'own' any section of array represented by g_a.

GA_PRINT_PATCH

      subroutine ga_print_patch(g_a,ilo,ihi,jlo,jhi,pretty)
      
      integer g_a                         [input]
      integer ilo,ihi,jlo,jhi             [input] coordinates of the patch
      integer pretty                      [input]

Prints a patch of g_a array to the standard output. If pretty has the value 0 then output is printed in a dense fashion. If pretty has the value 1 then output is formatted and rows/columns labeled.

This is a collective operation.

GA_PRINT

      subroutine ga_print(g_a)
      
      integer g_a                         [input]

Prints an entire array to the standard output.

This is a collective operation.

GA_PRINT_STATS

      subroutine ga_print_stats()

This non-collective (MIMD) operation prints information about:

number of calls to the GA create/duplicate, destroy, get, put, scatter, gather, and read_and_inc operations
total amount of data moved in the GA primitive operations
amount of data moved in GA primitive operations to logicaly remote locations
maximum memory consumption in global arrays, and
number of requests serviced in the interrupt-driven implementations by the calling process.

GA_CHECK_HANDLE

      subroutine ga_check_handle(g_a, string)
      integer g_a                         [input]
      character *(*) string               [input]

Check that the global array handle g_a is valid ... if not call ga_error with the string provided and some more info.

GA_INIT_FENCE

      subroutine ga_init_fence()

Initializes tracing of completion status of data movement operations.

GA_FENCE

      subroutine ga_fence()

Blocks the calling process until all the data transfers corresponding to GA operations called after ga_init_fence complete. For example, since ga_put might return before the data reaches the final destination ga_init_fence and ga_fence allow process to wait until the data is actually moved:

           call ga_init_fence()
           call ga_put(g_a, ...)
           call ga_fence()

ga_fence must be called after ga_init_fence. A barrier, ga_sync, assures completion of all data transfers and implicitly cancels outstanding ga_init_fence. ga_init_fence and ga_fence must be used in pairs, multiple calls to ga_fence require the same number of corresponding ga_init_fence calls. ga_init_fence/ga_fence pairs can be nested.

ga_fence works for multiple GA operations. For example:

           call ga_init_fence()
           call ga_put(g_a, ...)
           call ga_scatter(g_a, ...)
           call ga_put(g_b, ...)
           call ga_fence()

The calling process will be blocked until data movements initiated by two calls to ga_put and one ga_scatter complete.

GA_CREATE_MUTEXES

      logical function ga_create_mutexes(number)

      integer number             [input]

Creates a set containing the number of mutexes. Returns .true. if the opereation succeeded or .false. when failed. Mutex is simple synchronization object used to protect Critical Section. Only one set of mutexes can exist at a time. Mutexes can be created and destroyed as many times as needed.

Mutexes are numbered: 0, ..., number -1.

This is a collective operation.

GA_DESTROY_MUTEXES

       logical function ga_destroy_mutexes()

Destroys the set of mutexes created with ga_create_mutexes. Returns .true. if the operation succeeded or .false. when failed.

This is a collective operation.

GA_LOCK

   subroutine ga_lock(mutex)

   integer mutex      [input]   ! mutex id

Locks a mutex object identified by the mutex number. It is a fatal error for a process to attempt to lock a mutex which was already locked by this process.

GA_UNLOCK

        subroutine ga_unlock(mutex)

        integer mutex [input] ! mutex id

Unlocks a mutex object identified by the mutex number. It is a fatal error for a process to attempt to unlock a mutex which has not been locked by this process.

GA_NODEID

      integer function ga_nodeid()

Returns the GA process id (0, ..., ga_nnodes()-1) of the requesting compute process.

NOTE: the GA process id might not be the same as message-passing node id.

GA_NNODES

   integer function ga_nnodes()

Returns the number of the GA compute (user) processes.

NOTE: the number of the GA processes might not be the same as the number of the message-passing nodes.

GA_BRDCST

      subroutine ga_brdcst(type, buf, lenbuf, root)
      integer type       [input]         ! message type for broadcast
      byte buf(lenbuf)   [input/output]
      integer lenbuf     [input]
      integer root       [input]

Broadcast from process root to all other processes a message of length lenbuf.

This is a convenience operation available regardless of the message-passing library that GA is running with.

This is a collective operation.

GA_DGOP

      subroutine ga_dgop(type, x, n, op)
      integer type             [input]
      double precision x(n)    [input/output]
      character*(*) op         [input]

Double Global OPeration.

X(1:N) is a vector present on each process. DGOP 'sums' elements of X accross all nodes using the commutative operator OP. The result is broadcast to all nodes. Supported operations include '+', '*', 'max', 'min', 'absmax', 'absmin'. The use of lowerecase for operators is necessary.

This is a convenience operation, available regardless of the message-passing library that GA is running with.

This is a collective operation.

GA_IGOP

      subroutine ga_igop(type, x, n, op)
      integer type              [input]
      integer x(n)              [input/output]
      character*(*) OP          [input]

Integer Global OPeration. The integer version of ga_dgop described above, also include the bitwise OR operation.

This is a convenience operation available regardless of the message-passing library that GA is running with.

This is a collective operation.

GA_LIST_NODEID

      subroutine ga_list_nodeid(list, n)
      integer n                 [input]
      integer list(n)           [output]

Returns message-passing process ID (rank in MPI) for the GA processes in the range 0 .. n-1. For MPI, the ranks are in MPI_COMM_WORLD.

GA_MPI_COMMUNICATOR

      void ga_mpi_communicator(GA_COMM)
      MPI_Comm *GA_COMM;       [output]

Available in C only if MPI is used. Returns communicator handle for GA processes (group).