HPF Compiler (HPFC) Subset of HPF specification Version 2.0 and extensions Although the HPF Compiler's frontEnd is developed fully supporting HPF V2.0, the backEnd is designed as a subset of HPF specification. However, in order to support out of core computation which involves huge data processing, several extentions are added. Features and Extensions ========================= The compiler has these good features: 1) About HPF data mapping directives HPF Compilation system thoroughly touched all kinds of HPF directives defined in HPF V1.0. However we make a constrain on the processors declaration. supported directives : 1. PROCESSORS statement 2. TEMPLATE statement 3. ALIGN statement 4. DISTRIBUTE statement 5. REALIGN statement 6. REDISTRIBUTE statement 7. INHERIT directive is supported for efficient procedure call. 8. DYNAMIC directive 2) Forall statement and Where statement support Almost all kinds of forall statements are implemented, include mask expression in forall header and irregular patterns. Where statements are translated into equivalent forall statements. 3) Forall construct and Where construct Currently forall constructs and where constructs are translated by transforming them to forall statements and where statements. 4) Ghostarea is supported. By analysing forall statement, compiler can decide the size of ghostarea and perform more efficient communication - shift. Programmer can specify the ghostarea size by using SHADOW directive in case that compiler fail or unable to calculate ghostarea. For example, !HPF$ SHADOW A(1:1) This feature is very useful in the procedure call to improve performance. 5) Runtime communication detection. The compiler is good at communication detection during compiling time, which requires that all the parameters needed are constant. But this requirement can not always meet. Runtime communication detection code is generated so that communication pattern can be decided later in runtime. 6) compiler and runtime support for OUT-OF-CORE computation. In out of core computation, global array is divided into local arrays belonging to each processor. Since the local array are out of core, they have to be stored in files on disks. Data are stored in Local Array File(LAF). The node program explicitly reads from and writes into the file when required. During computation, data are read from LAF into in-core memory, and computaion is performed on them, after that the result is write back to the local array file. 7) Parallel IO The compiler can support all the I/O statements, including format, read/write, and other IO control statements, provided that the statement can be compiled by fortran compiler. The runtime support for parallel IO is transparent to programmers. The compiler also provide runtime support for C style I/O. In this case, only very necessary functions are implemented and currently not support formatted file. 8) Dynamic allocation and pointer are implemented. Allocate and deallocate statement are supported for dynamic memory allocation. Pointer assignment and nullify are also implemented. The STAT retrun value are not supported at present. 9) separate compilation is partly supported in frontEnd. eg, to compile two HPF programs: f1.hpf, f2.hpf, use script 'hpfc' hpfc -o f f1.hpf f2.hpf this will generate intermediate dep file 'f.dep' which combines the AST & ST tree of f1.hpf and f2.hpf to a new program tree in f.dep. Later program translation and node program generation will be performed on this combined program tree. 10) depending on local fortran compiler, more data types are supported, such as BYTE and WORD, which are currently supported by most Fortran77 compilers. 11) Independent Do is partly supported. The constraints are: 1. no communication happened according to owner-computing rule. 2. all the LHS have same computation set. 12) good portability and supporting heterogeneous platform computing Because runtime is based on MPI, so portability and heterogeneous computing are ensured by MPI. Extensions includes: in order to support out of core computation, several new directives are added. 1) OUT_OF_CORE directive This directive specify array is an out-of-core array, which means array is too large to fit in memory, eg: !HPF$ OUT_OF_CORE A,B which specify array A and B are out of core array. 2) IN_CORE_SIZE directive Specify the in core size, which is the memory that the disk resident out-of-core array can use, eg: !HPF$ IN_CORE_SIZE (100000) Thus every out-of-core array have a 100K in core memory to utilize. 3) MEMORY directive Specify the dynamic memory size the node program can use. !HPF$ MEMORY (5000000) Specify about 5M memory as dynamic memory node program can use. In fact, during runtime, a lot of temporary memory is needed. Since fortran77 cannot support dynamic memory allocation, Memory Allocator(MA) is used to implement dynamic memory allocation. In node program's initialization time, dynamic memory must be allocated for later use. If this directive is absent, compiler will estimate the memory size for user. Constraints ============= As an implementation of subset of HPF V2.0, we have some constraints and unsupported features. 1) About PROCESSORS The size of logical processers must be the same as physical processors to run the node program. However, to define processors group of unknown size, the intrinsic function `number_of_processors()' can be used in HPF program. Another constraint is that only one unique logical processors declaration is allowed in an HPF program, ie, all the logical processors defined in program must have same size and same shape. 2) About Forall Statement Intrinsic reduction functions cannot appeared in froall statement. 3) About procedure and function calls Whenever a procedure or function is called, the interface block of the procedure must be declared in the scope. If the interface of an procedure is absent, all parameters are considered as have INHERIT attribute. 4) Only parts of HPF intrinsic functions are implemented. Belows are transformational intrinsic functions currently supported: sum, maxloc, minloc, maxval, minval, count, transpose, matmul, dot_product, cshift, any, all, lbound, ubound, size, number_of_processors. The Fortran77-supported elemeantal intrinsic functions are also supported. 5) Independent forall is not supported. 6) Depending on the local fortran compiler, some feature are supported only by the local Fortran90 compiler, such as contains block , etc. 7) Derived types are not implemented. 8) Modules and Block Data are not implemented. 9) Storage and Sequence Association are not supported.