The intermediate representation used in the package is adopted from Sage++. We give a brief introduction here.An abstract syntax tree (AST) and several structure tables constitute the major part of our intermediate representation for the source program.
There are two types of nodes in the syntax tree. One is called bif node, corresponding roughly to statements in the source program; the other is called low-level node, corresponding to the expressions in each statement.
First, we only consider the bif nodes and their relations in the AST, which is based on program control constructs. The root is called global node and each node has potentially two distinct children, the control children. One is called true branch and the other is called false branch. The false branch is mainly used in pure conditional construct such as if-then-else, most bif node has empty false branch.
Each branch may represent a list of statements in the same level, as can be seen in the following example.
Besides other information, there are two important pieces of informations contained in each node. One is called id, which is a unique integer identifier designating the node among others. The other is called variant, which is a integer tag telling the nature of the node. Thus, we may use the notation N[tag] to refer to a node.
To describe the control children, we will use an expression of the form
( N[tag] (true-branch) (false-branch))The following simple example in HPF language is to illustrate this structure.
PROGRAM MAIN REAL A(100), X !HPF$ PROCESSORS P(4) !HPF$ DISTRIBUTE A(BLOCK) ONTO P FORALL (i=1:100) A(i) = X + 1 CALL FOO(A) END SUBROUTINE FOO(X) REAL X(100) !HPF$ INHERIT X IF (X(1).EQ.0) THEN X = 1 ELSE X = X + 1 END IF RETURN ENDThe bif nodes for this program are structured as follows:
(1[GLOBAL] ! 2 components in true-branch ((2[PROG_HEDR] ! 6 components in true-branch (3[VAR_DECL] 4[PROCESSORS_STMT] 5[DISTRIBUTE_DECL] (6[FORALL_STMT] (7[ASSIGN_STAT] 8[CONTROL_END] ) NULL ) 9[PROC_STAT] 10[CONTROL_END] ) NULL ) (11[PROC_HEDR] (12[VAR_DECL] 13[INHERIT_DECL] (14[LOGIF_NODE] ! both branchs are non empty (15[ASSIGN_NODE] 16[CONTROL_END] ) (17[ASSIGN_NODE] 18[CONTROL_END] ) ) 19[RETURN_STAT] 20[CONTROL_END] ) NULL ) NULL )The integer identifier associated with each node is the key to accessing the rest of the properties of the node.
Other information contained in a statement node include:
- the name and line number of the source file containing this statement,
- the identifier of the control parent of this node,
- a symbol reference such as a do loop parameter in Fortran or a subroutine name in a call statement,
- two or three expressions associated with the node (see the following sections for details),
- the control children of the node.
Among them, expressions give rise to one level of refinement of our syntax tree. A statement node can have up to three expressions nodes associated with it, we also call them low level expressions.
Expression means lists or algebraic expressions of variables, constants or functions. For example, a Fortran assignment statement has a left hand side expression and a right hand side expression.
Besides an integer id, each expression node has four components. The first is its variant. The second is an optional symbol reference. The third and fourth components are the left and right operand expression nodes. To describe expressions we can use the following notation.
(N[tag, symbol_id] left_subtree right_subtree)where N is the id, tag is its variant, symbol_id is for possible associated symbol table identifier. If both the left_tree and right_tree are empty, we may simply denote the expression node as N[tag,symbol_id]
Thus, the expression X+1 in the above example program takes the form
(1[ADD_OP] 2[VAR_REF, X] 3[INT_VAL, 1])All the symbols in the program have a record in the symbole table. During the construction of the program IR, these records will be linked to the bif nodes and the low-level nodes described above. So when traveling along the parse tree, all information about the symbols can be obtained.
Email contact: zgs@npac.syr.edu