Jikes Compiler (version 0.47.1) Documentation (version 19990611)


This is a project to document my understanding of the internals of the Jikes Compiler. For copyright information see copyright notice.

To that end we will walk the compilation of the world's most widely implemented program thru the actual compiler execution following the path thru the source tree. The program is:

   // 19990603000 tpd created

   public class Hello
   {
     public static void main(String[] args)
     { System.out.println("Hello, World");
     }
   }

  
First, some comments about the program. I have some automated tools to handle source code so this is in my standard format. The first line has the change date encoded as YYYYMMDDnnn where nnn is a simple increment for any changes in the day. This is followed by my initials and a comment.
Next we have a standard class except that I violate Java "standard" format. My style has leading opening braces indented 2 characters. The standard format has trailing opening braces with 4 char indents. This is religion and my pretty printer does it my way.

The Trail


The entry point of the compiler is main.
   main sets up an out-of-memory handler,
   turns off some potential floating point exceptions for some platforms,
   stores the command line (and file redirect) arguments,
   looks for options and sets flag variables,
   then constructs a Control class to process the files.
   finally, it gets the return code and exits.

   Command line arguments are stored in an ArgumentExpander class.
   This class takes care of the file redirection for reading command line
   arguments from a file.

   Option setting is done by the Option class.
   It checks all of the flags on the command line and sets public boolean
   variables in the class for all to reference.
  

Data structures


It is important to understand the internal data structures. Important classes are described here.

All of the files.


access.h
ast.cpp
ast.h
body.cpp

bool.h
   This file contains the Typedef bool
   bool is a typedef to get around early versions of the C++ compiler.
   It is probably no longer useful.
  
bytecode.cpp
bytecode.h
case.cpp
case.h
class.h
code.cpp
code.h
config.cpp
config.h
control.cpp
control.h
decl.cpp
definite.cpp
depend.cpp
depend.h
diagnose.cpp
diagnose.h
double.cpp
double.h
dump.cpp
error.cpp
error.h
expr.cpp
getclass.cpp
getclass.h
incrmnt.cpp
index.html
init.cpp
javaact.cpp
javaact.h
javadcl.h
javadef.h
javag.html
javaprs.h
javasym.h
jikes.cpp
long.cpp
long.h
lookup.cpp
lookup.h
lpginput.cpp
lpginput.h
modifier.cpp
mymake.html
op.cpp
op.h

option.cpp
option.h
  These file contains 4 classes:
    ArgumentExpander
      which reads the command line arguments and stores them.
      It handles @file redirection so arguments can be in files.
    KeywordMap
      simple struct class that holds keyword information.
    OptionError
      simple class to store bad option information.
    Option
      storage class that reads the command line arguments from an
      ArgumentExpander class and sets class-public booleans.
  

parser.cpp
parser.h
scanner.cpp
scanner.h
semantic.h
set.cpp
set.h
spell.h
stream.cpp
stream.h
symbol.cpp
symbol.h
system.cpp

tab.cpp
tab.h
   These files contain the class Tab.
   Tab is a simple class to hold the number of tabs used during output.
  
table.h
tuple.h
unicode.h
unzip.cpp
unzip.h
zip.cpp
zip.h

Jikes extension projects


An interesting extension would be to make Jikes a JNI method to dynamically compile a file or a byte stream. The resulting class file and all internal data structures could be returned to the caller for review. This would be a useful way to build things like dynamic XML or HTML.

Documentation Alternatives


There are several thoughts about the best way to document the compiler. We use a highly decorated source tree so we make modifications to the original source code for hypertext linking.
An alternative is to use patch files and construct diffs from the original source tree. Then when changes to the compiler are received we only have to reapply the patches. This would be easier in the long run.
Another alternative is to actually parse the code and decorate symbols with html on the fly. Then we could cross reference the symbols dynamically. However, we do not have a C++ parser available. However, this is a very interesting idea for Java. The basic thought is that we modify Jikes to generate html decorated Java with automatic cross references. This is possible once we understand the current source code.
Another interesting alternative is to generate an application that would dynamically parse and display compiler information for a Java file.