Paper Number: C440 Title: An Annotation-aware JVM Implementation Authors: Ana Azevedo, Alex Nicolau, and Joe Hummel The paper describes a way of attaching annotations to Java class files that describe potential optimizations, and a way of modifying a JIT compiler to produce more efficient native code by taking advantage of those annotations. The annotations contain information that is available at compile time but is expensive to produce at runtime. The authors also provide preliminary information on verifying annotations, and on the performance impact of using the annotations to guide the JIT compiler. Recommendation: Accept with revisions. The material seems relevant, and some interesting work has been done. Most of my recommendations below relate to the form of the presentation only, rather than the content. However, there are a couple of remarks on content (namely, major comments 3, 4, 6, and 10) that need to be addressed for the final version of this paper. There are also an astonishingly large number of English grammar, spelling, and usage errors. The paper should be thoroughly and carefully edited to remove them. Comments for the author ----------------------- Major comments: 1. Sections 5 and 6 should be swapped. It seems very strange to have the description of your system separated from performance measurements of your system by related work. 2. Some of your figures are too small to read easily, especially figures 2 and 6. 3. It should be noted that guavac, the compiler on which this project is based, is no longer being developed. Furthermore, development stopped before full JDK 1.1 compatibility was achieved. The upshot of all this is that your project is based on a JDK 1.0.3 compiler with partial JDK 1.1 support, and it is unlikely to ever achieve either full JDK 1.1 support or JDK 1.2 support. In your future work, note whether you plan to support guavac yourselves, or move to another compiler base. 4. Page 7, lines 1-3: VRA is generated assuming an unbounded number of registers (page 5, last full paragraph). At the very least, one virtual register is used for each local variable (page 6, last paragraph). The JVM specification allows up to 65535 local variables per method (counting longs and doubles as 2 variables). However, this portion of page 7 indicates that virtual register indexes are stored in a byte, so there are at most 256 virtual registers per method. What happens when more than 256 virtual registers are needed? 5. In section 2.1, note whether the possibility that a method will be executed concurrently by multiple threads affects your algorithm or not. If it does, explain how. 6. Also in section 2.1, the paper states that "references inside loops [count] 10 times more and [are] scaled by the nesting level." Why 10? Do your results change if a value other than 10 is used? If the analysis is extended to the cross-module case, as suggested in the conclusion, how will you prioritize variables found in method loops (i.e. recursion)? Will you ignore it? Treat it as a 10? 7. Figure 6 is just so much random code to me. It either needs a great deal more explanation, or reduce it to more easily understandable pseudocode. 8. Section 4 mentions a "Gosling property" and refers to [22], the Java Virtual Machine Specification. I cannot find any mention of a Gosling property in that book. Provide a page number, or at the very least a chapter number, when referring to specific information in a book. 9. On page 13, the acronym AJBC suddenly appears, without explanation. This is the first use of that acronym in the paper. What does it stand for? 10. Section 4 describes how Java class file verification can be extended to cover annotations as well. However, the section concludes by remarking all cases may not have been covered. If you aren't sure that your verifier is sufficient, why did you bother making the reader wade through all the preceding material in that section? Why not just list off, in a paragraph or two, the verifications you already know how to do, and then mention that you will continue to work on this topic? 11. On page 20 the claim is made that, "our AJIT system is capable of producing machine code that executes up to twice as fast as current JIT technology." The results only demonstrate that fact with respect to Kaffe. It may not be true in general. Also, no numbers have been shown regarding the increased size of classfiles, the time to verify the annotations, or the time required to generate the annotations. You argue that the runtime register allocator is "fast." How much will a thorough verification process slow down execution, since it will require time-consuming algorithms to check UD-chains, liveness, etc.? How long does the program need to run before the cost of verification is amortized away? 12. In Section 6, why did you use such old versions of Kaffe and Sun's JDK? I can understand that the lag time between submission and publication of a journal paper can cause the printed form of a paper to refer to old versions. But you are just now submitting the paper. These versions were already quite old at the time of submission. Minor comments: Throughout the paper, the following phrases appear: "the Java Bytecodes", "Java bytecode", "Java Bytecode". Settle on something standard. I suggest the 2nd form, which is the one that appears in literature from Sun. Throughout the paper, there is confusion between singular and plural forms. There are too many instances to list. Throughout the paper, both definite and indefinite articles are used inappropriately (i.e., they are present when they should not be, and incorrectly omitted in other places). There are too many instances to list. A spellcheck should be run on the paper to catch a number of misspelled words. Page 2, 3rd paragraph, line 4: I don't think that "sequentializes" is a word. I suggest "serializes". Page 3, 1st paragraph and Figure 1: The IR has not been explained at this point (nor indeed is it explained elsewhere in the paper). This makes understanding Figure 1 very difficult. The two examples in the paragraph are a single setence apiece; ordinarily, this would not be a problem, but the reader is already slighly confused from not understanding the IR. Page 4, line 4: Consider this evidence that you've been working with Java too intensely. Replace "finalize" with "conclude". Page 4, line 12: Remove an "in" from "In in Section 5 ...". Page 6, lines 3-4: The sentence, "For each bytecode operation type there is a distinct VRA annotation format" is confusing. The rest of the paper seems to indicate that each bytecode operation type has multiple VRA annotation formats, and that some VRA annotation formats are shared by multiple bytecode operation types. Page 6, line -1: CONST is used undefined here, and later in the paper. Page 7, first full paragraph: Merge with the previous paragraph. Page 8, figure 5, last bytecode name: the 't' in 'putfield' is missing. Page 9, last line of section 2.1: replace 'wrongly' with 'incorrectly', and change 'may be' to 'is' or 'is not', as appropriate, or at least indicate why deciding is difficult. Page 9, first line of section 3: Kaffe is not public domain. It is distributed freely under the Gnu Public License (GPL). There is a big difference. Page 10, line 7: remove the phrase "the needs". Page 10, line 15: there is a split infinitive here. Change "... translation be entirely skipped ..." to "... translation be skipped entirely ...". Page 10, 1st and 2nd lines of 3rd full paragraph: change "... representation in order to produce register allocation" to "... representation for register allocation". Page 10, last line of 3rd full paragraph: change ".. at the moment of ..." to "... when ...". Page 10, line -6: delete the phrase "number of" Page 12, line 1: replace "lower" with "reduce". Page 12, line -6: replace "... the type contents of local variables" with "... the type of each local variable". Page 13, line 8: replace "... operand types characterizes an invalid annotation information" with "... operand types indicates an invalid annotation". Page 13, 2nd full paragraph, 1st line: Reword the 2nd sentence; who is "us"? Page 14, figure 7: part (c) is incorrectly labeled as (b). In part (d), "Trusted Annotated Bytecode Stream" should be "Untrusted Annotated Bytecode Stream". Page 15, 2nd full paragraph, 1st line: Reword the 1st sentence; the grammatical structure is too complex. Page 15, 2nd full paragraph, line 8: remove a "to" in "... it is not enough to to have ...". Page 15, line -7: I do not understand the final clause of this sentence ("... as in the virtual register ..."). Page 16, line 6-7 and page 20, lines 19 and 24: you are submitting your paper to a journal. Why isn't this the final version? Maybe the unfinished work to which you refer should just go into a separate paper. Page 16, section 5, line 4: what is the difference between an intermediate form and a language? Page 16, section 5, line 5: in the references for JIT compilers, what about [4] and [20]? Do they belong in that list? Page 17, 2nd full paragraph, 1st line: delete "... described in ...". Page 18, 1st full paragraph: This paragraph does not really belong here. Since Kaffe is an integral part of your work, this paragraph should probably be in the Results section instead. Page 18, 2nd full paragraph. Consistently capitalize "slim binaries" throughout this paragraph. Section 6: Consistently capitalize "kaffe" throughout this section. Also, are the benchmarks used in this section meant to be representative of all programs? Of OO programs? Page 20 states that the "smallest performance gain was observed for the code with the highest number of subroutine calls." Since "true" OO programs often have this characteristic, will this fact greatly reduce the success of your optimization? Page 19: Consider combining these two tables to make more efficient use of space. Page 20, 2nd full paragraph, line 3: remove "maintained". Page 20, 2nd full paragraph, line 6: remove "implemented in the last". Page 20, final sentence of section 6: remove for the final version of this paper. Page 21, first full paragraph: sentences 5 and 6 are out of place here. Page 21, first full paragraph, line -5: replace "We will be also refining ..." with "We will also be refining ...". References, 12, 14, 22, and 27: The references consistently use the first initial and surname of each author, except for the four listed references. References, 23: Is "tcc" supposed to be capitalized? References, 25: the given URL does not exist. References, 27: the title is not consistently capitalized.