Referee 1 *****************************************************

Title:  Jaguar: Enabling Efficient Communication and I/O in Java
Authors:Matt Welsh and David Culler

a)Overall Recommendation
------------------------
ACCEPT

b)Words suitable for authors
----------------------------

This is a fine paper. However, I'm slightly disappointed by section 5
(preserialized objects) Apart from buffer handling the most serious
performance problems of serialization are
(a) the encoding of type information to handle both transparent class
    loading and the subtype problem mentioned below,
(b) cycle detection for arbitrary graphs of objects, and
(c) the ability to ship any objects (even those unknown at compile
    time)
As far as I understand it, preserialized objects avoid all those
problems (or significant portions of it). It is therefore slightly
exaggerated to call the mechanism "serialization".  I strongly suggest
that the authors try to stress the fact that this is a very special &
restricted form of serialization. Moreover, section 5.2 should discuss
the limitations in more detail.

Some minor remarks:
-------------------

- In figure 1 I would like to see the numbers formatted in a way so
that the .s are lined up. Like:

        .985 us
       1.31  us
    1706.0   us

Similar for figures 7 and 9.

- In section 4.3 I would like to see measurements instead of estimates
for bandwidth over JNI.

- section 5.2: since you don't have any form of type information in
the preserialized objects, I don't see how object graphs of type T can
be handled where the nodes can be arbitrary subtypes of T. Either
discuss or mention limitations in 5.2

- I cannot understand the second paragraph of section 5.2

- Insert white space:
  4th line from bottom of column that holds Fig 7. "offset.Next"
  1st line of 2nd column that holds Fig 9. "timings(from"

- 4th line of Acknowledgements: remove "design."

References:
-----------
updated versions of [3] and [22] will appear in the same issue.
    Please cite the journal article instead.
[2] Self instead of self
[3],[7],[8],[12],[17],[19],[22],[24],[25],[29] Java instead of java
[4] Myrinet instead of myrinet
[8] and [24] no URLs or URLs for both
[13] HotSpot instead of hotspot
[19] JIT instead of jit
[22] RMI instead of rmi
[25] PVM instead of pvm
[28] Berkeley JAWS instead of berkeley jaws
[28] avoid linebreak (-) in URL

Referee 2 *****************************************************************

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Overall Recommendation: Accept
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Overall, I think this is a very good paper. Even though it has not
been through the initial round of reviewing that the other Java '99
papers have, I would definitely recommend it for inclusion in the
Java Grande issue of "Concurrency: Practice and Experience". I like
the fact that the paper discusses both the strengths and weaknesses
of the Jaguar approach. I do think, however, that some of the
strengths are over-stated (see comments below).

Comments on Content
~~~~~~~~~~~~~~~~~~~

1. The thing that bothered me most about the paper were the claims
   that the Jaguar approach is inherently safer than using JNI. These
   claims are repeated throughout the paper: in the introduction, in
   section 2.2, in section 3.1, and again in section 3.4. Although
   I understand that the Jaguar philosophy is to write short, straight-
   line machine code in the unsafe code mappings, as far as I can
   tell there is nothing to prevent someone from adopting the same
   approach using JNI (of course, the JNI approach has serious
   performance problems, but those concerns are orthogonal to the
   issue of safety). And it is probably easier to shoot one's self
   in the foot with x86 assembly code than it is with C code.

   I think the claims of Jaguar's inherent safety should be toned down
   substantially. At the very least, they should not be repeated so many
   times. The strong performance arguments and benchmark results
   presented in the paper are more than enough to convince me of Jaguar's
   merits.

2. In a similar vein, the authors argue in section 3.4 that the use of
   machine code in the code mappings is an advantage because it is so
   difficult to use that people will be less inclined to provide "overly
   complex functionality as a single Jaguar primitive".  Talk about positive
   spin control! I think the authors should seriously consider removing
   that sentence from the paper.

3. In the second paragraph of section 2, the authors may also want to
   cite two other papers that appeared at Java Grande 99:

     Practical Guidelines for Boosting Java Server Performance
     Reinhard Klemm

     Performance Limitations of the Java Core Libraries
     Allan Heydon and Marc Najork

4. The second paragraph in section 4.2 (i.e., the paragraph that begins
   with "Exposing the JaguarVIA API to Java...") repeats material covered
   earlier in the paper; it could easily be omitted.

5. I think the example presented in section 4.2 would be a bit more
   understandable if the authors included just a little more of the Java
   source code. In particular, I think it would help to show the VIA_VI
   class and its TxDoorbell field (and perhaps other fields as well for
   completeness). Also, in the example that is shown, the return statement
   should appear on its own line.

6. Based on the x86 assembly code shown in Figure 4, I wonder how familiar
   someone must be with the JIT being used in order to write code mappings.
   How hard would it be for someone other than the authors to write correct
   code mappings? What symbolic (i.e., variable) names can the assembly
   code refer to?

7. How difficult would it be to actually measure the JNI performance
   reported in the right graph of Figure 5 rather than "simulating" it.
   Although your methodology for computing the estimates seems sound,
   it does not seem like performing the experiments should be that much
   work, and it would make the results much more convincing.

8. In the second paragraph of section 5.2, the authors enumerate a problem
   with PSOs, and then claim, "This is not as limiting as it might seem."
   Yet they don't really provide any basis for that claim. So far, they've
   only used PSOs in toy benchmark programs, so I think their claim is
   overstated.

Comments on Presentation
~~~~~~~~~~~~~~~~~~~~~~~~

1. The figures appear out-of-order (Figs 2 & 3 are swapped, as are Figs
   8 & 9). I presume this problem will be corrected in the journal
   publication.

2. In the left half of Figure 5, it is very difficult to distinguish
   the C curve (solid) from the Jaguar one (solid with tiny dots?).

3. The titles of many papers in the bibliography contain lower-case letters
   that should probably be upper-case (e.g. "evm", "java", "jit", "rmi",
   "k", "kvm", "jpvm", etc.). Capitalization can be preserved in LaTeX
   bibliographies using curly braces. Also, the reference for the K virtual
   machine should say "Sun Microsystems Inc.", not "Inc. Sun Microsystems".

4. Pet peeve: there are many, many instances of the word "which" that
   should be replaced by "that". For a guide to the correct usage, see
   the topic on "Which-Hunting" in "A Handbook for Scholars", Mary-
   Claire van Leunen, Oxford University Press, 1992.

5. Typos and suggested improvements:

  Pg 1, col 2, line -8 (from bottom):
     "...through compile-time code transformation..."
  -> "...through a compile-time code transformation..."

  Pg 2, col 2, line 15:
     "A large number of of user level..."
  -> "A large number of user level..."

  Pg 4, col 1, line 5:
     "...operation of a Java compiler is..."
  -> "...operation of a Java just-in-time compiler is..."

  Pg 4, col 1, section 3.1, line 3:
     "...the javac bytecode compiler and..."
  -> "...the javac compiler and..."

  Pg 4, col 2, line -4:
     "...of both field name and type, ..."
  -> "...of both a field's name and type, ..."

  Pg 5, col 2, line -4:
     "..., which prohibit overloadding."
  -> "..., which prohibits overloadding."

  Pg 6, col 1, section 4, line 6:
     "... which enable high-bandwidth and ..."
  -> "... which enables high-bandwidth and ..."

  Pg 6, col 2, paragraph 3:
     "...packet buffers must be first registered ..."
  -> "...packet buffers must first be registered ..."

  Pg 7, col 2, line -19:
     "They are implemented as the class..."
  -> "They are implemented by the class..."

  Pg 9, col 1, line -11:
     "...can be thought of a Java object..."
  -> "...can be thought of as a Java object..."

  Pg 11, col 1, mid-column:
     "This was accomplished by..."
  -> "The latter was accomplished by..."

  Pg 12, col 1, section 6, line 9:
     "...as much the JVM and the compiler itself."
  -> "...as much as the JVM and the compiler itself."

  Pg 13, col 1, section 7, line 6:
     "...as a Java object in this way provides..."
  -> "...as Java objects provides..."

  Pg 13, col 2, Acknowledgements:
     Remove sentence fragment "design.".

Referee 3 ***************************************************

C423 JGSI Review
Overall recommendation: accept
Comments:

Good paper.  I enjoyed reading it.  It would be nice to compare Jaguar
with Microsoft's J/direct technology in the related work section.
They use source-level annotations to annotate pinned objects
(e.g. "embedded arrays") which are propagated to the JIT compiler
through unused byte-code attributes.  Because of a poor implementation
(unclear whether it is due to some MSJVM shortcoming or to plain
incompetency), the measured J/Direct performance is aweful :-) So
there would be no need to show those numbers (although doing so would
further strengthen your case!).