I enclose 3 Referee reports on your paper. We would be pleased to accept it and could you please send me a new version before November 5 99 Please send a memo describing any suggestions of the referees that you did not address Ignore any aggressive remarks you don't think appropriate but please tell me. I trust you! Thank you for your help in writing and refereeing papers! Referee 1 ***************************************************************** Subject: C427 JGSI Review Title: Object-Serialization for Marshalling Data in a Java Interface to MPI Authors:Bryan Carpenter, Geoffrey Fox, Suang Hoon Ko, and Sang Lim a)Overall Recommendation ------------------------ ACCEPT b)Words suitable for authors ---------------------------- - On p6 you describe that you need two messages: the first sends the size, the second sends the data. Why is this idea better than to use a bigger first message that sends the size and the first segment of the data. The second message can be optimized away if the total array has been sent in the first message. - From the text I cannot see which JIT has been used on the Solaris machines? Was it HotSpot? HotSpot-people mentioned that they have built in better support for arrays of primitive types. I'm curious. - Since I've seen nice performance numbers for Symmantec's JIT on Wintel machines, I would like to see the results of your benchmarks benchmarks between PCs. - Figures: The lines in Fig 3 are not explained. Which line corresponds to which equation. I don't see any equations numbered 1 to 3 in Table 1. Figures 6/byte should have the same axis (0,50,100,150,200,...) as Fig 3. Same for Fig 7, moreover shared/byte should use 300 as an upper end of the axis (instead of 500) Figures 6 and 7 are hard to digest. The reader must remember what the lines and the open icons are. Only with this information in mind, the reader is able to see the improvement. What is the reason for showing the lines again? An important problem is that the reader does no longer see the goal: From my understanding it is the goal to approach the triangulars. - Please try to give an average improvement factor for both versions of you streams in the text. - Please explain why the byte[][] got *slower* in Figure 6/byte - A main cost factor when sending arrays of primitive types is to access them in the JNI. Most likely the JVM will copy the data from an Java-internal area into a C-array. Please discuss this problem. Please avoid "zero-copy" since there is *copying*. - What do you do in case of heterogeneous clusters where you have big & little endian machines? You will need some form of processing of float arrays. It is insufficient to just pass the C-float[] around. - KaRMI tries to reduce the copying as well. KaRMI can use specific communication hardware (e.g. VIA) and could use native MPI-routines for communication. One can plug in the mechanism of choice. Some minor bugs: ---------------- - 3rd line of 2nd paragraph of section 1.1: "serialiation" - end of item 2 on section 3: "the this buffer descriptor" - two paragraphs below: "presented in the next suggest that ..." - 2nd line on p6: "subset of the of array" - end of 1st paragraph of Section 4: "elements.2." - Footnote 2: "that there some debate" ... "various proposals for for optimized" - Remove "All timings are in microseconds" from the 1st paragraph on p8. Instead, put it behind all numbers in Table 1. I would prefer to see all t's lined up in Table 1. - find a consistent spelling for: ping-pong, Pingpong - Caption of Figure 4: "...for handling arrays" please add "of primitive type" - 3rd paragraph on p11: "Figure 3 shows the effect". Should be Figure 6. References: ----------- - [5] an updated version will appear in the same issue of CPE. same for [17] and [18] - [6] "jmpi"? I'm not sure, it might be "JMPI" Referee 2 ************************************************************* Subject: C427 JGSI Review a) Overall: Good b) I think that this paper shows very good idea to combine Java's Serialization with C and MPI's efficient data communication. I think your choice to abandon implementation of MPI_TYPE_STRUCT is pretty reasonable, but at the same time, it is possible to pass some "simple objects", which are instances of a final class and contains references, through MPI interface. If this special case is implemented, for example, array of complex numbers can be sent via MPI native interface. Methods and results shown in Section 5 is very interesting. If you have chance, please compare these results with the one with improved RMI routine described in Paper [17] (in the bibliography of your paper). Referee 3 ************************************************************** Subject: C427 JGSI Review 1. Overall ******** This is an interesting and well written article on an important topic. The details of serialization as an implementation of an MPI Object type are covered in detail. The assertion that object serialization is essential to Java message passing is well expressed and well supported. Performance issues are discussed and illustrated, though the conclusions are not well supported in terms of numbers and statistics. 2. Comments for Author(s) ********************* Use of acronyms. A large number of acronyms were used without introduction, ranging from fairly common, such as API and JDK, to ones more esoteric, like MPI and RMI, and ones I don't know, like JPVM and VIA. JPVM, in particular, was referenced to a paper on "jmpi", which served to increase the alphabet soup but not clarify it. Good science and good statistics: I assume the authors did some regressions on the formulae (1) through (3) to get to Table 1. Missing are statements regarding sample size, residuals, general statistics and general error analysis (for example, there appear to be consistent systematic errors in the plots). The Discussion section is disjoint and difficult to follow. The first paragraph is good, the second paragraph seems to unravel entirely, the third paragraph is fine, the next two paragraphs don't seem to connect. The second to the last paragraph is a bit scattered, and the sentence early on, "We consider this case unproven" seems disconnected. The rest of the paragraph is fine, and the conclusion pointing to one-dimension underlying data supports my own work, but needs a stronger argument. The final paragraph/sentence is to the point and closes the article excellently. * Extra preposition - Last sentence, para 2, page 4, should be "(But effects similar to these..." * Missing evidence - First sentence, para 3, page 5, beginning "Evidently..." implies there is evidence, but it is not evident. * Missing word - Last sentence, para 3, page 5, should be "Benchmark results presented in the next section suggest that..." * Confused sentence - Last sentence at the bottom of page 5, beginning "To support MPI derived datatypes..." This sentence is long, confusing, and contains one or more extraneous "to". * Use of the word "state" - Last sentence, para 1, and fourth sentence, para 2, page 6, the word "state" is used in an obscure way. * Repeated word - Second sentence, footnote 2, page 6, repeated "to" in the sentence beginning "There are various proposals..." * Typo - First sentence, para 3, page 8, "communication" is misspelled. * Missing verb - Second to last sentence, para 1, page 10, should be "...previous section can be drastically reduced or eliminated." * Incorrect hyphenation - First sentence, para 2, page 11, "straightforward" is incorrectly hyphenated. * Incorrect figure - First sentence, para 3, page 11, should reference Figure 6. * Repeated word - Second sentence, footnote 5, page 13, there's an extra "of" 3. Comments for Editor(s) ******************** None.