Subject: C434 JGSI Review Resent-Date: Thu, 30 Sep 1999 23:17:54 -0400 Resent-From: Geoffrey Fox Resent-To: p_gcf@npac.syr.edu Date: Fri, 17 Sep 1999 22:56:32 -0700 From: heydon@pa.dec.com (Allan Heydon) To: Geoffrey Fox CC: najork@pa.dec.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overall Recommendation: Accept ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This paper describes the content of and the design philosophy behind a benchmark suite for Java Grande applications, and presents the results of benchmarking a half dozen hardware/JVM combinations. The paper is well-written, interesting, and highly relevant to the Java Grande community. General Comments ~~~~~~~~~~~~~~~~ I have only fairly minor comments on the paper's content. 1. The last paragraph of section 2 touches on "a feature peculiar to Java benchmarking, which is that it is possible to distribute the benchmark without revealing the source code." The importance of this point is lessened by the wide availability of Java de-compilers, which produce quite readable Java source code. 2. At the top of the right column on page 2, it is stated that I/O components of the benchmarks have been removed, presumably because they are not considered relevant to Java Grande applications. Yet in the first paragraph of the introduction, large network requirements are considered one of the hallmarks of Grande applications (and perhaps disk usage should be added to that list). Perhaps the paragraph in section 3 meant to refer to "*terminal* I/O"? 3. In the descussion in section 4.1 on the meaning of the temporal and relative performance metrics, I think it would be worth adding an explicit statement that bigger values under both metrics indicate *better* performance. 4. The last paragraph of section 7 (describing how to obtain the benchmarks) does not constitute future work; it should probably be moved to the very beginning of section 5. Detailed Comments ~~~~~~~~~~~~~~~~~ These comments are just minor nitpicks. 1. The acronyms EPCC and MPI should be spelled out on the first use of each. 2. In section 5, the descriptions of some benchmarks begin with a verb (e.g., "measures", "tests"), while others begin with "This benchmark measures..." or "This benchmark tests...". For uniformity, they should all be changed to follow the same convention. Similarly, some descriptions say "Performance units are..." while others say "Results are reported in...". Finally, some say something like "This kernel/benchmark exercises ...", while others simply say something like "Memory and integer intensive." 3. There are a couple of places where a comma appears immediately before a parenthetical remark; these commas should be moved after the closing right parenthesis. 4. Pet peeve: there are many, many instances of the word "which" that should be replaced by "that". For a guide to the correct usage, see the topic on "Which-Hunting" in "A Handbook for Scholars", Mary- Claire van Leunen, Oxford University Press, 1992. 5. Typos and suggested improvements: Pg 1, col 1, section 1, line 3: "...well outside its original design specifications." -> "...well outside its original design goals." Pg 1, col 1, line -3 (counted from bottom): "Show that real large scale codes can be..." -> "Show that real, large-scale codes can be..." Pg 1, col 1, line -3: "...can be written and provide..." -> "...can be written, and provide..." Pg 1, col 2, line 2: "...execution environments thus allowing..." -> "...execution environments, thus allowing..." Pg 1, col 2, line 6: "...to Grande Applications and in doing so encourage..." -> "...to Grande Applications, and in doing so encourage..." Pg 1, col 2, section 2, line 6: "...a number of benchmarks [] ..." -> "...a number of micro-benchmarks [] ..." Pg 1, col 2, last line: "These are useful in that they can be representative..." -> "These are useful in that they are representative..." Pg 2, col 1, "Robust" item: "The performance of suite ..." -> "The performance of the suite ..." Regarding the Robustness criterion, I am dubious as to whether it is possible to eliminate hardware factors (such as cache size) from a performance measurement. Pg 2, col 1, "Portable" item: "...a variety of Java environments as possible." -> "...a variety of Java platforms as possible." Pg 2, col 1, line -12: "..., we provide three types of benchmark, ..." -> "..., we provide three benchmark types, ..." Pg 2, col 1, line -4: "...of real applications running under the Java environment." -> "...of real Java applications." Pg 2, col 2, line 12: "We also choose the kernels..." -> "We also chose the kernels..." Pg 2, col 2, line -24: "..., as well as ensuring adherence to..." -> "..., as well as to ensure adherence to..." Pg 3, col 1, line 14: "Relative performance is the ration of temporal performance ..." -> "Relative performance is the ratio of temporal performance ..." Pg 3, col 1, line 15: "... that is a chosen JVM/operating system/hardware combination." -> "... that is, a chosen JVM/operating system/hardware combination." Pg 3, col 1, line -16: "Accessing benchmark methods as class methods." -> "Accessing benchmark methods as static methods." (This ain't Smalltalk ;-) Pg 3, col 1, line -8: "We can force compliance to common structure..." -> "We can force compliance to a common structure..." Pg 3, col 2, line -14: "...to different types of variable." -> "...to different types of variables." Pg 5, col 1, line 9: "This performs a one-dimensional forward transform..." -> "This performs a one-dimensional Fourier transform..." Pg 5, col 1, Sparse: The first and third sentences can be merged as follows: Multiplies an N x N sparse matrix stored in compressed-row format with a prescribed sparsity structure by a dense vector 200 times. "This kernel exercises indirection-addressing and..." -> "This kernel exercises indirect-addressing and..." Pg 5, col 1, Search: "... using a alpha-beta pruned search technique." -> "... using an alpha-beta pruned search technique." Pg 5, col 2, section 6.1, line 3: "Also of interest is language comparisons, comparing the performance of Java versus other programming languages such as Fortran, C and C++." -> "Also of interest are language comparisons, that is, comparing the performance of Java to other programming languages such as Fortran, C and C++." Pg 5, col 2, section 6.1, line 6: "Currently, the LUFact and MolDyn benchmarks, allow..." -> "Currently, the LUFact and MolDyn benchmarks allow..." Pg 5, col 2, section 6.1, lines 7-10: "It is intended, however, that the parallel part of the suite will contain versions of well-known Fortran and C parallel benchmarks, ..." -> "However, we intend the parallel part of the suite to contain versions of well-known Fortran and C parallel benchmarks, ..." Pg 5, col 2, section 6.1, lines 11-17: "Measurements have been taken for the Linpack Benchmark (on a 1000 x 1000 problem size) and the Molecular Dynamics benchmark (2048 particles), using Java (Sun JDK 1.2.1 02 production version, and Sun JDK 1.2.1 reference version + Hotspot 1.0), Fortran and C on a 250MHz Sun Ultra Enterprise 3000 with 1Gb of RAM and the results are shown in Figure 3." -> "Measurements have been taken for the LUFact benchmark (on a 1000 x 1000 problem size) and the MolDyn benchmark (2048 particles) using Java (Sun JDK 1.2.1 02 production version, and Sun JDK 1.2.1 reference version + Hotspot 1.0), Fortran and C on a 250MHz Sun Ultra Enterprise 3000 with 1Gb of RAM. The results are shown in Figure 3." Pg 8, col 2, line 1: "Consideration of these issues has lead us to decide ..." -> "Consideration of these issues has led us to decide ..." ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~