Subject:
Re: Request to review a paper C506
From:
Victor Eijkhout <eijkhout@cs.utk.edu>
Date:
Mon, 04 Jun 2001 11:54:09 -0400
To:
fox@csit.fsu.edu

> Jack Dongarra suggested that you could referee this paper from Japan for a Special is

sue I

> am preparing for Concurrency and Computation: Practice and Experience

 "Practice and experience". Ok, with that in mind ...

Referee report on
Performance Optimization of GeoFEM on Various Computer Architecture
by Kazuo Minami

This paper discusses various techniques that were applied to the GeoFEM
code to improve its performance. The architectures alluded to in the title
are the SX-4, VPP5000, SR8000, and the DEC Alpha.

First of, the English of this paper is really bad. In many sentences the
reader has to guess the meaning.

Then, the author omits explanation of various relevant details. For
one, there is mention of forward and backward substitution, but there
is no explicit mention of what the preconditioner is. I presume a
block Jacobi with an ILU local solve, but I would like to have an
explicit statement.

More importantly, the author never describes the shape of the problem
domain, the discretisation used on it, and the exact data structure,
all important factors for the understanding of the code
transformations. Section 3.2 mentions the "cyclic multicolor on
hyperplace/RCM", but this phrase is never explained. The author should
delete the explanation of piplining (section 3.1) and replace it with
a clear explanation of the data structure. Without this, the code
fragments, starting with figure 1, are unintelligible. Figure 3
especially puzzles me. I do not recognise any kind of solver in this.

The main content of this paper is the five "performance factors" shown
at the end of section 3.6, and the code transformations used to
satisfy them. However, there is insufficient explanation of how the
transformations achieve this. By the way, in factor (4), "latency"
should be "ratio", I think.

The data structure transformation of section 3.7 might be interesting,
but the author should first explain what the original data structure
was, and devote more space to how and why the transformation is
beneficial.

The beginning of section 3.8 is a good example of the author's writing
style. "Both (1) and (2) of performance factor [...] was implemented
to the model coding of Fig.4." Apart from this being execrable
English, it would be so much easier to understand this if the author
would simply substitute what performance factors 1 and 2 and figure 4
are about. Now the reader has to leaf back and forth through the paper
to understand this sort of comment.

According to its caption, figure 12 is the code of "Model coding of
direct access". The author should mention that the algorithm is the
substitution part of the solver.

One more general comment: performance tests are done on two vector
pipeline machines and one superscalar chip. It looks to me as though
all the transformations are inspired solely by the pipelines. The
author in no place remarks on architectural differences, and whether
for the Alpha chip different optimisations would have been preferable.

In sum, the code transformations presented here are moderately
interesting, and could be of use to readers, but the presentation
precludes any such usefulness.


> Referee Recommendations. Please indicate overall recommendations here, and
> details in following sections.
>  > 1. publish as is
> 2. accepted provided changes suggested are made
> 3. reject

 Reject for now. I'm willing to take another look at it provided the
author makes some drastic changes in the presentation.


> D: Referee Comments (For Editor Only)
>  > E: Referee Comments (For Author and Editor)

 See above.


> F: Presentation Changes

 This needs a lot of work. English, style, and the author needs to be
much clearer in his explanations.