Subject:
I send my paper
From:
minami <minami@tokyo.rist.or.jp>
Date:
Mon, 05 Mar 2001 16:56:38 +0900
To:
fox@csit.fsu.edu

Dear Professor Fox,

This is Kazuo Minami of RIST, Japan.

I am sorry about the delay of my paper for special issue of "Concurrency

and Computation".
I send my paper in PDF format.


Best Regards,

Kazuo Minami


--------------------------------------------------
  E-mail: minami@tokyo.rist.or.jp --------------------------------------------------

We have a prospect that Geofem get good parallel performance on various computer
architecture by the research which has been made in GeoFEM team. In this
research, we focus a target to performance with a single processor, and common
data structure / coding manner to make GeoFEM running high performance on
various computer architecture was researched.

<b>Test coding for Solver Part of the structure and the fluid analysis code</b>
Data structure and coding manner was evaluated on scalar/vector/pseudo vector
architecture. A new data structure and a new direct access manner was introduced
in the fluid analysis solver. As a result, 21% performance for peak performance
was obtained on pseudo vector architecture. And We got as good performance
as the pseudo-vector execution performance in scalar execution. 25% performance
for peak performance was obtained on vector architecture.
A new direct access manner was introduced in the structure code solver.
As a result, 27% performance for peak performance was obtained on pseudo vector
architecture. 34% performance for peak performance was obtained on vector
architecture.
<b>Test Code for Matrix Assemble Part of the structure analysis code</b>
Coding of removing dependency was finished and performance was evaluated
on vector/scalar machine. 736.8MFlops was obtained at matrix assembling process
on SX-4. 900.7MFlops was obtained at whole test Code on SX-4, and
2.06GFlops was obtained on VPP5000 (peak : 9.6GFlops). 124MFlops was obtained
at matrix assembling process on Alpha system (Alpha 21164 533MHz). should first explain what the original data structure
was, and devote more space to how and why the transformation is
beneficial.

The beginning of section 3.8 is a good example of the author's writing
style. "Both (1) and (2) of performance factor [...] was implemented
to the model coding of Fig.4." Apart from this being execrable
English, it would be so much easier to understand this if the author
would simply substitute what performance factors 1 and 2 and figure 4
are about. Now the reader has to leaf back and forth through the paper
to understand this sort of comment.

According to its caption, figure 12 is the code of "Model coding of
direct access". The author should mention that the algorithm is the
substitution part of the solver.

One more general comment: performance tests are done on two vector
pipeline machines and one superscalar chip. It looks to me as though
all the transformations are inspired solely by the pipelines. The
author in no place remarks on architectural differences, and whether
for the Alpha chip different optimisations would have been preferable.

In sum, the code transformations presented here are moderately
interesting, and could be of use to readers, but the presentation
precludes any such usefulness.