5: Java and Fortran Issues for a Parallel CFD
- CFD in Fortran: The CFD simulation was in HPF. It uses a 4-stage
Runge-Kutta time stepping algorithm and a finite volume
central-difference technique to find the solution. In addition, it
uses a numerical dissipation model to dampen any spurious oscillations
and prevent the solution from blowing up in the presence of shock
waves.
- CFD in Fortran: For parallelism, the code uses many "Forall"
statements. This construct guarantees no dependencies, so the compiler
doesn't get confused. Also, arrays were distributed along their
longest dimension, which corresponds to the lengthwise direction of
the simulated geometric shape (i.e. nozzle). This kind of distribution
allows any number of processors to be used. Statements such as
DISTRIBUTE and ALIGN can be used for this sort of mapping.
- CFD in Fortran: Communication is required especially in the
artifical dissipation routine, where 2nd and 4th order derivatives are
computed. This and the presence of any load imbalance in computations
are the two obstacles against perfect scaling.
- CFD in Fortran: Coding was simple optimization according to
some intermediate output was not easy. Debugging HPF code is not that
easy either.
- CFD in Java: Uses the same numerical algorithm as in the
HPF implementation. Coding was not a problem at all as long as the
user is familiar with aspects of the language. The object-oriented
paradigm fits well with these kind of engineering problems. Also,
there is more or less an aspect of self-organization. Visualization
not only can help understand various properties of the algorithm but
also can help modify, tune up, change classes, methods,
etc. Therefore, with some help from the AWT library, the code can be
optimized to even do better.
- CFD in Java: Parallelism was achieved by adding
message-passing (MPI) to the sequential version. There is a big
advantage to this approach and that is there was very few
modifications needed to the original sequential version. In all
fairness we could probably say the same thing for HPF.
- CFD in Java: Message-passing was initiated between methods
computing artificial dissipation, where 2nd and 4th order derivatives
are computed. Those are the methods that used continuously throughout
the simulation. Because of that, I thought that those are the only
methods in the code that could benefit from parallelism. We confined
our message-passing to simple send/receive and bcast.
- CFD in Java: The parallel version benefited greatly from
the efficient and well-designed sequential version, in which inlining
and other tune ups were already done. So the result was a robust
parallel version
- CFD in Java: The inlining helped great deal to speed up
execution without creating any degradation to the quality of the
solution. Depending on the network and computation nodes, we have
noticed that the JavaMPI version was running close to even faster than
the CFD HPF version using the same number of processors. I noticed
that it definitely ran much faster on a dedicated network of
nodes. But the architecture of these nodes are different than those
used for the HPF tests.
- CFD in Java: There is still more room for improvements. In
addition to inlining, JARing can greatly help improve the speed of
downloading. Another thing to do is to use thread pooling. This is
basically to create a ready supply of sleeping threads at the
beginning of execution. Because of the thread startup process is
expensive in terms of the system resources, thread pooling makes the
startup process a little slower, but improves runtime performance
because sleeping (or suspended) threads are awakened only when they
are needed to perform new tasks.
- CFD in Java: Method inlining, which I already have done,
increases performance by reducing the number of method calls the
program makes. Other things that can be done is perhaps to try
streamlined synchronization, JIT, and other performance enhancing
approaches. These are all quite useful tools to try specifically for
large scale applications.
Saleh Elmohamed