Given by Jack Dongarra, Christian Deane, Keith Seymour, Clint Whaley at SC98 Orlando Java Grande Panel on November 13 98. Foils prepared December 6 98
Outside Index
Summary of Material
Java Grande Forum Homepage |
SC98 Java Grande Panel Presentation |
Motivation and Implementation of Translation of LAPACK into Java |
Discussion of I/O GOTO and futures |
ATLAS automatic BLAS Optimization and extension to Java |
Outside Index Summary of Material
Jack Dongarra |
Christian Deane |
Keith Seymour |
Clint Whaley |
University of Tennessee |
Oak Ridge National Laboratory |
Provide well-known and reliable libraries |
Avoid re-writing numerical code |
Quick and reliable translation |
Performance |
Linear Algebra library in Fortran 77 (binding to c)
|
Block algorithms
|
User interface provides similar calls in:
|
Used by vendors: HP-48G to Teraflop/s Machines |
Can only do arithmetic on data at the top of the hierarchy |
Higher level BLAS lets us do this |
Development of blocked algorithms important for performance |
600K lines of code |
extensive test package |
developed over 10 years with input from the linear algebra community |
state of the art methods |
Automatically translate to Java |
Outline of Project |
Phase 1: Write Fortran front end to lex and parse subset Fortran 77. |
Phase 2: Generate Java source and Jasmin assembly code for use with JVM. |
Phase 3: Test, document and distribute BLAS and LAPACK class files. |
Array Access/Argument Passing |
All arrays are declared as 1D and accessed with index arithmetic. |
Array indices must be passed separately as arguments, which changes the user interface. |
Primitives are passed in object wrappers to emulate pass-by-reference (only when needed, though). |
GOTO Translation |
First Step |
Try to identify Fortran constructs containing GOTO statements that can be translated to equivalent Java constructs (which cannot contain a goto statement). |
10 CONTINUE |
IF(C .EQ. ONE) THEN |
A = 2 * A |
GO TO 10 |
END IF |
while(c == one) |
{ |
a = 2 * a; |
} |
GOTO Translation |
Second Step |
Remaining GOTO statements must be transformed at the bytecode level. |
Bytecode Transformer |
Use bytecode parsing code from javab (Indiana University) . |
Provides efficient translation of GOTO statements. |
Input/Output |
Small subset of WRITE/FORMAT has been implemented -- just enough to allow translation of BLAS/LAPACK test routines. |
Some unformatted READ statements supported. |
File I/O not supported yet, but may be necessary for future testing. |
Current Status of Project |
f2j: formal compiler of Fortran 77 subset sufficient for BLAS, LAPACK and other numerical libraries. |
Most I/O statements are not fully supported. |
Double precision BLAS levels 1, 2, and 3 successfully tested. |
Double precision LAPACK routines successfully tested. |
released: May 22, 1998 |
Future of Fortran-to-Java Project |
Extend to wider subsets of Fortran and translate more numerical libraries. |
Support for Complex data type. |
Provide a large reliable Java numerical software repository. |
Focus on optimization |
Today's processors can achieve high-performance, but this requires extensive machine-specific hand tuning. |
Routines have a large design space w/many parameters
|
A few months ago no tuned BLAS for Pentium for Linux. |
Need for quick deployment of optimized routines. |
ATLAS - Automatic Tuned Linear Algebra Software |
A package that adapts itself to differing architectures via code generation coupled with timing
|
Package contains:
|
Currently provided:
|
BLAS require many man-hours / platform
|
Operations may be important, but not general enough for standard |
Allows for portably optimal codes |
Pentium's running Linux |
Code is iteratively generated & timed until optimal case is found. We try:
|
Cache based multiply optimizes for:
|
Takes a couple of hours to run. |
Keep a repository of kernels for specific machines. |
Develop a means of dynamically downloading code |
Extend work to allow sparse matrix operations |
Extend work to include arbitrary code segments |
See: http://www.netlib.org/atlas/ |
1. The Java calling program (jdmmtst.java) |
2. The Java loadLibrary class (CWRAP.java) |
3. The java loadLibrary header file (CWRAP.h) |
4. The C functions (c wrappers to call ATLAS and ATLAS c software library). |
5. The shared object (libCWRAP.so) which contains executable binaries of the C wrapper functions and all the ATLAS functions. |
Extend these ideas to Java directly |
References: |
http://www.netlib.org/ |
http://www.netlib.org/atlas/ |
http://www.cs.utk.edu/f2j/ |
http://www.netlib.org/utk/people/JackDongarra/ |