Subject: C429 JGSI Review
Resent-Date: Thu, 30 Sep 1999 23:19:29 -0400
Resent-From: Geoffrey Fox <gcf@npac.syr.edu>
Resent-To: p_gcf@npac.syr.edu
Date: Mon, 20 Sep 1999 11:45:25 -0400 (EDT)
From: Bill Pugh <pugh@cs.umd.edu>
To: gcf@npac.syr.edu

Paper: C429
Title: Performance Limitations of the Java Core Libraries
Authors: Allan Heydon and Marc Najork

Overall recommendation: interesting and worth reading, but frustrating
as well. Weak recommendation for publication.

I found this article interesting but frustrating. While the author's have
made some interesting observations, they fail to provide important details.

For example, in Section 2.1, the authors note that they have a tool that
can give detailed information about the number of synchronization operations
performed by a Java application. Then they talk about all of the
transformations they performed to reduce the number of synchronization
operations. But they don't actually tell us the before and after number of
synchronization operations.

Similarly, it is unclear which of the issues raised in the paper are
significant, and which, if any, are insignificant.

I suspect that the reason for this is that they performed the optimizations
as Mercator evolved, and they don't have two versions of Mercator
that they can compare (e.g., versions with the same functionality, but with
and without the changes described in this paper). This would be most
unfortunate; the paper would be much stronger with this information.

Also, JVM's vary greatly in how efficiently they implement synchronization.
In many of the early JVM's, synchronization was very expensive. In more
modern JVM's, it is relatively cheap, particularly for the uncontested
cases. How efficient is synchronization on the JVM the authors used?

It isn't clear which of the performance issues were fixed in 1.2.2, besides
the InetAddress.getByName cache lock. It isn't clear to me that performance
bugs which have been fixed are interesting at all. I would be happy to
see discussion of all bugs fixed in 1.2.2 removed from the paper.

It really is unclear to me if the authors are performing
premature/unnecessary optimizations in some places. For example, in Section
5,
the authors complain that the design of the java.net library causes
hundreds of allocations per second. On a 533Mhz Alpha, this is a problem?
Maybe so, but my gut impression is that this would be in the noise. If
the authors provided information on the performance impact of these
allocations, I might be convinced otherwise.

Some of the "inefficiencies" the author's describe are there for a reason.
The authors complain that some of the String constructors are inefficient:
that is just wrong. Some of them perform more computationally demanding
tasks
(e.g., converting from a UTF8 encoding to Unicode) than others. Calling this
an "inefficiency" is just plain wrong. Now, if the problem is that when
going through the constructor that takes an encoder and providing a
constructor
that simply uses a zero for the high byte, the result is much slower than
using the constructor that doesn't take an encoder, then that is an
efficiency
problem.

Similarly, if StringBuffer were to be made unsynchronized, then when a
StringBuffer is converted into a String, you would have to copy the
character
array (otherwise, unsynchronized access might allow you to mutate a String,
which would cause all kinds of security problems).

And again, if FileInputStream, FileOutputStream and RandomAccessFile
publicly supported reopen methods, it would have serious security
implications
(passing a FileOutputStream to a method would allow that stream to be
reassociated with a different file).