Subject: C429 JGSI Review Resent-Date: Thu, 30 Sep 1999 23:19:29 -0400 Resent-From: Geoffrey Fox Resent-To: p_gcf@npac.syr.edu Date: Mon, 20 Sep 1999 11:45:25 -0400 (EDT) From: Bill Pugh To: gcf@npac.syr.edu Paper: C429 Title: Performance Limitations of the Java Core Libraries Authors: Allan Heydon and Marc Najork Overall recommendation: interesting and worth reading, but frustrating as well. Weak recommendation for publication. I found this article interesting but frustrating. While the author's have made some interesting observations, they fail to provide important details. For example, in Section 2.1, the authors note that they have a tool that can give detailed information about the number of synchronization operations performed by a Java application. Then they talk about all of the transformations they performed to reduce the number of synchronization operations. But they don't actually tell us the before and after number of synchronization operations. Similarly, it is unclear which of the issues raised in the paper are significant, and which, if any, are insignificant. I suspect that the reason for this is that they performed the optimizations as Mercator evolved, and they don't have two versions of Mercator that they can compare (e.g., versions with the same functionality, but with and without the changes described in this paper). This would be most unfortunate; the paper would be much stronger with this information. Also, JVM's vary greatly in how efficiently they implement synchronization. In many of the early JVM's, synchronization was very expensive. In more modern JVM's, it is relatively cheap, particularly for the uncontested cases. How efficient is synchronization on the JVM the authors used? It isn't clear which of the performance issues were fixed in 1.2.2, besides the InetAddress.getByName cache lock. It isn't clear to me that performance bugs which have been fixed are interesting at all. I would be happy to see discussion of all bugs fixed in 1.2.2 removed from the paper. It really is unclear to me if the authors are performing premature/unnecessary optimizations in some places. For example, in Section 5, the authors complain that the design of the java.net library causes hundreds of allocations per second. On a 533Mhz Alpha, this is a problem? Maybe so, but my gut impression is that this would be in the noise. If the authors provided information on the performance impact of these allocations, I might be convinced otherwise. Some of the "inefficiencies" the author's describe are there for a reason. The authors complain that some of the String constructors are inefficient: that is just wrong. Some of them perform more computationally demanding tasks (e.g., converting from a UTF8 encoding to Unicode) than others. Calling this an "inefficiency" is just plain wrong. Now, if the problem is that when going through the constructor that takes an encoder and providing a constructor that simply uses a zero for the high byte, the result is much slower than using the constructor that doesn't take an encoder, then that is an efficiency problem. Similarly, if StringBuffer were to be made unsynchronized, then when a StringBuffer is converted into a String, you would have to copy the character array (otherwise, unsynchronized access might allow you to mutate a String, which would cause all kinds of security problems). And again, if FileInputStream, FileOutputStream and RandomAccessFile publicly supported reopen methods, it would have serious security implications (passing a FileOutputStream to a method would allow that stream to be reassociated with a different file).