Subject: C441 JGSI Review Resent-Date: Thu, 30 Sep 1999 23:15:04 -0400 Resent-From: Geoffrey Fox Resent-To: p_gcf@npac.syr.edu Date: Fri, 10 Sep 1999 07:57:33 +0200 From: Michael Philippsen To: Geoffrey Fox CC: phlipp@ira.uka.de Title: Design of the Kan Distributed Object System Authors:Jerry James and Ambuj Singh a)Overall Recommendation ------------------------ REJECT b)Words suitable for authors ---------------------------- The paper is very unfocussed and much too long. It is hard to see what the contributions of the paper are. I can see three possible contributions: a) If the intended contribution is "Kan, the language" the authors should provide an exact spec of the new language features and then argue that exactly those features are required to achieve the goals mentioned in the abstract. Otherwise it is *just another concurrent object oriented language*. From my point of view, the asynchronous method invocation and the futures are well used in other concurrent object oriented languages, the only interesting aspect are the nested transactions. I would like to see a paper on the nested transactions. Please give us an exact spec of the mechanism, give technical details of its transformation into regular Java and provide performance benchmarks. Such a paper should take 10-12 pages. b) If the intended contribution is "how to map Kan features onto regular Java", please provide us with an exact spec of the language mechanisms and show their transformation into Java. Provide performance benchmarks. Such a paper should take 10-12 pages. c) If the intended contribution is "Optimization of Kan mechanisms", it is hard to see what is new in the paper. It is well known that thread pools are a good idea. Thread inlining is extensively used for example in Orca, the positive effects on performance have been discussed. Finally, swizzling is common knowledge - the next release of Sun's RMI will have shortcuts for local access. The key idea of the global optimization mentioned in the paper is that "migration is worthwhile if there is not much contention". This is trivial. Finally, it is known that reflection is a costly mechanism, RMI's skeleton-mechanisms avoids costly reflection. Finally, Kan's performance numbers are too slow to be interesting. Jaguar, JaVIA and KaRMI achieve much better performance (micro-seconds). I think that a paper on the optimization ideas would only be acceptable if all the effects are discussed on real applications instead of synthetic benchmarks. Minor comments: --------------- - abstract: "The model constructs help the programmer avoid common concurrent programming errors, allowing clean expressions of concurrent algorithms." There is no proof for that claim in the paper. - provide citations for Java only once. - section 2 should define the language constructs *exactly*. I'll discuss the problems for the asynchronous call; similar comments apply to the nested transactions as well. Assume that the reader is familiar with the concept of futures. An open question is how exceptions are handled in futures? the Java programmer is used to declare all exceptions that might be thrown. If a general future object is used to capture an exception, a general java.lang.Exception needs to be re-thrown upon resultInt() (for example). This is a significant disadvantage compared to Java's synchronous method invocations. Moreover, since the futures do not have a concrete primitive type, I don't understand how type conflicts are handled. Assume a situation where a float value is stored to the future and an int is retrieved. - The readers know that guards are a neat idea. But there is no reason to re-invent them in Kan. Furthermore, the reader knows that it is more difficult to get multi-threaded code right than single threaded. For a Java-version of this knowledge see Dough Lea's book. Hence, I was bored by Fig 5 and its explanation. - For the nested transactions it is left open, how the "associated object" can be determined. And - as a related matter - whether guards can appear in static code. For example, consider Fig 3: which is the "associated object"? Is there any compiler support to make sure that guards are "read-only" methods? How does Kan implement guards with Java's basic mechanisms? Is there an efficient way to do it? - last paragraph of 2.2 is boring. last paragraph of 2.3.0 is repeated in the application section. - The authors cite the Brose papers repeatedly. One of the main points in these papers is that mixing of pass-by-reference and pass-by-copy is unwise for transparent distributed programming. Unfortunately, I suspect (could not determine it for sure from the text) that some objects are passed by copy in Kan as well. Why? (For example, what does *may* mean in "Note that if such references are shared, the object may be copied.") - Section 3.1 can be condensed into 3 lines: Kan uses sockets + serialization + reflection (instead of skeleton objects) + explicit messages handled by thread pool. - Section 3.2 can be condensed into 3 lines: Kan uses global object IDs, replication and lock objects. There is no distributed GC. An important problem that seems unaddressed in Kan is the thread ID. If threads migrate from machine to machine, their ID changes (it is one of the threads from the thread pool that takes on a method invocation) Therefore, Java's "synchronized"-mechanism does no longer work for nested invocations in such an environment unless special care is taken. What does Kan do? - section 4.1: nothing new in here - section 4.2: Kan is too slow to be interesting. It is hard to believe that a non-RMI implementation with explicit message passing is as slow as full RMI itself (including, distributed GC, dynamic class loading, etc.). Since KaRMI does a lot better than RMI, Kan's communication layer must be suboptimal. How much more will Kan's performance suffer when a distributed GC will be added in future? - section 4.4.3: dynamic type detection scheme would be interesting. - section 4.5: expand! although announced, I don't see the discussion of "several" applications. - section 5: instead of a long list of individual systems try to focus on some, for example the Java-relatives. Then provide a clear analysis of the features that distinguish Kan from them. Some minor bugs: ---------------- - insert dot on p5, 2nd paragraph: "is issued) If the programmer" - section 2.1, paragraph 3: why is Figure 3 mentioned? - section 4.1.4, 2nd paragraph: "The table in Table 4" --> "Table 4" - Figure 7 and its discussion is too detailed. I suggest to remove it completely from the paper. - end of p20: it is more than 2 orders of magnitude. 4? - Figure 9 is quite boring since there is nothing unexpected. - Figure 17 is boring: the reader needs to know the times for a single access and can then imagine the graph. - Figure 18: it is obvious that the time for a single access (see Fig 17) remains constant. Why plot? Why plot with different scale? - last line of section 5.1.3 "creation nd manipulation" --> "... and ..." - JavaParty's solution *is* portable across JVMs. There is no native code in JavaParty. JavaParty's serialization and RMI are plug-in replacements. - 3rd line of section 5.2.2: "serializble" --> "serializable" References: ----------- [10] Java instead of java [27] if this paper is accepted for the special issue, you should refer to an updated version of Herlihy's paper in the same issue. same for [42] [31] Springer --> Springer Verlag.