I enclose 3 Referee reports on your paper. We would be pleased to accept it and could you please send me a new version before November 5 99 Please send a memo describing any suggestions of the referees that you did not address Ignore any aggressive remarks you don't think appropriate but please tell me. I trust you! Thank you for your help in writing and refereeing papers! Referee 1 ************************************************ Subject: C432 JGSI Review C432: A Mobile Agent Based Push Methodology for Global Parallel Computing a) Overall Recommendation ====================== This paper is not very well written, though the idea is interesting. There is a big background section, but the details on the actual system are rather sketchy. b) Words suitable for authors ========================== 1. The `true' benefits of TRAVELLER are not clear -- if you are using a shared memory idea, why is this more interesting than other approaches such as JINI? 2. On page 2, last paragraph (section 1), you mention that Java relies on a `high performance' execution model? I thought one of the limitations of Java was its execution performance. I am not sure what you mean by this. 3. page 4, section 2.1, paragraph 4, http to HTTP 4. page 4, last paragraph, section 2.1, insert `to be' between demonstrated and simple (4th line from end of page). 5. page 4, last line, I am not sure what is meant by `... approach will activate the tasks' 6. I do not agree with second line on page 6. Voyager, Aglets etc all enable the creation of multithreaded agents. 7. Section 3.1, page 7, last line of paragraph 1, mentions that `Brokers are organised in a hierarchical way', but it does not mention how this is achieved. Agent naming is a big problem, and the paper does not clarify how this is achieved in this particular system. Details about what the broker does, how it compares server statistics, does load balancing etc are not clear. 8. I am not clear why a complete AgentTask is sent to the broker first. Does this not make the broker a bottleneck. In figure (1) on page 7, this is illustrated but not commented upon in the text. Why not just send an AgentTask reference to the broker instead? 9. Last 2 lines on page 7 identify that there is no standard way of characterising workload, however, the authors do not identify how they do it in their work. How does the broker make a decision about where to place an AgentTask? 10. page 9, section 4.1, second last line: change `the' to `this' `reasons' to reason 11. On page 10, the migration of an AgentTask is identified, but what happens to the data associated with this operation? Data migration is not mentioned or clearly identified. Does the broker keep a track of where the data is also, when new servers are identified and AgentTask needs to be migrated? Similarly, the overheads of achieving the actual migration are not commented upon. 12. Figure (2) on page 10 does not agree with text immediately following the figure. The numbers on the figure do not agree with what is described in the text. 13. page 10, last line, change: `plase be refered to' to `can be found in' 14. page 11, 3rd paragraph, last line, change: `invalidationand' to `invalidation and' 15. page 11, first bullet point: `addressing' to `address' `... threads which dynamically spawned on demands' to `... threads which are dynamically spawned on demand' 16. page 11, third bullet point: `help' to `helps' 17. page 12, section 5.1, incorrect type-writer font for second character (AgentTask) 18. page 12, in code segment the `grain' parameter is briefly described as identifying the `granularity of coherence for data replication'. I am not sure what this means. Perhaps, an example to illustrate what you mean by this would be useful? 19. page 13, first line, how does the broker identify how many threads to create? This is not clear from the code sample provided immediately below this. Perhaps, need more text to explain the code? 20. page 14, second paragraph, delete one `on' 21. page 14, figure 4, y-axis label is skewed 22. page 15, third line, what do you mean by a `submission'? Is this a single AgentTask or multiple ones? 23. page 15, section 6.2, 3rd paragraph, last line -- not clear what is meant by this. How does the DSA's caching influence performance exactly? 24. The examination of costs is divied into either RMI serialization (for complex objects) or DSA, can you think of other overheads? What about migrating an AgentTask and its associated data? Are there any management decisions that need to made for this other than RMI based ones? 25. page 17, I do not see the method proposed here as `novel'. There is no mention of the `Contract Net' or other such protocols that have been used for load balancing. There is a great deal of work in this area in network management literature, you may be interested in exploring this further. A good reference is: http://www.irisa.fr/solidor/work/astrolog.html You can find papers on the Contract Net and extensions at: http://www.cs.wustl.edu/~mas/publications.html additional papers are at: http://www.cs.umbc.edu/agents/papers/ http://www.cs.cf.ac.uk/User/O.F.Rana/agents99/ c) Further Remarks =================== There are 5.5 pages of introduction -- perhaps this should be reduced? The TRAVELLER system should be described in more detail. The application examples should perhaps be extended to describe what TRAVELLER is doing in these instances. Referee 2 ***************************************************************** Subject: C432 JGSI Review >a)Overall Recommendation Interesting work technically, but the presentation needs to be overhauled. Publish only if they change the emphasis of their presentation to more clearly portray what their contribution is and is not. >b)Words suitable for authors In the abstract, the question arises as to whether compute servers really have network connectivity that is really that unreliable. In the first bullet item on page 3, the statement regarding the dependence on lifetime reliable network connections isn't true for applications distributed across different administrative domains. There are numerous projects that have used global computing architectures, such as DSP for SETI, DES cracking, and others where work is parceled out and then connectivity isn't an issue for long periods of time. The second bullet item also doesn't address this class of global computing applications. In the next to last paragraph of section one, you fail to mention that one of the big problems for certain classes of applications is data movement. Only problems that don't require large amounts of data to be moved (such as the ones above) are appropriate for global distribution. I wouldn't necessarily consider an ATM network that connects computer-servers directly to be particularly global---at least not yet. It would also be useful here to mention that DSA provides a solution to data access, at least in the sort of network architecture that you have experimented with. In the first paragraph of section 2.2, change "Mobile agents have their" to "Mobile agents have as their". In the paragraph before the second code example on page 6, change "directory for agent" to "directory for an agent". In the next to last paragraph of page 7, you should make it clear that DSAs are used for application data as well as synchronization. In the third paragraph on page 11, mention what the performance impact is. In the first bullet item on page 11, change "spawned on demands" to "spawns on demand". In the second bullet item on page 11, change "facilities" to "facilitates" In the first sentence of section 5.1, make "AgentTask" all one font. The code examples on page 13 should use the same font as the other code examples in the paper. In the paragraph between the first and second code examples on page 13, "TaskAgent" should be in the same font as other code fragments. In the first paragraph of section 6, make the verb tense consistent. In figure 4, correct the label on the y-axis to either read vertically or horizontally, not both. In the last paragraph of page 15, change "critical" to "critically". On page 16, I have the following comment. In some sense, what you're doing is implementing a distributed VM using Agent/Broker/server technology for applications that are appropriately decomposable. The ability to decompose the application is measured by how effective your DSA works for the application. In section 7, the text beginning "Traveler provides an agent wrapper", is good and should go earlier, perhaps in your abstract or introduction. >c)Words for me if necessary Referee 3 ************************************************************ Subject: C432 JGSI Review a) publish b) This is a nicley written paper, with a rich bibliography setting the paper well in the context of related research. The general idea of agent based computations, although not entirely new, and thanks to Java more feasible than ever, is very interesting and worth pursuing. The authors describe a pilot implementation of such a system, and present some benchmark results based on two simple applications. The results are very encouraging. I have two general comments on the paper (which does not affect the high quality of the paper). 1. We need a better understanding of what does "global computing" actually mean. We do not want to make a false promise of free CPU form the Internet. What is the mechanism to participate in the computation, for example, providing resources? I know, the technology is (almost) there: SSL, Kerberos. Globus faces the same problem. Potential users (site administrators) ask the question: "how I will be credited for CPU I provide (while making resouces available to the Grid)?". What are the authenication/authorization policies? How to protect the CPU provider against an abuse? The technology is there, but what is the model of use? I realize that the authors concentrated on providing a proof of concept implementation. That's fine. But in the feature, these question will have to be answered. Note, that the pull model is much easier to understand (as far as security is concerned). Also, the agent based computation may become very useful for "private" clusters or "intranets", where the security is not an issue. So I would tend to see "global computing", for practical purposes, as intranet computing. 2. What is the cost of retargetting of a legacy application to the traveller environment? It is my understanding that it requires to rewrite the codes. And even though the net number of new lines to be inserted is really small, it is a major effort to examine the algorithm of the application in hand to make it right. I believe, that this model is better suited for a course grain, task parallel distrubuted computation that shares data only through input and output. There is one misspelling (p.8, UnicastRemoeObject - should be UnicastRemoteObject, and a minor flaw in fig 4 (msec). Fig 2 and 3, I guess originally in color, can be slightly improved: the blob representing the client is so dark, that it is virtually impossible to read the label of it. c) none