Replied: Mon, 17 Feb 2003 10:34:12 -0500 Replied: Hermann Hellwagner Return-Path: hellwagn@itec.uni-klu.ac.at Delivery-Date: Mon Feb 17 08:58:19 2003 Return-Path: Received: from round.uits.indiana.edu (round.uits.indiana.edu [129.79.1.72]) by grids.ucs.indiana.edu (8.10.2+Sun/8.10.2) with ESMTP id h1HDwJF23137 for ; Mon, 17 Feb 2003 08:58:19 -0500 (EST) Received: from mail.itec.uni-klu.ac.at (mail.itec.uni-klu.ac.at [143.205.122.25]) by round.uits.indiana.edu (8.12.1/8.12.1/IUPO) with ESMTP id h1HE0kfE017262 for ; Mon, 17 Feb 2003 09:00:47 -0500 (EST) Received: from itec.uni-klu.ac.at (nbdell03.itec.uni-klu.ac.at [143.205.122.197]) by mail.itec.uni-klu.ac.at (Postfix on SuSE Linux eMail Server 3.0) with ESMTP id 97F614153 for ; Mon, 17 Feb 2003 15:00:27 +0100 (CET) Message-ID: <3E50EAF9.3080705@itec.uni-klu.ac.at> Date: Mon, 17 Feb 2003 15:00:25 +0100 From: Hermann Hellwagner User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4.1) Gecko/20020314 Netscape6/6.2.2 X-Accept-Language: en-us MIME-Version: 1.0 To: gcf@indiana.edu Subject: Re: Request to referee C663: A PC Cluster System Employing the IEEE 1394 References: <3E185316.1000600@grids.ucs.indiana.edu> Content-Type: multipart/mixed; boundary="------------020701070403050708070506" Content-Length: 9061 This is a multi-part message in MIME format. --------------020701070403050708070506 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Geoffrey, Please find enclosed my review on paper C663, in HTML form since I downloaded the HTML form from the Website given in your request mail. If you have any questions on the review, please do not hesitate to contact me. Best regards Hermann --- Geoffrey Fox wrote: > Tony Hey suggested that you might be able to provide a referee report > Thank you for considering this > Geoffrey Fox > ---------------- > > I thought you might be able to provide me a referee report by February > 14 2003 on the paper > > C663: A PC Cluster System Employing the IEEE 1394 --------------020701070403050708070506 Content-Type: text/html; name="CC-PE - Referee Report for Paper C663.html" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="CC-PE - Referee Report for Paper C663.html" CC-PE - Referee Report for Paper C663

REFEREE'S REPORT

Concurrency and Computation: Practice and Experience


A: General Information

Please return to:
Geoffrey C. Fox
Electronically Preferred gcf@indiana.edu
Concurrency and Computation: Practice and Experience
Computer Science Department
228 Lindley Hall
Bloomington
Indiana 47405
Office Phone 8128567977(Lab), 8128553788(CS) but best is cell phone 3152546387


Fax: 8128567972

Please fill in Summary Conclusions (Sec. C) and details as appropriate in Secs. D, E and F.

B: Refereeing Philosophy

We encourage a broad range of readers and contributors. Please judge papers on their technical merit and separate comments on this from those on style and approach. Keep in mind the strong practical orientation that we are trying to give the journal. Note that the forms attached provide separate paper for comments that you wish only the editor to see and those that both the editor and author receive. Your identity will of course not be revealed to the author.

C: Paper and Referee Metadata

  • Paper Number: C663
  • Date: Feb. 16, 2003
  • Paper Title: A PC Cluster System Employing the IEEE 1394
  • Author(s): K. Hyoudou, R. Ozaki, Y. Nakayama
  • Referee: Hermann Hellwagner
  • Address: Dept. of Information Technology, University Klagenfurt, Austria

Referee Recommendations: Reject

D: Referee Comments (For Editor Only)

The paper seems to be an extended version of a poster presentation at Cluster'2000. However, the status of the cluster system apparently does not go substantially beyond the Cluster'2000 "FireCluster". There are more experiments and more results, and there is an MPI implementation. Yet it seems that the submitted paper is essentially about two years old. This also becomes evident from the references, the most recent of which is from 2001, from the technical data of the experimental cluster platform, and from the fact that IEEE 1394 has not been really considered for clustering by other researchers or by industry since 2000. Furthermore, the technical merits of the paper are too low to warrant publication in the journal; cf. below. In summary, the paper is not really original, it is not technically sound enough, the results do not significantly contribute to the state of the art; therefore, I recommend to reject the paper.

E: Referee Comments (For Author and Editor)

Overall comments

The submitted paper is well structured. The paper seems to be an extended version of the authors' poster presentation at Cluster'2000 [6]. However, the cluster system apparently does not go significantly beyond the "FireCluster" of Cluster'2000. There is an MPI implementation and more experiments and results, though. However, the system and the paper do not reflect today's state of the art of cluster computing, rather the state of the year 2000. Also, the technical depth and soundness of the paper seems to be too low to warrant publication in the journal. Some issues supporting these impressions of the reviewer are given below.

Detailed comments

Comparison to Fast Ethernet and other networks: It is scientifically not sound/serious and not fair to generally state that the performance of Fast Ethernet for clustering "is not very good". This depends on the communication software being used. The statement is certainly correct for the TCP/IP stack, but there are other approaches/libraries which deliver better performance than the standard communication stack (e.g. U-Net, VI Architecture implementations, etc.). In a similar vein, the comparison of clustering networks in Table 1 is too coarse, does not have enough detail. For instance, additional issues that need to be considered are: communication software (libraries), communication patterns, specific topologies and number of nodes. Furthermore, what "latency" is reported in Table 1: one-way or round-trip time? The latency of IEEE 1394 is reported as 7.5 microseconds while the rest of the paper states an RTT of 17.2 microseconds. Why is there a difference? What is the explanation of this discrepancy?

Originality/novelty: The most recent reference is [7], from the year 2001. This raises the impression that the paper is two years old. This is underpinned by the technical data of the experimental system and by the fact that recent work (2001-2) is not discussed at all.

Technical depth and soundness: The authors claim that the paper presents the "design and implementation" of the cluster system and specifically the communication library (CF). However, Sect. III only lists, at a very high level of abstraction, some "design policies" and some features of the library, e.g., the double-buffering strategy for point-to-point communication. This is not an implementation description, in the reviewer's view. Many questions remain open. Examples are: What if the second buffer at the sender side is about to overflow? Is the sending process blocked? If so, how? How does the CF library guarantee in-order delivery? Are there sequence numbers for messages? From Sect. III-A, a number of questions arise, e.g.: What does it mean that the "communication library handles the links", or that "networks/connections are [not] programmable"? (In the reviewer's understanding, a "programmable network" refers to an "active network", a recent research topic in networking.) "Implementation without OS modifications": However, you have to install device drivers and the RGM at kernel level (Fig. 3), so there are some changes in the OS. "Careful implementation" to achieve low latency: Is this being done at user level or at the kernel level? This is unclear at this point (Sect. III-A). In the rest of Sect. III, many more details about the communication software should be given for a journal paper.

Experimental results: See comment on the comparison to Fast Ethernet above. In addition, what is the number of nodes for the Ethernet+UDP results? What are the results for other NPB benchmarks? Why are there exactly these four being presented? Is MPICH-CF inferior to MPICH-p4 for other benchmarks? What about unfavourable communication patterns for a bus, e.g., many-to-one (gather) communication? Results would be interesting here. In summary, the claims given later on (Sect. VI) are, in the reviewer's opinion, not sufficiently substantiated. Example: "multicast ... achieved a bandwidth of 200 Mbps, regardless of the number of receiving nodes". This was only shown for <=7 receivers.

Related work: Related work is not discussed in enough detail and scope. For instance, user-level communication approaches which deliver very good performance (e.g., U-Net, VI Architecture) are not covered at all. Also, most recent related work (2001-2) is not covered at all. There is nothing about InfiniBand, nothing about other communication approaches using Remote Write/RDMA (Remote Direct Memory Access).

F: Presentation Changes

---

--------------020701070403050708070506--