Referee 1 **********************************************************

This paper is very well written, interesting to read, and provides
a large number of experimental results. The authors provide adequate
discussion of their results.  I am providing here detailed comments
on several points that I think could be made clearer.

Section 2:

I think that the section should contain some qualitative discussion
of why one model is more "user-friendly" than the other. I think
that Figure 2 does not make enough of a point and that the text
could develop a little more on the fact that explicit yields are
more complicated. Maybe a more complex example would do. And the
text misses some explicit statement about which model is less
intuitive or attractive to developers.

This brings me to a general comment about the paper. The
abstract foreshadows a trade-off between "complicated code" and
"performance", which makes sense. However, throughout the paper,
the concern is mostly performance. In fact, the only place where
programming styles are mentioned is in Section 6 (an 8 line
paragraph on page 24). I would have liked to see more mention of
that trade-off during the description of experimental results,
if only to remind the reader why all this is worth the trouble.

Section 3.1:

I would maybe rename that section "Key CP Implementation Issues"
as it is entirely about the CP programming paradigm.

Section 3.2.1 and 3.2.2:

I felt that those two sections were a little difficult to read.
After spending quite a lot of time analyzing them I believe they
are correct. I think that maybe a table or a paragraph describing
the use of locks for each strategy for each platform would be good.
As in: "SR_ic on single processor: no locking; SR_ts on single
processor: a simple locked flag, etc.". That way the reader can
just refer to that table for following the development rather
than referring to section 3.1 constantly while reading the paper.

Section 5:

I think that this section should explicitly contrast the results
with those of the previous section for SR. There are so many
results in Section 4 that the reader needs a reminder to see how
the Java results are different/similar to the SR results.

Section 6:

This section should really start with a summary of
results. Throughout the paper the reader finds out facts such as:
"SR_{IC} is good when there is a lot of synchronization", "SR_{IC}
is bad when there is little work", etc.. I suggest you go through
Sections 4 and 5, find all such statements and summarize the trends
at the beginning of Section 6. My experience reading the paper
was that I kept going back and forth between sections to remind
myself of previous results and reasons for the results.  I believe
everything is there, but a summary is necessary. Maybe a section
right before Section 6 called 'Summary of results and trends'
would be good. Or maybe at the end of Section 4 and Section 5. I
realize that some trends are not as clear cut as one would hope,
but I still think that the authors can provide a coherent summary.


F: Presentation Changes

The presentation of the experiemtns and results is really good and the
paper is extremely well-written. Besides my comments above I do not have
further comments.

Referee 2 **********************************************************


  The paper presents a detailed analysis of various concurrent
programming models.  In particular, the authors modeled and
evaluated the runtime efficiency and code complexity
of a cooperative multithreading paradigm and a concurrent
programming paradigm using modified versions of the SR
programming language using toy benchmarks running on
linuxPC systems.  The authors were especially thorough
in their analysis by introducing context switching modeling
into their concurrent programming model.  This added
accommodation (context switching) made the comparison between
the two models (concurrent programming versus cooperative
multithreading) especially realistic.


  Overall, this paper was excellent.  The modeling
methodology is sound and the results are thoroughly
analyzed.  Readers who work with
grid computing or other large-scale parallel systems
can use this paper to assist them in determining
which approach (concurrent programming or cooperative
multithreading) is the best approach for their parallel
programming needs.

  My only concern about this paper is that the authors
should add a sentence or two somewhere around the third
paragraph of section 1 to better distinguish their definition
of multithreading from the multithreading currently coming out in
commercial processors (IBM Power 4, Alpha 21464, Intel Xeons).
These new processors execute multiple threads concurrently,
not one at a time, as the authors' definition of multithreading
assumes.


F: Presentation Changes
     I suggest using numbers, rather than Roman numerals, to refer
     to tables.  Numbers tend to be less confusing, especially to
     readers who are skimming the paper.
           Table XI -> Table 11
           

Referee 3 **********************************************************

A comparison of two models of quasi parallel computation, concurrent 
programming and concurrent multi-tasking is made on the basis of various 
timing experients. The experiments compare the performances of several
implementations of both models for four example problems: producer-consumer,
reader-writer, dining philosophers and jacobi iteration. It is assumed that 
the number of processes greatly exceeds the number of processors (in the paper
the maximum number of  processors used for experiments is two); thus, the 
comparison seeks to analyse the efficiency of scheduling processes to 
processors in both computational models. The models are realised in two 
implementation languages, namely, SR and Java. The results are not conclusive
but indicate that concurrent multi-tasking is generally the more efficient 
model (albeit only marginally better for large work loads).

The results are not surprising; one would expect a model in which  
process activity can be controlled explicitly (by commands such as yield)
to produce more efficient results - given good implementations - than a model 
in which process control is managed behind the scenes by an operating system. 
The conclusions are weak - "cooperative multithreading models deserve further 
exploration". I am not convinced that a reader would be satisfied with this 
conclusion after working through a 26 page paper.

The subject matter of the paper is ideally suited to Concurrency and 
Computation:Practice and Experience. The work is an extension of an 
earlier EURO-PAR paper. The current paper contains the same four examples and 
considerable portions of text have been reproduced (are there copyright
implications?). However, the results have been updated (more efficient 
implementations), the section comparing the implementations of the two models 
in Java is entirely new and there are some new experiments which analyse, for 
example, the effect of changing the granularity of the time slice are given. 
The conclusion of the current paper adds little to the EuroPar conclusion.

The strength of the paper is the detail with which:
(i) each timing experiment is described; and
(ii)  each result is analysed.
There are often practical details which might be of interest to a general 
audience - for example, the authors have described the (unwanted) effects
that a Java garbage collection thread can have on timing experiments.

My reservation about the paper is that the timing results are often
what one would expect. Would it be possible to design new experiments which 
would shed further light on the sequence of process activiation (i.e. give a
trace of  process activity in an execution). Or perhaps report on experiments 
which analyse if processes carry out the same amount of work in each time 
slice? Perhaps it would then be possible to make further remarks on fairness 
of process activation.

I cannot make a strong case for accepting the paper as it stands:

1. The conclusion section is little more than a summary . The authors should try to 
say in what circumstances one model is to be preferred to the other.

2. I am unconvinced that timing experiments alone allow one to conclude which
model is the better. It may be possible to improve the interest level of the 
paper by conducting a small number of further experiments which analyse both 
fairness of activation and the amount of work done by a process in a given 
time slice.

3. The presentation of the paper should be improved (see below).

F: Presentation Changes

1. There are too many tables in the paper (26) - the authors should try to 
compress much of this data into graphs (for example, some of the results for 
the four problems could be compressed into a single graph  - as in Figure 2 
of the EuroPar paper).

2. Often, experimental information is repeated - the authors should try to 
avoid saying the same thing several times.

3. The descriptions of Green and native versions of Java are not clear.