Dr. J.S. Reeve
Concurrent Computation Group
Department of Electronics and Computer Science
University of Southampton
Southampton SO17 1BJ
UK

Dear Dr. Reeve
We append 2 referee report(s) on your paper

C459: A Parallel Viterbi Decoding Algorithm

We would be happy to publish your paper if you addressed the changes 
suggested by the referees.
Please include a discussion of your changes and their answer to the
referees in your resubmittal. If this is persuasive, we can publish your paper
without further refereeing. 
I thank you for your interest in Concurrency.Practice and Experience and apologize
for not replying earlier.
Please send all communication electronically if possible using the address 
fox@csit.fsu.edu


If you should need a "real address", please use:
Geoffrey Fox
Computational Science and Information Technology
Florida State University
400 Dirac Science Library
Tallahassee Florida 32306-4130
850-644-4587 but easiest is cell phone 3152546387

C459 Referee Reports

Referee One
-----------

This paper would greatly benefit from a more detailed explanation of the Viterbi
algorithm. Many readers will not be familiar with how to interpret some of the
figures, particularly Figures 1 and 2. Also, given that memory is presented as the
prime reason for wanting a parallel algorithm, it would be useful to know what
lengths of the generating shift register are typical in applications, and how
strong the motivation is to go to greater lengths. This would allow the potential
impact of the parallel algorithm to be assessed. In the summary section, the
author says that the timing difference between the just communication in the
parallel algorithm and the complete parallel algorithm is always less than 5%.
This means that more than 95% of the time in the parallel algorithm is spent on
communication. This doesn't seem to be borne out by the results shown in the
tables. For example, in Table IV for the code (127,106,7) the time on one
processor is 259 seconds and on four processors is 174 seconds. This would imply
that the time spent on communication is 174 - 259=4 = 109:25 seconds, which is
only about 63% of the total time.
Can the use of FPGA technology address the large memory requirements of the
Viterbi algorithm mentioned in the second paragraph of the introduction?
There are a couple of minor issues that need to be addressed.
1. In line 6 of the introduction the word \codes" is repeated.
2. In Figure two the text says that thick lines are used to show path branches for
input bit 0. All the lines look the same thickness to me. The only difference is
that some of the arrow heads are larger than others.
3. The text says that Figure 3 shows the matrix for BCH (15,7,2), whereas the
figure caption says it shows the matrix for BCH (31,16,7).

Referee Two
-----------

This paper needs a stronger introduction to motivate the Viterbi aklgorithm
and the need for its parallelization. More recent references need to be
cited to demonstrate the importance of your work.
Please provide a general description of the Viterbi decoding algorithm and its
parallelization.