Subject: Re: Please Suggest Some Referees! Resent-Date: Fri, 03 Sep 1999 14:15:21 -0400 Resent-From: Geoffrey Fox Resent-To: p_gcf@npac.syr.edu Date: Mon, 02 Aug 1999 12:24:14 -0500 From: "Daniel A. Reed" To: gcf@npac.syr.edu Geoffrey: For C409, I'd suggest Roy Campbell here at Illinois. At 11:18 AM 8/2/99 , you wrote: > > >Dear Board Member, > >Here is the latest set of abstracts for the papers submitted to Concurrency: >Practice >and Experience. I would be very grateful for a few insightful suggestions for >referees! Please send you ideas to me (gcf@npac.syr.edu). > >Thank you > > > >C409: Title: An Efficient Parallel Algorithm for Motion Estimation in Very Low >Bit-Rate Video Coding Systems > >Author: Charalampos Konstantopoulos, Andreas Svolos and Christos Kaklamanis > >_ _ _ _ _ _ _ _ > >Motion estimation is widely used in video coding schemes in order to reduce the >inherent temporal redundancy among the frames of a video stream. In >particular, low >and very low bit rate video coding schemes need sophisticated motion models >which >usually require a large number of arithmetic operations. In this paper we >present a >parallel algorithm for the most practical of these models. Specifically we >implement >the affine motion model on a hypercube-based multiprocessor. This model >covers the >most usual kinds of motion and requires only a modest number of arithmetic >operations. >Also, the hypercube network can efficiently handle the non-regular data flow >resulting >from the parallel implementation of this model. In addition, we assume that >our >multiprocessor is fine grained in contrast to most programmable >architectures used in >video coding where processors usually have large local memory. Apart from its >practicality, the constraint of limited local memory makes the algorithm >design more >challenging and thus more theoretically interesting. Finally, with regard >to other >proposals in the literature, our scheme is more general: whereas our scheme >covers all >kinds of motion supported by the affine motion model, the rest of proposals >deal only >with a subset of these kinds. > >C412: Title: Performance Analysis of Hybrid Network Multiprocessor Architecture > >Author: A.Averbuch, Y. Roditty and B. Shoham > >_ _ _ _ _ _ _ _ > >In this paper we investigate architectures that combine message-passing and >shared-memory technologies, called hereinafter hybrid architectures. We >introduced >hybrid architectures in which large buses of the shared-memory are splitted >into a >number of small high performance shared-memory blocks, which are connected via >message-passing architecture, such as hybercube, grid or ring. This way we >avoid the >possible degradation of the achieved performance due to the fact that the bus >performance does not scale well when the number of processors it connects >increases. > >We study the saturation situations of several hybrid network architectures, >where >adding processors do not reduce the overall execution time. We show that >the use of >hybrid network architectures leads to significant improvement of the systems >price/performance ratio, by significantly improving the performance with >almost no >system cost increment. Therefore, the usage of hybrid architectures >demonstrates how >minimal "cost" spending could significantly increases the system performance. > > > >C413: Title: Agent Based Networks for Scientific Simulation and Modeling > >Author: L. Boloni, D.C. Marinescu, J.R. Rice, P. Tsompanopoulou and E.A. >Vavalis > >_ _ _ _ _ _ _ _ _ > >The simulation and modeling of complex physical systems often involves many >components >because (1) the physical system itself has components of differing natures, (2) >parallel computing strategies require many (somewhat independent) >components, and (3) >existing simulation software applies only to simpler geometrical shapes and >physical >situation. We discuss how agent based networks are applied to such >multi-component >applications. The network agents are used to (a) control the execution of >existing >solvers on sub-components, (b) mediate between sub-components, and (c) >coordinate the >execution of the ensemble. This paper focuses on partial differential >equation (PDE) >models as an instance of the approach and describes the implementation of >networks >using the PELLPACK problem solving environment for PDEs and the Bond system >for agent >based computing. > >C415: Title: Parallel VLSI Test in a Shared-Memory Multiprocessor > >Author: C. Gil, J. Ortega, and M.G. Montoya > >_ _ _ _ _ _ _ _ _ > >This paper presents three parallel procedures implemented in a shared memory >multiprocessor to generate the patterns that allow the testing of digital >circuits. >The implementation of these procedures in a multiprocessor uses the system >memory >better than in a distributed memory better than in a distributed memory >multicomputer, >since it is not necessary to store the circuit structure in a the local >memory of each >processor, besides other common structures. The parallel test generation >procedures >are based on a new sequential algorithm which mixes both the Boolean >difference and >digital spectral techniques. It is thus different from other methods >proposed that >deal with the parallelization of test generation algorithms that carry out >an implicit >enumeration of the input pattern space. The first procedure distributes the >set of >faults using a back tracing procedure starting from a primary output and >allocating a >similar number of lines to each processor. The second procedure distributes >the set >of faults among the processors taking into account the distance from each >line to its >nearest primary output; it then applies the algorithm to generate the test >pattern >with some modifications. The third procedure uses a circuit partitioning >procedure >which allows similar sized parts of the circuit to be assigned to each >processor while >communications between processors are minimised. The experimental results >obtained >when the procedures are applied to the usual benchmark circuits (the ISCAS >set) show >figures for speedup better than in a multicomputer, although fewer >processors are >used. > >C416: Simulation of Complete Binary Tree Structures in a Faculty Flexible >Hypercube > >Author: Huan-Chao Keh and Jen-Chih Lin > >_ _ _ _ _ _ _ _ _ > >The Flexible Hyper cube is a generalization of binary hypercube networks in >that the >number of nodes can be arbitrary in contrast to a strict power of 2. >Restated, the >Flexible Hypercube retains the connectivity and diameter properties of the >corresponding hypercube. Although the embedding of complete binary trees in >faulty >hypercubes has received considerable attention, to our knowledge, no >investigation has >demonstrated how to embed a complete binary tree in a faulty Flexible >Hypercube. >Therefore, this investigation presents a novel algorithm to facilitate the >embedding >job when the Flexible Hypercube contains faulty nodes. Of particular >concern are the >network structures of the Flexible Hypercube that balance the load before as >well as >after faults start to degrade the performance of the Flexible Hypercube. >Furthermore, >to obtain the replaceable node of the faulty node, 2-expansion is permitted >such that >up to (n-2) faults can be tolerated with congestion 1, dilation 4 and load >1. That >is, (n-1) is the dimension of a Flexible Hypercube. Results presented herein >demonstrate that embedding methods are optimized > >C421: Title: Parallel Thinning Algorithms on Multicomputers: Experimental >Study on >Load Balancing > >Author: M.G. Montoya, C. Gil and I Garcia > >_ _ _ _ _ _ _ _ _ > >In this work practical implementation of two parallel thinning algorithms on a >multicomputer system are described. The Solution has been conceived for a >multiprocessor using the SPMD (Single Program Multiple Data) programming >model. Our >main goal is intended to describe our experiences on data partition / >distribution >among processors for parallel thinning algorithms as a representative type of >algorithms where communications take place between neighbor processors and >the work >load for each processor depends on the input data. It will be shown how the >efficiency of the parallel implementation can be optimized through the >application of >a preprocess. This preprocess is based on the analysis of the work load >balance. An >analysis of the communication cost is also made. Although the results shown >here >concern with the implementations of two parallel thinning algorithms we >think that our >proposal about data distribution are general and useful for a wide set of >algorithms >in the field of image processing > >C422: Parallel Adaptive Wavefront Algorithms Solving Lyapunov Equations for the >Cholesky Factor on Message Passing Multiprocessors > >Jose M. Claver and Vincente Hernandez > >The order of the matrices involved in several algebraic problems decreases >during the >solution process. In these cases, parallel algorithms which use adaptive >solving >blocks sizes offer better performance results than the ones obtained on >parallel >algorithms using traditional constant block sizes. Recently, new parallel >wavefront >algorithms solving the Lyapunov equations for the Cholesky factor using >Hammarling's >method on message passing multiprocessors systems have been designed [4]. >In this >paper, new parallel adaptive versions of these parallel algorithms are >described and >experimental results obtained on an SGI Power Challenge and a SUN UltraSparc >cluster >are presented. ========================================================================= Daniel A. Reed Professor and Head Internet: reed@cs.uiuc.edu Department of Computer Science Research phone: (217) 333-3807 University of Illinois Admin phone: (217) 333-3373 1304 West Springfield Avenue Research FAX: (217) 244-6869 Urbana, Illinois 61801 Admin FAX: (217) 333-3501 WWW Project Pointer: http://www-pablo.cs.uiuc.edu/ =========================================================================