CPS-615 -- Assignments Page -- Fall 1996

CPS-615 -- Course Assignments -- Fall 1996

Table of Contents

  • General Note about Assignment Preparation
  • Assignment-1
  • Assignment-2
  • Assignment-3
  • Assignment-4
  • Assignment-5
  • Assignment-6
  • Assignment-7
  • Assignment-8
  • Assignment-9
  • Assignment-10 (mini-project)
  • Back to class homepage

    General Note about Assignment Preparation

    Assignment-1 --- Due September 5, 1996

    Read my short overview of HPCC. Browse other HPCC resources and write a short (say 2 pages equivalent) HTML commentary describing areas that you found that look important and were not covered well in my paper. Here are a starting set of Web Resources. Your report should be something that we could hyperlink as an addon resource to expand material in my short somewhat dated review. As your report is HTML it can and should include references as clickable Web links. Take a look at the following general references:

    Go to solutions page

    Assignment-2 --- Due September 12, 1996

    This assignment consists of two parts.
    1. Consider the analysis in class of the performance (speedup S) of Laplace's Equation in two dimensions, found in lecture foilset: Laplace Example -- Programming Models and Performance. You need to do the following:
    2. Hadrian's Palace is L meters by L meters in size. Its floor is to be covered by tiles which are 0.99 by 0.99 meters in size and separated by 0.02 meters of grout in between each tile. Take the same rather informal approach given in the lectures to the analysis of the construction of Hadrian's wall and modify it for this case. Note Hadrian is keen on dancing and so his Palaces have no internal walls and can be considered as an unobstructed uniform two dimension expanse of tiles. In this part you need to do the following:

    A good reference is Parallel Computing Works!.

    Go to solutions page

    Assignment-3 --- Due September 24,1996

    Note: NO Lecture September 19 (Thursday)

    Consider the Sun Cluster in NPAC which has 8 UltraSparc2 based machines which are each capable of 100 megaflop (millions of floating point) per second performance. we wish to solve 2 problems on this machine:

    1. Laplace Equation given in class but modified to a N by N by N grid in three dimensions.
    2. A model problem for three dimensional Computational Fluid Dynamics which can be thought of as Lapace's equation with following modifications: These changes can be summarized as operationally:
    Derive the expected speedup (for upto 8 nodes) for the following choices of networks connecting the UltraSparcs:
    1. The first choice is a SINGLE SHARED Ethernet peaking at 10 megabits/second bandwidth .
    2. The second choice is switched fast Ethernet (each Sun system can simultaneously transmit at given bandwidth) with a peak performance of 100 megabits per second per link.
    3. The third choice is switched OC-12 ATM (each Sun system can simultaneously transmit at given bandwidth) with a peak performance of 620 Megabits per second.
    Assume that any of these networks can realize 50% of its rated peak performance. Derive the problem size (value of N) as a function of the number of processors which will lead to an efficiency of 80%. How much memory per node would we need for the 80% efficiency in computation?

    References:

    Go to solutions page

    Assignment-4 --- Due October 1 1996

    This assignment is about the Discussion of HPCC Software so you will need to review materials particularly on Fortran90, HPF and MPI and submit a review report. Here is a list of references that you might like to use for the review:

    Assignment-5 --- Due October 7 1996

    1. For this assignment you will need to familiarize yourself with VPL Web interface to HPF, Fortran90 and MPI.
    2. Write a Fortran90 routine implementing Jacobi Iteration for Laplace's equation in two dimensions and run it with 64 by 64 grid and unit boundary conditions on the square. See that the program produces correct answer, which is?

    Go to solutions page

    Assignment-6 --- Due October 17 1996

    In the previous assignment, you were asked to familiarize yourself with VPL and demonstrate it by running an F90 program. For this assignment you need to do the following:
    1. Write a short (about a page or so) critique of VPL features describing any additional feature(s) that you may think would be useful to add to it. We are interested in all comments (positive or negative) and would like them to be as specific as possible.
    2. Please put your critique (in text) on the Web (on your VPL area) with your other homeworks.
    3. Store your homeworks in directories called 'homework1', 'homework2', and so on. If the problem is divided into sub-problems, try to put your answers in clearly marked separate files in the appropriate directory.
    4. Comments, if any, and grades will appear shortly there in files -- each named 'comments-and-grade'.

    Please direct your comments/suggestions about VPL either to grader, Geoffrey or Kivanc (dincer@npac.syr.edu).

    Go to solutions page

    Assignment-7 --- Due October 24 1996

    This assignment consists of two sub-problems: Notes:
    1. A tolerance of 0.000003 should be used in the Jacobi iteration algorithm. Each program should return: number of iterations necessary to solve the Laplace heat equation. cumulative error between the experimental and theoretical solutions at the mesh points.
    2. Programs should be written using the Fortran 90 or HP Fortran languages. Double precision variables should be used throughout the calculations.
    3. It is essential that you use the VPL interface for compiling and running the program. If the problem is divided into sub-problems, try to put your answers in clearly marked separate files in the appropriate directory.
    4. If you encounter any difficulties with VPL, please contact the TA.
    References:
    1. Fox, G.C., Johnson, M.A., Lyzenga, G.A., Otto, S.W., Salmon, J.K. and Walker, D.W. (1988). Solving Problems on Concurrent Processors, Volume I, General Techniques and Regular Problems, Prentice-Hall, Englewood Cliffs, New Jersey, pp. 117-133.
    2. Gerald, C.F. and Wheatley, P.O. (1989). Applied Numerical Analysis, Fourth Edition, Addison-Wesley, pp. 471-489.

    Go to solutions page

    Assignment-8 --- Due November 7 1996

    N-Body and Random Number Generation.

    In N-Body Solver - NPAC benchmark suite there is a collection of N-body solvers and our interest is the HPF source which is an N-body source code written by Tom Haupt and runs on the alphafarm. This code has a speedup problem, or in other words, as we increase the number of processors that will run on we get an opposite of what we desire, that is we get a slowdown rather than a speedup. This deficiency (slowdown) is mainly due to the use of the random number generator. As you can see in the code, the RNG is written in F90 since HPF does not provide parallel random number generators.

    The question is to parallelize the F90 source for the random number generator to speedup the abovementioned N-body program. No need to strictly follow the F90 source to do the parallelization but come up with your own ideas of putting together an efficient parallel generator. Show your speedup timings.

    Hints:
    1. As mentioned above, HPF does not currently provide a parallel random number generator so a parallel implementation would require that the numbers generated by different processors are not correlated.
    2. A nice feature is to try not to correlate the results with the number of processors used.
    3. Take a look at Tom Haupt's HPF tutorial page. In particular, pay a visit to the page on Extrinsic Procedures. Since in the extrinsic mode the parallel RNG can reuse sequential F90 intrinsics, the generators work independently; therefore, in this approach the problem reduces to finding "right" seeds on each node.
    References:
    1. Paul Coddington's Random Number Generators for Parallel Computers.
    2. Parallel Computing Works! has a section on the N-body problem but this is the more advanced fast multipole algorithm. The older book Solving Problems on Concurrent Processors describes the basic N-body methods used in sample code.
    3. Parallel N-Body Simulations are described in Parallel N-Body Simulations contains algorithm analysis and pointers to very interesting papers and simulations.
    4. Some of John Salmon's publications on the subject are available here.

    Go to solutions page

    Assignment-9 --- Due November 22 1996

    MPI for Laplace Solver.

    This assignment consists of two related parts:

    Comments and Suggestions:
    1. Basically, the method is to implement the Gauss-Seidel (GS) algorithm in parallel. One possibility is to use what is commonly referred to as the odd-even (or maybe the even-odd) GS method. See Discussion in CPS615 Module on Iterative PDE Solvers.
    2. One can start with a 2-d grid of M points, then partition it into two sets of points. Updating goes as follows:

      So, essentially what constitute the GS algorithm is this multi-step approach of update-then-compute.

    3. On the aspect of parallelism, we could say that parallelism of the SOR method is pretty much similar to parallelism of the GS except for the over relaxation parameter (omega) which is used in the SOR update (see class notes): z^{k+1} (SOR) = omega z^{k+1} (GS) + (1 - omega)z^k
    References:
    1. Intro. to Parallel Computing: design and analysis of algorithms. V. Kumar et. al, Benjamin/Cummings publ. 1994. (mainly section 11.2, pages 426-433, will be handed in in the class).
    2. Solving Problems on Concurrent Processors, vol. I, G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. Prentice-Hall 1988. (Chapter 7 is very useful).
    3. In your VPL accounts, you will find materials-for-hwk9 directory containing various MPI examples. There is a program laplace.f90 which gives an elementary solution of Laplace's equation in 2 dimensions using the point-Jacobi iteration. Maybe you could start from there.
    4. Some MPI resources, etc. can accessed through the software tools and algorithms section of the class homepage.
    5. A useful MPI tutorial (1995) with examples in C and fortran is at this page.

    Go to solutions page

    Assignment-10 (mini project) --- Due December 20 1996

    Here are some of the mini-projects done by students:

    Back to class homepage