Full HTML for

Basic foilset Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion

Given by Geoffrey C. Fox at CPS615 Spring Semester 00 on March 00. Foils prepared March 15 00
Outside Index Summary of Material


This Introduces the three fundamental types of PDE's -- Elliptic, Parabolic and Hyperbolic and studies the numerical solution of Elliptic Equations
The sparse matrix formulation is used and iterative approaches -- Jacobi, Gauss Seidel and SOR are defined
These are motivated by analogies between equilibrium of diffusive equations and elliptic systems
Parallel Computing is Discussed for Gauss Seidel
Eigenvalue analysis is used to discuss convergence of methods
We discuss Multigrid methods at a simple level

Table of Contents for full HTML of Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion

Denote Foils where Image Critical
Denote Foils where HTML is sufficient

1 CPS 615 -- Computational Science in Simulation Track Solution of Simple Partial Differential Equations and Iterative Solvers
2 Abstract of Simple Partial Differential Equations and Iterative Solvers
3 Boundary Conditions I
4 Boundary Conditions II
5 Boundary Conditions III
6 Iterative Methods for Solving Sparse Matrices
7 Formalism for Iterative Methods
8 Preconditioning
9 Convergence of Jacobi in One Dimension
10 What is Easy/Hard for Jacobi?
11 Information Moves Slowly
12 Lowest Eigenvalue converges fastest
13 Successive Over Relaxation I
14 Successive Over Relaxation II
15 Some Mathematical Details
16 Multigrid Methods
17 Gauss Seidel is Slow I
18 Gauss Seidel is Slow II
19 Multigrid Philosophically
20 Multigrid Hierarchy
21 Basic Multigrid Ideas
22 Multigrid Algorithm: procedure MG(level, A, u, f)
23 Multigrid Cycles
24 What can we do for Parallel Gauss Seidel?
25 16 by 16 Wavefront Parallel Gauss Seidel
26 Wavefront
27 Red Black Parallel Gauss Seidel I
28 Red Black
29 Red Black Parallel Gauss Seidel II
30 Parallel Red Black
31 Red Black Parallel Gauss Seidel III
32 Red Black Parallel Gauss Seidel IV

Outside Index Summary of Material



HTML version of Basic Foils prepared March 15 00

Foil 1 CPS 615 -- Computational Science in Simulation Track Solution of Simple Partial Differential Equations and Iterative Solvers

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Spring Semester 2000
Geoffrey Fox
Northeast Parallel Architectures Center
Syracuse University
111 College Place
Syracuse NY
gcf@npac.syr.edu
gcf@cs.fsu.edu

HTML version of Basic Foils prepared March 15 00

Foil 2 Abstract of Simple Partial Differential Equations and Iterative Solvers

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
This Introduces the three fundamental types of PDE's -- Elliptic, Parabolic and Hyperbolic and studies the numerical solution of Elliptic Equations
The sparse matrix formulation is used and iterative approaches -- Jacobi, Gauss Seidel and SOR are defined
These are motivated by analogies between equilibrium of diffusive equations and elliptic systems
Parallel Computing is Discussed for Gauss Seidel
Eigenvalue analysis is used to discuss convergence of methods
We discuss Multigrid methods at a simple level

HTML version of Basic Foils prepared March 15 00

Foil 3 Boundary Conditions I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Cauchy Boundary Conditions are ? and ?? / ?n given on a curve C where n is normal direction to curve. This is appropriate for hyperbolic equations
C
Region determined by given boundary conditions

HTML version of Basic Foils prepared March 15 00

Foil 4 Boundary Conditions II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
? and ?? / ?n cannot be given on a closed curve as then solutions say from gotten from integrating from XZ1Y up and XZ2Y down will clash
Dirichlet Boundary Conditions: ? given on closed curve C
Neumann Boundary Conditions: ?? / ?n given on closed curve C

HTML version of Basic Foils prepared March 15 00

Foil 5 Boundary Conditions III

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
General Equation is of form:
a ? 2? /? x2 + b ? 2? /? x? y + c ? 2? /? y2 = f(x,y,t)
b2 > 4ac Hyperbolic Equations
b2 = 4ac Parabolic Equations
b2 < 4ac Elliptic Equations
Boundary Equation Type | Curve | Conditions | Example
Hyperbolic | Open | Cauchy | Wave Equation
Parabolic | Open | Dirichlet | Diffusion Equation
or Neumann
Elliptic | Closed | Dirichlet | Poisson Equation
or Neumann

HTML version of Basic Foils prepared March 15 00

Foil 6 Iterative Methods for Solving Sparse Matrices

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
There are many iterative methods which can be applied to solve any matrix equation but are particularly effective in sparse matrices as they directly exploit "zero structure"
Here we look at three simplest or stationary methods - so called because iteration equation is the same at each iteration
The Jacobi method is based on solving for every variable locally with respect to the other variables; one iteration of the method corresponds to solving for every variable once. The resulting method is easy to understand and implement, but convergence is slow.
The Gauss-Seidel method is like the Jacobi method, except that it uses updated values as soon as they are available. In general, it will converge faster than the Jacobi method, though still relatively slowly.
Successive Overrelaxation (SOR) can be derived from the Gauss-Seidel method by introducing an extrapolation parameter ?. For the optimal choice of ?, SOR converges faster than Gauss-Seidel by an order of magnitude.

HTML version of Basic Foils prepared March 15 00

Foil 7 Formalism for Iterative Methods

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Generic Iteration Strategy is to split matrix
  • A = M-N
  • so that it is much easier to invert M than A and in particular that inverting M can exploit zero structure of A
Ax = b implies Mx = Nx + b
So write Mxk = Nx(k-1) + b (*)
All iteration methods have this form for different choices of M, N for given A and b
We must ensure that Iteration (*) converges and we naturally prefer that it converges fast!

HTML version of Basic Foils prepared March 15 00

Foil 8 Preconditioning

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
One can change A and b by preconditioning
M1-1 A (M2-1 M2)x = M1-1 b
is same equation as before for any choice of matrices M1 and M2
All these choices are designed to accelerate convergence of iterative methods
Anew = M1-1 A M2-1
xnew = M2 x
bnew = M1-1 b
We choose M1 and M2 to make our standard methods perform better
There are specialized preconditioning ideas and perhaps better general approaches such as multigrid and Incomplete LU (ILU) decomposition
Anew xnew = bnew has same form as above and we can apply any of the methods that we used on A x = b

HTML version of Basic Foils prepared March 15 00

Foil 9 Convergence of Jacobi in One Dimension

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
For Jacobi the iteration matrix G = M-1N takes the form D-1(A-D) where D is diagonal part of A
One can calculate analytically the eigenvalues in this case and find that if one has N gridpoints, largest eigenvalue of G is
Thus if one requires an error of 10-6, then the number of iterations is given by a dreadfully large number K where

HTML version of Basic Foils prepared March 15 00

Foil 10 What is Easy/Hard for Jacobi?

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Slowest Convergence is for a discrepancy (residual) that varies slowly with grid point.

HTML version of Basic Foils prepared March 15 00

Foil 11 Information Moves Slowly

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
For this smooth discrepancy to be "noticed" by iterative method, the points along way away need to affect each other. This will take N steps to happen as information moves slowly (one grid point per iteration) in Jacobi
Then after impact, one needs to actually correct value
Thus convergence takes N2 iterations
On the other hand, if residual varies rapidly with position, that will be corrected quickly as only needs a few iterations for information to be communicated
Thus rapidly varying residual corresponds to lowest eigenvalue

HTML version of Basic Foils prepared March 15 00

Foil 12 Lowest Eigenvalue converges fastest

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index

HTML version of Basic Foils prepared March 15 00

Foil 13 Successive Over Relaxation I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Jacobi and Gauss Seidel give a formula for xk in terms of x(k-1) call this xbasick
Overrelaxation forms xSORk = ? xbasick + (1- ?)xbasic(k-1)
Typically only 0 < ? < 2 is sensible and ? < 1 is relaxation 1 < ? < 2 is over relaxation
It is "over" because if ? > 1, you go "further" in direction of new estimate than calculated
In "relaxation", you conservatively, average prediction with old value
Unfortunately, the best value of ? cannot be calculated except in simple cases and so SOR is not so reliable for general problems

HTML version of Basic Foils prepared March 15 00

Foil 14 Successive Over Relaxation II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
You can show that for Jacobi, best choice is ? = 1 i.e. that relaxation strategy is not very useful
For Gauss-Seidel, then in simple one dimensional case with N grid points, one can show that there is an optimal choice ? = 2 ( 1 - ? / N) i.e. almost equals 2
Then the number of iterations needed by optimal SOR to get a given error can be shown to be proportional to N not N2 as in Jacobi or Gauss Seidel. So Ratio of Iterations is:
Jacobi 2 N2/ ?2
Gauss-Seidel N2/ ?2
SOR on Gauss-Seidel N/ ?

HTML version of Basic Foils prepared March 15 00

Foil 15 Some Mathematical Details

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
The remaining foils discuss some details that are not essential
Details on convergence of Gauss Seidel
Detailed derivation of Convergence of Jacobi in one dimension
Discussion of SOR with computation of optimal ? for Jacobi case
Motivation of Iterative Methods with an artificial time whose discretization gives index k for iteration

HTML version of Basic Foils prepared March 15 00

Foil 16 Multigrid Methods

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
We remarked that key problem with iterative methods is that got detail (short wavelength) correct but that convergence was controlled by coarse (long wavelength) structure
Then in simple methods one needs of order N2 iterations to get good results
Ironically one goes to large N to get detail as if all you wanted was coarse structure, a smaller mesh would be fine
Basic idea in multigrid is key in many areas of science
  • Solve a problem at multiple scales
We get coarse structure from small N and fine detail from large N
  • Good qualitative idea but how do we implement?
Material taken from http://www.mgnet.org/mgnet/www/mgnet/www/mgnet/www/mgnet/www/ Tutorial by Ulrich Ruede

HTML version of Basic Foils prepared March 15 00

Foil 17 Gauss Seidel is Slow I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Take Laplace's equation in the Unit Square with initial guess as ? = 0 and boundary conditions that are zero except on one side
For N=31 by 31 Grid it takes around 1000 (N2) iterations to get a reasonable answer
Boundary Conditions Exact Solution

HTML version of Basic Foils prepared March 15 00

Foil 18 Gauss Seidel is Slow II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
1 Iteration
10 Iterations
100 Iterations
1000 Iterations

HTML version of Basic Foils prepared March 15 00

Foil 19 Multigrid Philosophically

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Suppose we have a finest level M(1) with N by N points (in 2D)
Then the k'th coarsest approximation M(k) to this has N/2k by N/2k points
One way to think about Multigrid is that M(k+1) can form a preconditioner to M(k) so that one can replace natural matrix A(k) by A-1(k+1)A(k)
  • A-1(k+1)A(k) is a nicer matrix than A(k) and iterative solvers converge faster as long wavelength terms have been removed
Basic issue is that A(k) and A(k+1) are of different size so we need prolongation and restriction to map solutions at level k and k+1
  • We apply this idea recursively

HTML version of Basic Foils prepared March 15 00

Foil 20 Multigrid Hierarchy

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Relax
Interpolate
Restrict
Relax
Relax
Relax
Relax

HTML version of Basic Foils prepared March 15 00

Foil 21 Basic Multigrid Ideas

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
In picture, relax is application of standard iteration scheme to "solve" short wavelength solution at a given level i.e. use Jacobi, Gauss-Seidel, Conjugate Gradient
Interpolation is taking a solution at a coarser grid and interpolating to find a solution at half the grid size
Restriction is taking solution at given grid and averaging to find solution at coarser grid
Interpolate
Restrict

HTML version of Basic Foils prepared March 15 00

Foil 22 Multigrid Algorithm: procedure MG(level, A, u, f)

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
if level = coarsest then
  • solve coarsest grid equation by another method "exactly"
else
  • smooth Alevel u = f (m1 times)
  • Compute residual r = f - Alevel u
  • Restrict F = R r ( R is Restriction Operator)
  • Call MG( level + 1, A(level+1), V, F) (mc times)
  • Interpolate v = P V (Interpolate new solution at this level)
  • correct unew = u + v
  • smooth Aunew = f (m2 times) and
  • set u = unew
endif
endprocedure
Alevel uexact = f
uexact = u + v
Alevel v = r = f - Alevel u

HTML version of Basic Foils prepared March 15 00

Foil 23 Multigrid Cycles

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
The parameter mc determines the exact strategy for how one iterates through different meshes
One ends up with a full cycle as shown
Full Multigrid Cycle
Coarsest Grid
Finest Grid
Finest
Coarsest

HTML version of Basic Foils prepared March 15 00

Foil 24 What can we do for Parallel Gauss Seidel?

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Must Exploit special structure of matrix
Gauss Seidel for "general matrices" cannot be parallelized
k-1
k-1
k
k

HTML version of Basic Foils prepared March 15 00

Foil 25 16 by 16 Wavefront Parallel Gauss Seidel

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
This is best approach to parallelizing Gauss Seidel for the natural ordering along rows -- first in x, the y (and then z in 3D)
4 Processors with cyclic row decomposition
  • Processor 1 rows 1 5 9 13
  • Processor 4 rows 4 8 12 16 etc.
Consider points on lines parallel to diagonal
  • 31 such lines
  • Execute each phase consecutively starting at bottom left
All points in a given phase can be updated in parallel as one only needs updated (iteration k) points from previous phase -- need iteration (k-1) from next phase
Load Imbalance but at worst processors differ by 1 in load and effect goes away for large systems
  • This example has speed up of 3.4 on 4 processors
  • 8 phases 1 unit, 8 phases 2 units, 8 phases 3 units, 7 phases 4 units of work

HTML version of Basic Foils prepared March 15 00

Foil 26 Wavefront

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
3
5
Suppose conventional
Gauss Seidel ordering
of update --
along rows
starting at
bottom left
All points needed by general point P
at level k are below dotted line
parallel to diagonal
and through P
Iteration Number k or k-1
Phase Number runs 1 to 31
31

HTML version of Basic Foils prepared March 15 00

Foil 27 Red Black Parallel Gauss Seidel I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
The Pipeline method has high communication costs (as will in fact use "block cyclic" to preserve locality) and is complex to implement well
Thus instead we note that we can get new versions of Gauss Seidel by reordering update order -- this could (in principle) make for a better or worse method (or more likely we won't know if better or worse!)
There is a natural reordering which is typically as good if not better for which parallelism is "trivial"
This ONLY works for nearest neighbor stencil but there are versions of red black for similar stencils

HTML version of Basic Foils prepared March 15 00

Foil 28 Red Black

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
K = (G2/2) + 1
where grid is G by G
First 4 and Middle 4 labelling of Gauss Seidel Update Order shown

HTML version of Basic Foils prepared March 15 00

Foil 29 Red Black Parallel Gauss Seidel II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
In red-black, you color every alternate point red and every other point black
This gives a checkerboard pattern shown on previous foil
Now label update so first update all red points, then update all black points
Updating red points to iterate k only requires black points at level k-1
Updating black points to level k requires NO black points but just read points at level k
So can divide parallel update into two phases
  • Phase I: Update all Red Points
  • Phase II: Update all Black Points

HTML version of Basic Foils prepared March 15 00

Foil 30 Parallel Red Black

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
1
K
2
K+1
3
K+2
4
K+3

HTML version of Basic Foils prepared March 15 00

Foil 31 Red Black Parallel Gauss Seidel III

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Do normal block decomposition
Parallel Phase I: Update all Red Points
  • Communicate black points at k-1 to halo in each processor
  • Compute red points to iterate k
Parallel Phase II: Update all Black Points
  • Communicate red points at k to halo in each processor
  • Compute black points to iterate k
This has similar efficiency analysis to Jacobi except a little more sensitive to latency
  • Same amount of communication but twice as many messages
In electrical power system and similar simulations, one gets irregular sparse matrices and no way to get such a clean parallel algorithm
  • In fact not clear if there is a sequential Gauss Seidel that converges

HTML version of Basic Foils prepared March 15 00

Foil 32 Red Black Parallel Gauss Seidel IV

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *
Full HTML Index
Fortunately very irregular problems like power systems tend not to be huge.
  • 100's or 10,000's not billions of points
  • Thus can use very difficult optimized "full matrix exploiting zero structure" algorithms
In PDE case, one can show for some elliptic problems that red black can be better than original ordering
For some hyperbolic equations used in computational fluid dynamics, this is not so clear and SOR methods are used
If one has more complex differencing schemes e.g. fourth order differencing, then red-black does not work but a scheme with more colors (and more update phases) can be devised
  • Difficulty is irregular graphs not complex stencils

© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Mon Mar 20 2000