Full HTML for Basic Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion

Full HTML for

Basic foilset Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion

Given by Geoffrey C. Fox at CPS615 Spring Semester 00 on March 00. Foils prepared March 15 00
Outside Index Summary of Material

This Introduces the three fundamental types of PDE's -- Elliptic, Parabolic and Hyperbolic and studies the numerical solution of Elliptic Equations

The sparse matrix formulation is used and iterative approaches -- Jacobi, Gauss Seidel and SOR are defined

These are motivated by analogies between equilibrium of diffusive equations and elliptic systems

Parallel Computing is Discussed for Gauss Seidel

Eigenvalue analysis is used to discuss convergence of methods

We discuss Multigrid methods at a simple level

Table of Contents for full HTML of Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion

Denote Foils where Image Critical

Denote Foils where HTML is sufficient

1

CPS 615 -- Computational Science in Simulation Track Solution of Simple Partial Differential Equations and Iterative Solvers
2

Abstract of Simple Partial Differential Equations and Iterative Solvers
3

Boundary Conditions I
4

Boundary Conditions II
5

Boundary Conditions III
6

Iterative Methods for Solving Sparse Matrices
7

Formalism for Iterative Methods
8

Preconditioning
9

Convergence of Jacobi in One Dimension
10

What is Easy/Hard for Jacobi?
11

Information Moves Slowly
12

Lowest Eigenvalue converges fastest
13

Successive Over Relaxation I
14

Successive Over Relaxation II
15

Some Mathematical Details
16

Multigrid Methods
17

Gauss Seidel is Slow I
18

Gauss Seidel is Slow II
19

Multigrid Philosophically
20

Multigrid Hierarchy
21

Basic Multigrid Ideas
22

Multigrid Algorithm: procedure MG(level, A, u, f)
23

Multigrid Cycles
24

What can we do for Parallel Gauss Seidel?
25

16 by 16 Wavefront Parallel Gauss Seidel
26

Wavefront
27

Red Black Parallel Gauss Seidel I
28

Red Black
29

Red Black Parallel Gauss Seidel II
30

Parallel Red Black
31

Red Black Parallel Gauss Seidel III
32

Red Black Parallel Gauss Seidel IV

Outside Index Summary of Material

HTML version of Basic Foils prepared March 15 00

Foil 1 CPS 615 -- Computational Science in Simulation Track Solution of Simple Partial Differential Equations and Iterative Solvers

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Spring Semester 2000

Geoffrey Fox

Northeast Parallel Architectures Center

Syracuse University

111 College Place

Syracuse NY

gcf@npac.syr.edu

gcf@cs.fsu.edu

HTML version of Basic Foils prepared March 15 00

Foil 2 Abstract of Simple Partial Differential Equations and Iterative Solvers

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

This Introduces the three fundamental types of PDE's -- Elliptic, Parabolic and Hyperbolic and studies the numerical solution of Elliptic Equations

The sparse matrix formulation is used and iterative approaches -- Jacobi, Gauss Seidel and SOR are defined

These are motivated by analogies between equilibrium of diffusive equations and elliptic systems

Parallel Computing is Discussed for Gauss Seidel

Eigenvalue analysis is used to discuss convergence of methods

We discuss Multigrid methods at a simple level

HTML version of Basic Foils prepared March 15 00

Foil 3 Boundary Conditions I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Cauchy Boundary Conditions are ? and ?? / ?n given on a curve C where n is normal direction to curve. This is appropriate for hyperbolic equations

Region determined by given boundary conditions

HTML version of Basic Foils prepared March 15 00

Foil 4 Boundary Conditions II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

? and ?? / ?n cannot be given on a closed curve as then solutions say from gotten from integrating from XZ1Y up and XZ2Y down will clash

Dirichlet Boundary Conditions: ? given on closed curve C

Neumann Boundary Conditions: ?? / ?n given on closed curve C

HTML version of Basic Foils prepared March 15 00

Foil 5 Boundary Conditions III

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

General Equation is of form:

a ? 2? /? x2 + b ? 2? /? x? y + c ? 2? /? y2 = f(x,y,t)

b2 > 4ac Hyperbolic Equations

b2 = 4ac Parabolic Equations

b2 < 4ac Elliptic Equations

Boundary Equation Type | Curve | Conditions | Example

Hyperbolic | Open | Cauchy | Wave Equation

Parabolic | Open | Dirichlet | Diffusion Equation

or Neumann

Elliptic | Closed | Dirichlet | Poisson Equation

or Neumann

HTML version of Basic Foils prepared March 15 00

Foil 6 Iterative Methods for Solving Sparse Matrices

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

There are many iterative methods which can be applied to solve any matrix equation but are particularly effective in sparse matrices as they directly exploit "zero structure"

Here we look at three simplest or stationary methods - so called because iteration equation is the same at each iteration

The Jacobi method is based on solving for every variable locally with respect to the other variables; one iteration of the method corresponds to solving for every variable once. The resulting method is easy to understand and implement, but convergence is slow.

The Gauss-Seidel method is like the Jacobi method, except that it uses updated values as soon as they are available. In general, it will converge faster than the Jacobi method, though still relatively slowly.

Successive Overrelaxation (SOR) can be derived from the Gauss-Seidel method by introducing an extrapolation parameter ?. For the optimal choice of ?, SOR converges faster than Gauss-Seidel by an order of magnitude.

HTML version of Basic Foils prepared March 15 00

Foil 7 Formalism for Iterative Methods

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Generic Iteration Strategy is to split matrix

A = M-N
so that it is much easier to invert M than A and in particular that inverting M can exploit zero structure of A

Ax = b implies Mx = Nx + b

So write Mxk = Nx(k-1) + b (*)

All iteration methods have this form for different choices of M, N for given A and b

We must ensure that Iteration (*) converges and we naturally prefer that it converges fast!

HTML version of Basic Foils prepared March 15 00

Foil 8 Preconditioning

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

One can change A and b by preconditioning

M1-1 A (M2-1 M2)x = M1-1 b

is same equation as before for any choice of matrices M1 and M2

All these choices are designed to accelerate convergence of iterative methods

Anew = M1-1 A M2-1

xnew = M2 x

bnew = M1-1 b

We choose M1 and M2 to make our standard methods perform better

There are specialized preconditioning ideas and perhaps better general approaches such as multigrid and Incomplete LU (ILU) decomposition

Anew xnew = bnew has same form as above and we can apply any of the methods that we used on A x = b

HTML version of Basic Foils prepared March 15 00

Foil 9 Convergence of Jacobi in One Dimension

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

For Jacobi the iteration matrix G = M-1N takes the form D-1(A-D) where D is diagonal part of A

One can calculate analytically the eigenvalues in this case and find that if one has N gridpoints, largest eigenvalue of G is

Thus if one requires an error of 10-6, then the number of iterations is given by a dreadfully large number K where

HTML version of Basic Foils prepared March 15 00

Foil 10 What is Easy/Hard for Jacobi?

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Slowest Convergence is for a discrepancy (residual) that varies slowly with grid point.

HTML version of Basic Foils prepared March 15 00

Foil 11 Information Moves Slowly

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

For this smooth discrepancy to be "noticed" by iterative method, the points along way away need to affect each other. This will take N steps to happen as information moves slowly (one grid point per iteration) in Jacobi

Then after impact, one needs to actually correct value

Thus convergence takes N2 iterations

On the other hand, if residual varies rapidly with position, that will be corrected quickly as only needs a few iterations for information to be communicated

Thus rapidly varying residual corresponds to lowest eigenvalue

HTML version of Basic Foils prepared March 15 00

Foil 12 Lowest Eigenvalue converges fastest

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

HTML version of Basic Foils prepared March 15 00

Foil 13 Successive Over Relaxation I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Jacobi and Gauss Seidel give a formula for xk in terms of x(k-1) call this xbasick

Overrelaxation forms xSORk = ? xbasick + (1- ?)xbasic(k-1)

Typically only 0 < ? < 2 is sensible and ? < 1 is relaxation 1 < ? < 2 is over relaxation

It is "over" because if ? > 1, you go "further" in direction of new estimate than calculated

In "relaxation", you conservatively, average prediction with old value

Unfortunately, the best value of ? cannot be calculated except in simple cases and so SOR is not so reliable for general problems

HTML version of Basic Foils prepared March 15 00

Foil 14 Successive Over Relaxation II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

You can show that for Jacobi, best choice is ? = 1 i.e. that relaxation strategy is not very useful

For Gauss-Seidel, then in simple one dimensional case with N grid points, one can show that there is an optimal choice ? = 2 ( 1 - ? / N) i.e. almost equals 2

Then the number of iterations needed by optimal SOR to get a given error can be shown to be proportional to N not N2 as in Jacobi or Gauss Seidel. So Ratio of Iterations is:

Jacobi 2 N2/ ?2

Gauss-Seidel N2/ ?2

SOR on Gauss-Seidel N/ ?

HTML version of Basic Foils prepared March 15 00

Foil 15 Some Mathematical Details

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

The remaining foils discuss some details that are not essential

Details on convergence of Gauss Seidel

Detailed derivation of Convergence of Jacobi in one dimension

Discussion of SOR with computation of optimal ? for Jacobi case

Motivation of Iterative Methods with an artificial time whose discretization gives index k for iteration

HTML version of Basic Foils prepared March 15 00

Foil 16 Multigrid Methods

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

We remarked that key problem with iterative methods is that got detail (short wavelength) correct but that convergence was controlled by coarse (long wavelength) structure

Then in simple methods one needs of order N2 iterations to get good results

Ironically one goes to large N to get detail as if all you wanted was coarse structure, a smaller mesh would be fine

Basic idea in multigrid is key in many areas of science

Solve a problem at multiple scales

We get coarse structure from small N and fine detail from large N

Good qualitative idea but how do we implement?

Material taken from http://www.mgnet.org/mgnet/www/mgnet/www/mgnet/www/mgnet/www/ Tutorial by Ulrich Ruede

HTML version of Basic Foils prepared March 15 00

Foil 17 Gauss Seidel is Slow I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Take Laplace's equation in the Unit Square with initial guess as ? = 0 and boundary conditions that are zero except on one side

For N=31 by 31 Grid it takes around 1000 (N2) iterations to get a reasonable answer

Boundary Conditions Exact Solution

HTML version of Basic Foils prepared March 15 00

Foil 18 Gauss Seidel is Slow II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

1 Iteration

10 Iterations

100 Iterations

1000 Iterations

HTML version of Basic Foils prepared March 15 00

Foil 19 Multigrid Philosophically

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Suppose we have a finest level M(1) with N by N points (in 2D)

Then the k'th coarsest approximation M(k) to this has N/2k by N/2k points

One way to think about Multigrid is that M(k+1) can form a preconditioner to M(k) so that one can replace natural matrix A(k) by A-1(k+1)A(k)

A-1(k+1)A(k) is a nicer matrix than A(k) and iterative solvers converge faster as long wavelength terms have been removed

Basic issue is that A(k) and A(k+1) are of different size so we need prolongation and restriction to map solutions at level k and k+1

We apply this idea recursively

HTML version of Basic Foils prepared March 15 00

Foil 20 Multigrid Hierarchy

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Relax

Interpolate

Restrict

Relax

HTML version of Basic Foils prepared March 15 00

Foil 21 Basic Multigrid Ideas

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

In picture, relax is application of standard iteration scheme to "solve" short wavelength solution at a given level i.e. use Jacobi, Gauss-Seidel, Conjugate Gradient

Interpolation is taking a solution at a coarser grid and interpolating to find a solution at half the grid size

Restriction is taking solution at given grid and averaging to find solution at coarser grid

Interpolate

Restrict

HTML version of Basic Foils prepared March 15 00

Foil 22 Multigrid Algorithm: procedure MG(level, A, u, f)

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

if level = coarsest then

solve coarsest grid equation by another method "exactly"

else

smooth Alevel u = f (m1 times)
Compute residual r = f - Alevel u
Restrict F = R r ( R is Restriction Operator)
Call MG( level + 1, A(level+1), V, F) (mc times)
Interpolate v = P V (Interpolate new solution at this level)
correct unew = u + v
smooth Aunew = f (m2 times) and
set u = unew

endif

endprocedure

Alevel uexact = f

uexact = u + v

Alevel v = r = f - Alevel u

HTML version of Basic Foils prepared March 15 00

Foil 23 Multigrid Cycles

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

The parameter mc determines the exact strategy for how one iterates through different meshes

One ends up with a full cycle as shown

Full Multigrid Cycle

Coarsest Grid

Finest Grid

Finest

Coarsest

HTML version of Basic Foils prepared March 15 00

Foil 24 What can we do for Parallel Gauss Seidel?

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Must Exploit special structure of matrix

Gauss Seidel for "general matrices" cannot be parallelized

k-1

HTML version of Basic Foils prepared March 15 00

Foil 25 16 by 16 Wavefront Parallel Gauss Seidel

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

This is best approach to parallelizing Gauss Seidel for the natural ordering along rows -- first in x, the y (and then z in 3D)

4 Processors with cyclic row decomposition

Processor 1 rows 1 5 9 13
Processor 4 rows 4 8 12 16 etc.

Consider points on lines parallel to diagonal

31 such lines
Execute each phase consecutively starting at bottom left

All points in a given phase can be updated in parallel as one only needs updated (iteration k) points from previous phase -- need iteration (k-1) from next phase

Load Imbalance but at worst processors differ by 1 in load and effect goes away for large systems

This example has speed up of 3.4 on 4 processors
8 phases 1 unit, 8 phases 2 units, 8 phases 3 units, 7 phases 4 units of work

HTML version of Basic Foils prepared March 15 00

Foil 26 Wavefront

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Suppose conventional

Gauss Seidel ordering

of update --

along rows

starting at

bottom left

All points needed by general point P

at level k are below dotted line

parallel to diagonal

and through P

Iteration Number k or k-1

Phase Number runs 1 to 31

HTML version of Basic Foils prepared March 15 00

Foil 27 Red Black Parallel Gauss Seidel I

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

The Pipeline method has high communication costs (as will in fact use "block cyclic" to preserve locality) and is complex to implement well

Thus instead we note that we can get new versions of Gauss Seidel by reordering update order -- this could (in principle) make for a better or worse method (or more likely we won't know if better or worse!)

There is a natural reordering which is typically as good if not better for which parallelism is "trivial"

This ONLY works for nearest neighbor stencil but there are versions of red black for similar stencils

HTML version of Basic Foils prepared March 15 00

Foil 28 Red Black

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

K = (G2/2) + 1

where grid is G by G

First 4 and Middle 4 labelling of Gauss Seidel Update Order shown

HTML version of Basic Foils prepared March 15 00

Foil 29 Red Black Parallel Gauss Seidel II

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

In red-black, you color every alternate point red and every other point black

This gives a checkerboard pattern shown on previous foil

Now label update so first update all red points, then update all black points

Updating red points to iterate k only requires black points at level k-1

Updating black points to level k requires NO black points but just read points at level k

So can divide parallel update into two phases

Phase I: Update all Red Points
Phase II: Update all Black Points

HTML version of Basic Foils prepared March 15 00

Foil 30 Parallel Red Black

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

1
K
2
K+1
3
K+2
4
K+3

HTML version of Basic Foils prepared March 15 00

Foil 31 Red Black Parallel Gauss Seidel III

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Do normal block decomposition

Parallel Phase I: Update all Red Points

Communicate black points at k-1 to halo in each processor
Compute red points to iterate k

Parallel Phase II: Update all Black Points

Communicate red points at k to halo in each processor
Compute black points to iterate k

This has similar efficiency analysis to Jacobi except a little more sensitive to latency

Same amount of communication but twice as many messages

In electrical power system and similar simulations, one gets irregular sparse matrices and no way to get such a clean parallel algorithm

In fact not clear if there is a sequential Gauss Seidel that converges

HTML version of Basic Foils prepared March 15 00

Foil 32 Red Black Parallel Gauss Seidel IV

From Collection of Extra Foils for CPS615 PDE Iterative Solution Discussion CPS615 Spring Semester 00 -- March 00. *

Full HTML Index

Fortunately very irregular problems like power systems tend not to be huge.

100's or 10,000's not billions of points
Thus can use very difficult optimized "full matrix exploiting zero structure" algorithms

In PDE case, one can show for some elliptic problems that red black can be better than original ordering

For some hyperbolic equations used in computational fluid dynamics, this is not so clear and SOR methods are used

If one has more complex differencing schemes e.g. fourth order differencing, then red-black does not work but a scheme with more colors (and more update phases) can be devised

Difficulty is irregular graphs not complex stencils

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Mon Mar 20 2000