Basic HTML version of Foils prepared 14 October 1997

Foil 67 Compilers, Caches and Data Locality

From Fox Presentation Fall 1995 CPS615 Basic Simulation Track for Computational Science -- Fall Semester 95/96/97. by Nancy McCracken and Geoffrey C. Fox


As owner's compute rule is obeyed, a good parallelizing compiler should be able to "automatically" find the "much better" algorithm as inverting loops and blocking are standard optimization strategies
Note that "much better" parallel algorithm is also correct sequential algorithm as naturally uses each j value N-1 times as j block fixed in cache and i values are cycled through
  • Now size of block J controlled by cache size and not processor memory size as in parallel case
  • Note each i value used J times, each j value N-1 times
General lesson is that amount of computation and amount of data re-use are as important as amount of communication



© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Fri Oct 2 1998