Basic HTML version of Foils prepared 20 October 1997

Foil 43 Compilers, Caches and Data Locality

From Fox Presentation Fall 1995 CPS615 Basic Simulation Track for Computational Science -- Fall Semester 97. by Geoffrey C. Fox


As owner's compute rule is obeyed, a good parallelizing compiler should be able to "automatically" find the "much better" algorithm as inverting loops and blocking are standard optimization strategies
Note that "much better" parallel algorithm is also correct sequential algorithm as naturally uses each j value N-1 times as j block fixed in cache and i values are cycled through
  • Now size of block J controlled by cache size and not processor memory size as in parallel case
  • Note each i value used J times, each j value N-1 times
General lesson is that amount of computation and amount of data re-use are as important as amount of communication



© Northeast Parallel Architectures Center, Syracuse University, npac@npac.syr.edu

If you have any comments about this server, send e-mail to webmaster@npac.syr.edu.

Page produced by wwwfoil on Mon Apr 12 1999