Sequential and Parallel Performance I
Now to understand performance of general algorithms in parallel or in caches, one must understand storage of the arrays.
Note first that labeling is not necessarily same as storage
if p(l) is any permutation of l=0…d-1, then we can store
This can be used to improve sequential performance by ensuring that when a given digit l is being processed, its storage digit p(l) is in range 0? p(l) < L, where the cache can hold 2L entries