Program: FIBMULT, a pseudo-random number generator using a multiplicative lagged Fibonacci algorithm, implemented in C for sequential and distributed memory parallel computers. Generates uniformly distributed pseudo-random floating point numbers in the interval [0.0, 1.0).
Version: 1.0, February 1995.
Author:
Paul Coddington,
Northeast Parallel Architectures Center
at Syracuse University,
paulc@npac.syr.edu/A>
Code: C code for the generator, plus
a test program which uses the generator, and a sample input and
output file, are available at
ftp://nhse.npac.syr.edu/pub/random/generators/fibmult/C/
A sequence of odd random integers X(n) is obtained from the formula
Unlike lagged Fibonnacci generators which use subtraction, addition, or
XOR, this generator seems to give "good" random numbers for any lag,
not necessarily large (see [Coddington]). However larger lags give
larger periods and empirically seem to give "better" random numbers
(see [Marsaglia], [Knuth], [Coddington]).
The period of this generator is
(2p-1)*2b-3,
where p is the
largest lag and b is the number of bits in the integers
X(n), i.e. b = log2(M).
The lag needs to be chosen large enough to give a period
much longer than the number of random numbers to be generated.
Only certain lags (p,q) will give the maximal period of the
generator (see [Marsaglia], [Knuth]). Below are some of those lags, with
the corresponding periods for 32-bit integers. As a comparison, note
that linear congruential generators have a period of about
10^{9} (which is too short for most purposes), or
10^{18} for 64-bit integers (which is fine for most purposes).
A Teraflop-year is about 10^{21} floating point operations.
Note that very little is known theoretically about this generator,
however it performs very well in all empirical statistical tests (see
[Marsaglia], [Coddington]).
This does not necessarily mean it will perform well for your
application.
It is a good idea to check your results by also running your program
a good linear congruential generator
using at least 48-bit arithmetic, such as RANF and DRAND48 which are
standard on Unix systems, or a combined linear congruential generator
such as the Lecuyer generator (see [James], [Coddington]).
The program accesses the "lag table", i.e. the array storing the
previous p values in the sequence, using a "circular list", i.e. the
elements of the array are not moved with every new addition, only the
pointers to the (n-p)th and (n-q)th value are changed.
In order for the sequence to be random, the initial values of the lag
table have to be "random enough". There have not to my knowledge been
any quantitive tests of this point, however initializing the lag table
using a good linear congruential generator (here I use the Unix/C
generator RAND) seems empirically to work fine.
The computation required to generate a new random number, apart from
updating 2 pointers, is 1 integer multiplication, which multiplies two
32-bit integers and returns the answer modulo 2^{32} (this is the
standard for unsigned ints in C), and 1 floating point multiplication,
to convert the integers into a floating point number in [0,1).
The routines in the program are called using a
standard interface.
These include routines for reading
and writing the lag table and lag pointers,
to allow checkpointing of the user program.
Note that you MUST call the initialization subroutine
before calling the generator.
F. James, "A Review of Pseudo-random Number Generators",
Comput. Phys. Comm. 60 (1990) 329.
D.E. Knuth, "The Art of Computer Programming Vol. 2: Seminumerical
Methods", (Addison-Wesley, Reading, Mass., 1981).
P. L'Ecuyer, "Random numbers for simulation", Comm. ACM 33:10 (1990) 85.
G.A. Marsaglia, "A current view of random number generators",
in Computational Science and Statistics: The Interface,
ed. L. Balliard (Elsevier, Amsterdam, 1985).
The Algorithm
The program implements a lagged Fibonnacci pseudo-random number
generator using multiplication, with lags p and q,
i.e. F(p,q,*) in the notation of [Marsaglia].
X(n) = X(n-p) * X(n-q) mod M
where M is a large integer (taken to be 231 here).
p q period
9689 4187 10^{2925}
4423 2098 10^{1340}
2281 1029 10^{695}
1279 418 10^{393}
607 273 10^{191}
521 168 10^{166}
250 103 10^{84}
127 63 10^{46}
97 33 10^{38}
55 24 10^{25}
43 22 10^{21}
31 13 10^{18}
24 10 10^{16}
17 5 10^{14}
7 3 10^{11}
5 2 10^{10}
I recommend using at least (55,24), and ideally much larger lags
such as (2281,1029). Larger lags should give higher quality random
numbers, with the only possible drawbacks being increased memory use
and initialization time. However the memory use is only of order
p words,
so p=2281 will add less than 10 kilobytes to the memory required
for the user program.
The Parallel Algorithm
The parallel code is exactly the same as the sequential code, except
a copy of the generator is run independently on each processor. The only
difference in the code is that the generators on each processor need to
be seeded differently. It is very important that the seeds for different
processors not be correlated.
Note that since different generators are being run on different
processors, the results will be dependent on the number of processors,
and a program that is run on a certain number of processors and
checkpointed (i.e. the current state of the random number generator is
stored) cannot be restarted using a different number of processors.
The Program
This program is based on the implementation of RANMAR in [James].
It is the same as RANMAR except it uses multiplication instead of
subtraction, the seeds are odd integers instead of reals, and it
does not include an extra Weyl generator.
References
P.D. Coddington, "Analysis of Random Number Generators Using Monte Carlo
Simulation", Int. J. Mod. Phys. C 5, 547 (1994).
NPAC technical report SCCS-526, available from
ftp://ftp.npac.syr.edu/pub/by_index/sccs/papers/ps/0500/sccs-0526.ps.Z