Homework Assignment 6
1. I wrote a program in C which generates a sequence of random numbers in the interval [0,1). It uses the linear congruential method. It accepts the number of random numbers to generate as a parameter from the command line. I ran the program for a number of numbers of numbers, in particular 5, 10, 50, 100, 500, 1000, 5000, 10000, and 15000.
The operation of the program is quite straightforward and is detailed with comments in the code itself. After getting a value for N from the command line, the program generates N random numbers, stores them in an array, and then calculates and outputs the appropiate statistical data. The routine rnd generates random numbers by forming the integer a*seed + c, truncating off the first 32-16=16 bits, and then dividing by 2^16 to get a floating point type in [0,1).

Plots of the deviation of the average and the variance (square of the standard deviation) from the expected values are in the file plot1.ps. The expected values are those for a constant probability distribution on the unit interval. The file plot1.ps is a postscript file, which can be viewed using ghostscript:
% gs plot1
If this is a problem please let me know (rideout@npac) as I can easily make hard copies. (The plots were generated using xmgr.)

After some expected large variations for small sample sizes, the errors in the average and variance quickly approach zero. This is what we would expect from the central limit theorem, which essentially states that these estimators for the mean and variance converge to the true values in the limit as N goes to infinity.

2. I modified my program to print the sequence of random numbers rather than the statistical information on the sequence, and to use the simple LCG given. For odd seeds, the period of the generator is four. For even seeds not divisible by four the period is two. For seeds divisible by four the period is one. (The period is the number of numbers generated before the LCG repeats.)
Why?

So in summary, seed*period*a mod m = seed . (For c=0.) This equation implicitly defines the period for the LCG's with c=0.

3. I modified the program to print pairs of adjacent numbers. Comments in the code describe the changes.
I generated 4096 random numbers for the two LCG's given, and plotted the successive pairs of numbers generated. The results are again in the form of postscript files.

The first LCG produced good results for the first few numbers, but then quickly converged to zero. Thus on the plot there is actually a very large number of points at (0,0). Obviously this is a very poor generator!

I believe that the problem is that the process of multiplying a large integer by an even number (66) does not give the correct answer. It in fact gives a number a little smaller than the correct answer, as I tested the multiplication by a hand calculation. By large I mean integers that would have to be represented by more than 32 bits. Somehow this causes the numbers to converge toward m divided by a power of two. Then each time this number is doubled, so that eventually we get to m itself. m is a 1 with lots of 0's after it, so truncated it turns into zero. I tried using m=2^31-1, but this only delays the arival of zero.

The second generator fails because a is too small. Each time the generator is called it returns a number 5 times as large (drop the integer part). This results in the adjacent pairs of numbers falling on lines whose slope is 5. These lines determine the next value for any given value. But random numbers are not supposed to be correlated with the previous numbers! Thus this also is a very poor generator.
Incidently, I checked to see if the odd value of m had something to do with this pattern, but I get the same type pattern for m=2^31.

A good random number generator should uniformly fill this plot, rather than prefering one region over another (or one line or point!). This would mean that there is no correlation between sucessive numbers.