The material shown here will be covered in much more detail in the forthcoming book, Concurrent Programming: The Java Programming Language, to be published by Oxford University Press in 1998.
Introduction
Sequential Example Programs
This is an introduction to using the Java programming language in concurrent or multithreaded applications. The context is operating systems courses and programming as opposed to software engineering. Topics covered are race conditions when threads share data, critical sections, mutual exclusion, semaphores, monitors, message passing, the rendezvous, remote procedure calls, distributed or network programming, and parallel processing. Solutions to the classical problems talked about in operating systems courses (the dining philosophers, the bounded buffer producers and consumers, and the database readers and writers) are shown in Java. Also shown is how to animate algorithms using the command set of the Xtango animation interpreter, animator.
All of the examples described and hyperlinked here may be retrieved by anonymous ftp from site ftp.mcs.drexel.edu in directory pub/shartley and file concProgsJava.tar.gz. Java is designed to be a platform-independent language, so all of these examples, including the animated ones, will run without change on Sun's Solaris 2.x UNIX for Sparc, Microsoft Windows 95/NT for Intel-based PCs, and the Apple Macintosh.
In this compressed (with GNU's gzip) tar archive, there is a directory lib that contains three subdirectories: Utilities, Synchronization, and XtangoAnimation. The path to the lib directory needs to be put into your CLASSPATH environment variable so that your Java programs can import the classes in the subdirectories of lib. For example, suppose you unpack the archive so that lib is in directory /home/you/Java. Then on a UNIX system, put the line
setenv CLASSPATH /usr/local/JDK/lib/classes.zip:/home/you/Java/lib:.
On a Windows 95/NT system, put the line
SET CLASSPATH=C:\JAVA\JDK\LIB\CLASSES.ZIP;D:\LIB;.
To test your CLASSPATH setting, try these commands.
java Utilities.GetOpt java XtangoAnimation.XtangoAnimator
These examples illustrate some of the sequential features of the language. An on-line tutorial is available from Sun Microsystems for a more detailed discussion of the syntax and semantics of the Java language.
These three examples show how to write a single-class program containing one or more procedures or functions and a main() method where execution of the program begins.
This example shows how to access a method defined in another class.
This example shows how to read numbers and words (strings) from the keyboard (or stdin in UNIX terminology).
This example shows how to open a file and write binary floating-point numbers (not translated into ASCII) to the file. The numbers are read from the keyboard with a tokenizer as in the previous example. Then the file is read and the numbers printed out. The file is deleted in the finally block whether or not there is an IO error.
This last example shows how to read and parse a text file containing some input data in a specific format.
The GetOpt class can be used to process command line arguments when a Java program is run, like the following
javac aClass.java java aClass -a -f theFile -w -80 -h3.33 arg1 arg2
A genetic algorithm is an optimization technique that uses randomization instead of a deterministic search strategy. To maximize f(x) over some domain, the values of x in the domain are encoded in strings over some alphabet, usually {0,1}. A population of such strings, called chromosomes, is created at random. Then the genetic operators selection for reproduction, crossover, and mutation are applied to the population members. Selection is based on the fitness value of the chromosome, usually f(x). After this is repeated many times, the population will contain mostly highly fit chromosomes, one of which (it is hoped) is close to the maximum value of the function f(x) over the domain.
This example shows how to split a program up into packages. The Chromosome class is defined as a hierarchy of subclasses for each gene data type.
The XtangoAnimator class in the XtangoAnimation directory of the lib directory contains a collection of methods that implement the command set of the Xtango algorithm animation interpreter, animator. A Java program can create an instance of XtangoAnimator and call its methods to create graphical icons of various colors in a window and then move them around.
The XtangoAnimator class is designed to be used in a stand-alone application for algorithm animation. In contrast, the Xtango animator program reads commands from a text file or UNIX pipe and interprets them. A group at Duke University has implemented a Samba interpreter, called lambada. Samba is a superset of Xtango. The lambada interpreter also reads a text stream of commands from a file or pipe. The XtangoAnimator class defines methods that correspond to the Xtango animator commands and that can be called in a Java program.
Here are two example programs.
A sorting screen snapshot:
A thread is an execution or flow of control, maintained in the program counter register, in an address space. A process is a program with one thread. A process can have more than one thread. All the threads in a process have their own program counter and their own stack for local (also called automatic) variables and return addresses of invoked procedures.
In Java, a thread in the run-time interpreter calls the main() method of the class on the java command line. Each object created can have one or more threads, all sharing access to the data fields of the object.
The article An Introduction to Programming with Threads by Andrew D. Birrell (1989) motivates concurrent programming with threads:
The MyObject class contains some methods useful for multithreaded simulations:
This is a skeleton (model or template) for a multithreaded simulation.
Use a Scheduler object to get time-slicing of threads on Sun's Solaris 2.x. Java threads are already time-sliced on Windows 95/NT.
This program creates two windows, each a subclass of Frame, with some buttons to control the thread in each window that computes Fibonacci numbers. There is another thread in the Java virtual machine that is blocked, waiting for events like button clicks in the windows. When a button is clicked, the action() method is called by this thread to process the click. Meanwhile, the other two threads can be computing concurrently and displaying the Fibonacci numbers in their windows. Try coding this as a purely sequential program with only one thread or flow of control!
A screen snapshot:
This example shows how to send the output of each thread to a different file. This technique may be useful for debugging.
Two threads doing N=N+1 at about the same time is called a race condition since one of the updates can get lost. In general, race conditions are possible when
Three different kinds of race conditions are illustrated with Java programs.
Multiple threads doing sum=fn(sum,m) to the shared variable sum is a race condition like N=N+1.
There is a fixed $10 million in this bank (10,000 accounts that each start out with $1000) but the auditor sometimes sees more, sometimes less. If the bank auditor is adding up the account balances while some funds are being moved from one account to another, an inaccurate total can be calculated. Can you explain how more money than is really there shows up sometimes in the sample output?
If two threads try to manipulate a queue at the same time, a node can get lost. Remember that the two threads may be executing in round-robin fashion on a shared CPU and that context switches can occur at any time.
The following sequence of queue snapshots shows node 4 not in the queue even though thread A appended it.
These examples show that concurrently executing threads that share data need to synchronize their operations and processing in order to avoid race conditions on shared data. Thread synchronization can be done with flag variables and busy waiting, as this example shows. Since it uses a lot of CPU cycles, busy waiting is inefficient. Blocking somehow would be better.
A critical section is a block of code in a thread that accesses one or more shared variables in a read-update-write fashion. In such a situation we want mutual exclusion: only one thread at a time can access (read-update-write) a shared variable at a time. The mutual exclusion problem is how to keep two or more threads from being in their critical sections at the same time, where we make no assumptions about the number of CPUs or their relative speeds A thread outside its critical section should not keep other threads outside their critical sections from entering, also called a ``safety'' property (absence of unnecessary delay). And no thread should have to wait forever to enter its critical section, also called a ``liveness'' property (eventual entry).
An atomic action ``makes an indivisible state transition: any intermediate state that might exist in the implementation of the action must not be visible to other threads'' (p. 60 Andrews' Concurrent Programming book). This means that nothing from another thread can be interleaved in the implementation of the action for it to be atomic. Critical sections need to be done as if they were one atomic action to avoid race conditions.
The mutual exclusion problem is to devise a pre-protocol and a post-protocol based on either hardware or software
Thread Ti, i = 1, 2, 3, ...
while (true) { outsideCS(); wantToEnterCS(i); // pre-protocol insideCS(); finishedInCS(i); // post-protocol }
This sequence of examples shows successful and unsuccessful attempts to solve the mutual exclusion problem in software without specialized hardware instructions like test-and-set.
First Attempt: Strict Alternation.
Second Attempt: Check Other's Flag Variable, Then Set Own.
Third Attempt: Set Own Flag Variable, Then Check Other's.
Dekker's Solution: Take Turns Backing Off.
Semaphores can be used for mutual exclusion and thread synchronization. Instead of busy waiting and wasting CPU cycles, a thread can block on a semaphore (the operating system removes the thread from the CPU scheduling or ``ready'' queue) if it must wait to enter its critical section or if the resource it wants is not available.
Mutual exclusion pseudocode:
semaphore S = 1; ... P(S); N=N+1; V(S);
Condition synchronization pseudocode (resource availability):
semaphore tapeDrives = 7; ...
P(tapeDrives); useTapeDrive(); V(tapeDrives);
Java has implicit binary semaphores of the form
Object mutex = new Object(); /*...*/ synchronized (mutex) { /*...*/ }
An Implicit Binary Semaphore Prevents Another Race Condition.
Java does not have explicit binary and counting semaphores, so they are provided as classes in the Synchronization subdirectory of the lib directory. Their implementation will be shown later. Two explicit binary semaphores, one initialized to zero (impossible with an implicit binary semaphore), and one explicit counting semaphore are used to synchronized three threads so they obey the ``rules.'' The ``syntactic sugar'' in MyObject.java lets us write P(S) instead of S.P() for a semaphore S.
A producer thread deposits items and blocks if the bounded buffer fills up. A consumer thread fetches items and blocks if the bounded buffer is empty.
The implementation shown uses a circular array and can be used only with a single producer thread and a single consumer thread. Do you see why?
A linked list can be used to implement a first-in-first-out buffer that is not bounded and can be used by multiple producer threads and multiple consumer threads.
Bounded buffers can be used to communicate information from one thread to another in a pipeline.
Semaphores can be used to solve the so-called ``classical'' synchronization problems found in many operating systems books: the sleeping barber, the five dining philosophers, and the database readers and writers.
A barber waits to cut hair. Customers enter the waiting room and take a seat if one is available. If the waiting room is full, they try again later. Otherwise, they wait until their turn for a hair cut.
Five philosophers sit around a table and think until hungry. Interspersed between the philosophers are five forks. A hungry philosopher must have exclusive access to both its left and right forks in order to eat. If they are not both free, the philosopher waits. The following algorithm does not deadlock (it never happens that all philosophers are hungry, each holding one fork and waiting for the other), allows maximal parallelism (a philosopher never picks up and holds a fork while waiting for the other fork to become available when the fork it is holding could be used for eating by its neighbor), an advantage, but also allows starvation (a philosopher's two neighbors can collaborate and alternate their eating so the one in the middle never can use the forks).
If a philosopher can hold a fork while waiting for the other fork, deadlock is possible, an extreme case of not having maximal parallelism. However, starvation is not possible. Each fork is represented by a semaphore and each hungry philosopher will do a ``P'' on its left fork and then its right fork.
We can fix the deadlock problem and retain no starvation, but we still do not have maximal parallelism. All philosophers pick up left then right except one designated philosopher who picks up right then left.
A database can be accessed concurrently by threads that only want to read, but a writer thread must have exclusive access with respect to other readers and writers. The solution here allows writers to starve if enough readers keep coming along to read the database that the number of current readers is always above zero.
This sequence of attempts to implement counting semaphores with binary semaphores illustrates some of the subtleties of pure binary semaphores and thread scheduling. The first two attempts have a problem if a context switch occurs at point A since then V's on the binary semaphore blocked may get lost. The third solves that problem but introduces more context switching overhead. The last two are good solutions, with the latter being somewhat simpler.
Jurassic Park consists of a safari area, a number of single-passenger safari cars, some number of people, and a museum. Each person in the park will visit the museum for a random amount of time, then line up to take a safari ride, waiting for an empty car. Each car will wait for a passenger, then go out on safari for a random amount of time. The following solution has a major flaw. What is it? How can the flaw be fixed?
The XtangoAnimator class in the XtangoAnimation subdirectory of the lib directory can be used to animate the multiple producers and consumers bounded buffer and the dining philosophers.
A screen snapshot of the dining philosophers:
Semaphores are like goto's and pointers: mistake prone, work okay but lack structure and ``discipline''.
For example, a disastrous typo:
This leads to deadlock:
Nested critical sections can lead to deadlock:
P2: P(S); P(Q); ... V(Q); V(S);
A monitor is an object with some built-in mutual exclusion and thread synchronization capabilities. They are an integral part of the programming language so the compiler can generate the correct code to implement the monitor. Only one thread can be active at a time in the monitor, where ``active'' means executing a method of the monitor. Monitors also have condition variables, on which a thread can wait if conditions are not right for it to continue executing in the monitor. Some other thread can then get in the monitor and perhaps change the state of the monitor. If conditions are now right, that thread can signal a waiting thread, moving the latter to the ready queue to get back into the monitor when it becomes free.
Monitors can use either signal-and-exit or signal-and-continue signaling discipline. In the former, a signaling thread must leave the monitor immediately, at which point it is guaranteed that the signaled thread is the next one in the monitor. In the latter, the signaled thread is not guaranteed to be the next one in the monitor. In fact, barging can take place: some thread that has called a monitor method and is blocked until the monitor is free can get into the monitor before a signaled thread.
Here are monitors for the three ``classical'' problems using the signal-and-exit signaling discipline and condition variables. They are written in a Java-like but not Java pseudocode. The synchronized attribute in a method definition means that only one thread at a time can be active in any synchronized method.
Philosopher starvation is prevented by introducing a new state: very hungry. A philosopher is put into this state if it is hungry, if one of its neighbors puts down its forks, and if it cannot eat because the other fork is in use. A new rule is added: a hungry philosopher cannot eat if it has a very hungry neighbor. These changes will prevent a collaboration of two philosophers trying to starve the philosopher between them. Notice that signal-and-exit requires leaving and reentering the monitor to generate more than one signal.
Writer starvation is prevented by requiring readers that come along to read the database to wait if there is a waiting writer even if other readers are currently reading the database. When the current readers finish, the waiting writer writes the database and then signals into the database a waiting reader. Each entering reader signals another waiting reader into the database.
Java uses the synchronized keyword to indicate that only one thread at a time can be executing in this or any other synchronized method of the object representing the monitor. A thread can call wait() to block and leave the monitor until a notify() or notifyAll() places the thread back in the ready queue to resume execution inside the monitor when scheduled. A thread that has been sent a signal is not guaranteed to be the next thread executing inside the monitor compared to one that is blocked on a call to one of the monitor's synchronized methods. Also, it is not guaranteed that the thread that has been waiting the longest will be the one woken up with a notify(); an arbitrary thead is chosen by the JVM. Finally, when a notifyAll() is called to move all waiting threads back into the ready queue, the first thread to get back into the moniitor is not necessarily the one that has been waiting the longest.
Each Java monitor has a single nameless anonymous condition variable on which a thread can wait() or signal one waiting thread with notify() or signal all waiting threads with notifyAll(). This nameless condition variable corresponds to a lock on the object that must be obtained whenever a thread calls a synchronized method in the object. Only inside a synchronized method may wait(), notify(), and notifyAll() be called.
Methods that are static can also be synchronized. There is a lock associated with the class that must be obtained when a static synchronized method is called.
A Java monitor may be designed with some methods synchronized and some not. The non-synchronized methods will form the public interface and will call the synchronized methods, which will be private.
An experiment was performed to determine if Java monitors are signal-and-exit or signal-and-continue. They use signal-and-continue. When a thread executes a notify(), Java does not necessarily move to the ready queue the thread that has been waiting the longest. Also, Java allows barging.
Here are Java monitors for the three ``classical'' problems. Two important things to be aware of because of signal-and-continue, the lack of named condition variables, and barging. Most of the time it will be necessary to use a while loop instead of an if when doing a wait().
while (condition) try {wait();} catch (InterruptedException e) {}
The bounded buffer monitor can only be used by a single producer thread and a single consumer thread. The ``driver'' code is the same as that for the semaphore single-producer single-consumer bounded buffer. What could go wrong if more than one thread of each type used this monitor? How would you fix the monitor?
Notice how inefficient the dining philosophers monitor is because a broadcast signal with notifyAll() must be sent whenever any philosopher puts down its forks due to Java's lack of named condition variables.
Since there are no named condition variables, another technique must be used to prevent starvation in the database readers and writers. The arrival times of readers forced to wait because of a waiting writer is maintained. When the waiting writer enters and then exits the database, all waiting readers that arrived before the time the writer just exiting finished writing are allowed to read the database.
As mentioned, Java does not have semaphores. Here is how they are implemented in the Synchronization package in the lib directory.
A binary semaphore can be implemented in other ways than the above, for example. Compare and contrast the two implementations. Which do you prefer?
A lock acts like a binary semaphore except only the locking thread can unlock the lock.
It is possible to use an object somewhat like a condition variable in a Java monitor. We can pull the code for the elements and spaces semaphores of the bounded buffer semaphore version into the bounded buffer implementation. The resulting bounded buffer can be used with multiple producer and multiple consumer threads.
A callback technique can be used to avoid waking up all the philosopher threads with a notifyAll(). An array of callback objects, convey, is used, one for each philosopher. If the forks are not available when the philosopher gets hungry, it waits inside its callback object for a notify().
The following implements a starvation-free synchronization algorithm for the readers and writers with a callback object for each thread to wait inside until it can access the database.
In contrast to named condition variables, it is not possible with this callback technique to wait in the middle of a monitor interface method for a signal and then continue executing inside the monitor interface method at that point after receiving the signal. The signaled thread has to reenter the monitor via an interface method.
A skeleton class for implementing named condition variables (exercise for the reader):
Here is the code for the XtangoAnimator Class.
Sometimes the phrase ``send a message to an object'' is used to describe a thread in one object calling a method in another object. Here, that phrase will be used to describe a thread in one object sending a message to a thread in another object, where the message is itself an object.
This technique is used for thread communication and synchronization in a computing environment where the threads do not have shared memory (since the threads reside in different virtual or physical machines). Hence the threads cannot share semaphores or monitors and cannot use shared variables to communicate. Message passing can still be used, of course, in a shared memory platform.
Messages are sent through a port or channel with an operation like send(port, message) and received from a port or channel with an operation like receive(port, message). Messages can be passed synchronously, meaning the sender blocks until the received does a receive and the receiver blocks until the sender does a send. Since the sender and receiver are at specific known points in their code at a known specific instant of time, synchronous message passing is also called a simple rendezvous with a one-way flow of information from the sender to the receiver.
In asynchronous message passing, the sender does not block. If there is not a receiver waiting to receive the message, the message is queued or buffered. The receiver still blocks if there is no queued or buffered message when a receive is executed.
In conditional message passing, the message remains queued until some condition, specified by the receiver, becomes true. At that time, the message is passed to the receiver, unblocking it.
A two-way flow of information, perhaps over the network, is called an extended rendezvous and can be implemented with a pair of sends and receives. Typically a client thread will use this technique to communicate with a server thread and request a service to be performed on its behalf. A similar situation is a worker thread contacting a master thread, asking for more work to do.
server or master: receive request; perform service; send reply
Messages are objects and can be
or serialized through a pipe within the same JVM,
or serialized through a socket between JVMs that are on the same physical machine or on different physical machines.
The base data types, int, double, etc., can be sent as messages in binary or raw data format through a pipe or socket using the DataInputStream and DataOutputStream methods. They can also be sent as objects using the wrapper classes Integer, Double, etc.
Here is a collection of Java message passing classes. All of the message passing port classes implement the methods in the MessagePassing interface or the ConditionalMessagePassing interface. This exception is thrown when an error occurs. This exception is used in implementing restricted rights ports (below). All classes except the conditional ones extend this base class.
Asynchronous Port. A Vector is used to queue sent but not yet received messages.
Asynchronous Conditional Port. The receiver must pass an object that implements the Condition interface, that is the object must contain a checkCondition() method that is used to determine which messages sent are eligible to be received.
Finite Buffer Asynchronous Port.
Receive-Only Rights Port. Send-Only Rights Port. These two filter classes can be wrapped around a message passing port to permit only sending or receiving on the port. This is done by overriding the restricted method with one that throws NotImplementedMethodException.
Integers and Floating-Point Numbers as Messages in a Pipe or Socket Port. The numbers are passed as binary or raw data types through a pipe within the same JVM or a socket between different JVMs.
Serialized Objects as Messages in a Pipe or Socket Port. The objects are serialized and deserialized using the writeObject() and readObject() methods through a pipe within the same JVM or a socket between different JVMs.
This is a simple example illustrating both synchronous and asynchronous message passing.
We can implement the bounded buffer producer and consumer with a set of empty messages representing the buffer slots. Will this same bounded buffer work with multiple producers and consumers?
This is a testing program for asynchronous, synchronous, finite buffer, and piped message passing within the same JVM. There are two types of threads in this collection of threads: those that produce work and those that perform or consume the produced work. A producer puts the work to be done into a message passing port that is called a bag of tasks because consumers reach into the bag to extract the next piece of work to do. If there is just a single thread producing work with many threads reaching into the bag, then this technique is called master/worker or worker crew.
Threads can use semaphores and monitors to handle mutual exclusion and condition synchronization. Suppose though the threads do not share memory but are in nodes that have private memories and CPUs on a LAN and suppose they still want to do condition synchronization to coordinate access to some shared resource. If all we have is message passing, can we implement some sort of distributed mutual exclusion algorithm? Suppose we also want to avoid a central server to avoid a bottleneck. We want to solve the N node mutual exclusion problem such that it
Basic Idea:
while (true) { outsideCS(); chooseNumber(); sendItToAllOtherNodes(); waitForMessageFromAllOtherNodes(); insideCS(); postProtocol(); }
A node will send a ``reply'' or acknowledgement message to a node that has sent a request message, i.e., when ``asked'':
In the absence of shared memory, a collection of Java threads can use this technique to implement mutual exclusion. The threads send messages to all other threads, asking for permission to enter their critical sections. The threads are all in the same JVM, but we say memory is not shared here because the threads do not share any variables, semaphores, or monitors.
The quick sort algorithm can be parallelized for a shared memory multiple CPU machine by dedicating each CPU to a worker thread and using a message passing port as a bag of tasks. The main() method puts the whole array to be sorted into the bag. A worker extracts the task, chooses a pivot point, and partitions the array. Each of the two partitions is then put back into the bag for one of the workers to perform. Even though message passing is being used for a bag of tasks, shared memory is still required because the array is being sorted ``in place'' and the work requests being put into the bag are array index pairs and not pieces of the array itself.
AsyncMessagePassing task = new AsyncMessagePassing();
quicksort threads get work:
while (true) { m = (Task) receive(task); quickSort(id, m.left, m.right); }
quicksort threads create work:
if (right-(l+1) > 0) send(task, new Task(l+1, right)); if ((l-1)-left > 0) send(task, new Task(left, l-1));
Animated Worker Crew Quick Sort.
These consumers are picky and will only conditionally accept messages that are smaller than some limit. This program tests both synchronous and asynchronous conditional message passing.
The distributed dining philosophers do not have a central server they can query for fork availability. Instead each philosopher has a servant who communicates with the two neighboring servants to negotiate the use of the forks. The servants pass needL, needR, passL, and passR messages back and forth. Each fork is always in the possession of some philosopher, one of the two on either side of the fork. When a philosopher finishes eating, it labels its two forks as dirty. A hungry philosopher's servant is required to give up a dirty fork in its possession, if asked for by its hungry neighbor's servant. This prevents starvation. Study carefully how conditional message passing is used. Does it matter if synchronous were used instead?
An extended rendezvous is also called a remote procedure call from a client to a server (or a worker to the master) because it resembles (and syntactic sugar can make it nearly identical to) a call to a procedure on a remote machine that is executed there. Typically the call represents a request for service, such as reading a file that resides on the remote machine. The server may handle the request in its main thread or the server may spawn a new thread to handle the request while the server's main thread handles additional requests for service from other clients. The latter gives greater throughput and efficiency since a lengthy request would otherwise delay the handling of requests from the other clients.
An addressing mechanism is needed so the client can contact an appropriate server. In the local case (everything in the same JVM), an object can be used as the place for the client and server to ``meet'' and establish a rendezvous. The server calls a method in the object and blocks until the client calls a method. At this point in time, both methods return a newly created object that the client and server subsequently use for the two-way flow of information. This object contains a message passing port shared by them. In the remote case, the client uses the server's machine name and a TCP/IP port number to address the server; the server ''listens'' on the TCP/IP port.
The extended rendezvous class implements this interface.
An object created from this class is used for the addressing described above. In the local case, one such object is used by both client and server. In the remote case, a client creates such an object using the server's machine name and port number in the object's constructor; the server uses just the port number;
When the rendezvous occurs, and object constructed from this class is returned to both the client and server. In the local case (within the same JVM), the client and server share this object and use it to transact (synchronous message passing of object references). In the remote case (between JVMs that might be on different physical machines), each gets its own object and the object contains a socket to the other JVM (and machine). Objects are serialized through the socket. The case of sending raw data types through a pipe (same JVM) or a socket (different JVMs) is not implemented and is an exercise for the reader.
This is a local case example. A command line option controls whether or not the server spawns off a new thread to handle the request. The clients and server all share a EstablishRendezvous object for addressing. Each time a client wants to rendezvous with the server, it calls the clientToServer() method to get an ExtendedRendezvous object whose clientMakeRequestAwaitReply() method is used to transact with the server. The client passes a reference to a RendezvousRequestReply object to the server. The object contains the data and a method for the server to call. The ExtendedRendezvous object is only used once by the client; however, it could be reused for multiple clientMakeRequestAwaitReply() calls as is done in the next example.
This is a remote case example. Suppose there are workstations named client0, client1, client2, client3, and client4 connected together on a local area network, along with a machine named server. The example compile and run shows for UNIX how to run each philosopher is in its own JVM on a different physical machine. Each philosopher sends an Integer object containing its ID value to the server when it is hungry. Since this is a rendezvous, the philosopher is blocked until it gets a reply indicating that its forks are available. The server spawns a new thread for each philosopher to handle the transactions. Each philosopher sends an Integer object containing it -ID-1 value when putting its forks down. Each philosopher has its own ExtendedRendezvous object whose clientMakeRequestAwaitReply() it calls over and over again (in contrast to the previous example, in which the clients obtained a new ExtendedRendezvous object for each transaction with the server).
This program attempts to measure the amount of time it takes to transact a rendezvous. A client sends a message containing an array of length N to the server. The server adds one to each entry of the array and sends it back. The client does this M times and calculates the number of bytes sent per millisecond. The program can be run in two ways. The local run passes the message as a reference from the client to the server within the same JVM. The remote run serializes the message containing the array through a socket over the network to the server running in a different JVM, possibly on another physical machine.
We have seen one of example of parallelizing an algorithm to use multiple CPUs: the animated quick sort example. The algorithm requires shared memory since the array is sorted ``in place'' and needs to be accessed by all the worker threads. A bag of tasks is used to distribute the work.
We can categorize parallel algorithms along several lines, using the abbreviations shown in parentheses:
In some cases the need for shared memory can be relaxed. If the shared data is read-only, it can be replicated or broadcast to each distributed memory and the copy stored there. In other situations, it might be possible to pass the shared data around as a message, to be updated by the currently owning thread. For example, it might be possible to replace
/* shared */ int N; /* shared */ Object mutex = new Object(); // ... synchronized (mutex) { N = N + 1; }
/* local */ int N; // ... N = receive(port); N = N + 1; send(port, N);
There will be too much overhead and inefficiency with FG unless the communication and synchronization tools are highly optimized, perhaps with hardware support. DM can be a network of workstations (NOW) or a specialized parallel architecture in which each CPU has its own memory and there is some kind of fast interconnect or switch connecting them.
On a shared memory platform, a barrier can be used for thread synchronization. It is an extension of the semaphore idea. No thread can continue past the barrier until all threads have arrived at the barrier; in other words, each thread arriving at the barrier blocks until all the other threads have arrived.
These examples will be categorized according to CG or FG, SM or DM, MP or SY, WC or DP.
Calculate the first n prime numbers. Start up n filter threads. FG, SM or DM, MP, DP.
Sort an array of length n by creating a pipeline of length n. FG, SM or DM, MP, DP.
Sort an array of length n by creating two sorting threads and a merge thread. Send half the array to each sorting thread. FG, SM or DM, MP, WC. Can you finish the code?
Sort an array of length n by creating n/2 worker threads. FG, SM or DM, MP, DP. Can you finish the code?
For an N-by-N chess board, start up a thread for each row in column one that a queen can be placed. The size of the board could be broadcast to each thread for DM. CG, SM or DM, MP, DP.
For each CPU, start up a worker thread that reads a row number for the column one queen from the bag of tasks. CG, SM or DM, MP, WC.
Modify the previous example so each worker executes on a networked workstation. The rendezvous technique is used: each worker is like a client that asks the server for more work to do. The program can also be run entirely locally with the -w command line option. This tests the three kinds of EstablishRendezvous objects that can be constructed. CG, DM, MP, WC.
Sort an array of length n. In this example, every thread needs to communicate with every other thread, possible in SM, networked workstations, or a specialized DM architecture. FG, SM or DM, MP, DP.
FG, SM, MP and SY, DP.
FG, SM, SY, DP.
FG, SM, SY, DP.
Create a new worker thread each time a section of the array is partitioned. CG, SM, SY, DP.
FG, SM or DM, MP, DP.
Replace the barrier with additional message passing. FG, SM or DM, MP, DP.
Last modified 5 January 1997.
© 1997 Stephen J. Hartley
SJH shartley@mcs.drexel.edu