Many scheduling problems involve allowing for a number of pairwise restrictions on which jobs can be done simultaneously. For instance, in attempting to schedule classes at a university, two courses taught by the same faculty member cannot be scheduled for the same time slot. Similarly, two courses that are required by the same group of students also should not conflict. The problem of determining the minimum number of time slots needed subject to these restrictions is a graph coloring problem (GC).
GC has considerable application to a large variety of complex problems involving optimization. Among those problems is the one we have dealt with, which is a large scale academic class scheduling. Constraints for these kind of problems --class scheduling or timetabling-- are usually expressible in the form of pairs of incompatible objects (e.g., pairs of classes that cannot be assigned to the same room at the same time period). Such incompatibilities are usefully embodied through the structure of a graph. Each object (e.g. class) is represented by a node and each incompatibility is represented by an edge joining the two nodes. A coloring of this graph is then simply a partitioning of the objects into blocks (or colors) such that no two incompatible objects end up in the same block. Thus, optimal solutions to such problems may be found by determining minimal coloring for the corresponding graphs. Unfortunately, this may not always be accomplished in a reasonable amount of time.
Again, the mapping between a simple case of class scheduling and a graph-based representation goes as follows: The events are represented by the vertices of the graph, and a pair of vertices are joined by an indirected edge if and only if the corresponding events cannot take place at the same time. Scheduling the events subject to the given constraints is therefor equivalent to coloring the corresponding graph such that no two adjacent vertices are of the same color. The determination of the minimum number of intervals of time needed for the schedule is therefore that same as finding the minimum number of colors required for the graph. This is known as the chromatic number of the graph, and its determination for an arbitrary graph is yet unsolved problem.
When dealing with this kind of mapping between timetabling and graph coloring, the first algorithm that a researcher would investigate would be the simple saturation algorithm (DSatur, for short) of Brelaz. As we mentioned in the previous section that the timetabling problem need to be simple in order for DSatur to be effective. Before outlining this algorithm, we would like to briefly state some of the "characteristics" or structure of the problem we have dealt with to show our justification or reasoning for using this algorithm. In addition, we also believe that our observations apply to other similar instance of the problems.
Observation: After carrying out the mapping, we have noticed that graphs representing the courses tended to be rather loosely connected. This is mainly because any given professor was assigned at most four or five courses per semester, therefore, the graph would show fully-connected clumps of four or five vertices. That is we have fully (or almost fully) connected components within the overall graph. Also, we observed that those components were sparsely connected, resulting in a somewhat sparse or loosely connected overall graph.
Because of this observation we have made the choice of using DSatur algorithm for its proven ability to work quite well on sparse graphs. We also made two modifications to the algorithm to accommodate our problem setup.
Next, I'll outline two versions of the algorithm.
Let V the vertex set, and E the set of edges of the graph.
|
Here is another version of the same algorithm:
Let V the vertex set, and E the set of edges of the graph.
|
So, how did we modify the above algorithm to deal with our problem?
In scheduling courses, there is a fixed number of time slots per week, and it is preferable to distribute the courses among each of these time slots so that students have many options when selecting courses. Therefore, the algorithm was modified to select colors for vertices so that, among the previously used legal colors for a vertex in V, the one used the fewest number of times previously was selected. This small modification served to distribute the courses more evenly over the time slots. In a typical graph coloring setup, a user always tries to minimize the number of colors used.
Our second modification to the basic DSatur algorithm is that, due to our use of two sets of timeslots (lecture timeslots and lab timeslots) and our criterion of scheduling one set prior to the other, we had to modify the method to color the lecture-course set of vertices first, followed by the lab-course set of vertices. Of course, without this modification there is no distinction between various sets of vertices. Also, throughout scheduling we needed to do "re-coloring" of the graph we constructed to get a legal schedule.
First, lets outline the 5 basic steps of the sequential version:
Let V the vertex set, and E the set of edges of the graph.
|
The first Step can be parallelized quite easily since it is a one time shot of sorting. Perhaps using bucket sort in parallel with a run time of theta(n), where n is the number of elements. Any other parallel sorting algorithm will do.
The second Step is also done only once so parallelizing it may not be worth the effort.
The third step would involve communication between nodes to determine the one with the maximum degree. Let V = {v1, ..., vn}.
Step 4 is a follow-up to step 3, using the computed set F(vi) for which |F(vi) is maximum, vi will color itself -or by the master node- with the least or lowest color by examining the set it has of its neighbors' colors.
To terminate the coloring process, the master node has to determine that all vertices have been colored.
Overall, it is a 3-step parallel process, with steps 1 and 2 are only done once.
Most of overhead is incurred from communication between the master and slaves to determine the coloring, kind of message-passing parallelism. Run time of step 1 and 2 is about (constant * n) or O(n) where n is number of nodes. For an m-edge graph, with m >= n, its overall runtime without our abovementioned modification is theta(m).