2: Scheduling Type Optimization


A: Graph Coloring and Timetabling

Many scheduling problems involve allowing for a number of pairwise restrictions on which jobs can be done simultaneously. For instance, in attempting to schedule classes at a university, two courses taught by the same faculty member cannot be scheduled for the same time slot. Similarly, two courses that are required by the same group of students also should not conflict. The problem of determining the minimum number of time slots needed subject to these restrictions is a graph coloring problem (GC).

GC has considerable application to a large variety of complex problems involving optimization. Among those problems is the one we have dealt with, which is a large scale academic class scheduling. Constraints for these kind of problems --class scheduling or timetabling-- are usually expressible in the form of pairs of incompatible objects (e.g., pairs of classes that cannot be assigned to the same room at the same time period). Such incompatibilities are usefully embodied through the structure of a graph. Each object (e.g. class) is represented by a node and each incompatibility is represented by an edge joining the two nodes. A coloring of this graph is then simply a partitioning of the objects into blocks (or colors) such that no two incompatible objects end up in the same block. Thus, optimal solutions to such problems may be found by determining minimal coloring for the corresponding graphs. Unfortunately, this may not always be accomplished in a reasonable amount of time.

Again, the mapping between a simple case of class scheduling and a graph-based representation goes as follows: The events are represented by the vertices of the graph, and a pair of vertices are joined by an indirected edge if and only if the corresponding events cannot take place at the same time. Scheduling the events subject to the given constraints is therefor equivalent to coloring the corresponding graph such that no two adjacent vertices are of the same color. The determination of the minimum number of intervals of time needed for the schedule is therefore that same as finding the minimum number of colors required for the graph. This is known as the chromatic number of the graph, and its determination for an arbitrary graph is yet unsolved problem.


B: Graph Coloring Algorithm

When dealing with this kind of mapping between timetabling and graph coloring, the first algorithm that a researcher would investigate would be the simple saturation algorithm (DSatur, for short) of Brelaz. As we mentioned in the previous section that the timetabling problem need to be simple in order for DSatur to be effective. Before outlining this algorithm, we would like to briefly state some of the "characteristics" or structure of the problem we have dealt with to show our justification or reasoning for using this algorithm. In addition, we also believe that our observations apply to other similar instance of the problems.

Observation: After carrying out the mapping, we have noticed that graphs representing the courses tended to be rather loosely connected. This is mainly because any given professor was assigned at most four or five courses per semester, therefore, the graph would show fully-connected clumps of four or five vertices. That is we have fully (or almost fully) connected components within the overall graph. Also, we observed that those components were sparsely connected, resulting in a somewhat sparse or loosely connected overall graph.

Because of this observation we have made the choice of using DSatur algorithm for its proven ability to work quite well on sparse graphs. We also made two modifications to the algorithm to accommodate our problem setup.

Next, I'll outline two versions of the algorithm.

Let V the vertex set, and E the set of edges of the graph.

  1. Repeat until all vertices are colored:
    1. Choose the next vertex v from V to be colored by selecting the one whose neighbors have already used the most colors. Break ties by choosing the one with the most uncolored neighbors.
    2. Color vertex v with the first color in the set of all colors that has not already been used by one of the neighbors, x of V, such that v and x are not equal.
  2. End Repeat

Figure 2.1: DSatur Graph Coloring Algorithm - version 1

Here is another version of the same algorithm:

Let V the vertex set, and E the set of edges of the graph.

  1. Initially take any node with the largest possible number of neighbors and color it with the smallest color.
  2. In general, for each v of V, there is a set F(v) of forbidden colors (these may be colors which have already been assigned to neighbors of v).
  3. At each step, we choose a node x of V for which the cardinality, |F(x)|, of F(x) is maximum (highest connectivity with the neighbors), and we color it with the smallest possible color.

Figure 2.2: DSatur Graph Coloring Algorithm - version 2

Some Analysis

So, how did we modify the above algorithm to deal with our problem?

In scheduling courses, there is a fixed number of time slots per week, and it is preferable to distribute the courses among each of these time slots so that students have many options when selecting courses. Therefore, the algorithm was modified to select colors for vertices so that, among the previously used legal colors for a vertex in V, the one used the fewest number of times previously was selected. This small modification served to distribute the courses more evenly over the time slots. In a typical graph coloring setup, a user always tries to minimize the number of colors used.

Our second modification to the basic DSatur algorithm is that, due to our use of two sets of timeslots (lecture timeslots and lab timeslots) and our criterion of scheduling one set prior to the other, we had to modify the method to color the lecture-course set of vertices first, followed by the lab-course set of vertices. Of course, without this modification there is no distinction between various sets of vertices. Also, throughout scheduling we needed to do "re-coloring" of the graph we constructed to get a legal schedule.


C: Parallelizing DSatur Algorithm

First, lets outline the 5 basic steps of the sequential version:

Let V the vertex set, and E the set of edges of the graph.

  • Step 1: Arrange the vertices by decreasing order degree (In and Out).
  • Step 2: Color a vertex of maximal degree with color 1.
  • Step 3: Choose a vertex with a maximal saturation degree. If there is an equality, choose any vertex of maximal degree in the uncolored subgraph.
  • Step 4: Color the chosen vertex with the least possible (lowered numbered) color.
  • Step 5: If all the vertices are colored, stop. Otherwise, return to Step 3.

Figure 2.3: DSatur - The sequential version

Parallelization

The first Step can be parallelized quite easily since it is a one time shot of sorting. Perhaps using bucket sort in parallel with a run time of theta(n), where n is the number of elements. Any other parallel sorting algorithm will do.

The second Step is also done only once so parallelizing it may not be worth the effort.

The third step would involve communication between nodes to determine the one with the maximum degree. Let V = {v1, ..., vn}.

A possible way to parallelize this step (with a bit of overhead) is to have a master node (i.e. vi) and the rest are slaves reporting back their degrees. The master will choose the one node vi (including itself) which has the maximum degree. Such parallelism is not quite inefficient unless we have a very large graph to deal with. As the process goes one, the input set becomes smaller and smaller. Also, at this step, not only each node communicate its degree to the master but also its F(vi) set (see Figure 2.2).

Step 4 is a follow-up to step 3, using the computed set F(vi) for which |F(vi) is maximum, vi will color itself -or by the master node- with the least or lowest color by examining the set it has of its neighbors' colors.

To terminate the coloring process, the master node has to determine that all vertices have been colored.

Overall, it is a 3-step parallel process, with steps 1 and 2 are only done once.

Most of overhead is incurred from communication between the master and slaves to determine the coloring, kind of message-passing parallelism. Run time of step 1 and 2 is about (constant * n) or O(n) where n is number of nodes. For an m-edge graph, with m >= n, its overall runtime without our abovementioned modification is theta(m).


Saleh Elmohamed