Load balance is needed for the polygon calculation. A scattered or cyclic distribution may work well, or else do dynamic load balancing by using data on work distribution for previous frame to load balance the next frame.
Parallel overhead is incurred due to the two different data decompositions used (polygon domain and pixel domain). Since polygons are irregular they will overlap pixel regions on different processors. Polygons can be cut at processor boundaries.
Communications overhead must be minimized since a frame has to be rendered and displayed every 40 msec for real-time animation. Cannot afford too much communication if latencies and communications times are in the msec range. Overlap calculation and communication?