11. Scalability and Fault-tolerance in Collaborative Systems

 

Scalability and Fault-tolerance are the most important issues in Collaborative Systems as in distributed systems.

 

The replication of servers on different participating hosts in a collaborative environment answers the scalability problem when the load (participating sessions) on a server crosses a certain threshold. The current API supports two different approaches. The notion of groups within a certain session allows us to define one object for each group and place these objects on different machines easily. It is also possible to split the Event Channel if it exceeds the certain capacity and connect two Event Channels to each other as supplier and consumer since one Event Channel can be a consumer/supplier of another Event Channel.

 

Fault-tolerance in Collaborative Systems can be solved by migrating sessions to a different participating host with minimal or little disruption whenever the machine hosting the server crashes. The ObjectServices agent, which is a distributed directory service, allows for migration of sessions to a different participating host in the case that a session terminates unexpectedly on one of the hosts. The Events can be stored persistently by the Event Channel to ensure that events are not lost on system failures.