NORTHEAST
PARALLEL ARCHITECTURES CENTER AT SYRACUSE UNIVERSITY
Written by Tomasz Stachowiak, NPAC, January 1997
The goals of this project are:
Initially such system was developed in NPAC based on InSoft's OpenDVE software development kit. Since there have been restrictions and limitations for OpenDVE-based conferencing system, we decided to develop our own conferencing software infrastructure. Reasons for creation NPAC Conferencing System (NCS), independent of OpenDVE:
First, NCS was implemented on SGI Indy workstation. However, after satisfied performance under IRIX 5.X operation system, NCS was ported successfully on PC Windows NT/95 platform.
A conference refers to a group of geographically dispersed nodes that are joined together and that are capable of exchanging audiographic and audiovisual information across various communication networks. Conference participants may have access to various types of media handling capabilities such as audio only (telephony), audio and data, audio and video, or audio, video and data.
The conference in NCS is based on the notion of session. Session is a group of users on different machines connected in virtual many-to-many channel. So far there can only be one session on one machine an just one user on the machine can participate in conference. Many applications can be attached to the session.
Conference participant can start new session and invite others. Invitation will be visible for remote user providing he has his Conference Manager running. He can also enable the Conference Engine "watchdog" function which automatically starts Conference Manager and displays incoming invitation. Remote user can accept or reject the invitation. If he accepts invitation he will become conference participant. From this moment all his NCS application will collaborate within the current session. Each conference participant is free to invite users to participate in conference. So far here is no other way to join the conference.
Each participant has his own user identification number, unique within the session. Every conference message includes this user id. There is always one special conference participant - statekeeper. Although all participants hold all the information about the conference it is necessary to name one that provides information to the new participants and solve any state conflicts. Usually it is conference initiator but in some situations (e.g. initiator left) it may be another conference member.
Conference application is every application which is attached to Conference Engine and collaborates within the session. Each participant can start conference application. Applications can be started from Conference Manager control panel or independently. If application is invoked from Conference Manager panel it will be started automatically on all participant machines. Also exiting the application from Conference Manager panel causes closing all participant applications. If application is started independently Conference Engine may reject the application trial to connect to the session.
The main assumption during NCS development was its modularity. Modular system structure enables easy and quick modification and improvement of some of its parts without affecting the others. Based on core modules it is possible to add new collaborative applications using existing NCS mechanisms. It is especially important in Internet environment where system context evolves dynamically.
NPAC Conferencing System consists of two main modules. First module is the Conference Engine (CE) which contacts the remote hosts, processes control messages from them and passes the indications of events to the Conference Manager (CM). Conference Engine also processes the local messages from Conference Manager. To create Conference Manager the set of API functions is provided. There is also the API for the conference application management.
NCS uses UDP (User Datagram Protocol) to speed up data exchange. For real-time applications it fulfills the performance requirements without affecting quality of service. It's up to application to create higher level, more reliable protocol if necessary. However implementing such a protocol for priority messages is also predicted as internal part of Conference Engine.
Conference Engine is the core of NCS system. It performs the following functions:
CE sends and receives all the conference control messages, however application data exchange is performed separately by the applications. It allows to avoid bottleneck in CE which would be intermediate point for all NCS messages in opposite case. It is particularly important for real-time applications for which this system was designed. Therefore applications keep copies of some necessary conference information (e.g. members addresses) and they are indicated of any changes in that data by the CE. It is particularly important for real-time applications for which this system was designed.
Conference Engines on different hosts communicate with the UDP socket connections. Also communication with CM and local applications is done by the same type of connection and even by the same socket. To accomplish this special kind of control protocol is designed consisting information about message sender, receiver and type.
NCS API was created to facilitate development of NCS-based applications. Conference Engine and API provides the backbone for conferencing system. It allows flexible improvement of NCS, creation of variety new applications and enriching its functionality. According to type of application that is implemented there are two different kinds of API functions:
API functions perform all the message exchanging and offer easy to use callbacks mechanism to indicate of NCS events. They also enable acquiring all kinds of conference information that may be useful for application.
First conference managing application must initialize conference with ncsSessionInitialize() function which returns error value upon initialization failure. Initialization open the control socket and informs the Conference Engine that managing application is active.
Then conference callbacks must be set with ncsSessionAddCallback() function. Callbacks are simple and convenient way of informing applications of conference events. Developer write the procedures handling events and add them as callbacks to conference system.
To properly react on system events application developer must implement the checking loop. Depending on developing platform it may be XWindows working procedure, Windows thread or ordinary loop. Inside the loop he can obtained the application file descriptors set which may be used as a parameter for the standard socket select() command. If at least one of the file descriptors is active function processing incoming messages (ncsProcessIncoming()) should be invoked.
After that application may wait for invitation(which invoke callback function) or start new session (ncsSessionCreate()) and invite participants(ncsSessionAddMember()). NCS session API provide also several other procedures dealing with invitation handling, applications managing, obtaining necessary information.
To collaborate within the current session application must attach to Conference Engine with ncsAppAttach() command which returns the application handle used in all other conference operations. Application API has similar callback mechanism to session API. It also handles the incoming messages in similar way. Additionally inside the loop function ncsAppGetData() must be invoked to retrieve incoming messages from other participants.
Application can obtain some conference information(e.g. its id, number of participants), send message to all conference participants or single one. Finally application should detach from Conference Engine (ncsAppDettach())
Conference Manager creates NCS user interface and interacts with user to manage the conference. It uses API session functions to:
Additionally it offers possibilities to starting and closing NCS application. Information about system application are included in easily configurable text file what ensure the required system flexibility.
Server is optional part of NCS - system works properly without it. However, working with server is much more convenient, because server keeps track of all users with NCS started. This important information allows to avoid sending invitation to non-existing(from the NCS point of view) users. In the future functionality of the server will be extended of capabilities of giving other useful information about NCS users
Video Tool is an application implemented using NCS API over the backbone of Conference Engine and Conference Manager. Video Tool has the capabilities of:
Since it is the application designed for the Internet it adopts two very efficient video compressing techniques:
Both options provide very good picture quality. H.263 option takes less bandwidth but is more CPU intensive, thus for slower machines H.261 algorithm is recommended. Additional feature enables only INTRA frames compression which speed up coding process. However it causes significant increase of used bandwidth.
Video Tool automatically recognize received video format and uses appropriate decompression method. All the video frames are captured and sent in QCIF format, however it is possible to switch on the expand mode where QCIF frames are extrapolated to CIF format. Application includes statistics module which provides information about bitrate, framerate and other connection parameters.
Initially Standford University H.261 encoder offered very poor performance. It was absolutely necessary to improve it for utilization with NCS. In motion compensation compressing algorithms the most computational intensive part is motion vector search. After extensive performance testing it turned out that motion estimation function takes 70% of compressing time. The algorithm used in Standford codec was full spiral search which is very simple but also inefficient. Thus we chose to change the motion estimation algorithm for faster one. The fastest motion search algorithm is 2D logarithmic search, however it offers worse quality. Since high quality in rather static conference images is not very important and performance is the most crucial factor we decided to use this method in our implementation. The source code for logarithmic search algorithm was taken from Berkeley MPEG encoder and adopted for our compressor. This improvement increased encoder performance almost three times.
Additionally we enable using H.261 encoder in "INTRA frames only" mode. It required removing the motion estimation part and decoding reference frame part. It speed up compression process of another 20%.
Apart from performance improvement H.261 codec required other changes to adapt it for videoconferencing purpose:
Since encoding video stream is very CPU intensive, it was required to implement the frame rate control mechanism to allow CPU to perform other functions necessary for proper conference behavior(decoding streams from other participants, user events handling, independent applications working e.g. audio tool). A few different approaches to this problem were implemented and tested but so far the best one was based on decoding time.
After decoding a frame from conference participant decoding time was measured and based on that time, number of participants and experimental chosen coefficient the time between encoding consecutive frames was calculated. However this solution works fine only if encoding capabilities of conference participants are similar. If one participant's frame rate is much higher than others, their time gaps are not long enough to deal with incoming frames. It is predicted to improve this method by adding to the calculations frame rate of other participants.
Another implemented solution was based on acknowledgments of decoding frame that were send back to the video sender. Only after he obtained all the acknowledgments(from all participants) he continued to capture video. Certainly it slowed down the sending process but it had very convenient advantage of autoregulating. This solution worked fine with OpenDVE environment but since NCS application data exchange is based on raw UDP protocol some of acknowledgment messages were lost what led to all applications hung up after some time. Of course it was possible to improving this protocol to be more reliable but it would cause further slowing down below acceptable level.
H.261 and H.263 require YUV 411 video format as its input. However, both SGI video capture library and PC capture library do not have capabilities to capture video data in this format. SGI format is very similar - YUV 422 so conversion is very simple, although PC provides raw RGB format requires conversion process described in chapter 2 of this report. Such conversion procedures were implemented and successfully used in Video Tool.
Audio Tool is application created to enable voice communication between conference participants. It was designed only for the speech signal, however it is possible to transfer also low bandwidth music. Since application was intend for Internet, high sound quality was sacrificed for the bandwidth efficiency. Audio Tool adopts two compression options:
So far both option exchange 8000 samples per second sound, 16bits sound. Audio stream is coded and sent in the 100 ms samples chunks. Audio formats are automatically recognized and appropriate decompression method is chosen.
On SGI Indy multiplexing audio streams is performed using internal SGI audio library mechanisms. For each stream dedicated audio port is opened. Receiving audio data, sender ID is detected and audio chunk is decoded and put to the appropriate port.
Most PC sound card does not support multiplexing data streams. Moreover usually they support only half-duplex audio mode. To handle this situation Audio Tool implemented on PC present completely different approach. First of all special mechanism switching between sending and receiving mode was created. Initially Audio Tool is in sending mode until the moment when level of audio received from a participant is above some threshold. This threshold can be set manually by user through application graphic interface. After that, application is switched into the receive mode and participant that sent the switching packet is set to the "current sender". Till next switch to the send mode all audio packets from different participants are ignored. Application is switched back when received audio level drops below threshold for fixed amount of time.
It allows to avoid necessity of audio multiplexing. Of course it influences conference quality but providing it is led in polite, calm way it will not be a big obstacle.
Also buffering is supported by internal SGI mechanism. Each audio port has it own samples queue whose length may be set by developer. Playing function blocks until there is enough room for the samples buffer.
Unfortunately PC does not have such convenient feature. Buffering
is left for the developer concern. In Audio Tool the double buffering
scheme is applied both for the audio capturing and audio playing.
Upon receiving audio chunk is decoded and prepared to playing,
but samples buffer is not sent to the speaker until next audio
packet is received. This next packet is stored in the other buffer
when previous one is being played. Similarly one buffer is compressed
and sent while the other is being captured.
Whiteboard is very convenient collaborative tool that allows to share drawing among conference participants. It is possible to perform following operations:
Each drawing element is stored in memory as an "graphic object". Object consists of following informations:
Application creates and manages list of objects that can be also save and retrieved from a file. Each object type has its own drawing procedure. Objects are exchanged between applications using NCS mechanisms. Objects received are handled the same way as obtain as a user input.
So far whiteboard was implemented only on SGI workstations based on Motif library. However porting it on PC platform is predicted, as well as improving application of clipboard capabilities.
Tango is another NPAC collaboratory system. However, Tango is more Web-oriented and does not support real-time applications. Since it was necessary to provide collaboratory environment with functionality of both Tango and NCS these two system were integrated. Because NCS works in this system in slave mode - all the conference management is performed by Tango -- conversion of NCS could be done without knowledge of Tango architecture. Thus this architecture will no be presented in this report.
To obtain seamless integration following activities were undertaken:
Figure
5 (a) NCS-user interaction (b) NCS-Tango integration
System created during internship provides basic set of collaboratory mechanisms, however they are far from perfection. Apart from user-feedback and handling unexpected situation improvements there are still much more important problems such as: more reliable control protocol, video rate control and audio quality. Other projects for the future are: