NORTHEAST PARALLEL ARCHITECTURES CENTER AT SYRACUSE UNIVERSITY



VIDEO ON DEMAND

- report

Tomasz Stachowiak


Syracuse 1996
  • TABLE OF CONTENTS
  • 1. Introduction 32. Technology 32.1 H.263 video compression 32.1.1 H.263 vs. H.261 32.1.2 Negotiable options 32.2 Audio compression 42.2.1 GSM 06.10 42.2.2 Intel/DVI ADPCM 43. Project 53.1 Audio-video client 53.1.1 VOD system architecture overview 53.1.2 Overview 63.1.3 New elements 63.1.4 Synchronization 63.1.5 Random access 73.2 H.263 stream offsetting 7

    1. Introduction

    Video On Demand(VOD) project is just a part of my internship in NPAC. My primary responsibility is Conferencing and Collaboration System but because I needed to implement video and audio synchronization algorithm for videoconferencing tools and I met some problem with starting this work at the conference level I decided to do it first in some better known environment such as VOD system. Additionally I could create tools which would be then integrated with this project. After that I made also some changes and improvements in existing VOD software connected with video and audio compression algorithms such as H.263, GSM 06.10, ADPCM.

    1. Technology
    2. H.263 video compression
    3. H.263 vs. H.261

    With H.263 it is possible to achieve the same quality as H.261 with 30-50% of the bit usage. Most of this is due to the half pel prediction and negotiable options in H.263. There is also less overhead and improved VLC tables in H.263.

    1. Negotiable options

    These options are negotiable. This means the decoder signals the encoder which of the options it has the capability to decode. If the encoder has any of these options, it can then turn them on, and for each of the options used the quality of the decoded video-sequence will increase.

    1. Audio compression
    2. GSM 06.10

    GSM is a telephony standard defined by the European Telecommunications Standards Institute (ETSI). The GSM 06.10 compressor models the human-speech system with two digital filters and an initial excitation. The linear-predictive short-term filter, which is the first stage of compression and the last during decompression, assumes the role of the vocal and nasal tract. It is excited by the output of a long-term predictive (LTP) filter that turns its input--the residual pulse excitation (RPE)--into a mixture of glottal wave and voiceless noise. GSM encoder compress 160 16-bit voice samples into 264-bit gsm frame. GSM 06.10 is faster than code-book lookup algorithms such as CELP. It offers 13kbps bandwidth.

    1. Intel/DVI ADPCM

    ADPCM compression algorithm uses the correlation between adjacent audio samples to reduce bit rate. It transmits only the differences between samples and their predicted values which have less dynamic range than the samples themselves. Predictor coefficients and reconstruction levels are calculated dynamically using coded signal. It allows to reduce the bandwidth but make this adaption technique more susceptible to transmission errors. There are a few ADPCM standards like e.g. Intel/DVI or G.721. Intel/DVI is not very computationally intensive still having good quality even for the music.

    1. Project
    2. Audio-video client
    3. VOD system architecture overview

    VOD system consists of three major parts:

    VOD system architecture is shown on figure1.

    Communication between database and clients is performed via Nestcape Navigator using standard HTTP, CGI mechanisms. Data from the server are transmitted using socket connections (TCP). Transmission control is done by additional control connection based on VOD client-server protocol.

    Figure 1 VOD System architecture

    1. Overview

    AV client is implemented for the SGI Indy, IRIX 5.x platform. It uses H.263 video compression, and either GSM or ADPCM audio compression. It supports QCIF and CIF file formats. It allows to play the movie, stop it and random access the movie.

    1. New elements

    In existing movie formats audio and video data were included in the same stream. But what we wanted to do was to synchronize AV data from two independent streams. Hence it was necessary to open two server connections, send and receive control messages from two separate channels. Unfortunately database system offers information just about video stream. Therefore audio configuration had to be included in the data file. It was done by adding special audio header whose structure is presented below in table1
    Table 1 Audio header
    NameLength (in bytes) Description
    Title12String "npac-audio". Indicates that file is NPAC audio stream
    Rate2Rate in samples per second
    Channels1Channels number:

    1 - for mono

    2 - for stereo

    Sample width2 Sample with in bits
    Code format10 String indicating compression type

    "adpcm", "gsm" supported

    1. Synchronization

    Synchronization is based on the internal SGI Audio Library mechanisms. Procedure sending audio samples to the speaker port blocks until all previous sent samples are played. In this case it is enough to send audio portion and then decode video frame to achieve the synchronization and keep the frame rate. The only problem that this requires single video frame decoding which moves the responsibility of single frame reading from the decoder to the AV client.

    To obtain this effect some changes were necessary both at the decoder and client side:

    1. Random access

    Since audio portions take always the same size it isn't a problem to find beginning of the portion knowing its size. Situation differs in H.263 video compression. First, H.263 frames have different length, second because it is predictive compression only INTRA frames can be accessed in this way. The way of solving this problem is described in chapter H.263 stream offsetting.

    1. H.263 stream offsetting

    H.263 system is supposed to be used as a preview tool to the MPEG movies. Hence H.263 sequence will be acquired by converting MPEG files. As a preview tool it is absolutely necessary to have the capabilities of random access. It implies necessity of offset files generation.

    To obtain offset files it was necessary to add some new features to the H.263 encoder:

    This mechanism allow to generate offset file parallel with movie converting