WWW: Beyond the Basics

11. Real-time Audio and Video

11.3 Digital Video

A video signal is also analog. Before a video signal can be processed by a computer or transmitted through a computer network, it needs to be digitized. Because digital video data is often too large to be transmitted through a computer network, compression techniques become very important. In this section, the concepts are presented to develop an understanding of motion video. These include the moving picture theorem, the color encoding theorem, and the compression formats for real-time video.

11.3.1 Moving Picture

A moving picture may be represented using a discrete sequence of still pictures (frames) as long as they are presented rapidly enough to the human eye. This property is used in television and motion pictures. To represent visual reality, two conditions must be met. First, the rate of repetition of the images or frame rate must be high enough to guarantee smooth motion from frame to frame. Second, the rate must be high enough so that the persistence of vision extends over the interval between successive images. The unit of frame rate is frames per second or fps.

It is well known that the human eye perceive a continuous motion at any frame rate faster than 15 fps. Video motion seems smooth when frame rate is achieved at 30 fps.

There are two widely used standards for motion video signals: the National Television System Committee (NTSC) standard and the Phase Alteration Line (PAL) standard. NTSC is used in the Americas and Japan. It specifies the frame rate to 30 fps. PAL is used in the European, China, and Australia. It adopted 25 fps as the frame rate.

11.3.2 Color Encoding

The human eye have three types of photo-receptors, one is sensitive to red, one is to green, and one is to blue. Any color perceived by the human eye is the mixture of those three. The 3-dimensional color space whose axes correspond to those three colors are called the RGB color space. The color encoding systems used for video are derived from this color space. The following is the summary of all approaches of encoding schemes used for video:

  1. RGB signal
  2. YUV signal
  3. YIQ signal

RGB signal consists of separate signals for red, green, and blue colors. Any color can be coded as a combination of these primary colors.

As human perception is more sensitive to brightness than color information, a more suitable coding distinguishes between brightness and color. Instead of separating colors, the YUV signal separates the brightness information (Y) from the color information (U and V). The component division for YUV signal is :

Y=0.30R+0.59G+0.11B
U=0.493(B-Y)
V=0.877(R-Y)

The resolution of luminance is more important than that of the color. Therefore, the brightness values can be coded using more bits than the color values, for example, the YUV encoding can be specified as (4:2:2) signal.

YIQ signal is similar to the YUV signal, which builds the basis for the NTSC format.

Y=0.30R+0.59G+0.11B
I=0.60R-0.28G-0.32B,
Q=0.212R-0.52G+0.31B


11.3.3 Compression Formats

Analog video needs to be converted to digital form in order to be processed in a computer. Each frame of video becomes a two dimensional array of picture elements (pixels). A full color picture is composed of three such 2-D arrays. The most common array sizes are 640x480 and 320x240, and each pixel is quantized to 256 gray levels. For example, The Sun Video Digitizer from Sun Microsystems capture the NTSC video signal with a frame resolution of 320*240 pixels, quantization of 8 bits/pixel, and a frame rate of 30 fps.

The transmission of uncompressed video data over a computer network requires very high bandwidth. The video data must be compressed before transmission. In applications, the speed of compression and decompression is very important. If the compression or decompression consume too much computational resources, this will slow down the frame rate. Compression can be performed either in hardware or in software. Software compression is more flexible, but hardware compression is faster. For large frame sizes and high frame rates, using hardware compression/decompression may be the only viable option.

Below two commonly used compression formats for real-time video are summarized. (See [Borko,1995] for the details of the compression schemes.)

ITU-T Recommendation H.261
H.261 was developed for processing the encoding/decoding of real-time audio and video. The Recommendation is intended to be used at video bit rates between 40 kbps and 2 Mbps.
The MPEG Motion Video Compression Standard
MPEG refers to Moving Picture Expert Group. The MPEG standard was developed to cover motion video as well as audio coding according to the ISO/IEC standardization process. MPEG-1's image size is either 160*120 or 320*240, 24-bit per pixel (color). The frame rate is 30 fps. It delivers acceptable video quality at compressed data rates between 1.0 and 1.5 Mbps, yet maintains audio/video synchronization.

For more information, see Steinmetz and Nahrstedt, and Standards Related to Desktop Videoconferencing.


[PREV] [NEXT] [UP] [HOME] [VT CS]

<shaohong@csgrad.cs.vt.edu>
Last modified: Sun Dec 8 1996