This talk will introduce the topics of graphics files and the various formats.
|
The main formats discussed are bitmap (PBM, PPM, PGM), GIF, JPEG, and a little on TIFF.
|
The compression techniques discussed are Run-Length Encoding, Huffman Coding, and Dictionary Systems.
|
References:
-
John Levine, ³Programming for Graphics Files in C and C++², John Wiley & Sons, 1994.
-
James Murray and William vanRyper, Encyclopedia of Graphics File Formats, Second Edition, O¹Reilly, 1996.
|
This talk will introduce the topics of graphics files and the various formats.
|
The main formats discussed are bitmap (PBM, PPM, PGM), GIF, JPEG, and a little on TIFF.
|
The compression techniques discussed are Run-Length Encoding, Huffman Coding, and Dictionary Systems.
|
References:
-
John Levine, ³Programming for Graphics Files in C and C++², John Wiley & Sons, 1994.
-
James Murray and William vanRyper, Encyclopedia of Graphics File Formats, Second Edition, O¹Reilly, 1996.
|
Traditionally, graphics files have either been vector or bitmap.
|
Vector files have represented graphics objects in terms of lines, polygons, and other high-level structures. This kind of representation has also been extended in Scene Description files, used in languages such as VRML. These kinds of files are not covered in this talk.
|
We will be discussing the various bitmap formats, in which the graphics data is represented by an array of pixels. Each value in the array represents the color of the pixel, either to be shown on a screen, or printed on paper. The pixel depth is the number of bits used to represent each pixel, which determines how many colors can be used in the image.
|
The pixel arrays can be stored by rows (scan lines), interleaved rows, or by planes based on color components or bits of the color representation:
|
Each pixel is specified by 3 components, representing the amount of red, green and blue in the pixel.
|
This model is additive: each color is created by starting with black and adding in the given amounts of red, green and blue. Additive colors are self-luminous, and are used on color monitors.
|
Component colors are usually given in numbers from 0 to 255, resulting in a total of 24 bits per color.
|
The absence of all colors is black (0, 0, 0) and the presence of all colors is white (255, 255, 255). Bright red is (255, 0, 0). A darker red is produced by adding less red, such as (127, 0, 0), and a lighter red by adding some of the other colors to make a color closer to white.
|
CMY (Cyan, Magenta, Yellow) is a subtractive color system used by printers and photographers to render colors with ink or emulsion. When illuminated, each of the three colors absorbs its complementary light color. Cyan absorbs red; magenta absorbs green; and yellow absorbs blue. By increasing the amount of yellow ink, for instance, the amount of blue in the image is increased.
|
The model is subtractive: each color is created by starting with white and subtracting the given amount of cyan, magenta, and yellow.
|
In real life, the colors are created by mixing inks and true black is difficult to produce by adding all the colors; it turns out a dark brown. For this reason, black is treated as a separate component, the fourth component, K.
|
Colors may be given as a color triple as in RGB, except that (0, 0, 0) is white and (255, 255, 255) is (theoretically) black. The colors may also be given as percentages of 100.
|
Luminence/Chrominance models vary the properties of colors instead of mixing the colors themselves to create colors.
|
In HSV (Hue, Saturation, Value (sometimes Brightness)), the hue is the actual color, the saturation is the amount of white in the hue (full saturation at 100% has no white), and value is the amount of self-luminescence.
|
In YUV, or YCrCb, the Y component is the luminance, and Cr and Cb are the color (chrominance) in amounts of redness and blueness.
|
These models pay more attention to the way that the human eye sees color, namely, that the eye is quite sensitive to changes in brightness, but much less sensitive to changes in color.
|
Instead of storing the actual 24 bit color for each pixel in a file, individual pixels are often stored as index numbers into a color map.
|
The color map itself gives the 24 bit value for each index.
|
A typical color map has from 16 to 256 entries, so the number of bits used to store an index is only 4 or 8 bits, compared to the full 24 bits, resulting in significant savings of space in typical size images. (Of course, the color map itself also has to be stored.)
|
Color maps are almost also used in screen controllers, sometimes called frame buffers, to minimize the memory needed to store the image.
|
More sophisticated schemes based on assigning a (variable length) code for different runs that occur in the image. For example, we would assign a short code of 4 bits or less to runs that occur often, and longer codes up to 10 or 11 bits to the less common runs of pixel values.
|
Named after David Huffman, who established in 1952 how to assign codes given the relative frequency of bytes in a file so that the total coded length was minimized.
|
Huffman schemes require that a table that assigns runs to codes. Either a fixed table is used, or the table is constructed and stored along with the image.
|
The CCITT Group 3 for bi-level images used a fixed Huffman scheme and is widely used for FAX transmission. Examples of Group 3 encodings:
|
20 pixel black run: 0000 1101 000
|
100 pixel white run: 11011 001 0101
|
A pixel run can either have a code (called terminating) or be made up of one or more other codes for runs (called makeup).
|
Capable of compressing continuous-tone image data with a pixel depth of 6 to 24 bits with reasonable speed and efficiency.
|
Lossy compression, but tries to limit loss to what human eye can¹t see. Good quality images can be reproduced from files compressed at 25:1.
|
Toolkit of compression methods: parameters can be chosen to vary image quality versus storage size.
|
Works best on photographs or natural images, not good on single color areas.
|
Compression scheme:
-
Transform the image into an optimal color space. Convert RGB to YCrCb.
-
Downsample chrominace components. Since the eye is most sensitive to the luminence, keep the Y value for each pixel, but only keep 1/4 of the CrCb components, one average value for every 2x2 pixel block.
-
Apply a Discrete Cosine Transform to 8 by 8 blocks, separating high and low-frequency information. This is the most time-consuming part of the transform.
-
Quantize each block with functions weighted for the human eye. This is the step affected most by the Q factor to decide how much high-frequency information to discard.
-
Encode resulting coefficients using a Huffman algorithm.
|
This is a widely used format made available by Compuserv (subject to licensing restricitions by Unisys).
|
The image data stored in a GIF file is LZW compressed.
|
The GIF formats allow for a header, logical screen descriptor and global color table at the beginning. Then for each image, there is a local image descriptor, a local color table and the image data. GIF89 adds other extension blocks of control information.
|
One of the options for storing the image is as ³interlaced². In this scheme, the rows are stored in the order of first and halfway, then the rows halfway inbetween, those and so on.
|
Allows from 1 to 8 bit color.
|
Reasonably fast to decompress.
|