Given by Roman Markowski and Geoffrey Fox at CPS600 Spring 1995 on March 1995. Foils prepared July 6,1995
Abstract * Foil Index for this file
This set of foils describes image and video compression schemes concentrating on Wavelets which seem most powerful although MPEG using related but less efficient Fourier technology will be used much more widely initially |
Wavelets are described in detail for Image case where they aree discussed for Telemedicine application. |
JPEG JBIG Fractal H.261 Schemes are briefly reviewed |
This table of Contents
Abstract
Roman Markowski |
NPAC |
Syracuse University |
111 College Place |
Syracuse |
NY 13244-4100 |
This set of foils describes image and video compression schemes concentrating on Wavelets which seem most powerful although MPEG using related but less efficient Fourier technology will be used much more widely initially |
Wavelets are described in detail for Image case where they aree discussed for Telemedicine application. |
JPEG JBIG Fractal H.261 Schemes are briefly reviewed |
image 2000x2000x24 bpp = 12MB |
one second of NTSC-quality video requires 23 MB |
compression - eliminating of redundant or less critical information
|
decreases the time and cost of transmission and the requirements for storage |
lossless - removes the redundancy in the signal; ratio 3:1; the heights of every pixel are perfectly reproduced |
lossy - selectively discards "less important" information; ratio 100:1 |
controversy: the critical feature of any lossy compression is what is important and what is not. |
evaluation of several image compression technologies
|
Huffman encoding; Shannon-Fano encoding; arithmetic encoding
|
RLE -Run Length Encoding (TIFF, BMP)
|
LZW - Lepel-Ziv-Welch (TIFF, GIF)
|
DCT - Discrete Cosine Transform (MPEG, JPEG, H.261)
|
current international standard for image compression |
lossy algorithm: the threshold of visible difference 20:1 |
lossless compression mode: 2:1 ratio, 12bpp (bpp is bits per pixel) |
designed for compression of full-color (24 bit) or gray-scale digitized images of "natural" scenes (continuous tone) |
exploits the known limitations of the human eye (10,000 colors simultaneously) |
ratio can be varied |
DCT (Discrete Cosine Transformation) and Huffman coding |
source available (Independent JPEG Group) |
ftp://ftp.uu.net/graphics/jpeg/wallace.ps.Z |
First transfer the image into a suitable color space (RGB --> Image representations which separate luminance and chrominance YUV, YCbCr, etc.); the human eye is not as sensitive to high-frequency color as it is to high-frequency luminance
|
the luminance component is left at full resolution; color is usually reduced 2:1 (2v1h,2v2h)
|
Then group the pixels for each component into 8x8 blocks; transform each block through DCT
|
In each block, divide each of the 64 frequency components by a separate "quantization coefficient" and round to integers; fundamental lossy step |
Encode the reduced coefficients using Huffman or arithmetic coding (license) |
Add headers and output the result |
Losslessly compresses binary images (one bit/pixel) |
Effective for bi-level images (black and white) |
Extended to gray scales with upto 8 bits per pixel with good results upto 6 bits per pixel |
Based on IBM's Q-coder - patented technology (no source) |
lossy algorithm |
patented technology by Barnsley(no source) |
compression extremely slow (many hours) |
decompression fast |
theoretically very high ratio - 1000:1 |
based on Iterated Function Theory and Partitioned Iterated Function Theory |
for compression ratio up to 40:1 JPEG is better; quality worse than wavelets or JPEG |
details are generated when zooming in the advanced form of interpolation |
Birth of fractal geometry in paper by B.Mandelbrot "the Fractal Geometry of Nature", 1977 |
J. Hutchinson: Iterated Function Theory, 1981 |
M.Barnsley, "Fractals Everywhere", 1988 |
in the forward direction fractal mathematics is good for generating natural looking images (trees, clouds, mountains)
|
in reverse direction can be used to compress images |
inverse problem: to go from a given image to Iterated Function System that can generate the original (unsolved) |
there are not many fractal compression programs available |
the fractals that lurk within fractal image compression are not those of the complex plane (Mandelbrot, Julia), but of Iterated Function Theory |
example: Sierpinski's Triangle |
standard for compression (synchronized) audio and video |
4 parts: video encoding, audio encoding, "systems" (synchronization) and compliance testing |
defines 352x240 pixels (30 fps in USA, 25fps in Europe) |
MPEG-1- designed for bandwidth up to 1.5 Mbps |
MPEG-2 - higher quality; designed for bandwidth 4-10 Mbps |
MPEG-3 - does not longer exist; MPEG-4 - very low bitrate coding |
DCT (Discrete Cosine Transform ) done on 8x8 blocks |
lossy algorithm with compression in space (DCT) and time (frame to frame) |
3 types of frames
|
CCITT standard |
VideoCodec for Audiovisual Services at px64 Kbps (p=1..30) |
describes videosource coder, video multiplex coder and the transmission coder |
designed for ISDN |
defines two picture formats:
|
compression ratio is |
PSNR = Peak-signal-to-noise ratio (in dB) |
RMSE is Root Mean Standard Error |
new technology |
signal analysis - weighted sum of basis functions |
Infinitely many possible sets of wavelets |
coefficients contain information about the signal |
basis functions
|
Heisenberg inequality - resolution in time and in frequency cannot both be made arbitrarily small |
Image: low frequency events are spread out in time and high frequency events are localized in time |
Further structure of Image is not uniform over x or y direction of image but rather in blocks where say head or a particular detail is. This is why JPEG uses FFT's in blocks. Wavelets automatically build in block structure adaptively so get more detail where you need it. |
sines or cosines (FFT) - cannot provide information about the time behavior of signal because they have infinite support |
impulse function - frequency behavior is not described because of its infinitesimally small support |
wavelets - set of basis functions, each with finite support of a different width |
wide variety of wavelet-based image compression schemes reported in the literature |
FFT vs DWT -
|
Both fast, linear operations; invertible and orthogonal |
rotation in function space
|
wavelet - localized in space and frequency;
|
Representation of general functions in terms of simpler, fixed building blocks |
Wavelets = small waves |
continuous wavelet transform:
|
or Discrete wavelet transform |
Multiresolution analysis |
decorrelates the pixel values of the image and concentrates the image info into a relatively small number of coefficients |
can be implemented as a pair of Quadrature Mirror Filters (QMF) which splits a signal's bandwidth in half
|
Transform can be written for one of log2N steps as a matrix operation given on following foil. As we will see each step deals with half of data from previous step and resolution examined doubles at each step |
In this way we look at ALL resolutions i.e. all length scales |
Each smoothing is constructed in this particular wavelet formulation as an average over 4 points |
The matrix is orthogonal if: |
The matrix will zero first two moments on sets of four points if: |
This is basic matrix for 1D Wavelet transform where points are smoothed in groups of four |
The solution of these 4 equations is: |
The Smoothing Filter H is the average: |
The detail is contained in operator G: |
First step produces N/2 smooth and N/2 detail values.
|
This recursively leads to pyramid algorithm |
We have reviewed 1D Wavelet Transform |
There are Three steps in 2D Image Case
|
Best methods use direct 2D methods starting with 2 by 2 or similar pixel blocks. However one can compose in several ways 1D transforms. It is NOT best to first transform in x and then transform in y. Rather ... |
A nice method uses two separate but INTERMIXED 1D transforms
|
Recursively transform the average signal (depending on required ratio) |
We have matrix of coefficients (average signal and detail signals of each scale); no compression has been accomplished yet; in fact there has been an increase in amount of storage required |
compression is achieved by quantizing and encoding wavelet coefficients |
coefficients that corresponds to smooth parts of data become small and can be set to 0 |
we can eliminate (set to zero) these coefficients with small magnitudes without creating significant distortion in the reconstructed image |
energy invariance
|
we can eliminate all but a few percent of wavelet coefficients |
thresholding function - amount of compression controlled by "t" parameter |
Quantizing of non-zero wavelet coefficients
|
codec = encoder/decoder |
losslessly compressing and decompressing the sparse matrix of quantized coefficients |
wide variety of schemes |
compromise between: memory, execution speed, available bandwidth, reconstructed image quality |
example: Shapiro's Zero Tree encoding |
Massachusetts General Hospital: no clinically significant image degradation was identified in radiology images up to 30:1 |
wavelet-based compression technology is superior to all other compression technologies (keeps details, high compression ratio) |
from a signal processing standpoint, one may view the image as a signal that has high-frequency (high-spatial detail) and low-frequency (smooth) components.The algorithm filters the signal and then iterates the process. |
HCOMPRESS - astronomical images, Richard L.White, Space Telescope Science Institute (gray scale) |
EPIC - Efficient Pyramid Image Coder, Eero P.Simoncelli, MIT Media Library |
IMAGER LIBRARY - available on the Web via the Wavelet Digest home page |
Commercially available software for medical applications; Aware, Inc - AccuPress Library |
exploiting the temporal redundancies present in an image sequence (as done by MPEG) |
techniques: hierarchical motion compression, 3D subband coding |
very time consuming - 256x256x8bpp image on 66MHz 80486 - 0.25 sec inverse wavelet transform of a single full frame (4 frames per second) |
it is necessary to perform the complete inverse transform for each frame (in a slowly varying image sequence) |
16 frames per second if just transform changes in image -- fewer nonzero pixels and so faster |
wavelet@math.scarolina.edu |
http://www.mathsoft.com/wavelets.html |
"Numerical recipes", WH Press, SA Teukolsky, WT Veterling, BP Flannery |
"Ten Lectures on Wavelets", Ingrid Daubechies, 1992 |