CPS616 Compression Technology
	CPS600 Module on
	Compression
	
	March 4,1995
	

		Roman Markowski
		NPAC
		Syracuse University
		111 College Place
		Syracuse
		NY 13244-4100
	Abstract of CPS600 Compression Module
		This set of foils describes image and video compression schemes concentrating on Wavelets which seem most powerful although MPEG using related but less efficient Fourier technology will be used much more widely initially
		Wavelets are described in detail for Image case where they aree discussed for Telemedicine application.
		JPEG JBIG Fractal H.261 Schemes are briefly reviewed
	Compressing Still and Moving Images 
		image 2000x2000x24 bpp = 12MB                                                                                          
		one second of NTSC-quality video requires 23 MB
		compression - eliminating of redundant or less critical information
			Spatial redundancy: values of neighboring pixels are strongly correlated
			Spectral redundancy: the spectral values for the same pixel location are correlated
			temporal redundancy: frames show very little change in the sequence
		decreases the time and cost of  transmission and the requirements for  storage
	Image/Video Compression
		lossless - removes the redundancy in the signal; ratio 3:1; the heights of every pixel are perfectly reproduced
		lossy - selectively discards "less important" information; ratio 100:1
		controversy: the critical feature of any lossy compression is what is important and what is not.
		evaluation of several image compression technologies
			JPEG - the leading standard in visual compression (compresses the image block by block), lossy, full color, block by block
			JBIG - binary images, lossless, gray scale
			Fractal - compression slow, decompression fast, lossy, 1000:1, bad quality
			PhotoCD - Kodak, 96x64, 192x128, 384x256, 768x512, 1536x1024,3072x2048 
			Wavelet - discovered in 1987, lossy, ratio 100:1, compresses the image as a whole
	Huffman and Other Compression Techniques
		Huffman encoding; Shannon-Fano encoding; arithmetic encoding
			input is sliced into fixed units  while the corresponding output comes in chunks of variable size
		RLE -Run Length Encoding (TIFF, BMP)
			lossless: Replace upto 127 identical characters by two bytes -- first byte is number of identical characters in string, second byte is character itself
			For example: AAAAbbbbbccct   --->   4A5b3c1t
		LZW - Lepel-Ziv-Welch (TIFF, GIF)
			Directory based encoding algorithm
			compress, zoo, lha,  pkzip. arj
			LZ77, LZ78
			input is divided into units of variable length (words) and coded in a fixed-length output code
			shorter bit patterns for most common characters
		DCT - Discrete Cosine Transform (MPEG, JPEG, H.261)
			lossy; converts the spatial image representation into a frequency map
			DCT scheme is effective only for compressing continuous-tone images (from "Graphics File Formats", O'Reilly and Associates Inc., page 162 in description of JPEG algorithm)
	JPEG - Joint Photographic Experts Group
		current international standard for image compression
		lossy algorithm: the threshold of visible difference 20:1
		lossless compression mode: 2:1 ratio, 12bpp (bpp is bits per pixel)
		designed for compression of full-color (24 bit) or gray-scale digitized images of "natural" scenes  (continuous tone)
		exploits the known limitations of the human eye (10,000 colors simultaneously)
		ratio can be varied
		DCT (Discrete Cosine Transformation) and Huffman coding
		source available (Independent JPEG Group)
		ftp://ftp.uu.net/graphics/jpeg/wallace.ps.Z
	JPEG Algorithm Specification -- I
		First transfer the image into a suitable color space (RGB --> Image representations which separate luminance and chrominance YUV, YCbCr, etc.); the human eye is not as sensitive to high-frequency color as it is to high-frequency luminance
			YUV is used in European TV standards and corresponds to YIQ in American NTSC. Y specifies gray-scale or luminance; U and V the chrominance
			YCbCr is another color space where again Y is component of luminancy but Cb and Cr are respectively ther color components in blue(Cb) and red(Cr)
		the luminance component is left at full resolution; color is usually reduced 2:1 (2v1h,2v2h)
			This means chrominance reduced 2:1 vertically wiuth no horizontal reduction(2v1h) or reduced both horizontally and vertically (2v2h)
	JPEG Algorithm Specification -- II
		Then group the pixels for each component into 8x8 blocks; transform each block through DCT
			Note for later -- need blocks as natural frequencies varies over image and not constant on a line of pixels as would correspond to Fourier transforms over full x or y of image. Wavelets "naturally" chooses block size. JPEG has fixed blocks
		In each block, divide each of the 64 frequency components by a separate "quantization coefficient" and round to integers; fundamental lossy step
		Encode the reduced coefficients using Huffman or arithmetic coding (license)
		Add headers and output the result
	JBIG - Joint-bi-level Image Experts Group
		Losslessly compresses binary images (one bit/pixel)
		Effective for bi-level images (black and white)
		Extended to gray scales with upto 8 bits per pixel with good results upto 6 bits per pixel
		Based on IBM's Q-coder - patented technology (no source)
	Fractal Compression -- I 
		lossy algorithm
		patented technology by  Barnsley(no source)
		compression extremely slow (many hours)
		decompression fast
		theoretically very high  ratio - 1000:1
		based on Iterated Function Theory and Partitioned Iterated Function Theory
		for compression ratio up to 40:1 JPEG is better; quality worse than wavelets or JPEG
		details are generated when zooming in the advanced form of interpolation
	Origins of Fractal Compression
		Birth of fractal geometry in paper by B.Mandelbrot "the Fractal Geometry of Nature", 1977
		J. Hutchinson: Iterated Function Theory, 1981
		M.Barnsley, "Fractals Everywhere", 1988
		in the forward direction fractal mathematics is good for generating natural looking images (trees, clouds, mountains)
			Used in Computer Graphics (Fractal trees,Mountains etc.)
		in reverse direction can be used to compress images
		inverse problem: to go from a given image to Iterated Function System that can generate the original (unsolved)
		there are not many fractal compression programs available
		the fractals that lurk within fractal image compression are not those of the complex plane (Mandelbrot, Julia), but of Iterated Function Theory
		example: Sierpinski's Triangle
	MPEG - Moving Picture Experts Group
		standard for compression (synchronized) audio and video
		4 parts: video encoding, audio encoding, "systems" (synchronization) and compliance testing
		defines 352x240 pixels (30 fps in USA, 25fps in Europe)
		MPEG-1- designed for bandwidth up to 1.5 Mbps
		MPEG-2 - higher quality; designed for bandwidth 4-10 Mbps
		MPEG-3 - does not longer exist; MPEG-4 - very low bitrate coding
		DCT (Discrete Cosine Transform ) done on 8x8 blocks
		lossy algorithm with compression in space (DCT) and time (frame to frame)
		3 types of frames
			I (Intra) frames - sent every 10-12 frames as still images
			P (Predicted) frames - deltas from the most recent I or P frame
			B (Bidirectional) frames - interpolation between I and P frames
	H.261 - similar to but not compatible with MPEG
		CCITT standard
		VideoCodec for Audiovisual Services at px64 Kbps (p=1..30)
		describes videosource coder, video multiplex coder and the transmission coder
		designed for ISDN
		defines two picture formats:                                                                                         
			CIF (Common Intermedia Format) - 288x360 luminance, 144x180 chrominance
			QCIF (Quarter CIF) - 144x180 of luminance and 72x90 of chrominance
	Performance Measures
		compression ratio is
		
		
		
		
		
		
		
		
		

		PSNR = Peak-signal-to-noise ratio (in  dB)
		RMSE is Root Mean Standard Error
	Introduction to Wavelets (1)
		new technology
		signal analysis - weighted sum of basis functions 
		Infinitely many possible sets of wavelets
		coefficients contain information about the signal
		basis functions
			impulse function reveals information only about the time domain behavior of the signal
			Fourier representation reveals information about  signal's frequency domain behavior
			we want to have representation which contains info about both the time and frequency (frequency content of the signal at the particular instant of time)
		Heisenberg inequality - resolution in time and in frequency cannot both be made arbitrarily small   
	Introduction to Wavelets (2)
		Image: low frequency events are spread out in time and high frequency events are localized in time
		Further structure of Image is not uniform over x or y direction of image but rather in blocks where say head or a particular detail is. This is why JPEG uses FFT's in blocks. Wavelets automatically build in block structure adaptively so get more detail where you need it.
		sines or cosines (FFT) - cannot provide information about the time behavior of signal because they have infinite support
		impulse function - frequency behavior is not described because of its infinitesimally small support
		wavelets - set of basis functions, each with finite support of a different width
		wide variety of wavelet-based image compression schemes reported in the literature
	Discrete Wavelet Transform
		FFT vs DWT -
			Individual wavelet localized pretty well in both time (space) and frequency
			Fourier expansion functions localized very well in frequency but not all in space or time
		Both fast, linear operations; invertible and orthogonal
		
		
		
		
		
		

		rotation in function space
			From Impulse (delta function) space to that of coefficients of Fourier or Wavelet functions
			Wavelet Transformation on N points performed as log2N rotations in multiresolution fashion
		wavelet - localized in space and frequency; 
			Basis consists of scalings and translations of Mother functions 
	Structure of Wavelet analysis
		Representation of general functions in terms of simpler, fixed building blocks
		Wavelets = small waves
		continuous wavelet transform:
			 translation (b) and dilation (a) parameters vary continuously
		or Discrete wavelet transform
		Multiresolution analysis
	Wavelet Transform Characteristics
		decorrelates the pixel values of the image and concentrates the image info into a relatively small number of coefficients
		can be implemented as a pair of Quadrature Mirror Filters (QMF) which splits a signal's bandwidth in half
			lowpass or smoothing
			 filter (H)
			highpass filter (G)
			which expresses detail
	Daubechie's Mother wavelets 
		 
	Mathematical Structure of Discrete Wavelet Transform-I
		Transform can be written for one of log2N steps as a matrix operation given on following foil. As we will see each step deals with half of data from previous step and resolution examined doubles at each step
		In this way we look at ALL resolutions i.e. all length scales
		Each smoothing is constructed in this particular wavelet formulation as an average over 4 points
		

		The matrix is orthogonal if:
		

		The matrix will zero first two moments on sets of four points if: 
	Matrix Structure of a Simple Wavelet Transformation
		This is basic matrix for 1D Wavelet transform where points are smoothed in groups of four
	Mathematical Structure of Discrete Wavelet Transform-II
		The solution of these 4 equations is:
		
		
		

		The Smoothing Filter H is the average:
		

		The detail is contained in operator G:
		

		First step produces N/2 smooth and N/2 detail values.
			Then we take N/2 smooth values and repeat step getting
			N/4 smooth and N/4 further detail results at double spatial resolution
		This recursively leads to pyramid algorithm
	Pyramid Algorithm in one dimension
		 
	How Image wavelet compression works
		We have reviewed 1D Wavelet Transform
		There are Three steps in 2D Image Case
			transformation of the image data using a predefined set of basis functions (multiresolution and orthogonal)
			quantization of the basis function coefficients (lossy)
			coding of the resulting data set to remove redundancy (Huffman lossless encoding)
	How wavelet compression works
	Pictorially
		 
	2D Forward/inverse wavelet transform
		Best methods use direct 2D methods starting with 2 by 2 or similar pixel blocks. However one can compose in several ways 1D transforms. It is NOT best to first transform in x and then transform in y. Rather ...
		A nice method uses two separate but INTERMIXED 1D transforms
			Image filtered along the  x  dimension
			Downsample by 2 along x
			Filter along the y dimension
			Downsample by 2 along y
		Recursively transform the average signal (depending on required ratio)
		We have matrix of coefficients (average signal and detail signals of each scale); no compression has been accomplished yet; in fact there has been an increase in amount of storage required
	2D Forward wavelet transform 
		 
	2D Inverse wavelet transform
		 
	Wavelets -- Quantization (1)
		compression is achieved by quantizing and encoding wavelet coefficients
		coefficients that corresponds to smooth parts of data become small and can be set to 0
	Wavelets -- Quantization (2)
		we can eliminate (set to zero) these coefficients with small magnitudes without creating significant distortion in the reconstructed image
		energy invariance
			total amount of energy in the image does not change when the wavelet transform is applied
			any change to wavelet coefficients will result in proportional changes in the pixel values of the reconstructed image
		we can eliminate all but a few percent of wavelet coefficients
		thresholding function - amount of compression controlled by "t" parameter
	Wavelets -- Quantization (3)
		Quantizing of non-zero wavelet coefficients
			many-to-one staircase functions
			decision points {Di, i=0,...,n}
			reconstruction levels {Ri, i=0,...,n}
			input value 'x' is mapped to a reconstruction level 'Ri' if 'x' lies in the interval  (Di, Di+1]
			separate quantizer for each scale
	Wavelets -- Coding
		codec = encoder/decoder
		losslessly compressing and decompressing the sparse matrix of quantized coefficients
		wide variety of schemes
		compromise between: memory, execution speed, available bandwidth, reconstructed image quality
		example: Shapiro's Zero Tree encoding
	Wavelets in Telemedicine
		Massachusetts General Hospital: no clinically significant image degradation was identified in radiology images up to 30:1
		wavelet-based compression technology is superior to all other compression technologies (keeps details, high compression ratio)
		from a signal processing standpoint, one may view the image as a signal that has high-frequency (high-spatial detail) and low-frequency (smooth) components.The algorithm filters the signal and then iterates the process.
	Wavelets -existing software
		HCOMPRESS - astronomical images, Richard L.White, Space Telescope Science Institute (gray scale)
		EPIC - Efficient Pyramid Image Coder, Eero P.Simoncelli, MIT Media Library
		IMAGER LIBRARY - available on the Web via the Wavelet Digest home page
		Commercially available software for medical applications; Aware, Inc - AccuPress Library
	Comparison W6+VLC, Biorthogonal+VLC, JPEG image coders
		 
	Wavelets -- Video compression
		exploiting the temporal redundancies present in an image sequence (as done by MPEG)
		techniques: hierarchical motion compression, 3D subband coding
		very time consuming - 256x256x8bpp image on 66MHz 80486 - 0.25 sec inverse wavelet transform of a single full frame (4 frames per second)
		it is necessary to perform the complete inverse transform for each frame (in a slowly varying image sequence)
		16 frames per second if just transform changes in image -- fewer nonzero pixels and so faster
	Block diagram of the video encoder
		 
	Block diagram of the video  decoder
		 
	Wavelet references
		wavelet@math.scarolina.edu
		http://www.mathsoft.com/wavelets.html
		"Numerical recipes", WH Press, SA Teukolsky, WT Veterling, BP Flannery
		"Ten Lectures on Wavelets", Ingrid Daubechies, 1992