TR#99: Orthogonal pyramid transforms for image coding

E.H. Adelson, E. Simoncelli, R. Hingorani

Published in pages 50-58 of:
Proceedings of SPIE, Vol. 845
Visual Communication and Image Processing II
27-29 October 1987
Cambridge, MA

We describe a set of pyramid transforms that decompose an image into a set of basis functions that are (a) spatial-frequency tuned, (b) orientation tuned, (c) spatially localized, and (d) self-similar. For computational reasons the set is also (e) orthogonal and lends itself to (f) rapid computation. The systems are derived from concepts in matrix algebra, but are closely connected to decompositions based on quadrature mirror filters. Our computations take place hierarchically, leading to a pyramid representation in which all of the basis functions have the same basic shape, and appear at many scales. By placing the high-pass and low-pass kernels on staggered grids, we can derive odd-tap QMF kernels that are quite compact. We have developed pyrmaids using separable, quincunx, and hexagonal kernels. Image data compression with the pyramids gives excellent results, both in terms of MSE and visual appearance. A non-orthogonal variant allows good performance with 3-tap basis kernels and the appropriate inverse sampling kernels.