FRAME BUFFERS AND COLORMAPS
Before describing the hardware that is used in color image processing, it is helpful to define what we mean by "color". For our purposes, we will define a color to be an ordered triple of color components in the Red-Green-Blue color space: (Cr, Cg, Cb). Each component of the color is an integer in the range [0,255] requiring 8 bits of memory. This is a common representation for colors in the image processing world. This is not the same as the video standard used by broadcast television in the United States.
FRAME BUFFERS
A frame buffer, or picture memory, is a computer memory organized into an m x n rectangular grid of dots, called picture elements, or pixels for short. Each pixel requires a certain number of bits, varying from 1 in bit-map displays to 24 or more in high-quality color displays. We call the number of bits per pixel the "depth" of the frame buffer.
A common size of frame buffers is m=n=512, at a depth of 8 bits: 512x512x8. A picture this size takes up 256 Kilobytes. When interfaced to a processor, it is possible to read and write into the frame buffer much like one reads and write into standard computer memories. The main difference is that this memory is scanned, usually 30 times per second, to generate a video picture which is displayed on a CRT monitor. A magnified view of the pixels in a color image is shown in figure 1. Further explanation of frame buffers, color displays, and other computer graphics hardware can be found in Newman and Sproull [36].
There are several different raster-scan frame buffer architectures commercially available at present. It is useful to distinguish between two classes of color frame buffers: let's call them "segregated" and "integrated". In segregated frame buffers, there are three independent memories for the red, green, and blue components of an image. Typically 8 bits are used per picture element (pixel). A color lookup table for each component is inserted between the picture memory and display device in order to allow contrast correction.
An integrated color frame buffer, on the other hand, stores a single color number at each pixel rather than three separate components. These color numbers (called pixel values in the remainder of the paper) are used as addresses into a single color lookup table. This "colormap" provides a level of indirection between the data in the picture memory and the actual displayed image. Also, it should be realized that the order of colors in a colormap is arbitrary.
With an integrated frame buffer of 3*n bits per pixel, one can simulate a segregated frame buffer having n bits per primary; an integrated frame buffer is more general than a segregated one. Unfortunately, the memory required for the colormap of a deep integrated frame buffer gets to be prohibitive. The colormap of a 24-bit integrated frame buffer, if one were to be built, would require at least 48 Megabytes of memory. This would make the colormap larger than the picture memory, which would require approximately 0.8 Megabytes. For this reason, one rarely finds integrated frame buffers with more than 8 or 9 bits per pixel.
Currently, segregated frame buffers are much more common than integrated systems in the image processing world, probably because of their simplicity and the higher quality of images. This may change, however. At this moment, shallow, low-resolution integrated graphic terminals such as the Apple are making their way into the home computer market. They may become much more common than the huge frame buffers one finds in universities. If this were to happen, the demand for color image quantization programs would certainly increase.
Most people feel that 6 to 8 bits per component are required for high quality grayscale images. With fewer than 100 quantization levels, the eye is distracted by the artificial edges called "contours." Figure 2 shows a 2-bit quantized image.
Note that the colormap also allows some redundancy. For example, there are many ways of displaying a totally black image. One could black out the entire colormap and pepper the picture memory with random pixel values, or one could zero the picture memory and only set slot 0 of the color map to be black (R=G=B=0).
CURRENT HARDWARE:
The algorithms outlined in chapter IV were implemented for the Ramtek 9300 Raster Display Frame Buffers at the Architecture Machine Group (AMG) at MIT. These are integrated 9-bit systems with 24-bit colors in the colormap. This implies that the colormap (also called a color matrix) is a 9x24 table: 9 bits in, 24 bits out. Colormaps for the Ramtek require 2K of memory. The resolution of these picture memories is m=640 pixels horizontally by n=480 lines vertically, or a total of N=m x n= 307,200 pixels.
Color image separations come from several sources: photographs digitized by vidicon through red, green, and blue filters, digitized video frames, and images from other labs recorded on magnetic tape. The quantization programs described later take the three 8-bit separations and quantize them down to 8 or 9.
OPTIMIZING THE COLORMAP
When an image is quantized, each of the 3-dimensional colors must be encoded into a single pixel value. At the AMG, there are 307,200 24-bit colors in the original image, each of which must be transformed into a pixel value. This is the quantizing step, which will be done once (in software) before saving the image on disk or tape. When the image is loaded into the frame buffer, the display processor in the frame buffer does the decoding for us through the colormap. This transforms our color numbers (pixel values) back into a 3-dimensional color video signal.
We wish the quantized, displayed image to approximate as closely as possible the 24-bit original. Although we cannot (at the AMG) display the 24-bit original for subjective comparison with the quantized image, the "quality" of the approximation can be measured with distortion formulas. This distortion formula is a measure of the similarity of the quantized image with the original. The formula we will use is:
We try to minimize D by varying the colormap. In other words, we want to approximate a 24-bit-per-pixel image with an 8 or 9-bit one by minimizing the sum of the distances between corresponding pixels in the original and quantized images. This is a difficult optimization problem with no fast solution.
ARE WE USING THE PROPER DISTORTION MEASURE?
Image quality and image similarity is a very difficult physiological and psychological problem. We will not attempt any new discoveries in this area. Instead, we will use the simple distance formula in order to speed the quantization algorithms. A complete evaluation of image similarity would include such complex topics as texture, continuity, contrast, spatial frequency, color balance, and activity. Fortunately, one can achieve relatively good results by ignoring most of these effects and comparing only corresponding pixels in the original and quantized images. That is the approach used here.
C. F. Hall [19] and others have achieved excellent color image compression though the use of perceptual transform coding. By transforming into a perceptual space, redundant spectral and spatial data are compressed. Hall achieved good results even at 1 bit per pixel. Unfortunately, these results are not applicable to our frame buffer quantization problem, because state-of-the-art frame buffers cannot yet compute spatial transforms in real time. The best one can do within the constraints of integrated frame buffer architecture is to choose the color quantization cells based on a perceptual distortion model of color differences; no spatial effects are relevant.
What sort of color distortions can we expect when we compress images? When highly continuous and colorful images are quantized, it is not possible to match every color exactly, so we anticipate contouring effects. The contouring effect can be alleviated by the addition of color noise. Experiments with noise addition are presented in chapter IV.
We can now proceed to reformulate our image processing problem into mathematical terms. Let K be the number of colors to which we are quantizing. Try all possible colormaps, and choose the one with the least distortion, but this is a horrendous computation which would require eons of computer time. The natural approach is to derive our colormap from the color distribution of the image being quantized.
The distribution of colors in an image is analogous to a distribution of N points in the RGB color cube. The color cube is a discrete lattice because the colors of our original image have integer components between 0 and 255. Thus there are 256x256x256 colors in the lattice, or a total of 16 million colors. Although the number of possible colors is many times greater than the number of pixels in an image.
It is understood that we intend to generate a different colormap for each image depending on its color range. In one sentence, this is two-pass, adaptive intraframe tapered quantization.