Introduction: The Ideal Image Compression
Lossy Compression Common Practices
Most lossy compression concepts are very complicated and involve complex mathematic transformation. All lossy algorithms work on the same principle: try and change the original data just engough so the compression is more effective, but not too much. Despite the complexity of those algorithms, only a few have been successful in achieving very good compression rates while keeping a fair representation of the original data at a low cpu usage. But these algorithms are so complex they are difficult to implement, difficult to improve, and it's difficult to add new features for the resulting images.
Lossy compression of sounds use the very clever concept, easy to understand, of "psycho-acoustic model". This means researches have been done to understand enough of the human ears, to compute how much modification of a sound sample can be done without being noticed by a listener. The mathematical transformations involved here are still complex because of the nature of the sound, but most people can understand what the "model" is about.
There is no such thing as a "aesthetico-visual model" for lossy compression of images. The traditional mesurement of image degradation is done by pixel-to-pixel differences, in order to compute a so-called peak "signal-to-noise" ratio (PSNR). This is supposed to give an estimate of how much damage has been done to the original image. The PSNR is mesured in dB (decibell), which is a mesure on a logarithm scale: each 6dB lost doubles the mean pixel-to-pixel difference between the two images. A PSNR of 48dB means that there is a mean difference of 1 on each pixel of a 8bit per pixel image. A JPEG image of quality 75 has a PSNR of about 33dB compared to its original image, which is about the same as using 5 or 6 bits per pixel. Very good quality is usually around 40dB, where degradation are not visible, which is about the same as encoding each pixel on 6 or 7 bits.
Lossy Compression Common Weaknesses.
However handy for estimating the efficiency of an algorithm, these mesures are unfortunatly very remotely representative of what a human being will feel as an aesthetic or annoying alteration of an image. For example, the JPEG compression is famous for its "wavy" artefacts in color gradients. Wavelet-based compression programs are more respectful of the gradients, but they deteriorate the borders of highly contrasted areas of the image. Fractal-based compression algorithms have a little bit of both problems. This is why the image compression "gurus" are trying to enhance lossy compression algorithms by improving their mathematical representations, up to a point where there is no more relation with the compressed data representing an actual image.
We think that a simpler algorithm for lossy compression of images would have many advantages over usual algorithms: easy implementation on almost any plateforms, valuable added features, more contributions from users for improvements, more controlable results. The obvious requirement for that simpler algorithm to be really useful is that it gives similar or better results than usual algoritms, namely JPEG.
There is one easy model to follow to degrade an image in a not too annoying manner, though: by making it blurry. The algorithm proposed in this paper tries to leverage this idea by blurring the "irrelevant" areas of the image, while keeping sharp the "relevant" areas.
Staying On The Path
To be really useful, a compression algorithm must be efficient at all stages. For instance, the JPEG algorithm is using a Huffman lossless compression over the DCT and quantization steps. The proposed algorithm is only intended to do a 2D decorelation of the data and does not try to define a way to compress resulting the data stream. There are many algorithms of lossless compression of binary data that can be used by applying the stream to an external program: gzip, bzip2, arithmetic, etc.
The proposed algorithm also tries to address several flaws, besides the visual degradations, commonly present in most image compression algorithms.
- Most importantly, the weight (as in kilobytes) of the compressed image must be mainly dependant on the amount of details present in the original image, not dependant on the size of the original image. Although these two values seem to be related, they are not. A simple experience can be to double the width and height of an image with a fair pixel-doubling algorithm, such as a bicubic interpolation (not a "smart" algorithm that tries to invent textures). Since no new detail has been added to the image, the two images, once compressed, should have roughly the same weight (depending on how much the compression algorithm "likes" the interpolation). Most compression algorithms keep the 4:1 ratio between the two compressed images (except for fractal-based algorithms).
- The degradation should not create details that need to be compressed: when an already compressed image is compressed further, the resulting image should be the same as if it was compressed at the same ratio from the same original image. Most compression algorithm put some artefacts in the image, creating fake details that will be considered as actual details if another compression pass is done, thus degrading the image even more at the second pass by wasting useful compressed data resources.
- Also, when one wants to display any subpart of an image, it should not be necessary to decompress the whole image and crop the wanted portion.
- Finally, due to the composite nature of images, it should be possible to use different compression parameters on different areas of an image.
Last modification: March 2002, © Eric GAUDET, 2002