ABSTRACT

Most (but not all) modern data compression problems are of the following form: you have a long binary word (or “file”) W which you wish to transform into a shorter binary word U in such a way that W is recoverable from U , or, in ways to be defined case by case, almost or substantially recoverable from U . In case W is completely recoverable from U , we say we have lossless compression. Otherwise, we have lossy compression. The compression ratio is lgth(W )/ lgth(U). The “compression ratio achieved by a method” is the average compression ratio obtained, using that method, with the average taken over all instances of W in the cases where the method is used. (This taking of the average is usually hypothetical, not actual.)

Sometimes the file W is sitting there, available for leisurely perusal and sampling. Sometimes the file W is coming at you at thousands of bits per second, with immediate compression required and with no way of foretelling with certainty what the bit stream will be like 5 seconds from now. Therefore, our compression methods will be distinguished not only by how great a compression ratio they achieve, together with how much information they preserve, but also by how fast they work, and how they deal with fundamental changes in the stream W (such as changing from a stream in which the digits 0,1 occur approximately randomly to one which is mostly 0’s).