Image and Video Compression: An Overview | 6

ABSTRACT

Presently we are witnessing an explosion of information. Information is generated in various forms such as text, audio, image, video, etc., which are commonly referred to as multimedia content. Within a very short period, availability of inexpensive yet powerful digital cameras and video recorders in the consumer world has further accelerated the pace of research and development in processing these media. One of the very initial tasks of processing these data is to compress them prior to their storage and/or transmission. By compressing data, we want to reduce the size required for their representation. For image and video, this is usually achieved by an alternative representation in a different domain other than their original domains of representation, which are spatial and spatiotemporal for images and videos, respectively. Two such popular representations of images and videos are formed by their discrete cosine transforms (DCT) [3] and discrete wavelet transforms (DWT) [92]. We refer to these domains as compressed domains or domains of compression. Since images and videos are increasingly available in these forms, there is a need for their direct analysis in those spaces. This potentially makes the processing faster by avoiding the overhead of decompression and recompression of the data. Further, a smaller data size in the compressed domain provides an additional advantage of keeping the memory requirement low. However, we should make a careful cost-benefit analysis in terms of actual computation cost and memory requirement before suggesting any equivalent processing in these domains. Even though direct processing in compressed domain avoids the decompression and recompression tasks, equivalent computations with transform coefficients may become too costly compared to their spatial domain operations.