Discrete cosine transform

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of images (e.g. JPEG and HEIF, where small high-frequency components can be discarded), video (e.g. MPEG and AVC), and audio (e.g. MP3 and AAC), to spectral methods for the numerical solution of partial differential equations. The use of cosine rather than sine functions is critical for compression, since it turns out (as described below) that fewer cosine functions are needed to approximate a typical signal, whereas for differential equations the cosines express a particular choice of boundary conditions. A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of images (e.g. JPEG and HEIF, where small high-frequency components can be discarded), video (e.g. MPEG and AVC), and audio (e.g. MP3 and AAC), to spectral methods for the numerical solution of partial differential equations. The use of cosine rather than sine functions is critical for compression, since it turns out (as described below) that fewer cosine functions are needed to approximate a typical signal, whereas for differential equations the cosines express a particular choice of boundary conditions. In particular, a DCT is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. The DCTs are generally related to Fourier Series coefficients of a periodically and symmetrically extended sequence whereas DFTs are related to Fourier Series coefficients of a periodically extended sequence. DCTs are equivalent to DFTs of roughly twice the length, operating on real data with even symmetry (since the Fourier transform of a real and even function is real and even), whereas in some variants the input and/or output data are shifted by half a sample. There are eight standard DCT variants, of which four are common. The most common variant of discrete cosine transform is the type-II DCT, which is often called simply 'the DCT'. Its inverse, the type-III DCT, is correspondingly often called simply 'the inverse DCT' or 'the IDCT'. Two related transforms are the discrete sine transform (DST), which is equivalent to a DFT of real and odd functions, and the modified discrete cosine transform (MDCT), which is based on a DCT of overlapping data. Multidimensional DCTs (MD DCTs) are developed to extend the concept of DCT on MD Signals. There are several algorithms to compute MD DCT. A new variety of fast algorithms are also developed to reduce the computational complexity of implementing DCT. The discrete cosine transform (DCT) was pioneered in a 1974 research paper by Nasir Ahmed, T. Natarajan and K. R. Rao. It was a benchmark publication, and has been cited as a fundamental development in many works since its publication. The basic research work and events that led to the development of the DCT were summarized in a later publication by Ahmed, 'How I came up with the Discrete Cosine Transform'. There were further developments made to DCT in later research papers. A 1977 paper by Wen-Hsiung Chen, C.H. Smith and S.C. Fralick presented a fast DCT algorithm, with further developments presented in a 1978 paper by N.J. Narasinha and S.C. Fralick, and a 1984 paper by B.G. Lee. These research papers, along with the original 1974 paper by Ahmed, Natarajan, and Rao, were cited by the Joint Photographic Experts Group as the basis for JPEG's lossy image compression algorithm in 1992. In 1988, the first DCT-based video coding format, H.261, was introduced. DCT was later also used for the MPEG-1 format, introduced in 1992. DCT subsequently became the standard for all of the MPEG video coding formats that followed. Modified discrete cosine transform (MDCT) was proposed by Princen, Johnson and Bradley in 1987, following earlier work by Princen and Bradley in 1986. MDCT is used in modern audio compression formats such as MP3 and AAC. The DCT, and in particular the DCT-II, is often used in signal and image processing, especially for lossy compression, because it has a strong 'energy compaction' property: in typical applications, most of the signal information tends to be concentrated in a few low-frequency components of the DCT. For strongly correlated Markov processes, the DCT can approach the compaction efficiency of the Karhunen-Loève transform (which is optimal in the decorrelation sense). As explained below, this stems from the boundary conditions implicit in the cosine functions.

Parent Topic

Child Topic

No Parent Topic