سیستم بر اساس بلند کردن اجسام برای فشرده سازی و طبقه بندی تجارت کردن در چارچوب JPEG2000
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|22402||2004||18 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Visual Communication and Image Representation, Volume 15, Issue 2, June 2004, Pages 145–162
In this paper, we propose a design for a novel lifting based wavelet system that achieves the best trade off between compression and classification performances. The proposed system is based on bi-orthogonal filters and can operate in a scalable compression framework. In the proposed system, the trade off point between compression and classification is determined by the system, however, the user can also fine-tune the relative performance using two controllers (one for compression and one for classification). Extensive simulations have been performed to demonstrate the compression and/or classification performance of our system in the context of the recent image compression standard, namely JPEG2000. Our simulation results show that the lifting based kernels, generated from the proposed system, are capable of achieving superior compression performance compared to the default kernels adopted in the JPEG2000 standard (at a classification rate of 70%). The generated kernels can also achieve a comparable compression quality with the JPEG2000 kernels whilst providing a 99% classification performance. In other words, the proposed lifting based system achieves the best trade off between compression and classification performance at the compressed bit-stream level in the wavelet domain.
The rapid growth of visual media in many applications has led to the proliferation of a variety of compression standards including the recent MPEG-4 (ISO/IEC JTC1/SC29/WG11, 1998) and JPEG2000 (Taubman, 2000) standards for image/video compression. It is therefore likely that visual media will be increasingly stored in the compressed format. Visual indexing techniques are also becoming important because of the requirements for retrieving visual information from multimedia databases. Potential applications include, multimedia information systems, digital libraries, interactive television, etc. The upcoming MPEG-7 (ISO, 2000) standard proposes content descriptors, which succinctly describe the visual content for the purposes of efficient retrieval. Since not all images/videos are indexed prior to compression, there is a requirement for sophisticated compressed domain indexing techniques, where the visual information is retrieved based on compressed domain features. Classification is an important step in visual indexing. In this paper, we use the terms classification and indexing interchangeably. Combined compression and classification are therefore becoming an important research issue in the context of efficient storage and retrieval of visual media in a variety of applications. Exploring classification in compressed domain has the advantages of faster search and retrieval. It also reduces the memory required for storing on-line data. Several combined compression and indexing techniques have been recently reported in the literature, which employ the compressed domain features as indices. In (Mandal et al., 1997), a wavelet based compression and indexing system has been presented. It employs Legendre moments of the wavelet coefficients as an index. In (Mandal, 1998), an indexing technique based on the histograms in the wavelet domain has been detailed. Only features derived from the high frequency bands are used to distinguish between various textures. Acceptable retrieval results has been obtained using this approach, however, it is computationally expensive because of the up-sampling process required for the high frequency bands. In (Liang and Jay Kuo, 1999), a wavelet-based image representation and description approach has been presented, where the images are indexed and compressed simultaneously. This greatly simplifies the image database management problem, however, the feature descriptors generated during the encoding process are only based on the sub-band energies, which do not effectively describe the image content. In (Chang and Jay Kuo, 1993), a texture analysis scheme has been presented based on an irregular tree decomposition structure, where the middle resolution sub-band coefficients are used for texture matching. In this scheme, J dimensional feature vector is generated consisting of the energy of the J most important sub-bands. Indexing is performed by matching the feature vector of the query image with those of the target images in the database. In ( Bhalod et al., 2000), a still texture object indexing scheme is proposed for use in the MPEG-4 framework, retrieval is based on the auto correlation values of the objects in all the wavelet channels. In ( Bhalod, 2000), a texture classification approach has been presented that uses the Mallat exponential algorithm ( Mallat, 1989) to model textures at all wavelet levels. A technique for texture classification using cooccurrence features has been first proposed in ( Haralick et al., 1973), which has been later used in the wavelet domain in ( Van de Wouwer et al., 1999). Second order statistics have been extracted using these cooccurrence signatures. Good retrieval results were obtained using this approach in addition to its low complexity. Another improved techniques for texture classification that uses Mallat exponentials is reported in ( Bhalod et al., 2001). We note that most of the texture classification approaches presented in the literature are computationally expensive due to the fact that the texture feature can mostly be extracted in the high frequency regions, which entails extra complexity. In most of the combined compression and classification approaches reported in the literature there is a synergistic relation between them as they both rely on the underlying content. Increasing the number of wavelet decomposition levels results in more frequency localizations in the decomposed image (until the fourth level (Mao and Jain, 1992)). While the additional frequency localizations improve the bit rate/PSNR trade off to achieve better compression performance, it also increases the size of the wavelet sub-band feature vector, which in term enhances the classification performance (Chang and Jay Kuo, 1993). Table 1 provides a summary of the classification performance (of the histogram technique (Mandal, 1998)) for different classes of images (Vistex database at the Vision and Modeling group at the MIT Media Lab) at a varying level of decompositions using the same filter kernel for all images and all levels of decomposition.Most of the combined compression and classification approaches reported in the literature rely on extracting indices from the transform coefficients (wavelet or DCT domain). However, with the increased complexity of the individual tasks (after transformation) involved during the compression process, and in order to obviate the need to decode the compressed bit-stream back to the transform domain coefficient level, there is a need for combined compression and indexing approaches that operate directly on the compressed bit-stream rather than just the transform coefficients. We note that there is a strong relation between the values of the transform coefficients and the content of the image. However, at the compressed bit-stream level this relationship is much harder to comprehend, this is due to the fact that compressed bit-stream does not directly relate to the content. Hence while compression and classification can be treated jointly at the transform domain level, there is a non-synergistic relation between them at the compressed bit-stream level. For example, a good classification performance may be typically obtained at the expense of compression performance degradations, and vice versa as shown in Fig. 9 and in (Bhalod et al., 2001). Hence, it is important to investigate the best trade off between compression and classification. This is the principal motivation for the research work introduced in this paper. A number of techniques in the signal processing literature have been presented that investigate the trade off between compression and classification in the Vector Quantization (VQ) domain. These techniques explore the trade off in the encoder/decoder design, by designing a decoder that classifies at a given bit rate or decodes at a specific classification error. In (Permutter et al., 1996; Li et al., 1999), Bayes VQ is employed in the design of the encoder and classifier, where the system computes a quantized index for each input vector. This quantization index can also be treated as a classification label. In (Baras and Dey, 1999), a novel coder for joint compression and classification has been presented for speech data that is based on a modified learning vector quantization (LVQ) algorithm. In (Srinivasamurthy and Ortega, 2001), a compression and classification algorithm has been presented that employs a Lagranian multiplier to control the trade off between compression and classification performances. Although extensive research has been carried out in the area of wavelet based compression, relatively little work has been done in the domain of exploring the trade off between compression and classification at the compressed bit-stream level. Since JPEG2000 is expected to be widely used for data compression, and considering that wavelet based coding is becoming the technique of choice for compression, we have explored the concept of joint compression and indexing at the compressed bit-stream level in the wavelet domain. Most of the existing wavelet image coders employ bi-orthogonal filters. This provides the advantages of selective frequency localization and perfect reconstruction (PR) in contrast to classical orthogonal filters (Cohen et al., 1992). It also obviates the need for any phase compensation in the pyramidal filter structure. Hence our proposed wavelet compression system is based on bi-orthogonal filters. Lifting has emerged as a powerful scheme in bi-orthogonal wavelet reconstruction (Sweldens, 1995). Lifting exploits the similarity of the filter coefficients in both low pass and high pass filters resulting in a higher speed of implementation compared to conventional convolution based coders. In addition, lifting inherits the advantages of in-place calculations and reversibility and has been adopted in many of the recent wavelet based implementations including MPEG-4 and JPEG2000. We have therefore adopted a lifting based implementation in the proposed system. In this paper, we propose a novel lifting based joint compression/classification coder. A Lagrangian multiplier has been employed to achieve the best trade off between compression and classification performances. The proposed system achieves a superior compression performance at a fixed classification rate or the best possible classification for a fixed compression ratio. Details of the proposed system are presented in Section 2. The applicability of the proposed system in the JPEG2000 framework is shown in Section 3. Experimental results that demonstrate the superior compression performance of the proposed scheme as well as the superior classification performance at a comparable compression performance to JPEG2000 is presented in Section 4, followed by the conclusion in Section 5.
نتیجه گیری انگلیسی
In this paper, we have presented a lifting based system for joint compression and classification at the compressed bit-stream level. The proposed system has two control switches, one for compression and one for classification, which control the trade off between them. This system maintains the perfect reconstruction property of the coding system and improves the regularity of the generated filters. When the classification factor is cancelled, the proposed system generate bi-orthogonal lifting kernels that achieve superior compression performance compared to the existing kernels defined in the JPEG2000 standard. With a classification rate of over 99%, we have been able to get comparable compression results with the existing compression kernels. Our system can help the user in choosing the best filters for both compression and classification.