بالا بردن سرعت جستجو تشابه در پایگاه داده های تصویر بعد بالا از طریق فیلتر کردن در برنامه ریزی پویا و مقیاس چندگانه
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
24967 | 2006 | 12 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Image and Vision Computing, Volume 24, Issue 5, 1 May 2006, Pages 424–435
چکیده انگلیسی
This paper presents a scalable content-based image indexing and retrieval system based on a new multiscale filter. Image databases often represent the image objects as high-dimensional feature vectors and access them via the feature vectors and similarity measure. A similarity measure based on the proposed multiscale filtering technique is defined to reduce the computational complexity of the similarity search in high-dimensional image database. Moreover, a special attention is paid to solve the problem of feature value correlation by dynamic programming. This problem arises from changes of images due to database updating or considering spatial layout in constructing feature vectors. The computational complexity of similarity measure in high-dimensional image database is very huge and the applications of image retrieval are restricted to certain areas. To demonstrate the effectiveness of the proposed algorithm, we conducted extensive experiments and compared the performance with the IBM's query by image content (QBIC) and Jain and Vailaya's methods. The experimental results demonstrate that the proposed method outperforms both of the methods in retrieval accuracy and noise immunity. The execution speed of the proposed method is much faster than that of QBIC method and it can achieve good results in terms of retrieval accuracy compared with Jain's method and QBIC method.
مقدمه انگلیسی
Image databases often represent the image objects as vectors of d numeric features and access them via the feature vectors and similarity measure. The feature vector dimensions of typical vector-based descriptors are quite large. The high dimensionality of the feature vectors leads to high computational complexity in distance calculation for similarity retrieval, and inefficiency in indexing and search. To make the content-based image retrieval truly scalable to large size image databases, efficient multidimensional indexing techniques need to be explored. Several methods have been proposed to overcome these problems [1]. The techniques can be roughly categorized into the following classes [2] and [3]: (1) the dimensionality reduction (DR) approach, (2) the multidimensional indexing approach, and (3) the filter-based approach. An indexing algorithm may be a combination of two or more of the mentioned classes. For example, one promising approach is to first perform dimension reduction and then use appropriate multidimensional indexing techniques. Even though the dimension of the feature vectors in image retrieval is normally very high, the embedded dimension is much lower [4] and [5]. Principle component analysis (PCA) [4] and [5] is a good way to condense most of information in a data set to a few dimensions. One thing worth pointing out is that blind dimension reduction is dangerous, since information can be lost if the reduction is below the embedded dimensions. Recently, although Smith and Chang [6] developed a post-verification technique to avoid the problem of blind reduction, the other limitations of the dimension reduction approach still exist. Considering that image retrieval system is a dynamic system and new images are continuously added to the image collection, the computation of dimension reduction is expensive [7]. The existing popular multidimensional indexing techniques include the bucketing algorithm, k–d tree, priority k–d tree, quad-tree, K–D–B tree, hB-tree, R-tree and its variants RC-tree and R*-tree. In addition to the above approaches, clustering and neural nets, widely used in pattern recognition, are also promising indexing techniques. Very good reviews and comparisons of various indexing techniques in image retrieval can be found in [8]. Multidimensional indexing methods treat d-dimensional feature vectors as points in a d-dimensional vector space and the similarity measure can be viewed as a measure of distance within that space. The multidimensional indexing approach receives a challenge to access image databases: the performance of existing multidimensional indexing schemes degrades dramatically as the dimensionality increases [3] and [9]. The filter-based approach searches the nearest k neighbors of a query by filtering the vectors so that only a small portion of them must be visited. The percentage of vectors visited during a search depends on the strategy used to design the filter. As an example, the LPC-file [2] partitions the vector space into rectangular cells and these cells are used to generate bit-encoded approximates for each vector. The k-NN queries are processed by first scanning the entire approximation file and by filtering the vast majority of vectors from the search based only on these approximations. The drawbacks of the filter-based approach are: (1) the design of the approximations is not a trivial work and the precision of the approximation is not good while applying to image data of good locality; (2) additional information should be added to the approximate in order to enhance the filtering rate when the database is getting larger and larger. This paper presents a scalable content-based image indexing and retrieval system based on a new multiscale filter. The image objects are represented as high-dimensional feature vectors and users can access them via the feature vectors and similarity measure. A similarity measure based on the proposed multiscale filtering technique is defined to reduce the computational complexity of the similarity search in high-dimensional image database. Moreover, a special attention is paid to solve the problem of feature value correlation by dynamic programming [10]. Dynamic programming is a well-known technique to return the optimal alignment between two feature sequences, where one is from a query image and the other is from a database image. Considering that an area in the query image is similar to that of a database image, however, this area is segmented into two nearby regions in the query image and segmented into a single region in the database image due to the fixed parameters used in the segmentation algorithm. In this case, the two regions in the query image would become two nodes, which are nearby with each other, in the resulting feature sequence representing the spatial layout of the query image. On the other hand, the corresponding region is just one node in the resulting feature sequence representing the spatial layout of the database image. The recursive equations to perform dynamic programming proposed in this work would properly align both nodes of the query image with that of the database image. In conclusion, the advantages of the proposed dynamic programming approach for content-based image retrieval are twofold: (1) it measures the image similarity in terms of spatial layout of regions properties; (2) it diminishes the affectation of inaccuracy of region segmentation due to lighting, viewpoint, and threshold values used in image retrieval. The computational complexity of similarity measure in high-dimensional image database is very huge and the applicability of image retrieval is limited. To demonstrate the effectiveness of the proposed algorithm, we conducted extensive experiments and compared the performance with the IBM's query by image content (QBIC) [11] and Jain and Vailaya's method [12]. The experimental results demonstrate that the proposed method outperforms both of the methods in retrieval accuracy and noise immunity. The execution speed of the proposed method is much faster than that of QBIC method and it can achieve good results in terms of retrieval accuracy compared with Jain's method and QBIC method. The remainder of this paper is organized as follows. Section 2 describes the method to compute the similarity between two images by dynamic programming. Section 3 presents the design of the proposed multiscale filter. Section 4 presents the analysis of the proposed image retrieval strategy. Some experimental tests to illustrate the effectiveness of the proposed image retrieval method is shown in Section 5. Finally, conclusions are drawn in Section 6.
نتیجه گیری انگلیسی
In this paper, we have presented a scalable content-based image indexing and retrieval system based on a new multiscale filter. The similarity measurement based on the proposed multiscale filtering technique has been defined to reduce the computational complexity of the similarity search in high-dimensional image database. Moreover, a special attention is paid to solve the problem of feature value correction by dynamic programming. Comparing with QBIC and Jain's methods, the experimental results demonstrate that the proposed method outperforms both of methods in retrieval accuracy, noise immunity, and the execution speed. The proposed method suffers a drawback: the searching time linearly depends on the number of images in a database. Future work should be done to solve this problem.