دانلود مقاله ISI انگلیسی شماره 28637
ترجمه فارسی عنوان مقاله

شناخت کاراکتر تخریب شده با استفاده از شبکه های بیزی پویا

عنوان انگلیسی
Recognition of degraded characters using dynamic Bayesian networks
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
28637 2008 12 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Pattern Recognition, Volume 41, Issue 10, October 2008, Pages 3092–3103

ترجمه کلمات کلیدی
مدل مارکوف - مدل مارکوف پنهان - شبکه های بیزی پویا - اسناد تاریخی - تشخیص کاراکتر شکسته -
کلمات کلیدی انگلیسی
Markovian models, Hidden Markov models, Dynamic Bayesian networks, Historical documents, Broken character recognition,
پیش نمایش مقاله
پیش نمایش مقاله   شناخت کاراکتر تخریب شده با استفاده از شبکه های بیزی پویا

چکیده انگلیسی

In this paper, we investigate the application of dynamic Bayesian networks (DBNs) to the recognition of degraded characters. DBNs are an extension of one-dimensional hidden Markov models (HMMs) which can handle several observation and state sequences. In our study, characters are represented by the coupling of two HMM architectures into a single DBN model. The interacting HMMs are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and image rows, respectively. Various couplings are proposed where interactions are achieved through the causal influence between state variables. We compare non-coupled and coupled models on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our models show that coupled architectures perform more accurately on degraded characters than basic HMMs, the linear combination of independent HMM scores, as well as discriminative methods such as support vector machines (SVMs).

مقدمه انگلیسی

Since the seminal work of Rabiner [1], stochastic approaches such as hidden Markov models (HMMs) have been widely applied to speech recognition, handwriting [2] and [3] and degraded text recognition [4] and [5]. This is largely due to their ability to cope with incomplete information and non-linear distorsions. These models can handle variable length observation sequences and offer joint segmentation and recognition which are useful to avoid segmenting cursive words into characters [6]. However, HMMs may also be used as classifiers for single characters [7] and [8] or characters segmented from words by an “explicit” segmentation method [9]: the scores output for each character and each class are combined at the word level. Another property of HMMs is that they belong to the class of generative models. Generative models better cope with degradation since they rely on scores output for each character and each class while discriminative models, like neural networks and support vector machines (SVMs), are powerful to discriminate classes through frontiers. In case of degradation, characters are expected to be still correctly classified by generative models even if lower scores are given. Noisy and degraded text recognition is still a challenging task for a classifier [10]. In the field of historical document analysis, old printed documents have a high occurence of degraded characters, especially broken characters due to ink fading. When dealing with broken characters, several options are generally considered: restoring and enhancing characters [11], [12] and [13] or recovering characters through sub-graphs within a global word graph optimization scheme [14]. Another solution is to combine classifiers or to combine data. Several methods can be used for combining classifiers [15], one of them consists of multiplying or summing the output scores of each classifier. In the works of [16] and [17], two HMMs are combined to recognize words. A first HMM, modeling pixel columns, proposes word hypotheses and the corresponding word segmentation into characters. The hypothesized characters or sub segments are then given to a second HMM modeling pixel rows. This second HMM normalizes and classifies single characters. The results of both HMMs are combined by a weighted voting approach or by multiplying scores. Our approach differs with restoration methods as it aims at enhancing the classification of characters without restoration. This is motivated by the fact that preprocessing may introduce distortions to character images. In our previous work [18], we compared data and decision fusion and showed that data fusion yields better accuracy than decision fusion for HMM-based printed character recognition. The present dynamic Bayesian network (DBN) approach is a data fusion scheme which couples two data streams, image columns and image rows into a single DBN classifier. It differs from the approach presented in [16] and [17] where two classifiers are coupled (one classifier per stream) in a decision fusion scheme, and from a data fusion scheme consisting of a multi-stream HMM which would require large and full covariance matrices in order to take into account dependencies between the streams [18]. Our study consists of building DBN models which include in a single classifier two sequences of observations: the pixel rows and the pixel columns. It can be seen as coupling two HMMs into a single DBN classifier, as opposed to combining the scores of two basic HMM classifiers in a decision fusion scheme. The two HMM architectures, each including an observation stream associated with state variables, are linked in a graphics-based representation. Two different streams are jointly observed and the model parameters (state transition matrices) reflect the spatial correlations between these observations. We apply the DBN models to broken character recognition. As generative models, DBNs are adapted to degraded character recognition. These models also provide a certain robustness to degradation due to their ability to cope with missing information. They have the ability to exploit spatial correlations between observations. Thus a corrupted observation in the image can be compensated by an uncorrupted one. We compare several DBN architectures among themselves, with other fusion models like the combination of independent HMMs, and with a SVM classifier. The paper is organized as follows. In Section 2, we briefly introduce Bayesian networks (BN) and DBNs. In Section 3, we present several independent or coupled models. In Section 4, we apply these models to the problem of broken character recognition (artificial and real). We conduct several experiments to show the advantages of DBNs by comparing their performance with the combination of HMM scores and with a SVM classifier. Conclusions are drawn in Section 5.

نتیجه گیری انگلیسی

We have presented a new approach for off-line character recognition, based on DBN. The modeling consists of coupling two HMMs in various DBN architectures. The observations for these HMMs are the image rows and the image columns, respectively. Interactions between rows and columns are modeled through state transitions or state/observation transitions. This results in finer representations of character images and in improvement of the basic HMM framework. We first investigated independent HMM and AR models. We showed that vertical models perform better than horizontal ones since columns of character images are more discriminating than rows. Secondly, we coupled these independent models into single models providing better performance than for the non-coupled models, as well as for the combination of the scores of the independent HMMs. We also demonstrated that the coupling through states such as in ST_CPL is more efficient than the coupling from state to observation as in GNL_CPL. The AR-coupled architecture which dynamically links observations in time gives the best recognition results. We applied this approach to the recognition of handwritten digits and old printed characters. We demonstrated the robustness of this approach in the presence of artificial and real world degradations. Our experiments show that coupled architectures cope better with highly broken characters than both basic HMMs and discriminative methods like SVMs. This is because coupled architectures are able to predict missing information and may provide at least one uncorrupted stream within time slices. The proposed coupled DBN architectures are thus particularly efficient for the recognition of broken characters. We expect further improvements from an accurate initialization of the parameters.