دانلود مقاله ISI انگلیسی شماره 28991
ترجمه فارسی عنوان مقاله

آموزش شبکه بیزی ساختار مبتنی بر فاصله هیستوگرام: یک روش خاص طبقه بندی نظارت شده

عنوان انگلیسی
Histogram distance-based Bayesian Network structure learning: A supervised classification specific approach
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
28991 2009 11 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Decision Support Systems, Volume 48, Issue 1, December 2009, Pages 180–190

ترجمه کلمات کلیدی
شبکه بیزی - فاصله هیستوگرام - طبقه بندی نظارت شده - یادگیری ماشین - آموزش ساختار -
کلمات کلیدی انگلیسی
Bayesian Network, Histogram distance, Supervised classification, Machine learning, Structure learning,
پیش نمایش مقاله
پیش نمایش مقاله  آموزش شبکه بیزی ساختار مبتنی بر فاصله هیستوگرام: یک روش خاص طبقه بندی نظارت شده

چکیده انگلیسی

In this work we introduce a methodology based on histogram distances for the automatic induction of Bayesian Networks (BN) from a file containing cases and variables related to a supervised classification problem. The main idea consists of learning the Bayesian Network structure for classification purposes taking into account the classification itself, by comparing the class distribution histogram distances obtained by the Bayesian Network after classifying each case. The structure is learned by applying eight different measures or metrics: the Cooper and Herskovits metric for a general Bayesian Network and seven different statistical distances between pairs of histograms. The results obtained confirm the hypothesis of the authors about the convenience of having a BN structure learning method which takes into account the existence of the special variable (the one corresponding to the class) in supervised classification problems.

مقدمه انگلیسی

Almost any practical intelligent application requires dealing with uncertainty. This uncertainty may be motivated by the inherent complexity of the problem, by the technical limitation of the data collection machines (Interval Error in single data, low resolution in images, etc.), by safety considerations (radioactive trace of a patient could give good information, but it is not applicable), or it could be due to the impossibility to collect or to manage all the data needed to perform the reasoning. Until recently, the application of strict probabilistic approaches to reasoning was considered impractical due to the problem of computing the joint probability distribution of a large number of random variables involved in reasoning. However, the emergence of the concept of conditional (in)dependency allowed to simplify the calculus involved and made the evolution of automatic methods for reasoning under uncertainty based on probability theory possible. The last decade has seen significant theoretical advances and an increasing interest in probabilistic graphical models (PGMs), the most widely used of the probability based methods. These models represent dependency relationships within a set of random variables, where the random variables are represented as nodes in a graph. The absence of arcs in the graph corresponds to independence and its presence means possible dependence between two variables. One of the most popular types of graphical models is Bayesian Networks (BN). In these type of models arcs are directed, and there should not be a directed cycle in the whole graph [11], [37] and [46]. Much research has been devoted to the BN structure learning task [13], [14] and [25]. However, the problem of acquiring good BN structures in general, and in particular for structures which would serve as classification models, remains open. Most of the structure learning algorithms need two components (score + search): the learning algorithm itself that guides the search, and the metric for evaluating the structure at each time step of the learning process. The objective of this work is to look for new metrics for Bayesian Network structure learning algorithms. We concentrate on supervised classification problems and, more specifically, on the use of histogram distances for computing the differences between the class a posteriori distribution a given net structure should give to a case, and the real category that case belongs to.

نتیجه گیری انگلیسی

In this paper, a new method for BN structure learning is proposed with the aim of improving the behavior of the BN when the network model obtained is being used for supervised classification problems. Although commonly used, they generally do not take into account the existence of a special variable, namely the class, that is of central interest for the intended model. The new proposed metrics do take into account the final objective of models for supervised classification problems and measure the goodness of the model being acquired in terms of its classification capabilities. To validate the method, the relationship between the value each metric gives to a network structure and the classification performance of the structure has been analyzed, not only for the training datasets, but also for the data used for validating the model. Such relationship showed that the metrics proposed in this paper give better correlation values than other metrics between the metric value and the classification capabilities of the models. They also improve the generalization capabilities of the models. Obtained results confirm that the presented methodology is appropriate, and that some of the tested metrics are very well suited for measuring the classification capabilities of a BN model. Jeffrey distance stands out, showing good generalization capabilities and high percentages in the posterior validation of the obtained models. In fact, Jeffrey-divergence metric is one of the most successful distance metrics encountered in mobile robot localization, when whole image histogram features are used to identify their position in real time against a database of previously recorded images of locations in their environments [56]. Still, the effect of the α parameter has to be analyzed. This parameter not only can affect the correlation between the metric and the performance of the net, but also the classification accuracy itself. Using a fixed value of α for all the databases we have obtained good classification results, but the selection of a specific α value for each classification problem could be a way of improving the obtained accuracy. Further work should include the use of more sophisticated paradigms as structure learning algorithms and the application of Estimation of Distribution Algorithms [50] in conjunction with the new proposed metric. A Feature Selection process [12] could also be used. It also would be interesting and highly recommendable to increase the effort in developing classifier comparison methods in ML, in the direction proposed by [7], and testing the approach with incomplete databases [60].