درباره عملکرد طبقه بندی TAN و شبکه های بیزی عمومی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
28985 | 2009 | 7 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Knowledge-Based Systems, Volume 22, Issue 7, October 2009, Pages 489–495
چکیده انگلیسی
Over a decade ago, Friedman et al. introduced the Tree Augmented Naïve Bayes (TAN) classifier, with experiments indicating that it significantly outperformed Naïve Bayes (NB) in terms of classification accuracy, whereas general Bayesian network (GBN) classifiers performed no better than NB. This paper challenges those claims, using a careful experimental analysis to show that GBN classifiers significantly outperform NB on datasets analyzed, and are comparable to TAN performance. It is found that the poor performance reported by Friedman et al. are not attributable to the GBN per se, but rather to their use of simple empirical frequencies to estimate GBN parameters, whereas basic parameter smoothing (used in their TAN analyses but not their GBN analyses) improves GBN performance significantly. It is concluded that, while GBN classifiers may have some limitations, they deserve greater attention, particularly in domains where insight into classification decisions, as well as good accuracy, is required.
مقدمه انگلیسی
This paper examines the performance of Bayesian networks as classifiers, comparing their performance to that of the Naïve Bayes (NB) classifier and the Tree Augmented Naïve Bayes (TAN) classifier, both of which make strong assumptions about interactions between domain variables. In the experiments performed for this work, described below in Section 3, standard Bayesian networks (referred to as General Bayesian Networks, GBNs, to distinguish them from NB and TAN) are compared with NB and TAN classifiers on 28 standard benchmark datasets. Our experiments indicate that the GBN classifier is substantially better than NB, with performance closer to that of TAN. This contrasts with the conclusions drawn in the landmark paper on Bayesian network classifiers by Friedman et al. [14]. That paper presented results on many of the same datasets, showing that GBNs constructed using the minimum description length (MDL) score tend to perform no better than NB. That result has been widely noted by other authors (e.g. [16] and [18]); in one case the result was interpreted as indicating that NB “easily outperforms” GBN. Our contention is that it has become ‘accepted wisdom’ that GBN classification performance is no better than that of NB, and significantly worse than TAN (ignoring other considerations such as computational complexity or interpretability). Our results indicate that GBN’s classification performance is superior to that of NB and much closer to that of TAN, when the same parameter estimation procedure is used for all. It turns out that Friedman et al. used simple frequency counts for parameter estimation in constructing GBN classifiers, whereas they used parameter smoothing in constructing TAN classifiers (see Section 2.3 for details). Our experiments show that if frequency counts are used for both GBN and TAN, neither is much better than NB (Section 3.3, Fig. 5), but if parameter smoothing is used for both, they both perform similarly well (Fig. 4). Furthermore, since GBN classifiers are commonly constructed through heuristic search, it is possible for improved GBN construction algorithms to lead to improved performance. Full-size image (26 K) Fig. 5. Relative accuracies of: (a) unsmoothed GBN-HC vs. unsmoothed NB, (b) unsmoothed TAN vs. unsmoothed NB. Figure options Full-size image (11 K) Fig. 4. Relative accuracies of TAN and GBN-K2. Figure options The structure of the paper is as follows. Section 2 reviews Bayesian networks and the algorithms for constructing GBN and TAN classifiers that are used in this paper. Section 3 presents experiments applying NB, TAN and two GBN algorithms to classification problems on 28 standard datasets, and identifies why the results of this paper are at odds with those of Friedman et al. as mentioned above. Finally, Section 4 draws general conclusions about the suitability of GBNs as classifiers.
نتیجه گیری انگلیسی
The results of the preceding section have shown that, when TAN and GBN-K2 classifiers are compared under careful experimental procedures and using the same parameter estimation procedure for both, there is little to distinguish between them in terms of classification accuracy. An advantage of TAN is its low computational complexity, which is O(n2 N). However, for a fixed maximum number of parents per node, to complexity of the GBN-K2 algorithm is O(n3 N r), which is just a factor (n r) worse (here, r is the maximum number of different values any node may have). Nonetheless, if GBN classifiers are more expensive to construct than TAN classifiers and do not offer greater classification accuracy, why use them? There are other possible drawbacks of GBNs as classifiers: 1. It is often observed in Machine Learning that we should not solve a more general problem than required, so why build a full GBN if all that is required is a classifier? 2. GBNs are in general more complex than TAN classifiers, though not necessarily; in fact, as discussed below, A GBN classifier may end up with fewer arcs than a TAN classifier on the same domain, since not all nodes might be within the Markov blanket. 3. The GBN that best describes the domain as a whole does not necessarily correspond to the one that best discriminates between classes and the classification node might potentially be unconnected from the network. While aware of these drawbacks, we propose three reasons for their use: 1. Insightful analysis: In many practical domains, particularly where it is required to convince end-users such as scientists or engineers that classification decisions are reasonable and logical, it is as important to gain insight into the problem as it is to achieve high classification accuracy. GBN classifiers support this by modelling the distribution, allowing more complex interactions between nodes to be represented than with TAN and also potentially identifying nodes that are outside the classification node’s Markov blanket. They also aid in identifying conditional independencies in the data, which may also be useful for domain insight. 2. Representational power: Ling and Zhang [20] have examined the representational power of discrete BNs and have concluded that, if each node has at most u parents, a BN can represent parity functions of maximum order u. This implies, as would be expected, that GBN has greater representational power than TAN which in turn has greater representational power than NB. 3. Appropriate complexity: As noted above, a GBN classifier may have fewer arcs than a TAN classifier for the same domain. In TAN, nodes are must have the class node as a parent, and a full tree of arcs between non-class nodes, so all but two nodes have exactly two parents each. In GBN, there are no such constraints; a node may have no parents or several. On the Adult dataset for example, the typical GBN had 13 arcs with 0–3 parents per node, which is the same number of arcs as Naïve Bayes for the dataset, which has exactly one parent per node. The TAN classifier for the Adult dataset was more complex, with 25 arcs. On the Connect 4 dataset, Naïve Bayes has 13 arcs, TAN has 83 arcs and GBN has a median of 74 arcs. GBN approaches are not as widely used for classification tasks as TAN. Notable exceptions include the work of Cheng and Greiner [7], the application by Baesens et al. [2] of Monte Carlo Markov Chain search to constructing GBN classifiers, and Grossman and Domingos’ [16] algorithm for learning GBNs that maximize conditional likelihood. However, a larger number of researchers have analysed TAN and proposed improvements. Examples include the work of Keogh and Pazzani [18] who proposed the use of classification accuracy rather than conditional mutual information in building TAN-style classifiers; Zhang and Ling [24], who extended Keogh & Pazzani’s work by using AUC measures; Cerquides and de Mántaras [6], who identified theoretical weaknesses in the TAN approach and proposed corrections for them; and Garg and Roth [15] who addressed the question of why classifiers such as TAN that make often inaccurate assumptions tend to perform well. Although the experiments here have shown that the GBN-K2 algorithm has quite good classification performance, it is likely that other algorithms would perform even better. Given the relative complexity of GBN construction compared to TAN construction, improving the performance of GBN classifiers would appear to be a topic with some potential for research. A limitation of GBN-K2 is that it requires an ordering on nodes. In specific applications it may be possible to determine a reasonable node ordering from domain knowledge, but it would be interesting to analyse the performance of other algorithms that do not require node ordering. That being said, GBN-HC does not require node ordering and its performance on the test datasets was slightly weaker than that of GBN-K2, but its simple hill-climbing search without restarts is quite limited. In the future, it is hoped to analyse more sophisticated algorithms, particularly the algorithm of Silander and Myllymäki [23], which searches for a globally optimal network. In order to address the issue noted earlier in this section that the optimal GBN for a domain is not necessarily the optimal one for classification, it would be necessary to develop an approach that constructs a Markov blanket around the classification node. Overall, we believe that GBNs may deserve greater attention as classifiers, particularly in problem domains where data is plentiful and insight into the domain, as well as high accuracy, is required, although work remains to be done to optimize them for classification tasks.