یک مدل برای تنظیم پارامتر بر اساس شبکه های بیزی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
28613 | 2008 | 12 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Engineering Applications of Artificial Intelligence, Volume 21, Issue 1, February 2008, Pages 14–25
چکیده انگلیسی
One of the difficulties that the user faces when using a model to solve a problem is that, before running the model, a set of parameter values have to be specified. Deciding on an appropriate set of parameter values is not an easy task. Over the years, several standard optimization methods, as well as various alternative approaches according to the problem at hand, have been proposed for parameter setting. These techniques have their merits and demerits, but usually they have a fairly restricted application range, including a lack of generality or the need of user supervision. This paper proposes a meta-model that generates the recommendations about the best parameter values for the model of interest. Its main characteristic is that it is an automatic meta-model that can be applied to any model. For evaluation purposes and in order to be able to compare our results with results obtained by others, a real geometric problem was selected. The experiments show the validity of the proposed adjustment model.
مقدمه انگلیسی
There are a variety of situations in which a researcher is faced with a modeling problem. Modeling is a process through which a model M (function or algorithm) is constructed to explain the behavior of the system and to predict unknown answers. Once the model of the studied system is established based on a set of parameters ΘΘ, the parameter values which make the model generate the best results should settle down. In this work, we are interested in the parameter setting of models used in the artificial intelligence (AI) field, such as artificial neural networks, genetic algorithms (GAS), cluster algorithms and so on (Duda et al., 2001). Parameter setting of a model M(Θ)M(Θ) can be carried out either by applying specific mechanisms to the model (Friedrichs and Igel, 2004) or by applying general optimization techniques used in system modeling (Fletcher, 2000 and Rao, 1996). In many other cases, the choice of the parameters involves consulting the specialized technical literature or simply resorting to a trial and error technique. This work has utility in the engineering field, where a variety of AI models and techniques are used to solve a whole range of problems, but such techniques are not known in depth. Our work could help in these situations. As an example to illustrate its usefulness, a problem of graphic design was selected (the root identification problem). In a deeper analysis, it can be distinguished three approaches to the problem of parameter setting in AI: the evolutionary approach (used by evolutionary algorithms: GAS ( Goldberg, 1989); evolution strategies ( Schwefel, 1995); and evolutionary programming ( Fogel, 1999)), the model selection approach and the statistical approach. The evolutionary approach is based on adapting the parameters during the evolutionary algorithm run. Parameter control techniques can be sub-divided into three types: deterministic, adaptive, and self-adaptive ( Eiben et al., 1999). In deterministic control, the parameters are changed according to deterministic rules without using any feedback from the search. The adaptive control takes place when there is some form of feedback that influences the parameter specification. Examples of adaptive control are the works of Davis (1989), Julstrom (1995) and Smith and Smuda (1995). Finally, self-adaptive control is based on the idea that evolution can be also applied in the search for good parameter values. In this type of control, the operator probabilities are encoded together with the corresponding solution, and undergo recombination and mutation. Self-adaptive evolution strategies ( Beyer and Schwefel, 2002) are an example of the application of this type of parameter control. In the model selection approach, a model selection criterion can be used to select parameters, more generally to compare and choose among models which may have a different capacity or competence. When there is only a single parameter one can easily explore how its value affects the model selection criterion: typically one tries a finite number of values of the parameter and picks the one which gives the lowest value of the model selection criterion. Most model selection criteria have been proposed for selecting a single parameter that controls the “complexity” of the class of functions in which the learning algorithms find a solution, e.g., the structural risk minimization ( Vapnik, 1982), the Akaike (1974) Information Criterion, or the generalized cross-validation criterion ( Craven and Wahba, 1979). Another type of criteria are those based on held-out data, such as the cross-validation estimates of generalization error ( Kohavi and John, 1995). These are almost unbiased estimates of generalization error ( Vapnik, 1982) obtained by testing M(Θ)M(Θ) on data not used to choose parameter ΘΘ. This approach is less applicable when there are several parameters to simultaneously optimize. In this case, the conditions which must be satisfied are more restrictive ( Bengio, 2000). Finally, the general goal of statistical approach (or statistical estimation theory) is to estimate some unknown parameter from the observation of a set of random variables, referred to as “the data”. The maximum likelihood (ML), maximum a posteriori (MAP), expectation maximization (EM) or hidden Markov models (HMM) are examples of estimation techniques. Briefly, the ML is a classical technique which estimates a parameter directly from the data, assuming that data are distributed according to some parameterized probability distribution functions ( Jeffreys, 1983). The MAP technique is a Bayesian approach, i.e., a priori available statistical information on the unknown parameters is also exploited for their estimation ( DeGroot, 1970). When it is assumed that the data are drawn from a mixture of parameterized probability distribution functions, where not just the parameters, but also the mixture components have to be estimated, the EM algorithm is useful ( Dempster et al., 1977, Redner and Walker, 1984 and Jordan and Jacobs, 1994). As a last step, the HMM assume that the mixtures evolve in time according to a Markov process ( Rabiner, 1989). In this paper we suggest the use of a meta-model M*M* that generates recommendations about the best parameter values ΘΘ for the model of interest M and that differs from other systems in that it is: • general , in the sense of being a system that can be applied to any model M(Θ)M(Θ); • automatic, in the sense of being a system built from a set of data, without any model user supervision. Most general, meta-modeling is the analysis, construction and development of the frames, rules, constraint, models and theories applicable and useful for the modeling in a predefined class of problems. In the context of this work, a model can be viewed as an abstraction of phenomena in the real world, and a meta-model is yet another abstraction, highlighting properties of the model itself. The rest of the work is focused on defining such a system for parameter setting. As will be shown in the followings sections, Bayesian networks (BNs) are identified as the suitable formalism to define the meta-model M*M*. Moreover, the proposed model has the added benefit of removing the need for, or at least reducing the effect of, user decisions about parameter values. The structure of the paper is as follows. Section 2 is devoted to describing the proposed model for parameter setting. In Section 3, notions about BNs are briefly presented, including learning from databases, and we outline their suitability for building a framework for parameter setting. After this, Section 4 describes the problem used throughout the paper to illustrate the application of the proposed adjustment model, the experiments carried out for evaluating it and the results attained. Finally, in Section 5 some conclusions and future works are explained.
نتیجه گیری انگلیسی
In this paper a meta-model for parameter setting based on BN has been presented. The adjustment model has been designed with the aim of reducing the effort of the user who needs to optimize the parameters ΘΘ of the model M(Θ)M(Θ) which solves the problem P. The main novelty of this proposed model resides in the fact that it is general, i.e., domain independent, and automatic, that is to say, no user supervision is necessary. Another advantage is that the proposed model is easy to implement. It was also observed that basing the adjustment model on BNs may be an appropriate solution. Learning a BN from data allows the acquisition of the knowledge implicit in the domain variables and successive updating of this knowledge as new cases are collected. Also, the inference mechanism over BN allows recommendations about the best parameter values under uncertain conditions to be given. Our experiments showed that this proposed model could be successfully applied to GAs that operate as a selector mechanism in the root identification problem. The results obtained were similar to the ones provided by Barreiro's work and the ones extracted after an analysis of the database. The adjustment model presents the advantage of being an automatic method faced with the other two (STAT and DB). Moreover, it does not require user supervision and is adaptive, so that the more data that are collected, the more precise the results are. As for future work, and taking into account that right parameter values of the model M(Θ)M(Θ) may change according to the problem at hand, it would be interesting if the described adjustment model allowed these values to be obtained from characteristics of the problem. It might be an effective way of extending the utility of this proposed model. To put this idea into practice, an adequate solution would be to combine the adjustment model with a database of several instances of the same problem, in the way that each case deals with an specific BN. In this sense, the case based reasoning (CBR) methodology is considered the suitable mechanism for guiding that integration (Aamodt and Plaza, 1994, Kolodner, 1993 and Watson and Marir, 1994). The CBR systems have produced good results in other domains to solve similar problems. In this sense, they are considered an appropriate methodology for setting the parameters of a model, provided that it had different instances of the particular problem to resolve.