رگرسیون لجستیک با استفاده از متغیرهای کمکی به دست آمده با مدل شبکه عصبی محصول واحد
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24724||2007||13 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Pattern Recognition, Volume 40, Issue 1, January 2007, Pages 52–64
We propose a logistic regression method based on the hybridation of a linear model and product-unit neural network models for binary classification. In a first step we use an evolutionary algorithm to determine the basic structure of the product-unit model and afterwards we apply logistic regression in the new space of the derived features. This hybrid model has been applied to seven benchmark data sets and a new microbiological problem. The hybrid model outperforms the linear part and the nonlinear part obtaining a good compromise between them and they perform well compared to several other learning classification techniques. We obtain a binary classifier with very promising results in terms of classification accuracy and the complexity of the classifier.
There are many fields of study such as medicine, microbiology and others, where it is very important to predict a binary response variable, or equivalently the probability of an event's occurrence in terms of the values of a set of explicative variables related to it. Therefore, in binary supervised learning problems, the goal is to learn how to distinguish between examples from two classes (herein labeled as y=0y=0 and y=1)y=1) on the basis of k observed predictor variables (also known as features or covariates) x1,x2,…,xkx1,x2,…,xk. The logistic regression (LR) model has been widely used in statistics for many years and has recently been the object of extensive study in the machine learning community. This traditional statistical tool arises from the desire to model the posterior probabilities of the class level given its observation via linear functions in the predictor variables. In this way, the LR model admirably serves the purpose of predicting a binary response variable and it is the most used in these cases as we can see, for example, in . The LR is a simple and useful procedure, although it poses problems when is applied to a real-problem of classification, where frequently we cannot make the stringent assumption of additive and purely linear effects of the covariates. A traditional technique to overcome these difficulties is to augment/replace the vector of inputs with additional variables, basis functions, which are transformations of the input variables and then to use linear models in this new space of derived input features. The beauty of this method is that once the basis functions have been determined, the models are linear in these new variables and the fitting is a standard procedure. Methods like sigmoidal feed-forward neural networks , projection pursuit learning , generalized additive models  and multivariate adaptive splines (MARS)  can be seen as different basis function. The major drawback of these approaches is to state the number and the typology of the corresponding basis functions. The simplest method to build basis functions is to augment the inputs with polynomial terms to achieve higher-order Taylor expansions, for example, with quadratic terms and multiplicative interactions. Note, however, that the number of variables grows exponentially in the degree of the polynomial. Our approach overcomes the nonlinear effects of the covariates proposing a LR model based on the hybridation of linear and product-units models (LRLPU), introducing into the model nonlinear basis functions constructed with the product of the inputs raised to arbitrary powers. These basis functions express the possible strong interactions between the covariates, where the exponents are not fixed and may even take real values. Moreover, we avoid the huge number of coefficients involved in the polynomial model. The nonlinear basis functions of the proposed model corresponds to a special class of feed-forward neural network, namely product-unit neural networks (PUNN), introduced by Durbin and Rumelhart . They are an alternative to standard sigmoidal neural networks and are based on multiplicative nodes instead of additive ones. The error surface associated with PUNN is extremely convoluted with numerous local optimums and plateaus. This is because small changes in the exponents can cause large changes in the total error surface. The estimation of the coefficients is carried out in several steps. In a first step, an evolutionary algorithm (EA) is applied to the design of the structure and training of the weights in a PUNN. The evolutionary process determines the number of basis functions in the model and the corresponding exponents. The complexity of the error surface of the proposed model justifies the use of an EA as part of the process of estimation of the model coefficients. That step can be seen as a global search in the coefficients’ model space. On the other hand, it is well known that EA are efficient at exploring an entire search space; however, they are relatively poor at finding the precise optimum solution in the region where the algorithm converges to. In order to improve the lack of precision of the EA, we use, in a second step, a local optimization algorithm. More precisely, once the basis functions have been determined by the EA, the model is linear in these new variables together with the initial covariates and the fitting proceeds with standard maximum likelihood optimization method for LR. Finally, we apply a backward method to select the best covariates to explain the response. By controlling the number of coefficients in the final model we can decrease the risk of building overly complex models that overfit the training data, and therefore obtain simpler models. It should be pointed that most of the classification techniques are principally used to improve the precision of the classifier, while their comprehensibility and interest are of secondary importance . That comprehensibility is becoming more and more important for researchers who need to be able to make a sensitive analysis of each and every covariate of the model, which is why the last few years have seen some articles dealing specifically with comprehensibility  and others that are about it as well as precision . Thus, throughout this paper we will do our best to obtain the maximum precision in classification while maintaining the simplest models possible as far as the number of model coefficients is concerned. We evaluate the performance of our methodology on seven data sets of two classes taken from the UCI repository  and a classification microbiology problem. The empirical results show that the proposed hybrid method is very promising in terms of classification accuracy, simplicity as well as very efficient in terms of the total number of coefficients and basis functions needed for constructing the final binary classifiers, and yielding a state-of-the-art performance. It is interesting to point out that the proposed hybrid model outperforms the linear model constructed by means of LR with initial covariates and also the nonlinear part of the model obtained with a logistic regression with all covariate product units (LRPU). In this way, the hybrid model (LRLPU) determines a good balance between the linear and nonlinear part. The paper is arranged as follows: Section 2 briefly reviews and discusses some related papers. Section 3 introduces LR and our model in depth. Section 4 describes the process of coefficient estimation. Section 5 introduces the datasets and explains the experiments carried out and finally Section 6 summarizes the conclusions of our work.
نتیجه گیری انگلیسی
In this paper, we focus on binary classification problems. We propose a LR method based on the hybridation of linear models and a special class of feed-forward neural network, namely product-unit neural networks. The nonlinear basis functions express the possible strong interactions between the covariates. The methodology to estimate the coefficients of the model is based on the combination of an EA to determine the basic structure of the product-unit model and a local optimization procedure carried out by standard logistic regression to find the final model. We include a backward stepwise method to select the best covariates to explain the response. In this way, we control the number of coefficients in the final model and we decrease the risk of building overly complex models that over fit the training data. The proposed model has been applied to seven benchmark data sets and a hard real microbiological world problem. The results obtained outperform the linear model constructed by means of logistic regression with initial covariates and also the nonlinear part of the model obtained with a logistic regression with all the product-unit covariates. In this way, the hybrid model determines a good balance between the linear and nonlinear part. Furthermore, the empirical results show that the proposed hybrid model is very promising in terms of classification accuracy and simplicity as well as very efficient in terms of the total number of coefficients and basis functions needed for constructing the final binary classifiers and yielding a state-of-the-art performance. Moreover, we can show the best model for each classification problem. It is important to note that the number of coefficients of the best models is considerably lower than in the alternative techniques considered.