دانلود مقاله ISI انگلیسی شماره 24133
ترجمه فارسی عنوان مقاله

بهبود عملکرد شبکه های عصبی در طبقه بندی با استفاده از رگرسیون خطی فازی

عنوان انگلیسی
Improving the performance of neural networks in classification using fuzzy linear regression
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24133 2001 6 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 20, Issue 2, February 2001, Pages 201–206

ترجمه کلمات کلیدی
شبکه های عصبی - رگرسیون خطی فازی - طبقه بندی -
کلمات کلیدی انگلیسی
Neural network, Fuzzy linear regression, Classification,
پیش نمایش مقاله
پیش نمایش مقاله  بهبود عملکرد شبکه های عصبی در طبقه بندی با استفاده از رگرسیون خطی فازی

چکیده انگلیسی

In this paper, we apply the fuzzy linear regression (FLR) with fuzzy intervals analysis into a neural network classification model. The FLR works as a data handler and separates the data sample into two groups. By training two independent neural works with these two groups, we can better describe the distribution space of the corresponding data sample with two different functions, rather than using only one function. The experimental result shows that our approach improves the accuracy of classification.

مقدمه انگلیسی

The classification problem is concerned with categorizing observations into different groups. The performance of the classification process is dependent on how well the discriminant function for the specific problem performs. A discriminant function is developed to minimize the misclassification rate, according to some given samples of input and output vector couples, which are referred to as “training data set”. This discriminant function is then used for classifying new observations into previously defined groups and for testing the accuracy of the prediction. In this research, we consider a binary classification problem. In a binary problem, observations are classified into two groups. Although our result is applied to a binary pattern, they can be easily generalized to deal with the general case. Many methods have been developed for classification problem. They include neural network (NN), multivariate discriminant analysis (MDA), decision tree, logistic regression, k nearest neighbor, among others (Bhattacharyya and Pendharkar, 1998, Han et al., 1996 and Tam and Kiang, 1992). Applications to different domains, such as managerial decisions, financial forecasting, bankruptcy prediction, image recognition, text-to-speech matching, medical diagnosis have been applied and tested (Han et al., 1996, Markham and Ragsdale, 1995, Salchenberger et al., 1992, Tam and Kiang, 1992 and Wilson and Sharda, 1994). Although several research studies suggest that the neural network approach have higher classification ability (Archer and Wang, 1993, Bhattacharyya and Pendharkar, 1998, Tam and Kiang, 1992 and Wilson and Sharda, 1994) than many other methods, the predictive capability of the neural network approach still has potential for further improvement. Han et al. (1996) indicate that the relative performance of different classification techniques may depend on the data conditions of the training data set. Specifically, in a training data set, some observations may have similar input vector values but different output vector values. These observations are referred to as “bad” observations. If “bad” observations are used in the training process, they may adversely affect the performance of the resulting neural network. The objective of this study is to propose a way to improve the accuracy of the neural network by separating “good” data from “bad” data for training. Our model comprises of two phases. In Phase I, fuzzy regression method with fuzzy interval analysis is applied. In Phase II, two simple backpropagation neural network constructions as the final classification engine are provided. By using the fuzzy linear regression with the fuzzy interval analysis, we separate the training data into two groups based on the fuzzy interval. The separated training data sets are used to generalize two neural networks accordingly. With two neural networks, we formulate two different functions to describe the distribution space of the data. In our experiment, our model is compared with the conventional backpropagation neural network. This result shows that using two different functions to describe the distribution space of the observations promises a more accurate classification result. The paper is organized as follows. In Section 2, Tanaka's model (Tanaka, Uejima & Asai, 1982), the modified Tanaka's model (Tanaka & Ishibuchi, 1992) with fuzzy interval analysis, and the multilayer feedforward backpropagation neural network are introduced. In Section 3, the sample data and the methodology used are described. In Section 4, the details of our model are explained. In Section 5, the results and the comparison between our model and the conventional backpropagation neural network are reported. In Section 5, we use synthetic data to show the advantages of the new model. Finally, a conclusion is given in Section 6.

نتیجه گیری انگلیسی

This paper has proposed a new classification model. Phase I of our model is used to find a fuzzy interval so as to separate the training data into two groups, i.e. whether a datum instance lies inside or outside the interval. The objective of Phase I is to minimize the effect of the vagueness data in the training data, and to separate the certain data and vagueness data into two groups. According to the fixed parameters and the found unknowns in the FLR with fuzzy interval model, the testing data sample is then separated into two groups also. In Phase II, two single hidden-layer BPNN models are used to build up the classification engines. The two independent NNs allow us to formulate two different non-linear discriminant functions to classify the data. The conventional method uses one NN to describe the distribution space of the data. Although, a NN can be used to formulate a highly non-linear function, it is hard to describe a very complex distribution space. This new model provides us with two independent functions to describe the distribution space of the data sets, therefore, the ability of describing the distribution space is improved.