تخمین فازی پارامترهای رگرسیون در رگرسیون خطی برای داده های ورودی و خروجی مبهم
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
24154 | 2003 | 15 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computational Statistics & Data Analysis, Volume 42, Issues 1–2, 28 February 2003, Pages 203–217
چکیده انگلیسی
The method for obtaining the fuzzy estimates of regression parameters with the help of “Resolution Identity” in fuzzy sets theory is proposed. The α-level least-squares estimates can be obtained from the usual linear regression model by using the α-level real-valued data of the corresponding fuzzy input and output data. The membership functions of fuzzy estimates of regression parameters will be constructed according to the form of “Resolution Identity” based on the α-level least-squares estimates. In order to obtain the membership degree of any given value taken from the fuzzy estimate, optimization problems have to be solved. Two computational procedures are also provided to solve the optimization problems.
مقدمه انگلیسی
In the real world, the data sometimes cannot be recorded or collected precisely. For instance, the water level of a river cannot be measured in an exact way because of the fluctuation, and the temperature in a room is also not able to be measured precisely because of the similar reason. Therefore, the fuzzy sets theory is naturally to be an appropriate tool in modeling the statistical models when the fuzzy data have been observed. The more appropriate way to describe the water level is to say that the water level is around . The phrase “around ” can be regarded as a fuzzy number . This is the main concern of this paper. Since Zadeh (1965) introduced the concept of fuzzy sets, the applications of considering fuzzy data to the regression models have been proposed in the literature. Tanaka et al. (1982) initiated this research topic. They also generalized their approaches to the more general models in Tanaka and Warada (1988), Tanaka et al. (1989), Tanaka and Ishibuchi (1991). The collection of papers edited by Kacprzyk and Fedrizzi (1992) gave an insightful survey. In the approach of Tanaka et al. (1982), they considered the L-R fuzzy data and minimized the index of fuzziness of the fuzzy linear regression model. Yager (1982) used a linguistic variable to represent imprecise information for the regression models. Moskowitz and Kim (1993) also proposed a method to assess the H-value in a fuzzy linear regression model proposed by Tanaka et al. (1982). Redden and Woodall (1994) compared various fuzzy regression models and gave the differences between the approaches of fuzzy regression analysis and usual regression analysis. They also pointed out some weakness of the approaches proposed by Tanaka et al. Chang and Lee (1994) also pointed out another weakness of the approaches proposed by Tanaka et al. Wang and Tsaur (2000) also proposed a new model to improve the predictability of Tanaka's model. Bárdossy (1990) proposed many different measures of fuzziness which must be minimized with respect to some suggested constraints. Peters (1994) introduced a new fuzzy linear regression model based on Tanaka's approach by considering the fuzzy linear programming problem. Diamond (1988) introduced a metric on the set of fuzzy numbers by invoking the Hausdorff metric on the compact α-level sets, and used this metric to define a least-squares criterion function as in the usual sense, which must be minimized. Ma et al. (1997) generalized Diamond's approach by embedding the set of fuzzy numbers into a Banach space isometrically and isomorphically. Näther 1997 and Näther 2000, Näther and Albrecht (1990) and Körner and Näther (1998) introduced the concept of random fuzzy sets (fuzzy random variables) into the linear regression model, and developed an estimation theory for the parameters. Chang and Ayyub (2001) gave the differences between the fuzzy regression and ordinary regression analysis and also Kim et al. (1996) compared both fuzzy regression and statistical regression conceptually and empirically. Chang (2001) proposed a method for hybrid fuzzy least-squares regression by defining the weighted fuzzy-arithmetic and using the well-accepted least-squares fitting criterion. Celminš 1987 and Celminš 1991 proposed a methodology for the fitting of differentiable fuzzy model function by minimizing a least-squares objective function. Chang and Lee (1996) proposed a fuzzy regression technique based on the least-squares approach to estimate the modal value and the spreads of L-R fuzzy number. Dunyak and Wunsch (2000) described a method for nonlinear fuzzy regression using a special training technique for fuzzy number neutral networks. D'Urso and Gastaldi (2000) proposed a doubly linear adaptive fuzzy regression model based on a core regression model and a spread regression model. D'Urso (2002) also developed the unconstrained and constrained least-squares estimation procedures. Jajuga (1986) calculated the linear fuzzy regression coefficients using a generalized version of the least-squares method by considering the fuzzy classification of a set of observations and obtaining the homogeneous classes of observations. Kim and Bishu (1998) used a criterion of minimizing the difference of the membership degrees between the observed and estimated fuzzy numbers. Sakawa and Yano (1992) introduced three indices for equalities between fuzzy numbers. From these three indices, three types of multiobjective programming problems were formulated. Tanaka and Lee (1998) used the quadratic programming approach to obtain the possibility and necessity regression models simultaneously. The advantage of adopting a quadratic programming approach is to be able to integrate both the property of central tendency in least squares and the possibilistic property in fuzzy regression. In this paper, we will first obtain the α-level least-squares estimates from the usual linear regression model by using the α-level real-valued data of the corresponding fuzzy input and output data, and then construct the fuzzy estimates of regression parameters according to the form of “Resolution Identity” in fuzzy sets theory which was introduced by Zadeh (1975). In order to obtain the membership degree of any value taken from the fuzzy estimate, the optimization problems have to be solved. We also develop two computational procedures to solve the optimization problems. In Section 2, we give some properties of fuzzy numbers. In Section 3, we obtain the α-level least-squares estimates from the usual linear regression model by using the α-level real-valued data of the corresponding fuzzy input and output data. The membership functions of fuzzy estimates will be constructed according to the form of “Resolution Identity” from the α-level least-squares estimates obtained above. In Section 4, we develop two computational procedures to obtain the membership degree of any given value taken from the fuzzy estimates. We also provide the methodology to transact the predicted fuzzy output data. In Section 5, the numerical examples are given to clarify the theoretical results, and show the possible applications in linear regression analysis for imprecise data.
نتیجه گیری انگلیسی
We have obtained the fuzzy estimates of regression parameters with the help of “Resolution Identity”. That is to say, the fuzzy estimates are constructed from the α-level least-squares estimates using the α-level real-valued data of the corresponding fuzzy input and output data. In order to obtain the membership degree of any given value taken from the fuzzy estimates of regression parameters, we have to solve the optimization problems. We also propose two computational procedures to solve the optimization problems. Finally, a numerical example is provided for clarifying the theoretical discussions. There are some weak points that were pointed out by one of the referees. Let us consider the simplest regression problem Y=βX. Given a single pair of triangular fuzzy data with . Then the fuzzy estimate is degenerated as a real number 1. This situation will rarely occur. The reason is as follows. From Eq. (4), the fuzzy estimates are degenerated as the crisp numbers (real numbers) if for all α∈[0,1]. We denote the matrices by , , and . Then from (1), the estimate is the jth element of the vector and is the jth element of the vector . Therefore, for all α∈[0,1] will rarely occur. However if it, unfortunately, happened, then we still can take this crisp number as a fuzzy estimate since a real number is just a degenerate case of a fuzzy number. Or, alternatively, we can change some of the fuzzy data a little bit without changing their core values to avoid this situation. For example, we may change their spread values a little bit. On the other hand, if we consider the same problem Y=βX then the estimate is really a fuzzy number when the single pair triangular fuzzy data is taken from and . This example elicits a situation that the different fuzzy data sets (with the same fuzziness) for the same problem may obtain the extremely different fuzzy estimates, the crisp case and the fuzzy case, respectively, as the above examples described. Hence the fuzziness of the fuzzy estimates depends not only on the fuzziness of the fuzzy data but also on the core values (the position) of the data. This strange situation can be explained as follows. The reason is that the fuzzy estimate is a fuzzy number. When we talk about number, we should regard the number as a whole item (a united item). Therefore, it would be natural to treat the fuzzy number (i.e. the fuzzy estimate) as a whole item rather than considering its core value and fuzziness separately. Under this concern, the different fuzzy data sets will give the different fuzzy estimates (fuzzy numbers) including the crisp case since the crisp number is just a special case of a fuzzy number.