ارزیابی وسیله اتصال دو متغیره غیر پارامتریک بر اساس شکل محدود شده رگرسیون بردار پشتیبانی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
25777 | 2012 | 10 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Knowledge-Based Systems, Volume 35, November 2012, Pages 235–244
چکیده انگلیسی
Copula has become a standard tool in describing dependent relations between random variables. This paper proposes a nonparametric bivariate copula estimation method based on shape-restricted ϵ-support vector regression (ϵ-SVR). This method explicitly supplements the classical ϵ-SVR with constraints related to three shape restrictions: grounded, marginal and 2-increasing, which are the necessary and sufficient conditions for a bivariate function to be a copula. This nonparametric method can be reformulated to a convex quadratic programming, which is computationally tractable. Experiments on both five artificial data sets and three international stock indexes clearly showed that it could achieve significantly better performance than common parametric models and kernel smoother.
مقدمه انگلیسی
Since [1], copula has become a standard tool of dependence modeling in multivariate statistical analysis [2] and [3]. Copula summarizes all dependence information between random variables and separates marginal components of a joint distribution from its distribution structure. In the last 20 years, copula has been widely applied in a variety of areas, such as engineering, finance, insurances, economics, etc. See, for example, the recent monographs [4], [5] and [6] (and references therein). In what follows we focus on bivariate copula only for simplicity. Let X1 and X2 be two continuous random variables of interest with joint distribution function H and marginal distributions F1 and F2, respectively. When the marginal distributions F1 and F2 are continuous, Sklar’s theorem [1] ensures that there exists a unique copula function C:[0, 1]2 → [0, 1], which satisfies By copula technique, the estimation of the joint distribution H can be separated by two steps: marginal distributions construction and copula estimation. Assume we should estimate H based on the i.i.d observations from the distribution H : (xi1,xi2)∈R2(xi1,xi2)∈R2, i = 1, … , T . The first step is to estimate the two marginal distributions, F 1 and F 2, based on the data sets x i1, i = 1, … , T , and x i2, i = 1, … , T , respectively. Assume View the MathML sourceF^1 and View the MathML sourceF^2 are the estimated cumulative distributions of X 1 and X 2, respectively. The second step is to estimate the copula function C based on the data set View the MathML source(F^1(xi1),F^2(xi2)),i=1,…,T. Let View the MathML sourceC^ denote the estimation of C, then the estimation for H is Due to the fact that the univariate cumulative distribution estimation problem has been extensively researched in statistics, this paper focuses on copula estimation and neglects the first step. Most often copula is obtained by Maximum Likelihood Estimation (MLE). Assume C comes from a copula family indexed by a real-valued parameter θ. The MLE for θ can be obtained by Commonly used copula families in practice are: • Gaussian where ρ ∈ (−1, 1), Φ(·) is the cumulative distribution function of the standard normal distribution, Φρ(·, ·) is the bivariate normal distribution function with marginal distributions standard normal and correlation ρ. • Student’s t where ρ ∈ (−1, 1), ν∈Nν∈N, tν(·) is the Student’s t distribution function with degree of freedom ν, tρ,ν is the bivariate Student’s t distribution function with correlation ρ and degree of freedom ν. • Clayton Though the MLE method has tractable computational complexity and nice asymptotic statistical properties, its performance severely depends on its guess on the copula family. For example, financial risk will be greatly underestimated if dependence between financial returns is assumed to be the Gaussian copula, which has zero lower tail dependence. For example, the Gaussian copula assumption in Collateralized Debt Obligations (CDOs) pricing [7] has been criticized to be one of the key reasons behind the global 2008–2009 Subprime Crisis.1 To overcome model specification error of MLE, some nonparametric methods have been proposed to estimate the underlying copula. Empirical copula, introduced by [8] and [9], extends the idea of empirical distribution for univariate variable to copula as follows Because empirical copula is highly discontinuous and wiggly, several methods have been proposed to smooth empirical copula, including kernel smoother [10], spline [11], [12], [13] and [14] and wavelets [15], [16] and [17]. However, nonparametric copula estimation with explicit 2-increasing shape restriction can hardly be found. One obvious obstacle of nonparametric methods for copula estimation is that their estimators are often invalid to be a copula function. The necessary and sufficient conditions for a bivariate function C:[0, 1]2 → [0, 1] to be a copula are. Different from MLE, whose assumed functional form ensures its estimator to be valid, nonparametric methods should explicitly append constraints related to the above three shape restrictions. These shape restrictions can be regarded as prior knowledge, which can be exploited to improve fitting performance. Its contribution is especially obvious when the size of the training data is small, in which case the common nonparametric estimators have great possibility of violating these shape restrictions. Shape-restricted regression dates back to the literature on isotonic regression [18] and [19]. There exists a large literature on the problem of estimating monotone, concave or convex regression functions. Because some of these estimators are not smooth, many efforts have been devoted to the search of a simple, smooth and efficient estimation of a shape-restricted regression function. Typical applications of shape-restricted regression include the study of dose response experiments in medicine and the study of utility functions, product functions, profit functions and cost functions in microeconomics. Shape-restricted regression has been incorporated into wavelets [20], spline [21] and [22] and Bernstein polynomials [23] and [24] etc. In machine learning area, shape restrictions also have been incorporated into support vector regression, such as monotone least squares support vector regression [25], monotone kernel quantile regression [26], boundary derivatives kernel regression [27], convex support vector regression [28]. Among them, [28] can tackle multivariate regression, while others are applicable only to univariate regression. [29] solved a support vector regression when one has some prior knowledge. However, it can only handle monotonicity or concavity of certain points. The main contribution of this paper is to propose a nonparametric copula estimation based on shape-restricted ϵ -support vector regression. In this paper, the method is called Kernel Copula Regression (KCR). In this method the estimator is obtained by fitting a bivariate function with ϵ -support vector regression based on samples (ui, C i), i = 1, …, T , where View the MathML sourceui=(F^1(x1i),F^2(x2i)) and Ci is its corresponding empirical copula (9). To make the estimator satisfy the three shape restrictions (10), (11) and (12), additional constraints are explicitly imposed on the classical ϵ-support vector regression [30] and [31]. To make the KCR estimator grounded and marginal, equality constraints are imposed on points that are spaced evenly on the four boundaries. To make it 2-increasing, nonnegative second-order mixed derivatives constraints are imposed on equidistant grid points of [0, 1]2. We expect to obtain better estimation by spanning a network of grounded, marginal and 2-increasing points. The advantages of KCR are multifold. First it is a nonparametric estimation method, which can handle any complex dependence structure. Second this estimator is smooth, which is an obvious superiority over the empirical copula estimator. Third this estimator satisfies the three shape-restrictions of copula: grounded, marginal and 2-increasing, provided that sufficient constraints related to shape restrictions are appended. Fourth its training involves a convex quadratic programming, which is computationally tractable. The structure of the paper is as follows. Section 2 introduces how to apply the classical ϵ-support vector regression (ϵ-SVR) to estimate copula. A toy data set is used to show its qualitative shortcomings. Section 3 presents the novel nonparametric method for copula estimation. We will detail how to impose additional constraints related to the three shape restrictions and how to transform it to a convex quadratic programming. Section 4 presents the numerical comparison results of KCR with other state-of-the-art methods. The paper is concluded in Section 5.
نتیجه گیری انگلیسی
This paper proposes a novel nonparametric bivariate copula estimation based on shape-restricted ϵ-support vector regression. This estimator smooths empirical copula with ϵ-SVR. But different from the classical ϵ-SVR, this method has additional constrains related to the three shape restrictions of copula functions. The additional shape constraints are used mainly for the determination of the shapes of the estimated surface, while the classical constraints are used to determine the location of the surface. The novelty of the estimator lies on three points. First, it has nonlinear smooth regression capability with the help of kernel trick. Second, it imposes equality constraints on boundary points to make the estimator grounded and marginal. Third, it imposes 2-increasing constraints on equidistant grid points to make its surface 2-increasing. Its dual problem is a convex quadratic programming, which is computationally tractable. Qualitative results of experiments on a toy data set clearly showed that the KCR estimator nearly satisfied the three shape restrictions of copula functions: grounded, marginal and 2-increasing. We also compared its performance, measured by average RMSE, with kernel smoother, which is regarded as a state-of-the-art copula estimation. The out-of-sample results obviously showed that the KCR method could achieve significantly better performance. The pair returns between three major stock indexes, S&P500, FTSE100 and HSI, were also used to test its applicability in financial risk management. Experimental results clearly demonstrated that this estimator could achieve significantly better performance than 6 parametric models and kernel smoother.