In this paper we present a method for improving the generalization performance of a radial basis function (RBF) neural network. The method uses a statistical linear regression technique which is based on the orthogonal least squares (OLS) algorithm. We first discuss a modified way to determine the center and width of the hidden layer neurons. Then, substituting a QR algorithm for the traditional Gram–Schmidt algorithm, we find the connected weight of the hidden layer neurons. Cross-validation is utilized to determine the stop training criterion. The generalization performance of the network is further improved using a bootstrap technique. Finally, the solution method is used to solve a simulation and a real problem. The results demonstrate the improved generalization performance of our algorithm over the existing methods.
Radial basis function neuron networks (RBF) have recently attracted extensive research interest (Huang et al., 2008, Kurt et al., 2008, Lee, 2007 and Schwenker et al., 2001). RBF can be regarded as an amalgamation of a data modeling technique for high-dimensional space, and a universal approximation scheme, also popularly known as artificial neural networks (ANNs). RBF neural networks are a powerful technique for generating multivariate nonlinear mapping (Bishop, 1991).
An RBF network approximates an unknown mapping function such as f:Rn→Rmf:Rn→Rm. First, the input data undergo a nonlinear transformation via the basis function in the network’s hidden layer; the combined basis function responses are linearly combined to give the network output. The overall input–output transfer function of the RBF network can be defined and mapped as follows:
equation(1)
View the MathML sourcefi(x)=yˆi=∑j=1Jϕjx-cj,ρ·θji,i=1,…,m,
Turn MathJax on
where θjiθji indicates the adjustable network weights connecting hidden network nodes with network output; ϕj(‖x-cj‖,ρ)ϕjx-cj,ρ is the activation function of the hidden layer neurons or the hidden kernel nodes in an RBF network. Typically this is a Gaussian function, as in Eq. (2). The x is the input variable; c is the center of the activation function; ρρ is the width of the activation function; and ‖·‖· denotes the Euclidean norm
equation(2)
View the MathML sourceϕjx-cj,ρ=exp-x-cj2ρ2.
Turn MathJax on
As is well known, the performance of an RBF network is critically dependent on the number and the centers of the hidden layer neurons chosen. The most natural choice is to set each piece of data in the training set to correspond to a center (Bishop, 1991 and Chen and Billings, 1992). This means that all the training data are center candidates. In this case, the number of degrees of freedom in the network is equal to the number of training data, and the network function fits each data point exactly. If the data behave regularly, but are contaminated by noise, then the phenomenon of overfitting is said to occur, which leads to poor generalization performance of the network. An efficient approach for improving generalization performance is to construct a small network using the parsimonious principle (Chen and Billings, 1992 and Kavli, 1993). However, the parsimonious principle strategy is not entirely immune to overfitting (Chen & Billings, 1992). Regularization is a technique that can be used to overcome overfitting (Bishop, 1991). Some researchers have combined regularization techniques with the parsimonious principle. For example, Barron and Xiao (1991) proposed a first-order regularized stepwise selection of subset regression models. (Orr, 1993) derived a regularized forward selection algorithm. (Chen & Billings, 1992) combined zero-order regularization with the orthogonal least squares (OLS) method to derive a regularized OLS algorithm (ROLS) for RBF networks.
In this study we present a method for improving the generalization ability of RBF neural network by using a statistics linear regression technique based on the OLS algorithm. We first discuss a modified way to determine the center and width of the hidden layer neurons. Then, substituting the QR algorithm for the traditional Gram–Schmidt algorithm, we solve for the connected weight of the hidden layer neurons. Cross-validation is utilized to find the stop training criterion to get a smaller network size.
The rest of this paper is organized as follows. In Section 2 we describe the development of the proposed method. In Section 3 the examples and results are discussed. In Section 4 some conclusions are offered.
The RBF training process can be divided into two stages. In the first stage the center and width of hidden nodes are determined, while in the second stage the weights of the hidden nodes are calculated. In this study we use a novel method dynamic center and width determination and the QR algorithm to calculate the connection weights. To improve the generalization performance we also use the early stopping and statistical linear regression techniques.