Water quality prediction plays an important role in modern intensive river crab aquaculture management. Due to the nonlinearity and non-stationarity of water quality indicator series, the accuracy of the commonly used conventional methods, including regression analyses and neural networks, has been limited. A prediction model based on support vector regression (SVR) is proposed in this paper to solve the aquaculture water quality prediction problem. To build an effective SVR model, the SVR parameters must be set carefully. This study presents a hybrid approach, known as real-value genetic algorithm support vector regression (RGA–SVR), which searches for the optimal SVR parameters using real-value genetic algorithms, and then adopts the optimal parameters to construct the SVR models. The approach is applied to predict the aquaculture water quality data collected from the aquatic factories of YiXing, in China. The experimental results demonstrate that RGA–SVR outperforms the traditional SVR and back-propagation (BP) neural network models based on the root mean square error (RMSE) and mean absolute percentage error (MAPE). This RGA–SVR model is proven to be an effective approach to predict aquaculture water quality.
Aquaculture water is an important aspect of the river crab’s habitat in the intensive breeding of river crab, and the water quality determines the growth status and product quality directly. Once the water quality deteriorates and the crabs are in a poor environment, it is very easy for there to be an outbreak of some diseases; also there is the decline in the quality and even a large number of dead river crabs in a short time, which will cause great economic losses to the farmers if remedial measures are not taken in a timely manner. So, taking advantage of modern information technology to have early warnings of water conditions and enable the dynamic change of water is an urgent and important matter.
Aquaculture water is an open, nonlinear, dynamic, complex system. Water quality is affected by many factors such as physics, chemistry, hydraulics, biology, meteorology, and human activities, and the water quality parameters are nonlinear, time varying, random and delayed, because of the interactions between them. Thus, it is difficult to describe them quantitatively using accurate mathematical models and to establish an accurate, perfect, nonlinear prediction model using traditional methods.
Prediction of water quality focuses mainly on lakes, rivers, reservoirs, estuaries, and other large expanses of water using the gray system theory, neural networks, statistical analysis methods, time series models, both in China and elsewhere. Partalas et al. studied the greedy ensemble selection family of algorithms for ensembles of regression models to solve the forecasting of water quality [1]; Feifei Li et al. established back-propagation (BP) and autoregressive (AR) versions of the short-term forecasting model to predict dissolved oxygen [2], Eun Hye Naa et al. designed a dynamic three-dimensional water quality model to predict phytoplankton growth patterns in time and space [3]; Han has presented a flexible structure radial basis function neural network (FS-RBFNN) to predict the wastewater biochemical oxygen demand (BOD) index [4].
Palani et al. developed a neural network model to forecast the amount of dissolved oxygen in seawater [5]; Bikash Sarkar proposed a water quality model to predict the changes of temperature in an indoor fish pond [6]; Yu Deng adopted a wavelet neural network model based on wavelet theory and neural network theory to forecast the drinking water permanganate index [7]. However, neural networks suffer from a few weaknesses, which include the need for numerous controlling parameters, difficulty in obtaining a stable solution, and the danger of over-fitting.
Support vector regression (SVR) is a novel learning machine based on statistical learning theory and a structural risk minimization principle, which has been successfully used for nonlinear system modeling [8]. Yunrong Xiang employed a least squares support vector machine (LS-SVM) and a particle swarm optimization model to predict the quality of a drinking water source [9]. Compared with artificial neural networks, an SVM provides more reliable and better performance under the same training conditions [10] and [11]. Although it has excellent features, SVR is limited in academic research and industrial applications, since the user must define various parameters appropriately. The SVR parameters must be set carefully in order to construct the SVR model efficiently [12], [13] and [14]. Inappropriately chosen SVR parameters will result in over-fitting or under-fitting, and different parameter settings may also cause significant differences in performance [15]. Thus, selecting the optimal parameters is an important step in SVR design. However, no general guidelines are available to help in selecting these parameters [16], [17] and [18]. So, we propose a hybrid approach of SVR with real-value genetic algorithm (RGA) optimization is developed by adopting an RGA to determine the SVR free parameters, and so the generalization ability and forecasting accuracy are improved in this study. The approach is used to forecast water quality in a high-density crab culture situation. The traditional SVR model and a BP neural network were also investigated for comparison. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by our proposed approach.
The structure of the paper is as follow. In Section 2, we introduce the real-value genetic algorithm (RGA) and support vector regression (SVR), and then the hybrid RGA–SVR model is proposed. Section 3 describes the data source and experimental setting and explains the process for determining the parameters of the RGA and SVR models. Section 4 discusses the results and analysis of the hybrid RGA–SVR model used in on-site aquaculture water quality prediction. Section 5 concludes the study, and suggests directions for future investigations.
Water quality prediction is very important for intensive aquaculture. It can help provide early warnings of the change of water quality and reduce the loss of aquaculture. The method introduced here employs a hybrid RGA–SVR approach for the forecasting of aquaculture water quality, in which a real-value genetic algorithm is used to select suitable parameters for SVR. The genetic algorithm consists in maintaining a population of chromosomes, which represent potential solutions to the problem to be solved. From actual experiments using monitored aquaculture water quality data from aquatic factories of YiXing in China, the hybrid approach of support vector regression with genetic algorithm optimization is able to provide reliable data on the water quality prediction of large-scale intensive aquaculture. The experimental results also suggest that the application of an artificial intelligence technique is perfectly suitable for the forecasting operation of nonlinear time series problems. The RGA–SVR forecasting method can help avoid economic losses caused by water quality problems to a certain extent. However, in the training process of the RGA–SVR model, operation of the genetic algorithm is difficult: different types and rates of crossover and mutation need to be set for different problems. So, how to use advanced techniques to update the appropriate features and parameters of the proposed model will be an important direction for future development.