الگوریتم ژنتیک برای اصلاح انتخاب داده های ورودی برای پیش بینی دمای هوا با استفاده از شبکه های عصبی مصنوعی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
8173 | 2013 | 8 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Applied Soft Computing, Volume 13, Issue 5, May 2013, Pages 2253–2260
چکیده انگلیسی
The accurate prediction of air temperature is important in many areas of decision-making including agricultural management, transportation and energy management. Previous research has focused on the development of artificial neural network (ANN) models to predict air temperature from one to twelve hours in advance. The inputs to these models included a constant duration of prior data with a fixed resolution for all environmental variables for all prediction horizons. The overall goal of this research was to develop more accurate ANN models that could predict air temperature for each prediction horizon. The specific objective was to determine if the ANN model accuracy could be improved by applying a genetic algorithm (GA) for each prediction horizon to determine the preferred duration and resolution of input prior data for each environmental variable. The ANN models created based on this GA based approach provided smaller errors than the models created based on the existing constant duration and fixed data resolution approach for all twelve prediction horizons. Except for a few cases, the GA generally included a longer duration for prior air temperature data and shorter durations for other environmental variables. The mean absolute errors (MAEs) for the evaluation input patterns of the one-, four-, eight-, and twelve-hour prediction models that were based on this GA approach were 0.564 °C, 1.264 °C, 1.766 °C and 2.018 °C, respectively. These MAEs were improvements of 3.98%, 4.59%, 2.55% and 1.70% compared to the models that were created based on the existing approach for the same corresponding prediction horizons. Thus, the GA based approach to determine the duration and resolution of prior input data resulted in more accurate ANN models than the existing ones for air temperature prediction. Future work could examine the effects of various GA and fitness evaluation parameters that were part of the approach used in this study
مقدمه انگلیسی
Air temperature is one of the most important weather variables that affect crop growth and development and is one of the primary input to model the simulation of crop production and food security, especially under a changing climate [1]. It is also important in other aspects of our daily life. Absalon and Slesak [2] stated that air temperature should be carefully monitored and included in the assessment of the quality of human life in an urban area. Stull [3] used air temperature along with relative humidity to calculate wet-bulb temperature at standard sea level pressure. White-Newsome et al. [4] used outdoor air temperature and dew point temperature for the prediction of indoor heat to mitigate the effects of indoor heat exposure among the elderly people in Detroit. The 2007 spring freeze in the eastern U.S. significantly affected agriculture and killed newly formed leaves, shoots, and developing flowers and fruits [5]. The severity of frost damage is influenced by the intensity and duration of low temperatures, the rates of temperature decrease and short-term temperature variations [6]. Therefore, it is necessary to accurately predict air temperature to help farmers in preventing crops from being damaged by freezing temperatures. The Georgia Automated Environmental Monitoring Network (AEMN), established in 1991 [7], currently consists of more than 80 weather stations distributed throughout the state of Georgia, USA. Each weather station is solar-powered and monitors weather data including air temperature, dew point temperature, relative humidity, vapor pressure, wind speed, wind direction, solar radiation and rainfall at a one-second frequency. These data were summarized into hourly averages until March 1996. Subsequently they have been aggregated into fifteen minute averages or totals, depending on the atmospheric parameter. The observations are downloaded at least hourly to a server, and immediately made available on the website www.georgiaweather.net for use by the general public, including growers and producers. Jain et al. [8] developed Artificial Neural Network (ANN) models to predict air temperature during the winter months. These models were trained using the patterns which included six hours of prior weather information such as air temperature, relative humidity, wind speed, and solar radiation as well as the time of the day. Smith et al. [9] improved the prediction accuracies of winter-specific air temperature models by including seasonal information in the input pattern and extending the duration of prior data to 24 h. Smith et al. [10] also developed ANN models to predict air temperature throughout the year using the data collected through 2005. These ANN models were implemented on www.georgiaweather.net as tools for temperature prediction. Shank et al. [11] created ANN models to predict dew point temperature up to 12 h in advance based on the weather variables dew point temperature, relative humidity, solar radiation, air temperature, wind speed, and vapor pressure. Shank et al. [12] created ensemble ANN models to improve the accuracy of dew point temperature prediction. These ANN models were also implemented on www.georgiaweather.net, where the predictions are available for both air and dew point temperature for each automated weather station in Georgia. Hourly predictions are made from one to twelve hours ahead and updated every fifteen minutes, once new data have been received from each weather station. Chevalier et al. [13] created a decision support system to interpret the air temperature and dew point temperature predictions along with the observed wind speed as one of the five frost warnings determined related to blueberries and peaches. The current ANN models that have been implemented are based on a Ward-style ANN architecture and were trained using the well-known error back-propagation algorithm. Preferred values for the ANN parameters such as the learning rate, number of hidden nodes, and initial weight range were determined by iterative search. The observations collected by weather stations were partitioned into different datasets for model development and evaluation purposes. In the work by Smith et al. [10], the duration of prior weather information for the inputs to the ANN model was determined by a limited iterative search. The durations considered were 2, 4, 6, 12, 18, 24, 30, 36, and 48 h of prior data. A single duration was used to include the prior data for all five weather variables. Although the observed data were available for every fifteen minutes, Smith et al. [10] always included the data in one hour intervals. They did not explore the effects of including the prior data with various resolutions, i.e. a shorter or longer interval than one hour. Thus, all twelve existing models included 24 h of prior data for each weather variable in one hour intervals, resulting in a constant 258 input variables to the ANN models. Evolutionary algorithms, which are inspired by the biological evolutionary process, have been widely combined with ANNs to evolve the network architecture, connection weights and input features. Montana and Davis [14] employed a genetic algorithm (GA) to evolve the connection weights of an ANN for the sonar image classification problem. They reported that the learning algorithm based on the GA outperformed the traditional back propagation algorithm. Stanley and Miikkulainen [15] presented a method named NEAT (Neuro Evolution of Augmenting Topologies), which enabled parallel evolution of both network architecture and connection weights. Aijun et al. [16] used a GA to optimize the chemical vapor infiltration (CVI) processing parameters of carbon/carbon composites. The fitness function of their GA evaluated ANNs based on the candidate input parameters of the network. Saxena and Saad [17] applied a GA to select the preferred combination of features to develop an ANN fault classification model for condition monitoring of mechanical systems. This GA also evolved the structure of the ANN in terms of the number of hidden nodes. Mohebbi et al. [18] coupled a GA with the ANN to estimate the moisture content of dried banana. Their GA evolved the ANN parameters such as the number of hidden layers, and the number of hidden nodes, learning rate and momentum for each hidden layer. Čongradac and Kulić [19] created a model to reduce the electricity consumption of chillers by coupling the ANN with a GA. The ANN was used to create a chiller model and then the GA was applied to optimize the chiller model parameters. Irani and Nasimi [20] used a hybrid GA-ANN strategy to predict the permeability of the Mansuri Bangestan reservoir. They used the GA to search for the best set of initial ANN weights for training using the back propagation algorithm and showed that the hybrid approach outperformed the traditional gradient-descent based approach for the ANN training. For the research herein, it was hypothesized that the information associated with each weather variable could contribute in varying degrees to the prediction accuracy of the model. Also, including too much unnecessary information could have a negative effect on the prediction accuracy. Tahai et al. [21] claimed that incorporating too many input noise variables into the ANN prediction model would result in a poor ANN generalization capability. The amount of input information to the ANN model associated with a weather variable can be controlled with the duration and resolution of prior data for that particular weather variable, where resolution is used to denote the interval. Longer duration and higher resolution requires more information to be included. The time series nature of the weather data also makes it intuitively appealing to explore variable resolution in prior data. The overall goal of this study was to improve the air temperature prediction accuracy of the ANN models that have been developed by Smith et al. [10] and have been implemented on www.Georgiawather.net by optimizing the duration and resolution of prior data that are included as input. The specific objective was to use evolutionary algorithms to search for the preferred duration and resolution of prior data to be included for each input weather variable and for each of the twelve prediction horizons.
نتیجه گیری انگلیسی
In this study, ANN models were developed to predict air temperature by performing a GA search to determine the optimal duration and resolution of prior data for each weather variable that was considered as a potential input variable for the model. These ANN models had higher accuracies than the ANN models that were developed based on the existing approach. This approach identified the contributive roles of various weather variables in predicting air temperature by using resource-intensive computational intelligence techniques. The ANN models based on the existing CDFR approach were recreated using the same datasets used to create the ANN models based on the new approach for a fair comparison. The GA based approach with a restricted parameter setting for the fitness evaluation produced more accurate models for one- through ten- and twelve-hour prediction horizons, but did not produce more accurate model for the eleven-hour prediction horizon. A limited study was performed that ran the GA with an increased number of ANN training patterns for the fitness evaluation for one-, four-, eight-, eleven-, and twelve-hour prediction horizons. Except for the eight-hour prediction horizon, the final ANN models developed using this extended GA based approach were the most accurate models developed in this study for their respective prediction horizons. Using the extended GA based approach, the highest improvement in the accuracy was achieved for the four-hour prediction horizon with a 4.59% improvement, compared to the accuracy of the model created based on the existing approach. However, the methodology used in this study could be further improved by exploring and fine-tuning various computational parameters. The extended GA runs showed that the GASDR model accuracies were generally improved by increasing the number of ANN training patterns used for the fitness evaluation. With additional computational resources, the number of ANN training patterns and the number of random network instantiations could be further increased for the GA fitness evaluation. Other possible parameters that could be explored include the GA population size, the crossover and the mutation operators and their probabilities, and the number of segments in the prior data for a weather variable. Another possible study is to explore the effects of applying other search algorithms such as particle swarm optimization to determine the duration and resolution of prior data for each weather variable.