خوشه کاربردی و رگرسیون خطی برای پیش بینی بار پیک
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|24308||2010||12 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : International Journal of Forecasting, Volume 26, Issue 4, October–December 2010, Pages 700–711
In this paper we consider the problem of short-term peak load forecasting using past heating demand data in a district-heating system. Our data-set consists of four separate periods, with 198 days in each period and 24 hourly observations in each day. We can detect both an intra-daily seasonality and a seasonality effect within each period. We take advantage of the functional nature of the data-set and propose a forecasting methodology based on functional statistics. In particular, we use a functional clustering procedure to classify the daily load curves. Then, on the basis of the groups obtained, we define a family of functional linear regression models. To make forecasts we assign new load curves to clusters, applying a functional discriminant analysis. Finally, we evaluate the performance of the proposed approach in comparison with some classical models.
Load demand forecasting is becoming more and more important as power generation costs increase and market competition intensifies: accurate forecasts are relevant to energy systems for scheduling generator maintenance and choosing an optimal mix of on-line capacity. The literature on load forecasting considers three main problems: long-term forecasts for system planning, medium-term forecasts for maintenance programs, and short-term prediction for the day-to-day operation, scheduling and load-shedding plans of power utilities. A central role is played by the intra-daily pattern of the load demand, known as the load curve, which describes the amount of energy consumed to satisfy the load demand of customers over the course of the day. The focus of the present work is on short-term (i.e. around 24 h) forecasting of the daily peak load in a district-heating (or “teleheating”) system, which is the maximum of the daily demand for heating. A district-heating system involves distributing the heat for residential and commercial requirements via a network of insulated pipes. We analyze here data on heat consumption in a major Italian centre, Turin, where the district heating is produced through a co-generation system. This technology allows for energy saving and reduces emissions compared to old technologies; it is therefore treated as a renewable energy source and delivers substantial economic and environmental benefits. In the recent literature concerning prediction in district-heating systems (see, for example, Dotzauer, 2002, and Nielsen & Madsen, 2006, for some applications and references), the algorithms employed are usually similar to those used in the prediction of electrical-power loads. A review of statistical methods for electrical load forecasting has been given by Weron (2006), for instance. These methods are mainly based on ARIMA models, regression models, exponential smoothing, and generalizations of these. Typically, weather variables are used for the prediction of electricity loads, and a great range of modeling approaches are presented in the literature. Developments in forecasting methodologies are also reflected by the contributions in the special issue of the International Journal of Forecasting on energy forecasting that appeared in 2008. For instance, Dordonnat, Koopman, Ooms, Dessertaine, and Collet (2008) develop a multi-equation model with time-varying parameters, while Alves da Silva, Ferreira, and Velasquez (2008) propose automated input selection procedures for forecasting models based on neural networks. Whereas electric and wind power, gas consumption and electricity price forecasting are discussed in the cited issue, the present paper deals with heat consumption in a residential area. In this case, the strong correlation between heating and external temperature has led us to a parsimonious model without exogenous variables, which forecasts satisfactorily without requiring the inclusion of weather variables. In this paper we propose a method based on a functional statistics approach. The functional statistics approach has become the object of an increasing amount of attention on the part of many researchers and practitioners in recent years, since it can be applied when data are collections of discrete observations effected on curves, images or shapes. A survey of these techniques can be found in the monographs of Ramsay and Silverman (2005) and Ferraty and Vieu (2006), for instance. Our data-set consists of hourly observations of the heat consumption over four separated periods, with 198 days in each period, over the years 2001-2005. The load data are registered in megawatts, and have been rescaled to be between 0 and 1. The goal of the present work is to build and estimate models on the basis of the first three periods, and then to evaluate the out-of-sample performance of different forecasts of the peak load on the whole fourth period. First, we define a sample formed by the daily curves of heat consumptions (we will refer to them as “load curves”), and consider a functional linear regression model where the peak demand on a given day is the scalar response and the load curve of the previous day is the functional regressor. Noting that intra-daily effects change with the season within each period, we then propose a methodology for improving the forecasting ability of the functional regression model: we partition the observational curves into homogeneous groups using the functional clustering technique presented by Abraham, Cornillon, Matzner-Lober, and Molinari (2003). The aim of our clustering procedure is to find some characteristic patterns in the data-set that may determine changes in the heat demand peaks. The classification of the load curves into groups is a solution to the problem of modeling the seasonality effect during each period. Then, we estimate a specific functional linear coefficient for each group, obtaining a family of functional regression models. Note that various clustering techniques for classifying similar load patterns have been considered in the literature; for instance, Chicco, Napoli, and Piglione (2001) and Amin-Naseri and Soroush (2006) refer to neural networks clustering; here we use clustering in a functional context instead. In order to assign the new curves to clusters in the forecasting procedure, we use a functional linear discriminant analysis, as was done by James and Hastie (2001). Finally, we compare the out-of-sample performance of our models with classical regression approaches in which the functional nature of the data is not taken into account. All routines are implemented by the R software. Finally, let us consider some related works which have recently appeared in the literature on forecasting with functional regression. Hyndman and Ullah (2007) propose a robust methodology for forecasting functional time series; their original method, which is applied in the demographic context in particular, also differs from our approach in that they forecast a smoothed function instead of a scalar, as in our problem. In addition, Kargin and Onatski (2008) and, in a electricity load context, Antoch, Prchal, De Rosa, and Sarda (2008), focus on forecasting functional data, which are assumed to be generated by a functional autoregressive process. Regarding the forecasting of scalars, Sood, James, and Tellis (2009) apply functional linear regression to predict the market consumption of new products, showing the advantages of functional techniques in comparison to standard ones. In addition to a different applicative problem, in which our data-set is given by time series, our method differs from theirs because it uses functional clustering not only to describe the data, but also for prediction. The paper is organized as follows. In Section 2 we describe the data-set, introduce the notations and present the problem considered in this work. In Section 3 we propose and discuss the functional forecasting methodology we adopt. Section 4 is devoted to a numerical comparison of the performances of the models considered with some simple non-functional competitors. Final remarks conclude the paper.
نتیجه گیری انگلیسی
In this paper we have proposed a method based on functional data analysis to approach a specific problem in short-term forecasting in a district-heating system. A family of functional linear models, selected by means of curve classification procedures, has been used to make peak load forecasts: the daily peak of heating demand is predicted on the basis of the “load curve” of the previous day. The technique presented, which generalizes the classical multiple regression model, is relatively simple and takes into consideration the functional nature of the problem considered. In fact, the method used could also be implemented when load curves are observed at a larger number of points in time, and also when the points are not equally spaced, a situation in which a multivariate approach could not be employed. The proposed model has shown a good performance in comparison with competing non-functional models. Moreover, functional clustering and functional linear discriminant analysis allow us to take into account the way in which the intra-daily pattern evolves over time without introducing a cumbersome specification with dummy variables or other complex procedures. The forecasting results using the functional techniques are promising. From a technical point of view, weather variables such as temperature could be added in the functional model as predictors; however, because of the information contained in the heating data of the problem considered, a parsimonious model without requiring the inclusion of exogenous variables turns out to be satisfactory in the forecasting exercise. The functional method may be also extended to a non-parametric approach by employing, for instance, non-parametric regression models, or by using non-parametric unsupervised classification, as illustrated by Ferraty and Vieu (2006). There could be future interesting extensions of the functional techniques discussed in this paper in various directions. It could be possible, first, to assume that the errors are dependent on time, which may also better capture middle season months like April and October; second, to build forecasts of the entire daily load curve; and, finally, to provide distributional forecasts (as, for example, in Hyndman & Fan, 2008).