دانلود مقاله ISI انگلیسی شماره 24963
ترجمه فارسی عنوان مقاله

رگرسیون بردار پشتیبانی از پیش بینی بار اتصال

عنوان انگلیسی
Support vector regression for link load prediction
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24963 2009 11 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computer Networks, Volume 53, Issue 2, 13 February 2009, Pages 191–201

ترجمه کلمات کلیدی
پیش بینی شبکه - اندازه گیری ترافیک - یادگیری نظارت شده - ماشین آلات بردار پشتیبانی
کلمات کلیدی انگلیسی
Network forecast,Traffic measurement,Supervised learning,Support vector machines
پیش نمایش مقاله
پیش نمایش مقاله  رگرسیون بردار پشتیبانی از پیش بینی بار اتصال

چکیده انگلیسی

From weather to networks, forecasting techniques constitute an interesting challenge: rather than giving a faithful description of the current reality, as a looking glass would do, researchers seek crystal-ball models to speculate on the future. This work explores the use of Support Vector Regression (SVR) for the purpose of link load forecast. SVR works well in many learning situations, because they generalize to unseen data, and are amenable to continuous and adaptive online learning – an extremely desirable property in network environments. Motivated by the encouraging results recently gathered by means of SVR on other networking applications, our aim is to enlighten whether SVR is also successful for the prediction of network links load at short time scales. We consider the problem of link load forecast based only on its past measurements, which is referred to as “embedded process” regression in the SVR lingo, and adopt a hands-on approach to evaluate SVR performance. In more detail, we perform a sensitivity analysis of the parameters involved, assess the computational complexity for training and validation, dig into the correlation structure of the prediction errors and evaluate techniques to extend the forecasting horizon. Our finding is that accuracy results are close enough to be tempting, but not enough to be convincing. Yet, as SVR exhibit a number of advantages, such as good robustness and flexibility properties, furthermore at a price of a limited complexity, we then speculate on what directions can be undertaken to ameliorate its performance in this context.

مقدمه انگلیسی

It is fairly well accepted that, as a result of network services and Internet applications evolution, network traffic is becoming increasingly complex. On the one hand, transport networks are challenged by the current convergence trend of voice/video/data services on an all-IP network, and by the fact that user-mobility will likely translate into service-mobility as well. On the other hand, the explosion of Internet telephony, television and gaming applications implies that we may be forced to re-think what we mean by “data” traffic. Moreover, the widespread usage of application layer overlays directly translates into a much higher variability of the data traffic injected into the network. In this paper, we question whether such variability can be efficiently forecasted, and if so, with what level of accuracy. The supervised prediction technique we selected is Support Vector Machines (SVM), a set of classification and regression techniques, introduced in the early nineties [1], that are grounded in the framework of statistical learning theory. Basically, Support Vector Regression (SVR) uses training data to build a forecast model which works well in many learning situations because it generalizes to unseen data and is amenable to continuous and adaptive online learning, an extremely desirable property in network environments. Initially bound to the optical character recognition context, the use of SVM rapidly spread to other fields, including time series prediction [2] and, more recently, networking [3], [4], [5] and [6]. Motivated by such encouraging results, we focus on link load forecast based only on past measurements, following an approach known as “embedded process” [2]. This problem is of great interest in networking for both capacity planning and self-management application (e.g. bandwidth provisioning, admission control, trigger of backpressure mechanisms, etc.). Though the SVM approach fits well to longer time-scales as well, which are more of a concern for capacity planning, in this paper we focus on the estimation of load variation at short time scales: adopting a hands-on approach to the SVM regression, we evaluate the effectiveness of SVR for link load forecast by exploring a rather extensive parameter and design space. Our aim is twofold: first, we want to evaluate the SVM accuracy and robustness and, second, we want to provide useful insights on the tuning of the SVM parameters, an aspect not always clear in previous work. We compare the performance with those achievable using Moving Average and Auto-Regressive models: our results show that, despite a good accordance with the actual data, the SVR gain achievable over simple prediction methods is not enough to justify its deployment for link load prediction at short time scales. Yet, we have to tribute SVR of a number of extremely positive aspects: for instance, SVR models are rather robust to parameter variation, and their computational complexity is far from being prohibitive, which makes them suitable for online prediction. Moreover, we experimentally verify that errors calculated over consecutive samples are independent and identically distributed, which allows the evaluation of confidence intervals. Finally, we also investigate methods to extend the forecast horizon using forecasted values as input for a new prediction: interestingly, this approach of recursive SVR may significantly extend the achievable forecast horizon, entailing only a very limited accuracy degradation. The remainder of the paper is organized as follows. After discussing related work in Section 2, we briefly overview the Support Vector Regression theory in Section 3. In Section 4 we specify the methodology we follow in applying SVR models to link load forecast, as well as describing the other forecasting techniques that will be used for comparison purposes. A complete and extensive sensitivity analysis of SVR performance is reported in Section 5, whereas further details on the temporal evolution of the error, computational complexity considerations and result of recursive SVR are reported in Section 6. Finally, concluding remarks and future work are addressed in Section 7.

نتیجه گیری انگلیسی

This paper explores the use of Support Vector Regression for the purpose of link load forecast: using a hands-on approach, we tune the SVR performance and compare it with those achievable by using Moving Average (MA) and Auto-Regressive (AR) models. Our results show that, despite a good accordance with the actual data, the SVR gain achievable over simple prediction methods such as MA or AR is not sufficient to justify its deployment for link load prediction at short time scales. Yet, we have to pay a tribute to SVR for a number of extremely positive aspects: for instance, SVR models are (i) rather robust to parameter variation, (ii) their computational complexity is far from being prohibitive, and (iii) the cascading of SVR models may significantly extend the achievable forecast horizon, entailing only a very limited accuracy degradation. It is our belief that this work constitutes a starting point for further investigation, whose directions are highlighted in the following. First, in order to gather more robust results, different traces representative of rather different network scenarios should be used to validate the extent of the above analysis. Then, the question remains about what can be done to improve the performance of SVR at short time scales: preliminary results seem to suggest that a manipulation of the time series (e.g., differentiation, statistical properties, etc.) may bring a significant benefit in terms of the forecast accuracy – in which case comparison over more sophisticated techniques for time series forecast would be needed. Another open issue is whether the use of other kernels (e.g., multi-linear or other that can take into account the characteristics of the time series) possibly improves the SVR accuracy, which could thus avoid the burden of costly time-series manipulation. Finally, other interesting directions for this research could possibly involve the evaluation of different forecast targets with respect to the average link load (such as the peak load, or the 95th percentile, etc.) as well as the analysis of longer timescales. Indeed, concerning the latter point, it could be interesting to investigate whether feeding SVR with features such as time-of-day and day-of-week would help in forecasting periodic load fluctuations (such as lunch breaks and week-ends).