تجزیه و تحلیل تجارت کردن بین نرخ نفوذ و فرکانس نمونه برداری از سنسور های تلفن همراه در تخمین حالت ترافیک
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
26571 | 2014 | 19 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Transportation Research Part C: Emerging Technologies, Volume 46, September 2014, Pages 132–150
چکیده انگلیسی
The rapid-growth of smartphones with embedded navigation systems such as GPS modules provides new ways of monitoring traffic. These devices can register and send a great amount of traffic related data, which can be used for traffic state estimation. In such a case, the amount of data collected depends on two variables: the penetration rate of devices in traffic flow (P) and their data sampling frequency (z). Referring to data composition as the way certain number of observations is collected, in terms of P and z, we need to understand the relation between the amount and composition of data collected, and the accuracy achieved in traffic state estimation. This was accomplished through an in-depth analysis of two datasets of vehicle trajectories on freeways. The first dataset consists of trajectories over a real freeway, while the second dataset is obtained through microsimulation. Hypothetical scenarios of data sent by equipped vehicles were created, based on the composition of data collected. Different values of P and z were used, and each unique combination defined a specific scenario. Traffic states were estimated through two simple methods, and a more advanced one that incorporates traffic flow theory. A measure to quantify data to be collected was proposed, based on travel time, number of vehicles, penetration rate and sampling frequency. The error was below 6% for every scenario in each dataset. Also, increasing data reduced variability in data count estimation. The performance of the different estimation methods varied through each dataset and scenario. Since the same number of observations can be gathered with different combinations of P and z, the effect of data composition was analyzed (a trade-off between penetration rate and sampling frequency). Different situations were found. In some, an increase in penetration rate is more effective to reduce estimation error than an increase in sampling frequency, considering an equal increase in observations. In other areas, the opposite relationship was found. Between these areas, an indifference curve was found. In fact, this curve is the solution to the optimization problem of minimizing the error given any fixed number of observations. As a general result, increasing sampling frequency (penetration rate) is more beneficial when the current sampling frequency (penetration rate) is low, independent of the penetration rate (sampling frequency).
مقدمه انگلیسی
The arrival of mobile internet and the incorporation of satellite navigation systems (such as GPS2 and GLONASS3) into smartphones has enabled a new approach for monitoring and estimating traffic, since these devices are able to collect and send traffic data. Unlike infrastructure-based technology, the use of smartphones as traffic sensors provides a good spatial and temporal coverage of the transportation network at a relatively low cost, since the cost of the devices are assumed by the user and information systems are of little cost (Sanwal and Walrand, 1995). For these reasons, traffic monitoring and traffic state estimation based on data provided by probe vehicles has been analyzed in the context of freeways (Herrera and Bayen, 2007, Herrera and Bayen, 2010, Herrera et al., 2010, Work et al., 2008 and Nanthawichit et al., 2003) and urban networks (Sun and Ban, 2013, Herring et al., 2010, Hofleitner et al., 2012, Ban et al., 2011, Jenelius and Koutsopoulos, 2013 and Feng et al., 2014). In spite of the clear benefits this technology provides, there are some inherent downsides in using it to monitor or to estimate traffic (Herrera et al., 2010). First, not every vehicle on the road acts as a probe vehicle because not all of them are equipped with this technology and able to send data. Second, energy consumption rises as the navigation system is turned on all the time. Moreover, the frequency with which data are sent (sampling frequency) also affects the energy consumption, as analyzed by Frank et al. (2012). The sampling frequency also relates to internet data consumption, since this could be of cost for the user, or could be within a monthly data limit. Finally, because of the nature of this technology, it could be privacy-invasive. This is an important topic to address in order to achieve a high participation rate in the system (Rose, 2006). Because of the previous downsides, the amount of data received from probe vehicles is limited, and essentially depends on: (i) The penetration rate of devices in the traffic flow: that is, how many vehicles act as probe vehicles and send data. This rate not only depends on the penetration of this technology in the population, but also in the people’s willingness to share their data. Smartphones in the population have been increasing steadily across the globe, with an estimate of 63% of total new handset sales in the USA and 37% globally for 2014 (GSMA, 2011). In Chile, smartphones connections have increased by 50% (from 2.6 to 4 MM) during 2012, achieving a penetration rate of 22.8 smartphones per 100 inhabitants (Subsecretaría de Telecomunicaciones, 2013). At the same time, the popular crowdsourcing application and social network Waze has 1.9 MM users in Chile (Ministerio de Transportes y Telecomunicaciones, 2013). (ii) The strategy and frequency of data sampling: that is, the way in which data are sampled and how frequently each probe vehicle sends data. For the reasons stated before, probe vehicles do not send data continuously. The way in which they send data (i.e. data are sampled) is called sampling strategy, and there are at least three different strategies (Herrera, 2009, Mohan et al., 2008 and Nanthawichit et al., 2003): temporal, spatial and event-based sampling strategy. The temporal sampling strategy sends data frequently in time. The time elapsed between two consecutive packages of data is called the sampling interval. Spatial sampling sends data at specific geographical locations, which might be of interest. Examples of this are the Virtual Trip Lines or VTLs ( Hoh et al., 2012), which also help to preserve privacy. Finally, in the event-based or triggered sampling data are sent after a certain action is performed, such as braking (detected by an accelerometer) or a horn sound (detected by a microphone). This article will focus on the effects of the penetration rate and sampling frequency on traffic state estimation, specifically to reconstruct the velocity field. Assuming a temporal sampling, penetration rate and sampling intervals can take different values, yielding different numbers of total data sent. It is of interest to thoroughly understand the effect of different penetration rates and sampling intervals in the accuracy of traffic state estimates. Herrera (2009) presents preliminary evidence suggesting that sporadically sampling more vehicles is preferred to a more frequently sampling of fewer vehicles. However, as stated by the author, further analysis needs to be conducted in order to be conclusive in this matter. Trade-offs between inductive loops and probe vehicles has been addressed previously for the purpose of travel time estimation (Mazaré et al., 2012), but no analyses on the trade-off between penetration rate and sampling frequency (the inverse of the sampling interval) was found in the literature. Studies have used different penetration and sampling frequencies (Piccoli et al., 2012), but using relatively small sampling intervals (from 1 to 3 s) and focusing on error degradation and model calibration, rather than the trade-off between variables. The method used to estimate traffic states is of relevance. Its performance depends on the amount of data sent by probe vehicles and also how these data are incorporated in the model. To the best of our knowledge, only one study investigates the benefits of using advanced traffic models to estimate traffic states in contrast to simpler models in relation to penetration rate, but not sampling frequency (Work et al. (2008) uses ten VTLs and only varies penetration rate). It is expected that above certain amounts of data provided by probe vehicles, advanced models would not be needed to produce accurate estimates. It would also be useful to quantify the performance of the method for scenarios in which the amount of data is below the previous threshold. Under the hypothesis that the proportion of probe vehicles in the traffic flow and the frequency with which they send data influences the quality of traffic states estimates, it is of interest to examine the relation between the amount and composition of data4 collected and the way traffic states are estimated. Specifically, this article has the following goals: • Define a measure to quantify the amount of data that is collected as a function of the penetration rate of sensors and the sampling frequency. • Compare the estimation performance of three traffic state estimation methods for several scenarios with different amount and composition of data. • Obtain a metric to infer the error for the three traffic state estimation methods, as a function of the penetration rate of sensors and the sampling frequency. • Study the trade-off between penetration rate and sampling frequency on traffic state estimation, since the same quantity of data can be collected from different penetration rates and sampling frequency. These goals will be assessed by the systematic analysis of two vehicle trajectories datasets. The first is the Next Generation Simulation dataset (FHWA, 2007), commonly known by its acronym NGSIM, of the US 101 Highway. The second dataset corresponds to trajectories generated by microsimulation. The rest of the article is organized as follows: Section 2 presents the proposed methodology to analyze the vehicle trajectories datasets and to achieve the goals. In Section 3, the methodology is applied on the previous datasets and results are analyzed. Finally, main conclusions and future perspectives of this investigation are presented in Section 4.
نتیجه گیری انگلیسی
An in-depth analysis of different scenarios of data collected from probe vehicles and their relation to traffic state estimation has been made. Two vehicle trajectories datasets have been used to obtain results. The first dataset uses real vehicle trajectories travelling a short stretch of freeway, where high vehicular demand creates congestion. The second dataset uses simulated vehicle trajectories travelling a longer stretch of freeway, in which a 20-min lane closure generates congestion. Three traffic state estimation methods were used. The first two are simpler and easier to implement (HA and IA), and the third one (FBH-v) is a heuristic based on traffic theory. A metric to determine the number of observations provided by probe vehicles was proposed. This metric is a function of the penetration rate, the sampling frequency, the number of vehicles and the average probe travel time. Results showed that the number of observations to be collected can be effectively estimated, with errors below 6% for both datasets. This metric is key to the study of these variables. The variability of the estimation depends on the number of observations estimated: scenarios with low penetration rate and sampling frequency exhibits higher variability than scenarios with high penetration rate and sampling frequency. In terms of the estimation performance of the three traffic state estimation methods, different results were found. These differences are mainly related to the way in which each method works and the network’s features. For the NGSIM dataset, FBH-v was always superior in terms of the estimation error. Compared to the IA method, this superiority is more pronounced in scenarios with an intermediate number of observations. The HA method yielded the worst results. For the second dataset, results were different. The IA method estimated poorly when probe observations are low due to the fact that boundary conditions are not representative of the state on the road. Estimation improves greatly when observations increase and tend to converge when data count is high. On the other hand, for scenarios with small number of observations the HA method outperformed the FBH-v method. When the number of observations is high, we recommend FBH-v, even though differences are not considerable. The performance of the methods is also related to traffic conditions on the road. If traffic conditions change frequently (i.e. several waves travelling the segment, as in the NGSIM case), the HA performs poorly in comparison to the other two methods. If the opposite happens (as with the microsimulation dataset), the performance of the HA method increases, and is comparable to the FBH-v method. Considering both datasets used, the overall success rate of the FBH-v is superior to the other two methods, suggesting the benefit of using a model that incorporates traffic theory. Fusing data provided from other sources, such as geo-referenced social-networks, can be useful to easily identify incidents. All three method would benefit and their error decrease, specially the FBH-v method which propagates congestion through traffic theory. Based on economic theory, a function for inferring an expected estimation error was proposed. This metric allows a systematic analysis of changes in penetration rate or sampling frequency and their effect on the estimation performance. This is important since the number of observations, by itself, is not a good predictor of the method performance. Trade-off between penetration rate and sampling frequency was analyzed through the elasticities Eɛ,P and Eɛ,z, and the ratio Δ = Eɛ,P/Eɛ,z. The value of the ratio Δ indicates the convenience of increasing P or z in order to improve the estimation. If Δ is greater (or less) than one, increasing the penetration rate (or sampling frequency) is Δ (or 1/Δ) times more effective than increasing sampling frequency (penetration rate) for greater error reduction. If Δ equals one, changes to either variable yield the same error reduction. This locus is defined as an indifference curve. All these different cases were presented in each dataset and estimation method. The indifference curve is also the solution to the optimization problem of minimizing the error given any fixed number of observations, giving as a result an optimal combination of penetration rate and sampling interval. These results are useful if gathering or processing data is restricted by cost. For example, if there is a hardware capacity than cannot process more than certain amount of observations, or if a traffic management center has to pay users for data sampling (and the cost is related to the amount of observations sent). In the future, it would be interesting to carry out a similar analysis for different sampling strategies, like an event-based or triggered sampling. Specifically, a sampling strategy in which data are sent by probe vehicles when the acceleration (deceleration) is greater (lesser) than a certain threshold. We believe these data are at least as valuable as data sent by a temporal sampling strategy, because this information is more related to shockwave identification. Therefore, each observation could provide more information related to traffic state estimation. This is important if observations are costly and are paid per observation sent.