# توسعه مدل سازی اکتشافی زنجیره مارکوف و چارچوب شبیه سازی در زمینه تحقیقات حمل و نقل

کد مقاله | سال انتشار | مقاله انگلیسی | ترجمه فارسی | تعداد کلمات |
---|---|---|---|---|

11422 | 2005 | 13 صفحه PDF | سفارش دهید | 8060 کلمه |

**Publisher :** Elsevier - Science Direct (الزویر - ساینس دایرکت)

**Journal :** Expert Systems with Applications, , Volume 28, Issue 1, January 2005, Pages 105-117

#### چکیده انگلیسی

This paper has developed and evaluated the implementation of an adapted Markov Chain modelling heuristic and simulation framework in the context of transportation research. In order to gain insight into the travel patterns of individuals and in the decision making that people use to make transport mode decisions, a new methodology is presented in this paper to extract knowledge from data. The presented approach shows two ways to store the sequential information (sequences of activities and travel) that is typically incorporated in activity diary data. The approach is novel, especially with respect to store information in ‘codebooks’, a term which is introduced to reflect that the information which is kept, represents the combinations of activities that typically sequentially occur in a persons' diary. In order to test the validity of the heuristic, new data is simulated and compared with the original observed data. The new data is generated by means of Monte Carlo simulation and the empirically derived information from the codebooks is used as a constraint in the simulations. In order to make a mature evaluation of the simulated diaries, different performance indicators were considered by using pattern-, trip- and activity-level measures. It is shown in the paper that the results are satisfactory and that the framework that was developed holds out considerable promise; both for gaining behavioural decision making insights and for simulating activity diary data that can assist practitioners and researchers in the calibration of travel demand models.

#### مقدمه انگلیسی

For the last decade, activity-based transportation models have set the standard for modelling travel demand. The most important characteristic in these models is that travel demand is derived from the activities that individuals and households need or wish to perform. The main advantage is that travel has no longer an isolated existence in these models, but it is perceived as a way to perform activities and to realize particular goals in life. The aim of this paper is not to detail on these models but it is important to realize that the increased complexity of an activity-based transportation model is also reflected in the data that are required to estimate such a model. Indeed, the collected travel information needs to be immediately associated with the activities the respondent says to be engaged in. This implies that data requisites are considerably higher. Fortunately, this wealth of information also has the potential to obtain a better understanding about the behavioural mechanisms and principles that individuals and households use to organize activities and to perform travel activities. The aim of this paper is therefore to present an adapted Markov Chain modelling heuristic that can satisfy both needs; i.e. it should both be possible to (i) capture and increase insights in behavioural decision making and (ii) provide a simulation tool of activity diary data that can assist practitioners and researchers in the calibration of activity-based transportation models. The work in this paper is a thorough extension of previously published work (Janssens, Wets, Brijs, & Vanhoof ,2004). Other approaches that deal with scarcity of data and that provide insights in behavioural decision making are mentioned in Arentze et al., 2001 and Greaves and Stopher, 2000 and Stopher, Greaves, and Bullock (2003). The idea to come up with an adapted Markov Chain modelling heuristic was advanced after it became clear that the simple application of the available Markov Chain modelling theory is infeasible in this application area. Typically, data are collected by means of activity diaries after it has been repeatedly shown (Koppelman, 1981 and Robinson, 1985) that traditional travel surveys especially under-report off-peak, non-home based trips of short duration. Stopher, 1992, Clarke et al., 1981 and Niemi, 1993 have argued that activity diaries outperform travel surveys in this respect. Moreover, it is claimed (Clarke et al., 1981) that despite the fact that only out-of-home activities generate traffic, collecting in-home activities seems also warranted since it provides the closest correspondence with the natural storing of information and the planning of activities. It is assumed in this paper that each diary consists of a set of correlated successive observations of a random variable. To this end, a discrete random variable Xt is considered, taking values in the finite set {1,…,m}, where each value in this set represents an activity that occurs in a persons' diary. Travelling is considered as an activity as well, however, the transport mode is added as an additional attribute in this case. Our goal is to simulate (predict) the value taken by Xt as a function of the values taken by previous observations of this variable. Markov chains are probabilistic models which are commonly used to model this type of dependencies in data. However, as said, their application resulted in additional difficulties in this research context (see infra). For this reason, the development of a new adapted Markov Chain modelling heuristic was implemented. The presented approach shows two different ways to store the sequential information (sequences of activities) in ‘codebooks’, a term which is introduced to reflect that the information which is kept here represents low- and high-order combinations of activities that typically sequentially occur in one particular diary. In the limit, low-order combinations assume that the present value taken by Xt is entirely explained by the first lag (Activity t−1), while high-order combinations assume that the correct present value of Xt can only be explained by the last k-1 observations (Activity t-1, Activity t−2, … Activity k-1) in which k represents the length of the diary. However, high-order combinations tend to overfit the sample dataset, while low-order combinations have the problem that the knowledge extraction is insufficiently tuned towards the sample data and that therefore insights in activity-travel patterns are non-reliable. A trade-off will be explored in the paper. Hereafter, new data is simulated based on the sequential information, which is incorporated in the codebooks. The new simulated data is then compared with the originally observed data. The aim of this simulation framework is two-fold: (i) it serves as a validation measure for the developed heuristic and (ii) it is a simulation tool of activity diary data that can assist practitioners and researchers in the calibration of activity-based transportation models. The remainder of the paper is organized as follows. Section 2 briefly introduces Markov Chains in order to provide the necessary theoretical background. Section 3 elaborates on the problems that were encountered when using Markov chains and it describes the reasons for developing the new heuristic. In Section 4, the core of the algorithm is introduced. Section 5 describes the procedure which is used for simulating the new data based on the information which is stored in the codebooks. A controlled Monte Carlo simulation technique is introduced. In Section 6, the developed method is tested and empirically validated by means of pattern-, trip- and activity-level performance indicators. Conclusions and topics for future research are given in Section 7.

#### نتیجه گیری انگلیسی

For the last decade, activity-based transportation models have set the standard for modelling travel demand. The most important characteristic in these models is that travel demand is derived from the activities that individuals and households need or wish to perform. The main advantage is that travel has no longer an isolated existence in these models, but it is perceived as a way to perform activities and to realize particular goals in life. The aim of this paper is not to detail on these models but it is important to realize that the increased complexity of an activity-based transportation model is also reflected in the data that are required to estimate such a model. Indeed, the collected travel information needs to be immediately associated with the activities the respondent says to be engaged in. This implies that data requisites are considerably higher. Fortunately, this wealth of information also has the potential to obtain a better understanding about the behavioural mechanisms and principles that individuals and households use to organize activities and to perform travel activities. The aim of this paper is therefore to present an adapted Markov Chain modelling heuristic that can satisfy both needs; i.e. it should both be possible to (i) capture and increase insights in behavioural decision making and (ii) provide a simulation tool of activity diary data that can assist practitioners and researchers in the calibration of activity-based transportation models. The work in this paper is a thorough extension of previously published work (Janssens, Wets, Brijs, & Vanhoof ,2004). Other approaches that deal with scarcity of data and that provide insights in behavioural decision making are mentioned in Arentze et al., 2001 and Greaves and Stopher, 2000 and Stopher, Greaves, and Bullock (2003). The idea to come up with an adapted Markov Chain modelling heuristic was advanced after it became clear that the simple application of the available Markov Chain modelling theory is infeasible in this application area. Typically, data are collected by means of activity diaries after it has been repeatedly shown (Koppelman, 1981 and Robinson, 1985) that traditional travel surveys especially under-report off-peak, non-home based trips of short duration. Stopher, 1992, Clarke et al., 1981 and Niemi, 1993 have argued that activity diaries outperform travel surveys in this respect. Moreover, it is claimed (Clarke et al., 1981) that despite the fact that only out-of-home activities generate traffic, collecting in-home activities seems also warranted since it provides the closest correspondence with the natural storing of information and the planning of activities. It is assumed in this paper that each diary consists of a set of correlated successive observations of a random variable. To this end, a discrete random variable Xt is considered, taking values in the finite set {1,…,m}, where each value in this set represents an activity that occurs in a persons' diary. Travelling is considered as an activity as well, however, the transport mode is added as an additional attribute in this case. Our goal is to simulate (predict) the value taken by Xt as a function of the values taken by previous observations of this variable. Markov chains are probabilistic models which are commonly used to model this type of dependencies in data. However, as said, their application resulted in additional difficulties in this research context (see infra). For this reason, the development of a new adapted Markov Chain modelling heuristic was implemented. The presented approach shows two different ways to store the sequential information (sequences of activities) in ‘codebooks’, a term which is introduced to reflect that the information which is kept here represents low- and high-order combinations of activities that typically sequentially occur in one particular diary. In the limit, low-order combinations assume that the present value taken by Xt is entirely explained by the first lag (Activity t−1), while high-order combinations assume that the correct present value of Xt can only be explained by the last k-1 observations (Activity t-1, Activity t−2, … Activity k-1) in which k represents the length of the diary. However, high-order combinations tend to overfit the sample dataset, while low-order combinations have the problem that the knowledge extraction is insufficiently tuned towards the sample data and that therefore insights in activity-travel patterns are non-reliable. A trade-off will be explored in the paper. Hereafter, new data is simulated based on the sequential information, which is incorporated in the codebooks. The new simulated data is then compared with the originally observed data. The aim of this simulation framework is two-fold: (i) it serves as a validation measure for the developed heuristic and (ii) it is a simulation tool of activity diary data that can assist practitioners and researchers in the calibration of activity-based transportation models. The remainder of the paper is organized as follows. Section 2 briefly introduces Markov Chains in order to provide the necessary theoretical background. Section 3 elaborates on the problems that were encountered when using Markov chains and it describes the reasons for developing the new heuristic. In Section 4, the core of the algorithm is introduced. Section 5 describes the procedure which is used for simulating the new data based on the information which is stored in the codebooks. A controlled Monte Carlo simulation technique is introduced. In Section 6, the developed method is tested and empirically validated by means of pattern-, trip- and activity-level performance indicators. Conclusions and topics for future research are given in Section 7.