به سمت آزمون رابط های شنوایی-صوتی و تشخیص انحراف در هنگام رانندگی: مقایسه اقدامات حرکات چشم در بررسی حجم کار شناختی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
38806 | 2015 | 12 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Transportation Research Part F: Traffic Psychology and Behaviour, Volume 32, July 2015, Pages 23–34
چکیده انگلیسی
Abstract Recently, there has been a growing need among researchers to understand the problem of cognitive workload induced by auditory–verbal–vocal tasks while driving in realistic conditions. This is due to the fact that we need (a) valid methods to evaluate in-vehicle electronic devices using voice control systems and (b) experimental data to build more reliable driver state monitoring systems. In this study, we examined the effects of cognitive workload induced by the delayed digit recall task (n-back) while driving. We used a high-fidelity driving simulator and a highway scenario with moderate traffic to study eye movements in realistic driving conditions. This study included 46 participants, and the results indicate that a change in pupil size is most sensitive for measuring changes in cognitive demand in auditory–verbal–vocal tasks. Less sensitive measures included changes in fixation location and blink rate. Fixation durations and the driving performance metrics did not provide sensitive measures of graded levels of cognitive demand.
مقدمه انگلیسی
. Introduction Every day, approximately 3500 people are killed in road accidents globally. This fact leads to more than 1.2 million deaths on the world’s roads per year. Moreover, millions of people are injured and very often disabled. The World Health Organization (World Health Organization, 2013) estimated that road traffic injuries will become one of the leading causes of death in next decade. Many of these accidents emanate from human error. However, driving itself is an extremely complex activity. It can be evaluated from the perspective of different driving behavior theories and models (Groeger, 2002, Michon, 1985 and Reason et al., 1990). This state is due to the fact that driving outcomes are a result of the interaction of different factors. Nevertheless, researchers were always interested in specific human factors underlying traffic accidents. An analysis using the 100-car Naturalistic Driving Study Data and a data set of 2,000,000 miles conducted by the National Highway Traffic Safety Administration (NHTSA) indicated that inattention and distraction are major factors contributing to near-crash situations and road incidents (Klauer, Dingus, Neale, Sudweeks, & Ramsey, 2006). This phenomenon is not a new problem in traffic safety science. However, due to the presence of new types of distractors, there is a growing need among researchers to better understand this cognitive mechanism. In the scope of this study, the theoretical background is closest to the concept of diverting attention toward competing activities and the resulting cognitive workload (Regan, Hallett, & Gordon, 2011). Cognitive workload itself is a complex construct. Apart from the studies on attention, it has always been central within the working memory domain. This might be due to the fact that there may be one common discrete resource mediating working memory storage and capacity limits of attention (Ester, Vogel, & Awh, 2012). However, in applied psychology and human factors studies, cognitive workload usually refers to the concept of imposing demands on humans’ limited mental resources. There is a certain limit of resources that, when exceeded, task realization performance significantly decreases. Cognitive workload is often studied in one of two paradigms, single-task demand or dual-task demand. Multiple tasks are related to the concept of multiple resource theory that deals with time-sharing efficiency when performing concurrent tasks (Wickens, 2002 and Wickens et al., 2008). This theory indicates different dimensions of information processing related to the sensory modalities, codes, responses, and stages underlying dual-task performance. Drivers are heavily exposed to multitasking. When we are driving, we perform concurrent tasks. The act of driving is mainly processed in the visual–spatial–manual pathway, but it can be also affected by engaging in the secondary auditory–verbal–vocal task, such as phone conversations (Ho & Spence, 2008). There is also a second use case for this type of information processing. The evolution of human–machine interfaces toward auditory interfaces in recent years has affected the design of in-vehicle devices. In some studies, these interfaces are labeled as easier and more satisfying to use than visual interfaces (Sodnik, Dicke, Tomazič, & Billinghurst, 2008). However, it also triggered questions about road safety implications. Lee, Caven, Haake, and Brown (2001) reported that speech-based systems simulating the work of e-mail significantly increases reaction time and introduces a major cognitive workload. Concerns over workload induced by voice-based interactions are also reflected in study by Reimer and Mehler (2013), where authors concluded that these interactions can also lead to increased visual demand. There are several ways to measure cognitive workload, both objective and subjective. One of the most popular among researchers is still subjective assessment conducted using the NASA task load index (NASA-TLX) (Hart & Staveland, 1988) and the Subjective Workload Assessment Technique (SWAT) (Reid & Nygren, 1988). Their popularity is due to their low cost, non-intrusiveness, ease-of-use, known sensitivity, and validity (Zhang & Luximon, 2005). On the other hand, there exists a group of objective techniques aimed at measuring cognitive workload, which seem to be more applicable to the real-time monitoring of the driving activity. The first group of objective measurement techniques can be classified as performance indicators, and includes steering performance metrics, lane keeping, speed management, vehicle following, response time, and steering grip (Ariën et al., 2013 and Minin et al., 2012). The second group includes physiological and neurological measures. In the context of multitasking while driving, heart rate and skin conductance showed consistent patterns in detecting an increased workload (Mehler, Reimer, Coughlin, & Dusek, 2009). Other good indicators of workload in this group can be amplitude of P3b in electroencephalography (EEG) analysis (Lei, Welke, & Roetting, 2009). Other activation patterns in the human brain that correspond to the working memory load and visual attention load can be detected when using functional magnetic resonance imaging (fMRI) (Tomasi, Chang, Caparelli, & Ernst, 2007). However, it should be noted that this technique is not easily applicable to driving studies but not impossible (Hunga et al., 2014). Developments in the automotive industry and implementations in other related industries, such as those that involve earth-moving equipment, eye-tracking can be considered a very promising direction for real-time monitoring of the operator’s state. Within the eye-tracking domain, researchers use different variables to measure cognitive workload, such as changes in the diameter of the pupil, blink rate, blink duration, saccadic movements, dwell rate, fixation durations, fixations rate, and fixation locations. These measures are defined by standardization bodies and researchers e.g. ISO, 2002, National Highway Traffic Safety Administration, 2012 and Holmqvist, 2011. The size of the pupil is one of the most commonly used workload indicators, and it has long been recognized in a variety of cognitive tasks, such as arithmetic problems and linguistic tasks. It is usually calculated as percentage change in pupil diameter due to the pupil diameter variability among humans. Apart from the cognitive activity, pupil size is affected by lighting conditions. In stable lighting conditions, pupil size increases with an increasing cognitive workload (Iqbal et al., 2004 and Tsai et al., 2007). For instance, the pupil diameter in highly demanding arithmetic tasks can change by more than 20%, but the percentage of this change varies between different types of tasks and cognitive efforts (Holmqvist, 2011). Results by Recarte and Nunes (2000) indicate that pupil size can change during the spatial-imagery and verbal tasks performed while driving. However, Demberg, Sayeed, Castronovo, and Müller (2013) conclude that there are more reliable and more sensitive pupil measures than just simple changes in pupil size, for example, the Index of Cognitive Activity (ICA) (Marshall, 2000 and Marshall, 2007). This complex measurement system calculates rapid and small pupil dilations, and has proved to be a good indicator of detecting cognitive workload in linguistic tasks and digit-span tasks in simulated driving conditions (Demberg et al., 2013 and Schwalm et al., 2008). Blink rate and blink duration are highly discussed measures of cognitive workload. Both measures seem to be correlated with the number of errors in visual tasks (Van Orden, Jung, & Makeig, 2000). Tsai et al. (2007) found that blink frequency increases in the dual-task paradigm among drivers when the second task is an auditory task. In the same study, blink duration was not statistically significant. Other experimental results indicate that blink duration and blink frequency are highly dependent on the visual demand imposed by a single task or two concurrent tasks (Recarte, Pérez, Conchillo, & Nunes, 2008). In their flight simulator study, Veltman and Gaillard (1996) found that blinking is not related to the cognitive workload when operators have to process a significant portion of visual information in a limited amount of time. Therefore, the nature of the task implied by the necessity of the visual search should be considered when using blink rate as an indicator of general cognitive workload. Fixation analysis is also applied to the mental effort evaluation. For instance, shorter fixation durations can be indicative of higher stress and an associated high cognitive workload (Holmqvist, 2011 and Miura, 1990). It should be noted that this similarity to the blink measures can vary across different tasks. The complexity of the visual scene triggers shorter fixations and increased visual inspection. Recarte and Nunes (2000) found the effect of spatial-imagery tasks on increased fixation durations; however, they did not find the effect of verbal tasks on the very same measure. In the same study, upon inspection, the authors found that the horizontal and vertical visual fields decrease while performing mental tasks. Hence, driving fixation measures should be taken in relation to the driving experience and characteristics of the given scenario (e.g., road classification). Novice drivers have different fixation patterns than experienced drivers, which have an impact on horizontal visual search processes (Crundall et al., 2012). There are two practical implications of this study. First, traffic safety agencies are becoming more interested in the evaluation of speech-based interfaces. For example, the NHTSA plans to issue Driver Distraction Guidelines for auditory–vocal human–machine interfaces. Therefore, there is a need for more data from experiments with varied driving conditions to validate the methodology to assess these types of interfaces. Second, automotive manufacturers are introducing more sophisticated methods to assess the mental state of drivers, especially drowsiness and cognitive workload. Researchers are continuously developing quantitative algorithms that can classify operator workload more accurately. On-board systems will be able to perform complex analysis of driving behavior based on driving performance, psychophysiological arousal and eye-tracking measures (Yang, Reimer, Mehler, & Dobres, 2013). Existing literature still brings mixed results when assessing the usefulness of eye-tracking variables. In some cases, findings were based on a very small sample size and in very artificial driving situations. In the present paper, we want to verify selected measures under conditions of changing demands imposed by the auditory–verbal–vocal task.
نتیجه گیری انگلیسی
3. Results We used a repeated analysis of variance (ANOVA) measurement on the dependent variables with Huyn–Feldt correction when the sphericity assumption was violated, and Bonferroni correction for post hoc multiple comparisons. This correction was run due to the small number of groups (four and less). The eta-squared measure (η2) was used to assess the effect size for group mean differences. All analyses were conducted in the PASW 21 Statistical Package. All descriptive statistics for each eye-movement variable are presented in Table 1. 3.1. Task characteristic analysis In the first part of the analysis, we wanted to find differences in task characteristics between n-back levels. In this part, we referred to the “percentage of correct answers” in the n-back tasks and the subjective assessment of task workload measured by the NASA-TLX questionnaire. Repeated ANOVA measurements were conducted on the percentage of correct answers, revealing that there is a general effect across the n-back tasks, F(2, 90) = 17.82; p < 0.01, η2 = 0.26 (see Fig. 3). After conducting post-hoc comparisons, differences in the “number of percentage correct answers” were found between 0-back and the other two conditions of 1-back and 2-back. However, only trends were found between 1-back and 2-back runs at the assumed level of significance (p < 0.05), which can indicate that both tasks were similar in difficultly for participants in terms of generating correct digits while driving. Repeated ANOVA measures on the subjectively assessed mean task workload revealed that there was a significant main effect across n-back experimental conditions, F(1.24, 55.69) = 69.74; p < 0.01, η2 = 0.61 (see Fig. 4). As the next step of the analysis, we compared the simple effects of the mean task workload at each n-back level, which showed that all three conditions significantly differed from each other (p < .001). The mean workload increased with task demand. First, the 0-back (M = 21.5) task was assessed as easier than the 1-back (M = 28.65) and 2-back (M = 46.17) conditions. Moreover, the difference between the subjectively assessed moderate cognitive workload level (1-back) and the high cognitive workload (2-back) was also significant. These results showed that the assessed difficulty level and number of errors in the 2-back, 1-back, and 0-back conditions increased as the number of n increased. This is in line with previous studies on n-back tasks ( Jaeggi et al., 2010) (see Table 2). Influence of the different levels of n-back tasks on the percentage of correct ... Fig. 3. Influence of the different levels of n-back tasks on the percentage of correct answers generated in the secondary task. Figure options Influence of the different levels of n-back tasks on the subjectively assessed ... Fig. 4. Influence of the different levels of n-back tasks on the subjectively assessed mean workload. Figure options Table 2. Means and standard deviations for eye-movement variables in four conditions: (1) Primary task (baseline driving); (2) secondary task inducing minor cognitive workload (0-back); (3) secondary task inducing moderate cognitive workload (1-back); and (4) secondary task inducing major cognitive workload (2-back). Dependent variables Measurement Baseline driving 0-back task 1-back task 2-back task M SD M SD M SD M SD Horizontal search 145.09 54.33 130.94 70.49 103.8 58.80 111.64 62.14 Vertical search 117.52 36.18 92.11 37.11 71.03 36.58 73.97 30.26 Fixation duration 305.87 65.15 310.42 86.44 342.78 116.4 339.92 119.21 Blink rate 27.50 12.98 29.33 11.01 31.35 12.83 32.85 11.42 Change in pupil size −5.28 3.30 −2.38 2.43 2.55 3.20 5.10 3.94 Table options 3.2. Fixations analysis The second part of the analysis included fixation location and fixation duration as dependent variables. Eye position corresponds to the spatial locus of cognitive processing and it can be affected by the cognitive workload (Irwin, 2004). Different levels of cognitive workload induced by n-back conditions significantly impacted the standard deviation of fixation locations along the horizontal axis. We found a main task effect for the horizontal search, F(3, 135) = 8.24; p < 0.01, η2 = 0.16 (see Fig. 5). Next, a comparison of simple effects was conducted, which identified significant differences between the baseline condition (driving without a secondary task) and a moderate and a high level of cognitive workload (1-back, 2-back). We did not find significant differences between driving and the 0-task or the 1-back and 2-back conditions and 2-back and 0-back. A similar repeated ANOVA measurement analysis was conducted on fixation locations along the vertical axis, revealing the main effect of the secondary task, F(2.66, 119.70) = 33.98; p < 0.01, η2 = 0.43 (see Fig. 6). It appears that the vertical axis was a more sensitive measure of cognitive workload than the horizontal axis. Baseline driving (M = 117.52) was characterized by a significantly increased vertical search in comparison with the 0-back (M = 92.11), 1-back (M = 71.03) and 2-back (M = 73.97), p < .05. The same significant difference in the simple effects comparison was observed for 1-back and 2-back. Influence of the different levels of n-back task on the horizontal search ... Fig. 5. Influence of the different levels of n-back task on the horizontal search indicated by the standard deviation of fixation location along the horizontal axis. Figure options Influence of the different levels of n-back task on the vertical search ... Fig. 6. Influence of the different levels of n-back task on the vertical search indicated by the standard deviation of fixation location along the vertical axis. Figure options Next, we analyzed the duration of fixations, which is commonly associated with deeper cognitive processing. However, it should be noted that there might be some exceptions (e.g., high and low arousal, experience, neurological impairments, and fast-moving stimulus) (Holmqvist, 2011). Although the main effect of the task was found for fixation durations, F(2.25, 101.33) = 3.66; p < 0.05, η2 = 0.08, in simple effect comparisons with the Bonferroni test, we did not find differences between each task condition at the assumed level of significance, p < .05. 3.3. Pupillometry The next part of our analysis was conducted with the usage of pupil diameter changes. This popular indicator of cognitive workload proved to be a very sensitive measure of the increasing demand imposed by secondary tasks in stable lighting conditions. We calculated this measurement as a percentage of the individual change in each condition due to the pupil size variability among participants (between 3 and 5 mm). The main effect of the task was significant for the change in the pupil diameter, F(2.52, 101.36) = 71.31; p < 0.01, η2 = 0.61. The next step of the analysis included simple effect comparisons. Our comparisons showed that the baseline driving task differed from all conditions with secondary tasks inducing an additional cognitive workload – 0-back, 1-back, and 2-back. Pupil size in the baseline condition was significantly smaller than in any other task condition (p < 0.001); its size was smaller than 5.28% of its average size during the entire experimental run. Increasing the cognitive workload during auditory–verbal–vocal processing increased the diameter of the pupil across all three levels. In the 2-back condition, the pupil size was bigger than the average value in all other conditions. This relationship is illustrated in Fig. 7. Influence of the different levels of n-back task on the average percentage ... Fig. 7. Influence of the different levels of n-back task on the average percentage change in the diameter of the pupil. Figure options 3.4. Blink rates Blink rate is a highly controversial measurement that can indicate cognitive workload only in certain tasks that do not interfere with visual search. Even though the primary driving task requires the driver to keep his or her eyes on the road to identify potential hazards, we were able to find different blink rates when the operator had to perform a delayed digit recall task. Repeated ANOVA measurements were carried out and showed the main effect of the task on blink rates across baseline driving and n-back levels, F(2.63, 118.16) = 2.96; p < 0.05; η2 = 0.06. In the next step, we compared the simple effects; however, we did not find significant differences between pairs of variables. Only baseline and 2-back conditions showed a trend (p = 0.08) that an increased number of blink rates can be related to an increased auditory cognitive workload (see Fig. 8). Influence of the different levels of n-back task on the blink rate. Fig. 8. Influence of the different levels of n-back task on the blink rate. Figure options 3.5. Driving performance analysis None of the performance variables measured in this experiment were related to the effect of different tasks. We analyzed the lateral and longitudinal control of the vehicle. The mean speed and standard deviation of speed could not discriminate between different levels of n-back tasks. Moreover, the standard deviation of steering wheel angle and steering reversal rate were not significant measurements to assess cognitive workload. It should be noted that we have not analyzed more sophisticated lateral deviations, such as the high frequency component of steering angle, and due to the nature of the driving task, lane performance indicators. These measurements can be, in some experimental conditions, valid indicators of cognitive workload ( Minin et al., 2012).