بازشناسی کاربران جوان و مسن تر از حالات صورت عامل مجازی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
37804 | 2015 | صفحه PDF |

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : International Journal of Human-Computer Studies, Volume 75, March 2015, Pages 1–20
چکیده انگلیسی
Abstract As technology advances, robots and virtual agents will be introduced into the home and healthcare settings to assist individuals, both young and old, with everyday living tasks. Understanding how users recognize an agent׳s social cues is therefore imperative, especially in social interactions. Facial expression, in particular, is one of the most common non-verbal cues used to display and communicate emotion in on-screen agents (Cassell et al., 2000). Age is important to consider because age-related differences in emotion recognition of human facial expression have been supported (Ruffman et al., 2008), with older adults showing a deficit for recognition of negative facial expressions. Previous work has shown that younger adults can effectively recognize facial emotions displayed by agents (Bartneck and Reichenbach, 2005, Courgeon et al., 2009, Courgeon et al., 2011 and Breazeal, 2003); however, little research has compared in-depth younger and older adults’ ability to label a virtual agent׳s facial emotions, an import consideration because social agents will be required to interact with users of varying ages. If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are influenced by the intensity of the emotion, dynamic formation of emotion (i.e., a neutral expression developing into an expression of emotion through motion), or the type of virtual character differing by human-likeness. Study 1 investigated the relationship between age-related differences, the implication of dynamic formation of emotion, and the role of emotion intensity in emotion recognition of the facial expressions of a virtual agent (iCat). Study 2 examined age-related differences in recognition expressed by three types of virtual characters differing by human-likeness (non-humanoid iCat, synthetic human, and human). Study 2 also investigated the role of configural and featural processing as a possible explanation for age-related differences in emotion recognition. First, our findings show age-related differences in the recognition of emotions expressed by a virtual agent, with older adults showing lower recognition for the emotions of anger, disgust, fear, happiness, sadness, and neutral. These age-related difference might be explained by older adults having difficulty discriminating similarity in configural arrangement of facial features for certain emotions; for example, older adults often mislabeled the similar emotions of fear as surprise. Second, our results did not provide evidence for the dynamic formation improving emotion recognition; but, in general, the intensity of the emotion improved recognition. Lastly, we learned that emotion recognition, for older and younger adults, differed by character type, from best to worst: human, synthetic human, and then iCat. Our findings provide guidance for design, as well as the development of a framework of age-related differences in emotion recognition.
مقدمه انگلیسی
. Introduction People have been fascinated with the concept of intelligent agents for decades. Science fiction machines, such as Rosie from the Jetsons or C3PO from Star Wars, are idealized representations of advanced forms of technology coexisting with humans. Recent research and technology advancements have shed light onto the possibility of intelligent machines becoming a part of everyday living and socially interacting with human users. As such, understanding fluid and natural social interactions should not be limited to only the study of human–human interaction. Social interaction is also involved when humans are interacting with an agent, such as a robot or animated software agent (e.g., virtual agent). The label “agent” is widely used and there is no agreed upon definition. Robots and virtual agents can both be broadly categorized as agents; however, there is differentiation between the terms. A robot is a physical computational agent ( Murphy, 2000 and Sheridan, 1992). A virtual agent does not have physical properties; rather, it is embodied as a computerized 2D or 3D software representation ( Russell and Norvig, 2003). Whether an agent is robotic or virtual, it can be broadly defined as a hardware or software computational system that may have autonomous, proactive, reactive, and social ability ( Wooldridge and Jennings, 1995). The social ability of the agent may be further defined as social interaction with either other agents or people. It is generally accepted people are willing to apply social characteristics to technology. Humans have been shown to apply social characteristics to computers, even though the users admit that they believe these technologies do not possess actual human-like emotions, characteristics, or “selves” (Nass et al., 1994). Humans have been shown to elicit social behaviors toward computers mindlessly (Nass and Moon, 2000), as well as to treat computers as teammates with personalities, similar to human–human interaction (Nass et al., 1995 and Nass et al., 1996). How are social cues communicated? Facial expressions are one of the most important media for humans to communicate emotional state (Collier, 1985), and a critical component in successful social interaction. Similarly, facial expression is one of the most common non-verbal cues used to display emotion in onscreen agents (Cassell et al., 2000). Humans learn and remember hundreds (if not thousands) of faces throughout a lifetime. Face processing may be special for humans and primates due to the social importance placed on facial expressions. Emotional facial expressions may be defined as configurations of facial features that represent discrete states recognizable across cultures and norms (Ekman and Friesen, 1975). These discrete states, as proposed by Ekman and Friesen, are often referred to as ‘basic emotions.’ Research on emotions has evolved over the decades, with the exact number and definition of basic emotions debated. In later research Ekman considered as many as 28 emotions as having some or all of the criteria for being considered basic (Ekman and Cordaro, 2011). Nonetheless, six of these emotions (anger, disgust, fear, happiness, sadness, and surprise) have been studied in detail (Ekman and Friesen, 2003, Calder et al., 2003 and Sullivan and Ruffman, 2004). Internationally standardized photograph sets (Beaupre and Hess, 2005 and Ekman and Friesen, 1978) make it possible to compare results across studies for this set of emotions. An agent may make use of facial expressions to facilitate social interaction, communication, and express emotional state, requiring the user to interpret its facial expressions. To facilitate social interaction, a virtual agent will need to demonstrate emotional facial expression effectively to depict its intended message. Emotion is thought to create a sense of believability by allowing the viewer to assume that a social agent is capable of caring about its surroundings (Bates, 1994) and creating a more enjoyable interaction (Bartneck, 2003). The ability of an agent to express emotion may play a role in the development of intelligent technology. Picard (1997) stressed that emotion is a critical component and active part of intelligence. More specifically, Picard stated that “computers do not need affective abilities for the fanciful goal of becoming humanoids; they need them for a meeker and more practical goal: to function with intelligence and sensitivity toward humans” (p. 247). Social cues, such as emotional facial expression, are not only critical in creating intelligent agents that are sensitive and reactive toward humans, but will also affect the way in which people respond to the agent. The impetus of this research was to investigate how basic emotions may be displayed by virtual agents that are appropriately interpreted by humans. 1.1. Emotion expression of virtual and robotic agents There are many agents, applied to a variety of applications, developed to express emotion. Virtual agents, such as eBay׳s chatbots, ‘Louise and Emma,’ have been used to provide users with instructional assistance when using a web-based user interface. Video game users both young and old engage in virtual words, interacting with other agents or avatars in a social manner. Previous research has shown that participants’ recognition of facial emotion of robotic characters and virtual agents are similar (Bartneck et al., 2004), and commercial robot toys, such as Tiger Electronics and Hasbro׳s Furby, have been designed to express emotive behavior and social cues. The development of agents with social capabilities affords future applications in social environments, requiring collaborative interaction with humans (Breazeal, 2002 and Breazeal et al., 2004). In fact, a growing trend in intelligent agent research is addressing the development of socially engaging agents that may serve the role of assistants in home or healthcare settings (Broekens et al., 2009 and Dautenhahn et al., 2005). Assistive agents are expected to interact with users of all ages; however, the development of assistive intelligent technology has the promise of increasing the quality of life for older adults in particular. Previous research suggests that older adults are willing to consider having robotic agents in their homes (Ezer, 2008 and Ezer et al., 2009). For example, one such home/healthcare robot, known as Pearl, included a reminder system, telecommunication system, surveillance system, the ability to provide social interaction, and a range of movements to complete daily household tasks (Davenport, 2005). Many agent applications may require some level of social interaction with the user. The development of social agents has a long academic history, with a number of computational systems developed to generate agent facial emotion. Some of these generation systems (e.g., Breazeal, 2003 and Fabri et al., 2004) have been modeled from psychological-based models of emotion facial expression, such as FACS ( Ekman and Friesen, 1978) or the circumplex model of affect ( Russell, 1980). Another approach has focused on making use of animation principles (e.g., exaggeration, slow in/out, arcs, timing; Lasseter, 1987) to depict emotional facial expression (e.g., Becker-Asano and Wachsmuth, 2010) and other non-verbal social cues ( Takayama et al., 2011). Emotion generation systems are generally developed with the emphasis of creating “believable” agent emotive expression. Although some previous research has incorporated the generation of agent facial expression based on the human emotion expression literature (i.e., how humans demonstrate emotion), the design of virtual agents can be further informed by the literature on human emotion recognition (i.e., how humans recognize and label emotion). Can agents be designed to express facial expression effectively? Studies that have investigated humans’ recognition of agent emotional facial expressions have suggested that the agent faces are recognized categorically ( Bartneck and Reichenbach, 2005 and Bartneck et al., 2004), and can be labeled effectively with a limited number of facial cues ( Fabri et al., 2004). As previously mentioned, emotion recognition is integral in humans’ everyday activities; however, humans’ ability to recognize agent expression likely depend on many factors. For example, wrinkles, facial angle, and gaze are all design factors known to affect recognition and perceptions of expressivity for virtual agents ( Courgeon et al., 2009, Courgeon et al., 2011 and Lance and Marsella, 2010). Given that assistive agents are likely to interact with a range of users, other factors such as the user’s age, perceptual capability, and intensity of the agent emotion must be considered as well. 1.2. Understanding human facial expressions of emotion How humans interpret other humans’ facial expressions may provide some insight into how they might recognize agent facial expressions. Due to the importance of emotion recognition in social interaction, it is not surprising that considerable research has been conducted investigating how accurately people recognize human expression of the six basic emotions. The literature suggests that emotion recognition is largely dependent on the following four factors: age, facial features, motion, and intensity. 1.2.1. Age-related differences in emotion recognition Age brings about many changes such as cognitive, perceptual, and physical declines. Such declines can potentially be eased or mitigated by socially assistive agents and technological interventions (Beer et al., 2012 and Broadbent et al., 2009). However, an open question remains, how well can older adults recognize emotions of agents? It is easy to imagine social virtual and robotic agents assisting older adults with household tasks, therapy, shopping, and gaming. The social effectiveness of these agents will depend on whether the older adult recognizes the expressions the agent is displaying. The literature suggests that the ability to recognize emotional facial expressions differs across adulthood, with many studies investigating differences between younger adults, typically aged 18–30, and older adults, typically aged 65+years old. Isaacowitz et al. (2007) summarized the research in this area conducted within the past 15 years. Their summary (via tabulating the percentages of studies resulting in significant age group differences) found that of the reviewed studies, 83% showed an age-related decrement for the identification of anger, 71% for sadness, and 55% for fear. No consistent differences were found between age groups for the facial expressions of happiness, surprise, and disgust. These trends in age group differences were also reported in a recent meta-analysis (Ruffman et al., 2008). Older and younger adults’ differences in labeling emotional facial expressions appear to be relatively independent of cognitive changes that occur with age (Keightley et al., 2006 and Sullivan and Ruffman, 2004). Theories of emotional-motivational changes with age posit that shifts in emotional goals and strategies occur across adulthood. One such motivational explanation, the socioemotional selectivity theory, suggests that time horizons influence goals (Carstensen et al., 1999), resulting in many outcomes including a positivity effect. That is, as older adult near the end of life, their goals shift so that they are biased to attend to and remember positive emotional information compared to negative information (Carstensen and Mikels, 2005, Mather and Carstensen, 2003 and Mather and Carstensen, 2005). Other related motivational accounts suggest that emotional-motivational changes are a result of compensatory strategies to adapt to age-related declining resources (e.g., Labouvie-Vief, 2003) or emotional-regulation strategies actually becoming less effortful with age (e.g., Scheibe and Blanchard-Fields, 2009). The positivity effect has been suggested as a possible explanation for age-related decrements in emotion recognition (Ruffman et al., 2008 and Williams et al., 2006) for negative emotions such as anger, fear, and sadness; however, the effect does not explain why older adults have been shown to sometimes recognize disgust just as well, if not better, than younger adults (Calder et al., 2003). Note that the literature generally indicates that older adults show a deficit in recognition of negative emotions. However in one study, where the cognitive demands of the task were minimized, age-related differences in recognizing anger and sadness (Mienaltowski et al., 2013). 1.2.2. Configural and featural processing of human facial expressions of emotion Early studies have provided evidence that faces are processed holistically, meaning that faces are more easily recognized when presented as the whole face rather than as isolated parts/features ( Tanaka and Farah, 1993 and Neath and Itier, 2014). Additional work by Farah et al. (1995) showed that inverted faces were harder to recognize than inverted objects, suggesting that the spatial relations of features are important in face recognition. Similarly, the recognition of facial emotions is influenced by both configural and featural processing of human facial expressions ( McKelvie, 1995). The arrangement of the features of the face (e.g., mouth, eyebrows, eyelids) influences both processing of the face holistically and by its individual features to some degree ( McKelvie, 1995). As such, facial features may be a critical factor in processing an emotional facial expression (see Frischen et al., 2008 for summary). Despite the support for holistic processing of faces, some posit that age related differences in emotion recognition may be explained by the way in which older and younger adults perceive or attend to the individual facial components (e.g., facial features) that convey an expression. In other words, some suggest age related differences in emotion recognition might be attributed to biological changes in the perceptual systems and visuospatial ability involved in processing facial features (Calder et al., 2003, Phillips et al., 2002, Ruffman et al., 2008 and Suzuki et al., 2007). For example, older adults may focus their attention on mouth regions of the face, rather than eye regions (Sullivan et al., 2007). One explanation for this bias is that the mouth is less threatening than the eyes; because negative facial emotions are generally assumed to be more distinguishable according to changes in the eyes, attending to mouth regions may result in less accurate identification of negative emotion (Calder et al., 2000). 1.2.3. Intensity and motion of human facial expressions of emotion While most of the research discussed thus far relates to studies investigating emotion recognition of static facial expressions, it is important to note that in everyday interactions emotion are depicted as a dynamic formation, with faces transitioning between emotions and various intensities of emotions through motion. Although emotional facial expressions vary in intensity in day-to-day living, they are usually subtle (Ekman, 2003). Investigating peoples’ recognition of facial emotion across all intensities (i.e., subtle to intense) provides a better understanding of how people process and interpret emotional facial expression in everyday interactions. For example, understanding what level of intensity a facial emotion should be expressed to support accurate recognition is an important factor to consider for both human–human and human–agent interaction. For both virtual agents (Bartneck and Reichenbach, 2005 and Bartneck et al., 2004) as well as human faces (Etcoff and Magee, 1992), recognition of emotional facial expressions improve as the intensity of the emotion increases, but in a curvilinear fashion. Thus the relationship between recognition and intensity is not a 1:1 ratio. Typically once a facial emotion reaches some threshold, recognition reaches a ceiling, suggesting a categorical perception of facial expressions (Bartneck and Reichenbach, 2005 and Etcoff and Magee, 1992). However, it is unclear how intensity of facial expressions displayed by a virtual agent may influence age-related differences in emotion recognition, and whether the threshold for emotion recognition differs across age groups. Not only are emotional facial expressions displayed at varying intensities, but also in everyday interaction the expression of emotion usually dynamically changes in both intensity and from one emotion to another. Thus, the idea is that the formation of emotional facial expression may contain information additional to static pictures ( Bould et al., 2008). Seeing facial emotions in motion (i.e., dynamic formation) may facilitate recognition of facial expressions at less intense levels. Bould and Morris (2008) found that younger adults had higher recognition of emotions when viewing the dynamic formation of emotions as opposed to seeing a single static picture or several static pictures in sequence (multi-static condition). The multi-static condition contained the same amount of frames as the dynamic condition but had a masker between each frame to remove the perception of motion. Their findings as well as others suggest some aspect of motion aids emotion recognition more than viewing the same number of static frames in sequence ( Ambadar et al., 2005 and Bould and Morris, 2008). To date, research investigating older adults’ recognition of dynamic emotions is largely lacking. Adding motion information may help individuals, even older adults, to recognize emotional facial expressions at more subtle intensities. In particular, older adults have been shown in laboratory studies to have difficulty identifying anger, sadness, and fear from static pictures of human faces (Ruffman et al., 2008). Showing the dynamic formation of facial emotion is more reflective of emotion expression formation in daily life, which older adults are more familiar with than the static faces in the laboratory. Motion could provide some additional information potentially making facial emotions less ambiguous. 1.3. Goal of current research As outlined in the literature review, emotional facial expression is a critical component in successful social interaction, and thus will be critical in designing agents with social ability. Although research has been conducted investigating the role of social cues in human–agent interaction (Bartneck and Reichenbach, 2005, Becker-Asano and Wachsmuth, 2010, Breazeal, 2003 and Fabri et al., 2004), little research has compared in-depth younger and older adults’ ability to label the facial emotion displayed by a social agent, an important consideration because socially assistive agents will be required to interact with users of varying ages. Compared to previous works, the goal of the current research is to understand age-related differences in emotion recognition; we believe it is critical to understand how people of all ages interpret social cues the agent is displaying, particularly facial expression of emotion. Overall, age-related differences in emotional recognition of human facial expressions have been supported ( Ruffman et al., 2008). If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are related to the intensity of the emotion, dynamic formation of emotion, or the character type. The literature suggests that while emotion expression is an important component of communication in all cultures, particularly human recognition of the six basic emotions (Ekman and Friesen, 1975 and Ekman and Friesen, 2003). Although we recognize that humans express and recognize hundreds of expressions, we focus on six basic emotions that have been examined previously in detail; they also represent emotions for which age-related differences have been most prevalently found. If in fact people do apply social characteristics to technology in the same way in which they do to other humans (e.g., Nass et al., 1996 and Nass and Moon, 2000), it is unknown whether certain variables (i.e., intensity, dynamic formation, and character type) affect the recognition of facial expressions displayed by a virtual agent, and, importantly, to what degree the effects of these variables differ across age groups. Specifically to better understand age-related differences in facial emotion recognition, our research aimed to: • Assess age-related differences in emotion recognition of virtual agent emotional facial expressions. Furthermore, we assessed misattributions older and younger adults made when identifying agent emotions to better understand the nature of their perceptions [Study 1 and Study 2]. • Consider the implication of dynamic formation of emotional facial expressions (i.e., motion), with further consideration of possible age-related differences in emotion recognition [Study 1]. • Examine the role of facial emotion intensity in emotion recognition, with emotions ranging from subtle to extreme, and whether emotions of varying intensity increase/decrease age-related differences [Study 1]. • Investigate age-related differences in emotion recognition of facial expressions displayed by a variety of agent characters. That is, the agent characters ranged in human-like features and facial arrangement, so we may better understand the role of configural and featural processing in emotion recognition [Study 2].
نتیجه گیری انگلیسی
.